2014-01-04 18:07:18

by Mikulas Patocka

[permalink] [raw]
Subject: [PATCH] kobject: provide kobject_put_wait to fix module unload race

Hi

I noticed that Jeff Mahoney added a new structure kobj_completion, defined
in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch
eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel,
this interface is still unused.

However, converting the drivers to use kobj_completion is not trivial
(note that all users of the original kobject interface are buggy - so all
of them need to be converted).

I came up with a simpler patch to achieve the same purpose - this patch
makes fixing the drivers easy - the driver is fixed just by replacing
"kobject_put" with "kobject_put_wait" in the unload routine.

I'd like to ask if you could revert
eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it
with this patch.

See http://www.redhat.com/archives/dm-devel/2013-October/msg00141.html for
the bug that this patch fixes.

Mikulas



From: Mikulas Patocka <[email protected]>

This patch introduces a new function kobject_put_wait. It decrements the
kobject reference count, waits until the count reaches zero. When this
function returns, it is guaranteed that the kobject was freed.

A rationale for this function:

The kobject is keeps a reference count. The driver unload routine
decrements the reference count, however, references to the kobject may
still be held by other kernel subsystems. The driver must not free the
memory that contains the kobject. Instead, the driver provides a "release"
method. The "release" method is called by the kernel when the last kobject
refernce is dropped. The "release" method should free the memory that
contains the kobject.

However, this pattern is buggy with respect to modules. The release method
is placed in the driver's module. When the driver exits, the module
reference count is zero, thus the module may be freed. However, there may
still be references to the kobject. If the module is unloaded and then the
release method is called, a crash happens.

Recently, CONFIG_DEBUG_KOBJECT_RELEASE was added. This option deliberately
provokes this race condition.

This patch fixes the bug by providing new function kobject_put_wait.
kobject_put_wait works like kobject_put, but it also waits until all other
references were dropped and until the kobject was freed. When
kobject_put_wait returns, it is guaranteed that the kobject was released
with the release method.

Thus, we can change kobject_put to kobject_put_wait in the unload routine
of various drivers to fix the above race condition.

This patch fixes it for device mapper. Note that all kobject users in
modules should be fixed to use this function.

Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected]

---
drivers/md/dm-sysfs.c | 2 +-
include/linux/kobject.h | 3 +++
lib/kobject.c | 34 +++++++++++++++++++++++++++++-----
3 files changed, 33 insertions(+), 6 deletions(-)

Index: linux-3.13-rc6/include/linux/kobject.h
===================================================================
--- linux-3.13-rc6.orig/include/linux/kobject.h 2014-01-02 23:13:24.000000000 +0100
+++ linux-3.13-rc6/include/linux/kobject.h 2014-01-02 23:14:02.000000000 +0100
@@ -27,6 +27,7 @@
#include <linux/wait.h>
#include <linux/atomic.h>
#include <linux/workqueue.h>
+#include <linux/completion.h>

#define UEVENT_HELPER_PATH_LEN 256
#define UEVENT_NUM_ENVP 32 /* number of env pointers */
@@ -66,6 +67,7 @@ struct kobject {
struct kobj_type *ktype;
struct sysfs_dirent *sd;
struct kref kref;
+ struct completion *free_completion;
#ifdef CONFIG_DEBUG_KOBJECT_RELEASE
struct delayed_work release;
#endif
@@ -106,6 +108,7 @@ extern int __must_check kobject_move(str

extern struct kobject *kobject_get(struct kobject *kobj);
extern void kobject_put(struct kobject *kobj);
+extern void kobject_put_wait(struct kobject *kobj);

extern const void *kobject_namespace(struct kobject *kobj);
extern char *kobject_get_path(struct kobject *kobj, gfp_t flag);
Index: linux-3.13-rc6/lib/kobject.c
===================================================================
--- linux-3.13-rc6.orig/lib/kobject.c 2014-01-02 23:13:23.000000000 +0100
+++ linux-3.13-rc6/lib/kobject.c 2014-01-02 23:17:01.000000000 +0100
@@ -172,6 +172,7 @@ static void kobject_init_internal(struct
if (!kobj)
return;
kref_init(&kobj->kref);
+ kobj->free_completion = NULL;
INIT_LIST_HEAD(&kobj->entry);
kobj->state_in_sysfs = 0;
kobj->state_add_uevent_sent = 0;
@@ -577,15 +578,11 @@ static void kobject_cleanup(struct kobje
{
struct kobj_type *t = get_ktype(kobj);
const char *name = kobj->name;
+ struct completion *free_completion = kobj->free_completion;

pr_debug("kobject: '%s' (%p): %s, parent %p\n",
kobject_name(kobj), kobj, __func__, kobj->parent);

- if (t && !t->release)
- pr_debug("kobject: '%s' (%p): does not have a release() "
- "function, it is broken and must be fixed.\n",
- kobject_name(kobj), kobj);
-
/* send "remove" if the caller did not do it but sent "add" */
if (kobj->state_add_uevent_sent && !kobj->state_remove_uevent_sent) {
pr_debug("kobject: '%s' (%p): auto cleanup 'remove' event\n",
@@ -611,6 +608,10 @@ static void kobject_cleanup(struct kobje
pr_debug("kobject: '%s': free name\n", name);
kfree(name);
}
+
+ /* if someone is waiting for the kobject to be freed, wake him up */
+ if (free_completion)
+ complete(free_completion);
}

#ifdef CONFIG_DEBUG_KOBJECT_RELEASE
@@ -651,6 +652,28 @@ void kobject_put(struct kobject *kobj)
}
}

+/**
+ * kobject_put - decrement refcount for object and wait until it reaches zero.
+ * @kobj: object.
+ *
+ * Decrement the refcount, and wait until the refcount reaches zero and the
+ * kobject is freed.
+ *
+ * This function should be called from the driver unload routine. It must not
+ * be called concurrently on the same kobject. When this function returns, it
+ * is guaranteed that the kobject was freed.
+ */
+void kobject_put_wait(struct kobject *kobj)
+{
+ if (kobj) {
+ DECLARE_COMPLETION_ONSTACK(completion);
+ BUG_ON(kobj->free_completion);
+ kobj->free_completion = &completion;
+ kobject_put(kobj);
+ wait_for_completion(&completion);
+ }
+}
+
static void dynamic_kobj_release(struct kobject *kobj)
{
pr_debug("kobject: (%p): %s\n", kobj, __func__);
@@ -1076,6 +1099,7 @@ void kobj_ns_drop(enum kobj_ns_type type

EXPORT_SYMBOL(kobject_get);
EXPORT_SYMBOL(kobject_put);
+EXPORT_SYMBOL(kobject_put_wait);
EXPORT_SYMBOL(kobject_del);

EXPORT_SYMBOL(kset_register);
Index: linux-3.13-rc6/drivers/md/dm-sysfs.c
===================================================================
--- linux-3.13-rc6.orig/drivers/md/dm-sysfs.c 2014-01-02 23:13:24.000000000 +0100
+++ linux-3.13-rc6/drivers/md/dm-sysfs.c 2014-01-02 23:14:02.000000000 +0100
@@ -104,5 +104,5 @@ int dm_sysfs_init(struct mapped_device *
*/
void dm_sysfs_exit(struct mapped_device *md)
{
- kobject_put(dm_kobject(md));
+ kobject_put_wait(dm_kobject(md));
}


2014-01-04 18:14:29

by Jeff Mahoney

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race

On 1/4/14, 1:06 PM, Mikulas Patocka wrote:
> Hi
>
> I noticed that Jeff Mahoney added a new structure kobj_completion, defined
> in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch
> eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel,
> this interface is still unused.
>
> However, converting the drivers to use kobj_completion is not trivial
> (note that all users of the original kobject interface are buggy - so all
> of them need to be converted).
>
> I came up with a simpler patch to achieve the same purpose - this patch
> makes fixing the drivers easy - the driver is fixed just by replacing
> "kobject_put" with "kobject_put_wait" in the unload routine.
>
> I'd like to ask if you could revert
> eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it
> with this patch.

I have no objections to reverting it. There were concerns from Al Viro
that it'd be tough to get right by callers and I had assumed it got
dropped after that. I had planned on using it in my btrfs sysfs exports
patchset but came up with a better way.

-Jeff

> See http://www.redhat.com/archives/dm-devel/2013-October/msg00141.html for
> the bug that this patch fixes.
>
> Mikulas
>
>
>
> From: Mikulas Patocka <[email protected]>
>
> This patch introduces a new function kobject_put_wait. It decrements the
> kobject reference count, waits until the count reaches zero. When this
> function returns, it is guaranteed that the kobject was freed.
>
> A rationale for this function:
>
> The kobject is keeps a reference count. The driver unload routine
> decrements the reference count, however, references to the kobject may
> still be held by other kernel subsystems. The driver must not free the
> memory that contains the kobject. Instead, the driver provides a "release"
> method. The "release" method is called by the kernel when the last kobject
> refernce is dropped. The "release" method should free the memory that
> contains the kobject.
>
> However, this pattern is buggy with respect to modules. The release method
> is placed in the driver's module. When the driver exits, the module
> reference count is zero, thus the module may be freed. However, there may
> still be references to the kobject. If the module is unloaded and then the
> release method is called, a crash happens.
>
> Recently, CONFIG_DEBUG_KOBJECT_RELEASE was added. This option deliberately
> provokes this race condition.
>
> This patch fixes the bug by providing new function kobject_put_wait.
> kobject_put_wait works like kobject_put, but it also waits until all other
> references were dropped and until the kobject was freed. When
> kobject_put_wait returns, it is guaranteed that the kobject was released
> with the release method.
>
> Thus, we can change kobject_put to kobject_put_wait in the unload routine
> of various drivers to fix the above race condition.
>
> This patch fixes it for device mapper. Note that all kobject users in
> modules should be fixed to use this function.
>
> Signed-off-by: Mikulas Patocka <[email protected]>
> Cc: [email protected]
>
> ---
> drivers/md/dm-sysfs.c | 2 +-
> include/linux/kobject.h | 3 +++
> lib/kobject.c | 34 +++++++++++++++++++++++++++++-----
> 3 files changed, 33 insertions(+), 6 deletions(-)
>
> Index: linux-3.13-rc6/include/linux/kobject.h
> ===================================================================
> --- linux-3.13-rc6.orig/include/linux/kobject.h 2014-01-02 23:13:24.000000000 +0100
> +++ linux-3.13-rc6/include/linux/kobject.h 2014-01-02 23:14:02.000000000 +0100
> @@ -27,6 +27,7 @@
> #include <linux/wait.h>
> #include <linux/atomic.h>
> #include <linux/workqueue.h>
> +#include <linux/completion.h>
>
> #define UEVENT_HELPER_PATH_LEN 256
> #define UEVENT_NUM_ENVP 32 /* number of env pointers */
> @@ -66,6 +67,7 @@ struct kobject {
> struct kobj_type *ktype;
> struct sysfs_dirent *sd;
> struct kref kref;
> + struct completion *free_completion;
> #ifdef CONFIG_DEBUG_KOBJECT_RELEASE
> struct delayed_work release;
> #endif
> @@ -106,6 +108,7 @@ extern int __must_check kobject_move(str
>
> extern struct kobject *kobject_get(struct kobject *kobj);
> extern void kobject_put(struct kobject *kobj);
> +extern void kobject_put_wait(struct kobject *kobj);
>
> extern const void *kobject_namespace(struct kobject *kobj);
> extern char *kobject_get_path(struct kobject *kobj, gfp_t flag);
> Index: linux-3.13-rc6/lib/kobject.c
> ===================================================================
> --- linux-3.13-rc6.orig/lib/kobject.c 2014-01-02 23:13:23.000000000 +0100
> +++ linux-3.13-rc6/lib/kobject.c 2014-01-02 23:17:01.000000000 +0100
> @@ -172,6 +172,7 @@ static void kobject_init_internal(struct
> if (!kobj)
> return;
> kref_init(&kobj->kref);
> + kobj->free_completion = NULL;
> INIT_LIST_HEAD(&kobj->entry);
> kobj->state_in_sysfs = 0;
> kobj->state_add_uevent_sent = 0;
> @@ -577,15 +578,11 @@ static void kobject_cleanup(struct kobje
> {
> struct kobj_type *t = get_ktype(kobj);
> const char *name = kobj->name;
> + struct completion *free_completion = kobj->free_completion;
>
> pr_debug("kobject: '%s' (%p): %s, parent %p\n",
> kobject_name(kobj), kobj, __func__, kobj->parent);
>
> - if (t && !t->release)
> - pr_debug("kobject: '%s' (%p): does not have a release() "
> - "function, it is broken and must be fixed.\n",
> - kobject_name(kobj), kobj);
> -
> /* send "remove" if the caller did not do it but sent "add" */
> if (kobj->state_add_uevent_sent && !kobj->state_remove_uevent_sent) {
> pr_debug("kobject: '%s' (%p): auto cleanup 'remove' event\n",
> @@ -611,6 +608,10 @@ static void kobject_cleanup(struct kobje
> pr_debug("kobject: '%s': free name\n", name);
> kfree(name);
> }
> +
> + /* if someone is waiting for the kobject to be freed, wake him up */
> + if (free_completion)
> + complete(free_completion);
> }
>
> #ifdef CONFIG_DEBUG_KOBJECT_RELEASE
> @@ -651,6 +652,28 @@ void kobject_put(struct kobject *kobj)
> }
> }
>
> +/**
> + * kobject_put - decrement refcount for object and wait until it reaches zero.
> + * @kobj: object.
> + *
> + * Decrement the refcount, and wait until the refcount reaches zero and the
> + * kobject is freed.
> + *
> + * This function should be called from the driver unload routine. It must not
> + * be called concurrently on the same kobject. When this function returns, it
> + * is guaranteed that the kobject was freed.
> + */
> +void kobject_put_wait(struct kobject *kobj)
> +{
> + if (kobj) {
> + DECLARE_COMPLETION_ONSTACK(completion);
> + BUG_ON(kobj->free_completion);
> + kobj->free_completion = &completion;
> + kobject_put(kobj);
> + wait_for_completion(&completion);
> + }
> +}
> +
> static void dynamic_kobj_release(struct kobject *kobj)
> {
> pr_debug("kobject: (%p): %s\n", kobj, __func__);
> @@ -1076,6 +1099,7 @@ void kobj_ns_drop(enum kobj_ns_type type
>
> EXPORT_SYMBOL(kobject_get);
> EXPORT_SYMBOL(kobject_put);
> +EXPORT_SYMBOL(kobject_put_wait);
> EXPORT_SYMBOL(kobject_del);
>
> EXPORT_SYMBOL(kset_register);
> Index: linux-3.13-rc6/drivers/md/dm-sysfs.c
> ===================================================================
> --- linux-3.13-rc6.orig/drivers/md/dm-sysfs.c 2014-01-02 23:13:24.000000000 +0100
> +++ linux-3.13-rc6/drivers/md/dm-sysfs.c 2014-01-02 23:14:02.000000000 +0100
> @@ -104,5 +104,5 @@ int dm_sysfs_init(struct mapped_device *
> */
> void dm_sysfs_exit(struct mapped_device *md)
> {
> - kobject_put(dm_kobject(md));
> + kobject_put_wait(dm_kobject(md));
> }
>


--
Jeff Mahoney
SUSE Labs


Attachments:
signature.asc (841.00 B)
OpenPGP digital signature

2014-01-04 18:16:12

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race

On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> Hi
>
> I noticed that Jeff Mahoney added a new structure kobj_completion, defined
> in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch
> eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel,
> this interface is still unused.

There are pending btrfs patches to use this interface.

> However, converting the drivers to use kobj_completion is not trivial
> (note that all users of the original kobject interface are buggy - so all
> of them need to be converted).

Wait, what? How are "all users" buggy? Please explain this in detail.

> I came up with a simpler patch to achieve the same purpose - this patch
> makes fixing the drivers easy - the driver is fixed just by replacing
> "kobject_put" with "kobject_put_wait" in the unload routine.

No, that's not ok at all.

> I'd like to ask if you could revert
> eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it
> with this patch.
>
> See http://www.redhat.com/archives/dm-devel/2013-October/msg00141.html for
> the bug that this patch fixes.
>
> Mikulas
>
>
>
> From: Mikulas Patocka <[email protected]>
>
> This patch introduces a new function kobject_put_wait. It decrements the
> kobject reference count, waits until the count reaches zero. When this
> function returns, it is guaranteed that the kobject was freed.
>
> A rationale for this function:
>
> The kobject is keeps a reference count. The driver unload routine
> decrements the reference count, however, references to the kobject may
> still be held by other kernel subsystems. The driver must not free the
> memory that contains the kobject. Instead, the driver provides a "release"
> method. The "release" method is called by the kernel when the last kobject
> refernce is dropped. The "release" method should free the memory that
> contains the kobject.
>
> However, this pattern is buggy with respect to modules. The release method
> is placed in the driver's module. When the driver exits, the module
> reference count is zero, thus the module may be freed. However, there may
> still be references to the kobject. If the module is unloaded and then the
> release method is called, a crash happens.

Yes, module unloading while a kobject is still "active" is not a good
thing, what modules do you have that cause this problem? Why not just
grab the module reference in your kobject if you need this type of
protection? It's not the kobject's code fault that this issue is there,
or that we now have a "delayed release" function to expose this type of
thing, it's the user of the kobject.

Please fix the broken users of the kobject first.

thanks,

greg k-h

2014-01-04 18:34:16

by Al Viro

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race

On Sat, Jan 04, 2014 at 10:16:20AM -0800, Greg Kroah-Hartman wrote:

> > I came up with a simpler patch to achieve the same purpose - this patch
> > makes fixing the drivers easy - the driver is fixed just by replacing
> > "kobject_put" with "kobject_put_wait" in the unload routine.
>
> No, that's not ok at all.

Agreed - all it takes is one cargo-culter who religoiusly does such
conversion and drops a ref to parent before that to child.

> > However, this pattern is buggy with respect to modules. The release method
> > is placed in the driver's module. When the driver exits, the module
> > reference count is zero, thus the module may be freed. However, there may
> > still be references to the kobject. If the module is unloaded and then the
> > release method is called, a crash happens.
>
> Yes, module unloading while a kobject is still "active" is not a good
> thing, what modules do you have that cause this problem? Why not just
> grab the module reference in your kobject if you need this type of
> protection? It's not the kobject's code fault that this issue is there,
> or that we now have a "delayed release" function to expose this type of
> thing, it's the user of the kobject.
>
> Please fix the broken users of the kobject first.

<snide> Are you saying that there is another kind? </snide>

When would you grab that reference to module? More to the point, when
would you *drop* it? Doing so from module_exit is not going to work,
obviously...

2014-01-04 20:35:59

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race



On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:

> On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> > Hi
> >
> > I noticed that Jeff Mahoney added a new structure kobj_completion, defined
> > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch
> > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel,
> > this interface is still unused.
>
> There are pending btrfs patches to use this interface.
>
> > However, converting the drivers to use kobj_completion is not trivial
> > (note that all users of the original kobject interface are buggy - so all
> > of them need to be converted).
>
> Wait, what? How are "all users" buggy? Please explain this in detail.

1) some code takes a reference to a kobject
2) the user unloads the device
3) the device driver unload routine calls kobject_put (but there is still
reference, so the kobject is not destroyed)
4) the device is unregistered, it drops the module reference count to zero
5) the user unloads the module
6) the piece of code that grabbed the kobject reference at step 1) drops
it
7) the kobject code calls ->release method, it points into the unloaded
module


You could try to fix it by incrementing the module reference count between
steps 2) and 3) and decrementing it in the kobject release routine, but it
doesn't work - you'd have to run "module_put(THIS_MODULE);" from the
kobject release routine - it is still buggy, because the module may be
unloaded just after "module_put(THIS_MODULE);" and before the release
routine returns.


Another problem with incrementing the module reference count happens if
the above sequence is executing as a consequence of module unloading -
suppose that steps 2)-4) are being executed from the module unload routine
itself (for example, network drivers have this property that the module
may be removed while the device is in use and the module unload routine
shuts the device down). In this case, there is no way to delay unloading
of the module from the unload routine itself.


> > I came up with a simpler patch to achieve the same purpose - this patch
> > makes fixing the drivers easy - the driver is fixed just by replacing
> > "kobject_put" with "kobject_put_wait" in the unload routine.
>
> No, that's not ok at all.

What exactly do you think is wrong with this patch?

> > I'd like to ask if you could revert
> > eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it
> > with this patch.
> >
> > See http://www.redhat.com/archives/dm-devel/2013-October/msg00141.html for
> > the bug that this patch fixes.
> >
> > Mikulas
> >
> >
> >
> > From: Mikulas Patocka <[email protected]>
> >
> > This patch introduces a new function kobject_put_wait. It decrements the
> > kobject reference count, waits until the count reaches zero. When this
> > function returns, it is guaranteed that the kobject was freed.
> >
> > A rationale for this function:
> >
> > The kobject is keeps a reference count. The driver unload routine
> > decrements the reference count, however, references to the kobject may
> > still be held by other kernel subsystems. The driver must not free the
> > memory that contains the kobject. Instead, the driver provides a "release"
> > method. The "release" method is called by the kernel when the last kobject
> > refernce is dropped. The "release" method should free the memory that
> > contains the kobject.
> >
> > However, this pattern is buggy with respect to modules. The release method
> > is placed in the driver's module. When the driver exits, the module
> > reference count is zero, thus the module may be freed. However, there may
> > still be references to the kobject. If the module is unloaded and then the
> > release method is called, a crash happens.
>
> Yes, module unloading while a kobject is still "active" is not a good
> thing, what modules do you have that cause this problem?

All modules - Do you have some code in the generic module unload routine
that prevents unloading the module while someone holds a reference to a
kobject created by the module? If not, then all modules have this bug.

> Why not just
> grab the module reference in your kobject if you need this type of
> protection?

See above - you'd have to drop the reference count from the module itself.
And it doesn't work if you are destroying the kobject from the module
unload routine itself.

> It's not the kobject's code fault that this issue is there,
> or that we now have a "delayed release" function to expose this type of
> thing, it's the user of the kobject.

So, show me any single driver that do this correctly. I haven't looked at
the whole kernel, but all drivers I have looked on so far are buggy.


Just a grep '\.release' drivers/*/*.c shows this:

These are buggy because they don't manage module references in the kobject
release routine: pkt_kobj_release, cpufreq_sysfs_release,
cpuidle_sysfs_release, ktype_state_cpuidle, cpuidle_driver_sysfs_release,
dmi_entry_free, dmi_sysfs_entry_release, release_firmware_map_entry,
rdev_free, md_free, pps_cdev_release, iscsi_boot_kobj_release,
map_release, portio_release

These are decrementing the own module count from the release method, so
they are less buggy than the above, but still buggy:
edac_device_ctrl_instance_release, edac_device_ctrl_master_release,
edac_device_ctrl_instance_release

> Please fix the broken users of the kobject first.
>
> thanks,
>
> greg k-h

Mikulas

2014-01-04 22:42:11

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race

On Sat, Jan 04, 2014 at 06:34:03PM +0000, Al Viro wrote:
> On Sat, Jan 04, 2014 at 10:16:20AM -0800, Greg Kroah-Hartman wrote:
>
> > > I came up with a simpler patch to achieve the same purpose - this patch
> > > makes fixing the drivers easy - the driver is fixed just by replacing
> > > "kobject_put" with "kobject_put_wait" in the unload routine.
> >
> > No, that's not ok at all.
>
> Agreed - all it takes is one cargo-culter who religoiusly does such
> conversion and drops a ref to parent before that to child.
>
> > > However, this pattern is buggy with respect to modules. The release method
> > > is placed in the driver's module. When the driver exits, the module
> > > reference count is zero, thus the module may be freed. However, there may
> > > still be references to the kobject. If the module is unloaded and then the
> > > release method is called, a crash happens.
> >
> > Yes, module unloading while a kobject is still "active" is not a good
> > thing, what modules do you have that cause this problem? Why not just
> > grab the module reference in your kobject if you need this type of
> > protection? It's not the kobject's code fault that this issue is there,
> > or that we now have a "delayed release" function to expose this type of
> > thing, it's the user of the kobject.
> >
> > Please fix the broken users of the kobject first.
>
> <snide> Are you saying that there is another kind? </snide>
>
> When would you grab that reference to module? More to the point, when
> would you *drop* it? Doing so from module_exit is not going to work,
> obviously...

You normally have subsystem core module that does handle release of its
objects and users of said objects so it is usually OK for objects to
outlive the users, you just need to make sure the core stays around.

In input we grab module reference to input core when we allocate input
device and drop it when input device is freed. This way we can be sure
that input core stays around until all input devices are gone. The same
for serio.

Thanks.

--
Dmitry

2014-01-05 03:42:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race

On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote:
>
>
> On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:
>
> > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> > > Hi
> > >
> > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined
> > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch
> > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel,
> > > this interface is still unused.
> >
> > There are pending btrfs patches to use this interface.
> >
> > > However, converting the drivers to use kobj_completion is not trivial
> > > (note that all users of the original kobject interface are buggy - so all
> > > of them need to be converted).
> >
> > Wait, what? How are "all users" buggy? Please explain this in detail.
>
> 1) some code takes a reference to a kobject
> 2) the user unloads the device
> 3) the device driver unload routine calls kobject_put (but there is still
> reference, so the kobject is not destroyed)

A driver should never be messing around with "raw" kobjects, they should
be using a 'struct device' which is created/managed by the subsystem
they belong to. See Dmitry's example of input and serio as ways to do
this, also USB and PCI do this properly.

Perhaps your sybsystem isn't doing this properly? What code do you have
that creates raw kobjects and has this problem?

thanks,

greg k-h

2014-01-05 03:48:14

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race

On Sat, Jan 04, 2014 at 01:14:14PM -0500, Jeff Mahoney wrote:
> On 1/4/14, 1:06 PM, Mikulas Patocka wrote:
> > Hi
> >
> > I noticed that Jeff Mahoney added a new structure kobj_completion, defined
> > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch
> > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel,
> > this interface is still unused.
> >
> > However, converting the drivers to use kobj_completion is not trivial
> > (note that all users of the original kobject interface are buggy - so all
> > of them need to be converted).
> >
> > I came up with a simpler patch to achieve the same purpose - this patch
> > makes fixing the drivers easy - the driver is fixed just by replacing
> > "kobject_put" with "kobject_put_wait" in the unload routine.
> >
> > I'd like to ask if you could revert
> > eee031649707db3c9920d9498f8d03819b74fc23 (no code uses it) and replace it
> > with this patch.
>
> I have no objections to reverting it. There were concerns from Al Viro
> that it'd be tough to get right by callers and I had assumed it got
> dropped after that. I had planned on using it in my btrfs sysfs exports
> patchset but came up with a better way.

Ok, thanks, I'll revert it for 3.14.

greg k-h

2014-01-05 06:05:37

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race

On Sat, Jan 04, 2014 at 07:42:28PM -0800, Greg Kroah-Hartman wrote:
> On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote:
> >
> >
> > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:
> >
> > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> > > > Hi
> > > >
> > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined
> > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch
> > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel,
> > > > this interface is still unused.
> > >
> > > There are pending btrfs patches to use this interface.
> > >
> > > > However, converting the drivers to use kobj_completion is not trivial
> > > > (note that all users of the original kobject interface are buggy - so all
> > > > of them need to be converted).
> > >
> > > Wait, what? How are "all users" buggy? Please explain this in detail.
> >
> > 1) some code takes a reference to a kobject
> > 2) the user unloads the device
> > 3) the device driver unload routine calls kobject_put (but there is still
> > reference, so the kobject is not destroyed)
>
> A driver should never be messing around with "raw" kobjects, they should
> be using a 'struct device' which is created/managed by the subsystem
> they belong to. See Dmitry's example of input and serio as ways to do
> this, also USB and PCI do this properly.

Well, Mikulas is correct in the sense that there is still a race between
release function invoking the final module_put() and getting preempted
and module getting unloaded by another thread. Hitting this race is
pretty hard though.

--
Dmitry

2014-01-05 16:44:17

by Bart Van Assche

[permalink] [raw]
Subject: Re: [dm-devel] [PATCH] kobject: provide kobject_put_wait to fix module unload race

On 01/04/14 19:06, Mikulas Patocka wrote:
> - if (t && !t->release)
> - pr_debug("kobject: '%s' (%p): does not have a release() "
> - "function, it is broken and must be fixed.\n",
> - kobject_name(kobj), kobj);
> -

Has it been considered to issue a warning if no release function has
been defined and free_completion == NULL instead of removing the above
debug message entirely ? I think even with this patch applied it is
still wrong to invoke kobject_put() on an object without defining a
release function.

Thanks,

Bart.

2014-01-05 18:26:29

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [dm-devel] [PATCH] kobject: provide kobject_put_wait to fix module unload race

On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
> On 01/04/14 19:06, Mikulas Patocka wrote:
> > - if (t && !t->release)
> > - pr_debug("kobject: '%s' (%p): does not have a release() "
> > - "function, it is broken and must be fixed.\n",
> > - kobject_name(kobj), kobj);
> > -
>
> Has it been considered to issue a warning if no release function has
> been defined and free_completion == NULL instead of removing the above
> debug message entirely ? I think even with this patch applied it is
> still wrong to invoke kobject_put() on an object without defining a
> release function.

This patch isn't going to be applied, and I've reverted the original
commit, so there shouldn't be any issues anymore with this code.

thanks,

greg k-h

2014-01-05 18:27:01

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race

On Sat, Jan 04, 2014 at 10:05:31PM -0800, Dmitry Torokhov wrote:
> On Sat, Jan 04, 2014 at 07:42:28PM -0800, Greg Kroah-Hartman wrote:
> > On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote:
> > >
> > >
> > > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:
> > >
> > > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> > > > > Hi
> > > > >
> > > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined
> > > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch
> > > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel,
> > > > > this interface is still unused.
> > > >
> > > > There are pending btrfs patches to use this interface.
> > > >
> > > > > However, converting the drivers to use kobj_completion is not trivial
> > > > > (note that all users of the original kobject interface are buggy - so all
> > > > > of them need to be converted).
> > > >
> > > > Wait, what? How are "all users" buggy? Please explain this in detail.
> > >
> > > 1) some code takes a reference to a kobject
> > > 2) the user unloads the device
> > > 3) the device driver unload routine calls kobject_put (but there is still
> > > reference, so the kobject is not destroyed)
> >
> > A driver should never be messing around with "raw" kobjects, they should
> > be using a 'struct device' which is created/managed by the subsystem
> > they belong to. See Dmitry's example of input and serio as ways to do
> > this, also USB and PCI do this properly.
>
> Well, Mikulas is correct in the sense that there is still a race between
> release function invoking the final module_put() and getting preempted
> and module getting unloaded by another thread. Hitting this race is
> pretty hard though.

Yes, that is true, especially as no one auto-unloads modules for reasons
like this :)

thanks,

greg k-h

2014-01-05 22:04:49

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race



On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:

> On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote:
> >
> >
> > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:
> >
> > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> > > > Hi
> > > >
> > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined
> > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch
> > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel,
> > > > this interface is still unused.
> > >
> > > There are pending btrfs patches to use this interface.
> > >
> > > > However, converting the drivers to use kobj_completion is not trivial
> > > > (note that all users of the original kobject interface are buggy - so all
> > > > of them need to be converted).
> > >
> > > Wait, what? How are "all users" buggy? Please explain this in detail.
> >
> > 1) some code takes a reference to a kobject
> > 2) the user unloads the device
> > 3) the device driver unload routine calls kobject_put (but there is still
> > reference, so the kobject is not destroyed)
>
> A driver should never be messing around with "raw" kobjects, they should
> be using a 'struct device' which is created/managed by the subsystem
> they belong to. See Dmitry's example of input and serio as ways to do
> this, also USB and PCI do this properly.
>
> Perhaps your sybsystem isn't doing this properly? What code do you have
> that creates raw kobjects and has this problem?
>
> thanks,
>
> greg k-h

So, are you saying that a module shouldn't ever be able to create a
kobject type?

Do "grep -rw kobj_type drivers/ fs/* net/bridge/" to see how much code
uses kobjects. There are 77 line. Majority of them may be compiled as
modules.

What do you want to do with all those kobject users? Hide them behind
another interface that doesn't exists yet?

Mikulas

2014-01-05 22:05:11

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [dm-devel] [PATCH] kobject: provide kobject_put_wait to fix module unload race



On Sun, 5 Jan 2014, Bart Van Assche wrote:

> On 01/04/14 19:06, Mikulas Patocka wrote:
> > - if (t && !t->release)
> > - pr_debug("kobject: '%s' (%p): does not have a release() "
> > - "function, it is broken and must be fixed.\n",
> > - kobject_name(kobj), kobj);
> > -
>
> Has it been considered to issue a warning if no release function has
> been defined and free_completion == NULL instead of removing the above
> debug message entirely ? I think even with this patch applied it is
> still wrong to invoke kobject_put() on an object without defining a
> release function.
>
> Thanks,
>
> Bart.

With the above patch, you don't need the release method for correct
behavior. Therefore you don't have to issue warning when it is missing.

Mikulas

2014-01-05 22:11:40

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race



On Sat, 4 Jan 2014, Dmitry Torokhov wrote:

> On Sat, Jan 04, 2014 at 06:34:03PM +0000, Al Viro wrote:
> > On Sat, Jan 04, 2014 at 10:16:20AM -0800, Greg Kroah-Hartman wrote:
> >
> > > > I came up with a simpler patch to achieve the same purpose - this patch
> > > > makes fixing the drivers easy - the driver is fixed just by replacing
> > > > "kobject_put" with "kobject_put_wait" in the unload routine.
> > >
> > > No, that's not ok at all.
> >
> > Agreed - all it takes is one cargo-culter who religoiusly does such
> > conversion and drops a ref to parent before that to child.
> >
> > > > However, this pattern is buggy with respect to modules. The release method
> > > > is placed in the driver's module. When the driver exits, the module
> > > > reference count is zero, thus the module may be freed. However, there may
> > > > still be references to the kobject. If the module is unloaded and then the
> > > > release method is called, a crash happens.
> > >
> > > Yes, module unloading while a kobject is still "active" is not a good
> > > thing, what modules do you have that cause this problem? Why not just
> > > grab the module reference in your kobject if you need this type of
> > > protection? It's not the kobject's code fault that this issue is there,
> > > or that we now have a "delayed release" function to expose this type of
> > > thing, it's the user of the kobject.
> > >
> > > Please fix the broken users of the kobject first.
> >
> > <snide> Are you saying that there is another kind? </snide>
> >
> > When would you grab that reference to module? More to the point, when
> > would you *drop* it? Doing so from module_exit is not going to work,
> > obviously...
>
> You normally have subsystem core module that does handle release of its
> objects and users of said objects so it is usually OK for objects to
> outlive the users, you just need to make sure the core stays around.
>
> In input we grab module reference to input core when we allocate input
> device and drop it when input device is freed. This way we can be sure
> that input core stays around until all input devices are gone. The same
> for serio.
>
> Thanks.
>
> --
> Dmitry

But sometimes, the driver itself needs to create nodes in the sysfs
filesystem (for example drivers/md/dm-sysfs.c). I don't quite see how
would you push all driver-specific sysfs nodes into the generic non-module
code.

Mikulas

2014-01-05 22:23:46

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race

On Sun, Jan 05, 2014 at 05:04:31PM -0500, Mikulas Patocka wrote:
>
>
> On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:
>
> > On Sat, Jan 04, 2014 at 03:35:39PM -0500, Mikulas Patocka wrote:
> > >
> > >
> > > On Sat, 4 Jan 2014, Greg Kroah-Hartman wrote:
> > >
> > > > On Sat, Jan 04, 2014 at 01:06:01PM -0500, Mikulas Patocka wrote:
> > > > > Hi
> > > > >
> > > > > I noticed that Jeff Mahoney added a new structure kobj_completion, defined
> > > > > in include/linux/kobj_completion.h to the kernel 3.13-rc1 in the patch
> > > > > eee031649707db3c9920d9498f8d03819b74fc23. In the current upstream kernel,
> > > > > this interface is still unused.
> > > >
> > > > There are pending btrfs patches to use this interface.
> > > >
> > > > > However, converting the drivers to use kobj_completion is not trivial
> > > > > (note that all users of the original kobject interface are buggy - so all
> > > > > of them need to be converted).
> > > >
> > > > Wait, what? How are "all users" buggy? Please explain this in detail.
> > >
> > > 1) some code takes a reference to a kobject
> > > 2) the user unloads the device
> > > 3) the device driver unload routine calls kobject_put (but there is still
> > > reference, so the kobject is not destroyed)
> >
> > A driver should never be messing around with "raw" kobjects, they should
> > be using a 'struct device' which is created/managed by the subsystem
> > they belong to. See Dmitry's example of input and serio as ways to do
> > this, also USB and PCI do this properly.
> >
> > Perhaps your sybsystem isn't doing this properly? What code do you have
> > that creates raw kobjects and has this problem?
> >
> > thanks,
> >
> > greg k-h
>
> So, are you saying that a module shouldn't ever be able to create a
> kobject type?
>
> Do "grep -rw kobj_type drivers/ fs/* net/bridge/" to see how much code
> uses kobjects. There are 77 line. Majority of them may be compiled as
> modules.
>
> What do you want to do with all those kobject users? Hide them behind
> another interface that doesn't exists yet?

Most of them should be using the driver/device interface to sysfs (the
drivers/* files, with the exception of the driver core code). I'll look
at the others later.

And note, as module unloading can only happen by the root user, and
never happens "automatically", this is an issue, but a very minor one,
and can usually be solved by having a central "place" that handles the
kobject lifetimes.

thanks,

greg k-h

2014-01-05 22:39:31

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race

On Sun, Jan 05, 2014 at 05:11:23PM -0500, Mikulas Patocka wrote:
>
>
> On Sat, 4 Jan 2014, Dmitry Torokhov wrote:
>
> > On Sat, Jan 04, 2014 at 06:34:03PM +0000, Al Viro wrote:
> > > On Sat, Jan 04, 2014 at 10:16:20AM -0800, Greg Kroah-Hartman wrote:
> > >
> > > > > I came up with a simpler patch to achieve the same purpose - this patch
> > > > > makes fixing the drivers easy - the driver is fixed just by replacing
> > > > > "kobject_put" with "kobject_put_wait" in the unload routine.
> > > >
> > > > No, that's not ok at all.
> > >
> > > Agreed - all it takes is one cargo-culter who religoiusly does such
> > > conversion and drops a ref to parent before that to child.
> > >
> > > > > However, this pattern is buggy with respect to modules. The release method
> > > > > is placed in the driver's module. When the driver exits, the module
> > > > > reference count is zero, thus the module may be freed. However, there may
> > > > > still be references to the kobject. If the module is unloaded and then the
> > > > > release method is called, a crash happens.
> > > >
> > > > Yes, module unloading while a kobject is still "active" is not a good
> > > > thing, what modules do you have that cause this problem? Why not just
> > > > grab the module reference in your kobject if you need this type of
> > > > protection? It's not the kobject's code fault that this issue is there,
> > > > or that we now have a "delayed release" function to expose this type of
> > > > thing, it's the user of the kobject.
> > > >
> > > > Please fix the broken users of the kobject first.
> > >
> > > <snide> Are you saying that there is another kind? </snide>
> > >
> > > When would you grab that reference to module? More to the point, when
> > > would you *drop* it? Doing so from module_exit is not going to work,
> > > obviously...
> >
> > You normally have subsystem core module that does handle release of its
> > objects and users of said objects so it is usually OK for objects to
> > outlive the users, you just need to make sure the core stays around.
> >
> > In input we grab module reference to input core when we allocate input
> > device and drop it when input device is freed. This way we can be sure
> > that input core stays around until all input devices are gone. The same
> > for serio.
> >
> > Thanks.
> >
> > --
> > Dmitry
>
> But sometimes, the driver itself needs to create nodes in the sysfs
> filesystem (for example drivers/md/dm-sysfs.c). I don't quite see how
> would you push all driver-specific sysfs nodes into the generic non-module
> code.

Then you need to make sure your driver does not allow unloading while
its objects are active. I.e. require that all your devices are gone
(by increasing module count when you create a DM object and decreasing
it when you release DM object) before you allow unloading the driver.

Basically we should avoid kobject_put() in exit paths of the module.
Then we are left with that tiny race with release being preempted and
module getting unloaded.

Thanks.

--
Dmitry

2014-01-06 18:43:23

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [PATCH] kobject: provide kobject_put_wait to fix module unload race



On Sun, 5 Jan 2014, Dmitry Torokhov wrote:

> > But sometimes, the driver itself needs to create nodes in the sysfs
> > filesystem (for example drivers/md/dm-sysfs.c). I don't quite see how
> > would you push all driver-specific sysfs nodes into the generic non-module
> > code.
>
> Then you need to make sure your driver does not allow unloading while
> its objects are active. I.e. require that all your devices are gone
> (by increasing module count when you create a DM object and decreasing
> it when you release DM object) before you allow unloading the driver.

For drivers that register to various subsystems (for example with
pci_register_driver and pci_unregister_driver) this can't be done
correctly - pci_unregister_driver is called from the module unload path,
it destroys all instances of the device and their appropriate sysfs nodes.

The sysfs nodes exist even if the driver is unused and has zero module
count.

> Basically we should avoid kobject_put() in exit paths of the module.



> Then we are left with that tiny race with release being preempted and
> module getting unloaded.

Majority of kobject users aren't managing module refcount in the
release routine. They do not have a tiny race - they have a big race that
is hapenning with CONFIG_DEBUG_KOBJECT_RELEASE.



These use completion to wait for the released object (thus, they are
correct):
cpufreq_sysfs_release, cpuidle_sysfs_release, cpuidle_state_sysfs_release,
cpuidle_driver_sysfs_release, ext4_sb_release, ext4_feat_release,
f2fs_sb_release

These have no protection against module unload at all:
pkt_kobj_release, map_release, portio_release, ib_port_release,
cm_release_port_obj, mlx4_port_release, ttm_bo_global_kobj_release,
ttm_pool_kobj_release, ttm_mem_zone_kobj_release,
ttm_mem_global_kobj_release, rdev_free, md_free, efivar_release,
dmi_entry_free, dmi_sysfs_entry_release, edd_release,
iscsi_boot_kobj_release, lockspace_kobj_release, gfs2_sbd_release,
release_nbp

These have empty or non-existent release routine (thus having no
protection): dm-sysfs.c, qib_port_release

These use module refcount: edac_device_ctrl_master_release,
edac_device_ctrl_instance_release, edac_device_ctrl_block_release

> Thanks.
>
> --
> Dmitry

Mikulas

2014-01-06 18:55:29

by Mikulas Patocka

[permalink] [raw]
Subject: Re: [dm-devel] [PATCH] kobject: provide kobject_put_wait to fix module unload race



On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote:

> On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
> > On 01/04/14 19:06, Mikulas Patocka wrote:
> > > - if (t && !t->release)
> > > - pr_debug("kobject: '%s' (%p): does not have a release() "
> > > - "function, it is broken and must be fixed.\n",
> > > - kobject_name(kobj), kobj);
> > > -
> >
> > Has it been considered to issue a warning if no release function has
> > been defined and free_completion == NULL instead of removing the above
> > debug message entirely ? I think even with this patch applied it is
> > still wrong to invoke kobject_put() on an object without defining a
> > release function.
>
> This patch isn't going to be applied, and I've reverted the original
> commit, so there shouldn't be any issues anymore with this code.

Why? This patch does the same thing as
eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you
accept eee031649707db3c9920d9498f8d03819b74fc23 and not this?

The code to wait for kobject destruction using completion already exists
in cpufreq_sysfs_release, cpuidle_sysfs_release,
cpuidle_state_sysfs_release, cpuidle_driver_sysfs_release,
ext4_sb_release, ext4_feat_release, f2fs_sb_release (these are the only
kobject users that are correct w.r.t. module unloading), so if you accept
this patch, you can simplify them to use kobject_put_wait.

Mikulas

2014-01-06 19:23:27

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [dm-devel] [PATCH] kobject: provide kobject_put_wait to fix module unload race

On Mon, Jan 06, 2014 at 01:55:11PM -0500, Mikulas Patocka wrote:
>
>
> On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote:
>
> > On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
> > > On 01/04/14 19:06, Mikulas Patocka wrote:
> > > > - if (t && !t->release)
> > > > - pr_debug("kobject: '%s' (%p): does not have a release() "
> > > > - "function, it is broken and must be fixed.\n",
> > > > - kobject_name(kobj), kobj);
> > > > -
> > >
> > > Has it been considered to issue a warning if no release function has
> > > been defined and free_completion == NULL instead of removing the above
> > > debug message entirely ? I think even with this patch applied it is
> > > still wrong to invoke kobject_put() on an object without defining a
> > > release function.
> >
> > This patch isn't going to be applied, and I've reverted the original
> > commit, so there shouldn't be any issues anymore with this code.
>
> Why? This patch does the same thing as
> eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you
> accept eee031649707db3c9920d9498f8d03819b74fc23 and not this?

I have now reverted that commit, it will not be in 3.14, so consider it
rejected as well :)

thanks,

greg k-h

2014-01-06 21:31:27

by Mike Snitzer

[permalink] [raw]
Subject: Re: kobject: provide kobject_put_wait to fix module unload race

On Mon, Jan 06 2014 at 1:55pm -0500,
Mikulas Patocka <[email protected]> wrote:

>
>
> On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote:
>
> > On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
> > > On 01/04/14 19:06, Mikulas Patocka wrote:
> > > > - if (t && !t->release)
> > > > - pr_debug("kobject: '%s' (%p): does not have a release() "
> > > > - "function, it is broken and must be fixed.\n",
> > > > - kobject_name(kobj), kobj);
> > > > -
> > >
> > > Has it been considered to issue a warning if no release function has
> > > been defined and free_completion == NULL instead of removing the above
> > > debug message entirely ? I think even with this patch applied it is
> > > still wrong to invoke kobject_put() on an object without defining a
> > > release function.
> >
> > This patch isn't going to be applied, and I've reverted the original
> > commit, so there shouldn't be any issues anymore with this code.
>
> Why? This patch does the same thing as
> eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you
> accept eee031649707db3c9920d9498f8d03819b74fc23 and not this?
>
> The code to wait for kobject destruction using completion already exists
> in cpufreq_sysfs_release, cpuidle_sysfs_release,
> cpuidle_state_sysfs_release, cpuidle_driver_sysfs_release,
> ext4_sb_release, ext4_feat_release, f2fs_sb_release (these are the only
> kobject users that are correct w.r.t. module unloading), so if you accept
> this patch, you can simplify them to use kobject_put_wait.

Hi Mikulas,

Please just submit a DM-only patch that follows the same racey pattern
of firing a completion from the kobj_type .release method in dm_mod.
I'll get it queued up for 3.14.

If/when we gets reports of a crash due to dm_mod unload racing with
kobject_put we can revisit this.

Thanks,
Mike

2014-01-07 04:01:41

by Mikulas Patocka

[permalink] [raw]
Subject: Re: kobject: provide kobject_put_wait to fix module unload race



On Mon, 6 Jan 2014, Mike Snitzer wrote:

> On Mon, Jan 06 2014 at 1:55pm -0500,
> Mikulas Patocka <[email protected]> wrote:
>
> >
> >
> > On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote:
> >
> > > On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
> > > > On 01/04/14 19:06, Mikulas Patocka wrote:
> > > > > - if (t && !t->release)
> > > > > - pr_debug("kobject: '%s' (%p): does not have a release() "
> > > > > - "function, it is broken and must be fixed.\n",
> > > > > - kobject_name(kobj), kobj);
> > > > > -
> > > >
> > > > Has it been considered to issue a warning if no release function has
> > > > been defined and free_completion == NULL instead of removing the above
> > > > debug message entirely ? I think even with this patch applied it is
> > > > still wrong to invoke kobject_put() on an object without defining a
> > > > release function.
> > >
> > > This patch isn't going to be applied, and I've reverted the original
> > > commit, so there shouldn't be any issues anymore with this code.
> >
> > Why? This patch does the same thing as
> > eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you
> > accept eee031649707db3c9920d9498f8d03819b74fc23 and not this?
> >
> > The code to wait for kobject destruction using completion already exists
> > in cpufreq_sysfs_release, cpuidle_sysfs_release,
> > cpuidle_state_sysfs_release, cpuidle_driver_sysfs_release,
> > ext4_sb_release, ext4_feat_release, f2fs_sb_release (these are the only
> > kobject users that are correct w.r.t. module unloading), so if you accept
> > this patch, you can simplify them to use kobject_put_wait.
>
> Hi Mikulas,
>
> Please just submit a DM-only patch that follows the same racey pattern
> of firing a completion from the kobj_type .release method in dm_mod.
> I'll get it queued up for 3.14.
>
> If/when we gets reports of a crash due to dm_mod unload racing with
> kobject_put we can revisit this.
>
> Thanks,
> Mike

Here I'm sending dm-only patch.



dm: wait until kobject is destroyed

There may be other parts of the kernel taking reference to the dm kobject.
We must wait until they drop the references before deallocating the md
structure.

Signed-off-by: Mikulas Patocka <[email protected]>
Cc: [email protected]

---
drivers/md/dm-sysfs.c | 10 +++++++++-
drivers/md/dm.c | 11 +++++++++++
drivers/md/dm.h | 2 ++
3 files changed, 22 insertions(+), 1 deletion(-)

Index: linux-3.13-rc7/drivers/md/dm-sysfs.c
===================================================================
--- linux-3.13-rc7.orig/drivers/md/dm-sysfs.c 2014-01-07 02:06:08.000000000 +0100
+++ linux-3.13-rc7/drivers/md/dm-sysfs.c 2014-01-07 02:07:09.000000000 +0100
@@ -79,6 +79,11 @@ static const struct sysfs_ops dm_sysfs_o
.show = dm_attr_show,
};

+static void dm_kobject_release(struct kobject *kobj)
+{
+ complete(dm_get_completion_from_kobject(kobj));
+}
+
/*
* dm kobject is embedded in mapped_device structure
* no need to define release function here
@@ -86,6 +91,7 @@ static const struct sysfs_ops dm_sysfs_o
static struct kobj_type dm_ktype = {
.sysfs_ops = &dm_sysfs_ops,
.default_attrs = dm_attrs,
+ .release = dm_kobject_release,
};

/*
@@ -104,5 +110,7 @@ int dm_sysfs_init(struct mapped_device *
*/
void dm_sysfs_exit(struct mapped_device *md)
{
- kobject_put(dm_kobject(md));
+ struct kobject *kobj = dm_kobject(md);
+ kobject_put(kobj);
+ wait_for_completion(dm_get_completion_from_kobject(kobj));
}
Index: linux-3.13-rc7/drivers/md/dm.c
===================================================================
--- linux-3.13-rc7.orig/drivers/md/dm.c 2014-01-07 02:07:09.000000000 +0100
+++ linux-3.13-rc7/drivers/md/dm.c 2014-01-07 04:58:37.000000000 +0100
@@ -203,6 +203,9 @@ struct mapped_device {
/* sysfs handle */
struct kobject kobj;

+ /* wait until the kobject is released */
+ struct completion kobj_completion;
+
/* zero-length flush that will be cloned and submitted to targets */
struct bio flush_bio;

@@ -2049,6 +2052,7 @@ static struct mapped_device *alloc_dev(i
init_waitqueue_head(&md->wait);
INIT_WORK(&md->work, dm_wq_work);
init_waitqueue_head(&md->eventq);
+ init_completion(&md->kobj_completion);

md->disk->major = _major;
md->disk->first_minor = minor;
@@ -2931,6 +2935,13 @@ struct mapped_device *dm_get_from_kobjec
return md;
}

+struct completion *dm_get_completion_from_kobject(struct kobject *kobj)
+{
+ struct mapped_device *md = container_of(kobj, struct mapped_device, kobj);
+
+ return &md->kobj_completion;
+}
+
int dm_suspended_md(struct mapped_device *md)
{
return test_bit(DMF_SUSPENDED, &md->flags);
Index: linux-3.13-rc7/drivers/md/dm.h
===================================================================
--- linux-3.13-rc7.orig/drivers/md/dm.h 2014-01-07 02:06:08.000000000 +0100
+++ linux-3.13-rc7/drivers/md/dm.h 2014-01-07 02:07:09.000000000 +0100
@@ -15,6 +15,7 @@
#include <linux/list.h>
#include <linux/blkdev.h>
#include <linux/hdreg.h>
+#include <linux/completion.h>

#include "dm-stats.h"

@@ -152,6 +153,7 @@ int dm_sysfs_init(struct mapped_device *
void dm_sysfs_exit(struct mapped_device *md);
struct kobject *dm_kobject(struct mapped_device *md);
struct mapped_device *dm_get_from_kobject(struct kobject *kobj);
+struct completion *dm_get_completion_from_kobject(struct kobject *kobj);

/*
* Targets for linear and striped mappings

2014-01-07 05:26:01

by Linus Torvalds

[permalink] [raw]
Subject: Re: kobject: provide kobject_put_wait to fix module unload race

This looks completely broken to me. You do a "kobject_put()" and then
after you've dropped that last use, you wait for the completion of
something that may already have been free'd.

Wtf? Am I missing something?

Linus

On Tue, Jan 7, 2014 at 12:01 PM, Mikulas Patocka <[email protected]> wrote:
>
>
> On Mon, 6 Jan 2014, Mike Snitzer wrote:
>
>> On Mon, Jan 06 2014 at 1:55pm -0500,
>> Mikulas Patocka <[email protected]> wrote:
>>
>> >
>> >
>> > On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote:
>> >
>> > > On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
>> > > > On 01/04/14 19:06, Mikulas Patocka wrote:
>> > > > > - if (t && !t->release)
>> > > > > - pr_debug("kobject: '%s' (%p): does not have a release() "
>> > > > > - "function, it is broken and must be fixed.\n",
>> > > > > - kobject_name(kobj), kobj);
>> > > > > -
>> > > >
>> > > > Has it been considered to issue a warning if no release function has
>> > > > been defined and free_completion == NULL instead of removing the above
>> > > > debug message entirely ? I think even with this patch applied it is
>> > > > still wrong to invoke kobject_put() on an object without defining a
>> > > > release function.
>> > >
>> > > This patch isn't going to be applied, and I've reverted the original
>> > > commit, so there shouldn't be any issues anymore with this code.
>> >
>> > Why? This patch does the same thing as
>> > eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you
>> > accept eee031649707db3c9920d9498f8d03819b74fc23 and not this?
>> >
>> > The code to wait for kobject destruction using completion already exists
>> > in cpufreq_sysfs_release, cpuidle_sysfs_release,
>> > cpuidle_state_sysfs_release, cpuidle_driver_sysfs_release,
>> > ext4_sb_release, ext4_feat_release, f2fs_sb_release (these are the only
>> > kobject users that are correct w.r.t. module unloading), so if you accept
>> > this patch, you can simplify them to use kobject_put_wait.
>>
>> Hi Mikulas,
>>
>> Please just submit a DM-only patch that follows the same racey pattern
>> of firing a completion from the kobj_type .release method in dm_mod.
>> I'll get it queued up for 3.14.
>>
>> If/when we gets reports of a crash due to dm_mod unload racing with
>> kobject_put we can revisit this.
>>
>> Thanks,
>> Mike
>
> Here I'm sending dm-only patch.
>
>
>
> dm: wait until kobject is destroyed
>
> There may be other parts of the kernel taking reference to the dm kobject.
> We must wait until they drop the references before deallocating the md
> structure.
>
> Signed-off-by: Mikulas Patocka <[email protected]>
> Cc: [email protected]
>
> ---
> drivers/md/dm-sysfs.c | 10 +++++++++-
> drivers/md/dm.c | 11 +++++++++++
> drivers/md/dm.h | 2 ++
> 3 files changed, 22 insertions(+), 1 deletion(-)
>
> Index: linux-3.13-rc7/drivers/md/dm-sysfs.c
> ===================================================================
> --- linux-3.13-rc7.orig/drivers/md/dm-sysfs.c 2014-01-07 02:06:08.000000000 +0100
> +++ linux-3.13-rc7/drivers/md/dm-sysfs.c 2014-01-07 02:07:09.000000000 +0100
> @@ -79,6 +79,11 @@ static const struct sysfs_ops dm_sysfs_o
> .show = dm_attr_show,
> };
>
> +static void dm_kobject_release(struct kobject *kobj)
> +{
> + complete(dm_get_completion_from_kobject(kobj));
> +}
> +
> /*
> * dm kobject is embedded in mapped_device structure
> * no need to define release function here
> @@ -86,6 +91,7 @@ static const struct sysfs_ops dm_sysfs_o
> static struct kobj_type dm_ktype = {
> .sysfs_ops = &dm_sysfs_ops,
> .default_attrs = dm_attrs,
> + .release = dm_kobject_release,
> };
>
> /*
> @@ -104,5 +110,7 @@ int dm_sysfs_init(struct mapped_device *
> */
> void dm_sysfs_exit(struct mapped_device *md)
> {
> - kobject_put(dm_kobject(md));
> + struct kobject *kobj = dm_kobject(md);
> + kobject_put(kobj);
> + wait_for_completion(dm_get_completion_from_kobject(kobj));
> }
> Index: linux-3.13-rc7/drivers/md/dm.c
> ===================================================================
> --- linux-3.13-rc7.orig/drivers/md/dm.c 2014-01-07 02:07:09.000000000 +0100
> +++ linux-3.13-rc7/drivers/md/dm.c 2014-01-07 04:58:37.000000000 +0100
> @@ -203,6 +203,9 @@ struct mapped_device {
> /* sysfs handle */
> struct kobject kobj;
>
> + /* wait until the kobject is released */
> + struct completion kobj_completion;
> +
> /* zero-length flush that will be cloned and submitted to targets */
> struct bio flush_bio;
>
> @@ -2049,6 +2052,7 @@ static struct mapped_device *alloc_dev(i
> init_waitqueue_head(&md->wait);
> INIT_WORK(&md->work, dm_wq_work);
> init_waitqueue_head(&md->eventq);
> + init_completion(&md->kobj_completion);
>
> md->disk->major = _major;
> md->disk->first_minor = minor;
> @@ -2931,6 +2935,13 @@ struct mapped_device *dm_get_from_kobjec
> return md;
> }
>
> +struct completion *dm_get_completion_from_kobject(struct kobject *kobj)
> +{
> + struct mapped_device *md = container_of(kobj, struct mapped_device, kobj);
> +
> + return &md->kobj_completion;
> +}
> +
> int dm_suspended_md(struct mapped_device *md)
> {
> return test_bit(DMF_SUSPENDED, &md->flags);
> Index: linux-3.13-rc7/drivers/md/dm.h
> ===================================================================
> --- linux-3.13-rc7.orig/drivers/md/dm.h 2014-01-07 02:06:08.000000000 +0100
> +++ linux-3.13-rc7/drivers/md/dm.h 2014-01-07 02:07:09.000000000 +0100
> @@ -15,6 +15,7 @@
> #include <linux/list.h>
> #include <linux/blkdev.h>
> #include <linux/hdreg.h>
> +#include <linux/completion.h>
>
> #include "dm-stats.h"
>
> @@ -152,6 +153,7 @@ int dm_sysfs_init(struct mapped_device *
> void dm_sysfs_exit(struct mapped_device *md);
> struct kobject *dm_kobject(struct mapped_device *md);
> struct mapped_device *dm_get_from_kobject(struct kobject *kobj);
> +struct completion *dm_get_completion_from_kobject(struct kobject *kobj);
>
> /*
> * Targets for linear and striped mappings

2014-01-07 14:16:18

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: kobject: provide kobject_put_wait to fix module unload race

On Mon, Jan 06, 2014 at 11:01:22PM -0500, Mikulas Patocka wrote:
>
>
> On Mon, 6 Jan 2014, Mike Snitzer wrote:
>
> > On Mon, Jan 06 2014 at 1:55pm -0500,
> > Mikulas Patocka <[email protected]> wrote:
> >
> > >
> > >
> > > On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote:
> > >
> > > > On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
> > > > > On 01/04/14 19:06, Mikulas Patocka wrote:
> > > > > > - if (t && !t->release)
> > > > > > - pr_debug("kobject: '%s' (%p): does not have a release() "
> > > > > > - "function, it is broken and must be fixed.\n",
> > > > > > - kobject_name(kobj), kobj);
> > > > > > -
> > > > >
> > > > > Has it been considered to issue a warning if no release function has
> > > > > been defined and free_completion == NULL instead of removing the above
> > > > > debug message entirely ? I think even with this patch applied it is
> > > > > still wrong to invoke kobject_put() on an object without defining a
> > > > > release function.
> > > >
> > > > This patch isn't going to be applied, and I've reverted the original
> > > > commit, so there shouldn't be any issues anymore with this code.
> > >
> > > Why? This patch does the same thing as
> > > eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you
> > > accept eee031649707db3c9920d9498f8d03819b74fc23 and not this?
> > >
> > > The code to wait for kobject destruction using completion already exists
> > > in cpufreq_sysfs_release, cpuidle_sysfs_release,
> > > cpuidle_state_sysfs_release, cpuidle_driver_sysfs_release,
> > > ext4_sb_release, ext4_feat_release, f2fs_sb_release (these are the only
> > > kobject users that are correct w.r.t. module unloading), so if you accept
> > > this patch, you can simplify them to use kobject_put_wait.
> >
> > Hi Mikulas,
> >
> > Please just submit a DM-only patch that follows the same racey pattern
> > of firing a completion from the kobj_type .release method in dm_mod.
> > I'll get it queued up for 3.14.
> >
> > If/when we gets reports of a crash due to dm_mod unload racing with
> > kobject_put we can revisit this.
> >
> > Thanks,
> > Mike
>
> Here I'm sending dm-only patch.
>
>
>
> dm: wait until kobject is destroyed
>
> There may be other parts of the kernel taking reference to the dm kobject.
> We must wait until they drop the references before deallocating the md
> structure.
>
> Signed-off-by: Mikulas Patocka <[email protected]>
> Cc: [email protected]
>
> ---
> drivers/md/dm-sysfs.c | 10 +++++++++-
> drivers/md/dm.c | 11 +++++++++++
> drivers/md/dm.h | 2 ++
> 3 files changed, 22 insertions(+), 1 deletion(-)
>
> Index: linux-3.13-rc7/drivers/md/dm-sysfs.c
> ===================================================================
> --- linux-3.13-rc7.orig/drivers/md/dm-sysfs.c 2014-01-07 02:06:08.000000000 +0100
> +++ linux-3.13-rc7/drivers/md/dm-sysfs.c 2014-01-07 02:07:09.000000000 +0100
> @@ -79,6 +79,11 @@ static const struct sysfs_ops dm_sysfs_o
> .show = dm_attr_show,
> };
>
> +static void dm_kobject_release(struct kobject *kobj)
> +{
> + complete(dm_get_completion_from_kobject(kobj));
> +}

Please read the kobject documentation in the kernel tree for why this
isn't ok. The fact that you didn't have a release function at all means
this code has always been broken, why have you been ignoring the kernel
complaining about this for so long before?

You need to free the memory in the release function, not just sit around
and wait for potentially forever.

thanks,

greg k-h

2014-01-07 18:01:14

by Mikulas Patocka

[permalink] [raw]
Subject: Re: kobject: provide kobject_put_wait to fix module unload race



On Tue, 7 Jan 2014, Linus Torvalds wrote:

> This looks completely broken to me. You do a "kobject_put()" and then
> after you've dropped that last use, you wait for the completion of
> something that may already have been free'd.
>
> Wtf? Am I missing something?
>
> Linus

It is correct. The release method dm_kobject_release doesn't free the
kobject. It just signals the completion and returns.

This is the sequence of operations:
* call kobject_put
* wait until all users stop using the kobject, when it happens the release
method is called
* the release method signals the completion and returns
* the unload code that waits on the completion continues
* the unload code frees the mapped_device structure that contains the
kobject

Using kobject this way avoids the module unload race that was mentioned at
the beginning of this thread.

Mikulas

> On Tue, Jan 7, 2014 at 12:01 PM, Mikulas Patocka <[email protected]> wrote:
> >
> >
> > On Mon, 6 Jan 2014, Mike Snitzer wrote:
> >
> >> On Mon, Jan 06 2014 at 1:55pm -0500,
> >> Mikulas Patocka <[email protected]> wrote:
> >>
> >> >
> >> >
> >> > On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote:
> >> >
> >> > > On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
> >> > > > On 01/04/14 19:06, Mikulas Patocka wrote:
> >> > > > > - if (t && !t->release)
> >> > > > > - pr_debug("kobject: '%s' (%p): does not have a release() "
> >> > > > > - "function, it is broken and must be fixed.\n",
> >> > > > > - kobject_name(kobj), kobj);
> >> > > > > -
> >> > > >
> >> > > > Has it been considered to issue a warning if no release function has
> >> > > > been defined and free_completion == NULL instead of removing the above
> >> > > > debug message entirely ? I think even with this patch applied it is
> >> > > > still wrong to invoke kobject_put() on an object without defining a
> >> > > > release function.
> >> > >
> >> > > This patch isn't going to be applied, and I've reverted the original
> >> > > commit, so there shouldn't be any issues anymore with this code.
> >> >
> >> > Why? This patch does the same thing as
> >> > eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you
> >> > accept eee031649707db3c9920d9498f8d03819b74fc23 and not this?
> >> >
> >> > The code to wait for kobject destruction using completion already exists
> >> > in cpufreq_sysfs_release, cpuidle_sysfs_release,
> >> > cpuidle_state_sysfs_release, cpuidle_driver_sysfs_release,
> >> > ext4_sb_release, ext4_feat_release, f2fs_sb_release (these are the only
> >> > kobject users that are correct w.r.t. module unloading), so if you accept
> >> > this patch, you can simplify them to use kobject_put_wait.
> >>
> >> Hi Mikulas,
> >>
> >> Please just submit a DM-only patch that follows the same racey pattern
> >> of firing a completion from the kobj_type .release method in dm_mod.
> >> I'll get it queued up for 3.14.
> >>
> >> If/when we gets reports of a crash due to dm_mod unload racing with
> >> kobject_put we can revisit this.
> >>
> >> Thanks,
> >> Mike
> >
> > Here I'm sending dm-only patch.
> >
> >
> >
> > dm: wait until kobject is destroyed
> >
> > There may be other parts of the kernel taking reference to the dm kobject.
> > We must wait until they drop the references before deallocating the md
> > structure.
> >
> > Signed-off-by: Mikulas Patocka <[email protected]>
> > Cc: [email protected]
> >
> > ---
> > drivers/md/dm-sysfs.c | 10 +++++++++-
> > drivers/md/dm.c | 11 +++++++++++
> > drivers/md/dm.h | 2 ++
> > 3 files changed, 22 insertions(+), 1 deletion(-)
> >
> > Index: linux-3.13-rc7/drivers/md/dm-sysfs.c
> > ===================================================================
> > --- linux-3.13-rc7.orig/drivers/md/dm-sysfs.c 2014-01-07 02:06:08.000000000 +0100
> > +++ linux-3.13-rc7/drivers/md/dm-sysfs.c 2014-01-07 02:07:09.000000000 +0100
> > @@ -79,6 +79,11 @@ static const struct sysfs_ops dm_sysfs_o
> > .show = dm_attr_show,
> > };
> >
> > +static void dm_kobject_release(struct kobject *kobj)
> > +{
> > + complete(dm_get_completion_from_kobject(kobj));
> > +}
> > +
> > /*
> > * dm kobject is embedded in mapped_device structure
> > * no need to define release function here
> > @@ -86,6 +91,7 @@ static const struct sysfs_ops dm_sysfs_o
> > static struct kobj_type dm_ktype = {
> > .sysfs_ops = &dm_sysfs_ops,
> > .default_attrs = dm_attrs,
> > + .release = dm_kobject_release,
> > };
> >
> > /*
> > @@ -104,5 +110,7 @@ int dm_sysfs_init(struct mapped_device *
> > */
> > void dm_sysfs_exit(struct mapped_device *md)
> > {
> > - kobject_put(dm_kobject(md));
> > + struct kobject *kobj = dm_kobject(md);
> > + kobject_put(kobj);
> > + wait_for_completion(dm_get_completion_from_kobject(kobj));
> > }
> > Index: linux-3.13-rc7/drivers/md/dm.c
> > ===================================================================
> > --- linux-3.13-rc7.orig/drivers/md/dm.c 2014-01-07 02:07:09.000000000 +0100
> > +++ linux-3.13-rc7/drivers/md/dm.c 2014-01-07 04:58:37.000000000 +0100
> > @@ -203,6 +203,9 @@ struct mapped_device {
> > /* sysfs handle */
> > struct kobject kobj;
> >
> > + /* wait until the kobject is released */
> > + struct completion kobj_completion;
> > +
> > /* zero-length flush that will be cloned and submitted to targets */
> > struct bio flush_bio;
> >
> > @@ -2049,6 +2052,7 @@ static struct mapped_device *alloc_dev(i
> > init_waitqueue_head(&md->wait);
> > INIT_WORK(&md->work, dm_wq_work);
> > init_waitqueue_head(&md->eventq);
> > + init_completion(&md->kobj_completion);
> >
> > md->disk->major = _major;
> > md->disk->first_minor = minor;
> > @@ -2931,6 +2935,13 @@ struct mapped_device *dm_get_from_kobjec
> > return md;
> > }
> >
> > +struct completion *dm_get_completion_from_kobject(struct kobject *kobj)
> > +{
> > + struct mapped_device *md = container_of(kobj, struct mapped_device, kobj);
> > +
> > + return &md->kobj_completion;
> > +}
> > +
> > int dm_suspended_md(struct mapped_device *md)
> > {
> > return test_bit(DMF_SUSPENDED, &md->flags);
> > Index: linux-3.13-rc7/drivers/md/dm.h
> > ===================================================================
> > --- linux-3.13-rc7.orig/drivers/md/dm.h 2014-01-07 02:06:08.000000000 +0100
> > +++ linux-3.13-rc7/drivers/md/dm.h 2014-01-07 02:07:09.000000000 +0100
> > @@ -15,6 +15,7 @@
> > #include <linux/list.h>
> > #include <linux/blkdev.h>
> > #include <linux/hdreg.h>
> > +#include <linux/completion.h>
> >
> > #include "dm-stats.h"
> >
> > @@ -152,6 +153,7 @@ int dm_sysfs_init(struct mapped_device *
> > void dm_sysfs_exit(struct mapped_device *md);
> > struct kobject *dm_kobject(struct mapped_device *md);
> > struct mapped_device *dm_get_from_kobject(struct kobject *kobj);
> > +struct completion *dm_get_completion_from_kobject(struct kobject *kobj);
> >
> > /*
> > * Targets for linear and striped mappings
>

2014-01-07 18:16:29

by Mikulas Patocka

[permalink] [raw]
Subject: Re: kobject: provide kobject_put_wait to fix module unload race



On Tue, 7 Jan 2014, Greg Kroah-Hartman wrote:

> > Index: linux-3.13-rc7/drivers/md/dm-sysfs.c
> > ===================================================================
> > --- linux-3.13-rc7.orig/drivers/md/dm-sysfs.c 2014-01-07 02:06:08.000000000 +0100
> > +++ linux-3.13-rc7/drivers/md/dm-sysfs.c 2014-01-07 02:07:09.000000000 +0100
> > @@ -79,6 +79,11 @@ static const struct sysfs_ops dm_sysfs_o
> > .show = dm_attr_show,
> > };
> >
> > +static void dm_kobject_release(struct kobject *kobj)
> > +{
> > + complete(dm_get_completion_from_kobject(kobj));
> > +}
>
> Please read the kobject documentation in the kernel tree for why this
> isn't ok. The fact that you didn't have a release function at all means
> this code has always been broken, why have you been ignoring the kernel
> complaining about this for so long before?

I read that file and I say that the usage pattern presented in that file
is broken w.r.t. module unload.

I don't want to fix one race condition and introduce another one (albeit
smaller).

> You need to free the memory in the release function, not just sit around
> and wait for potentially forever.

How could that code wait forever? Even if userspace keeps the appropriate
files in sysfs open, the dm device can be unloaded, it doesn't wait
forever.

> thanks,
>
> greg k-h

Mikulas

2014-01-07 18:26:37

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: kobject: provide kobject_put_wait to fix module unload race

On Tue, Jan 07, 2014 at 06:16:22AM -0800, Greg Kroah-Hartman wrote:
> On Mon, Jan 06, 2014 at 11:01:22PM -0500, Mikulas Patocka wrote:
> >
> >
> > On Mon, 6 Jan 2014, Mike Snitzer wrote:
> >
> > > On Mon, Jan 06 2014 at 1:55pm -0500,
> > > Mikulas Patocka <[email protected]> wrote:
> > >
> > > >
> > > >
> > > > On Sun, 5 Jan 2014, Greg Kroah-Hartman wrote:
> > > >
> > > > > On Sun, Jan 05, 2014 at 05:43:56PM +0100, Bart Van Assche wrote:
> > > > > > On 01/04/14 19:06, Mikulas Patocka wrote:
> > > > > > > - if (t && !t->release)
> > > > > > > - pr_debug("kobject: '%s' (%p): does not have a release() "
> > > > > > > - "function, it is broken and must be fixed.\n",
> > > > > > > - kobject_name(kobj), kobj);
> > > > > > > -
> > > > > >
> > > > > > Has it been considered to issue a warning if no release function has
> > > > > > been defined and free_completion == NULL instead of removing the above
> > > > > > debug message entirely ? I think even with this patch applied it is
> > > > > > still wrong to invoke kobject_put() on an object without defining a
> > > > > > release function.
> > > > >
> > > > > This patch isn't going to be applied, and I've reverted the original
> > > > > commit, so there shouldn't be any issues anymore with this code.
> > > >
> > > > Why? This patch does the same thing as
> > > > eee031649707db3c9920d9498f8d03819b74fc23, but it's smaller. So why did you
> > > > accept eee031649707db3c9920d9498f8d03819b74fc23 and not this?
> > > >
> > > > The code to wait for kobject destruction using completion already exists
> > > > in cpufreq_sysfs_release, cpuidle_sysfs_release,
> > > > cpuidle_state_sysfs_release, cpuidle_driver_sysfs_release,
> > > > ext4_sb_release, ext4_feat_release, f2fs_sb_release (these are the only
> > > > kobject users that are correct w.r.t. module unloading), so if you accept
> > > > this patch, you can simplify them to use kobject_put_wait.
> > >
> > > Hi Mikulas,
> > >
> > > Please just submit a DM-only patch that follows the same racey pattern
> > > of firing a completion from the kobj_type .release method in dm_mod.
> > > I'll get it queued up for 3.14.
> > >
> > > If/when we gets reports of a crash due to dm_mod unload racing with
> > > kobject_put we can revisit this.
> > >
> > > Thanks,
> > > Mike
> >
> > Here I'm sending dm-only patch.
> >
> >
> >
> > dm: wait until kobject is destroyed
> >
> > There may be other parts of the kernel taking reference to the dm kobject.
> > We must wait until they drop the references before deallocating the md
> > structure.
> >
> > Signed-off-by: Mikulas Patocka <[email protected]>
> > Cc: [email protected]
> >
> > ---
> > drivers/md/dm-sysfs.c | 10 +++++++++-
> > drivers/md/dm.c | 11 +++++++++++
> > drivers/md/dm.h | 2 ++
> > 3 files changed, 22 insertions(+), 1 deletion(-)
> >
> > Index: linux-3.13-rc7/drivers/md/dm-sysfs.c
> > ===================================================================
> > --- linux-3.13-rc7.orig/drivers/md/dm-sysfs.c 2014-01-07 02:06:08.000000000 +0100
> > +++ linux-3.13-rc7/drivers/md/dm-sysfs.c 2014-01-07 02:07:09.000000000 +0100
> > @@ -79,6 +79,11 @@ static const struct sysfs_ops dm_sysfs_o
> > .show = dm_attr_show,
> > };
> >
> > +static void dm_kobject_release(struct kobject *kobj)
> > +{
> > + complete(dm_get_completion_from_kobject(kobj));
> > +}
>
> Please read the kobject documentation in the kernel tree for why this
> isn't ok.

If documentation says that this is not allowed then we need to fix
documentation.

> The fact that you didn't have a release function at all means
> this code has always been broken, why have you been ignoring the kernel
> complaining about this for so long before?
>
> You need to free the memory in the release function, not just sit around
> and wait for potentially forever.

Why? I understand that normally freeing is what's happening but not
necessarily. Release is simply called when last reference to the
[k]object is dropped, that's it.

Saying that every release function has to free memory is just a
cargo-cult programming to me. We already have (as far as I can see)
correct examples of release functions not freeing memory:
fs/char_dev.c::cdev_default_release().

Thanks.

--
Dmitry

2014-01-07 19:19:39

by Mike Snitzer

[permalink] [raw]
Subject: Re: kobject: provide kobject_put_wait to fix module unload race

On Tue, Jan 07 2014 at 1:00pm -0500,
Mikulas Patocka <[email protected]> wrote:

>
>
> On Tue, 7 Jan 2014, Linus Torvalds wrote:
>
> > This looks completely broken to me. You do a "kobject_put()" and then
> > after you've dropped that last use, you wait for the completion of
> > something that may already have been free'd.
> >
> > Wtf? Am I missing something?
> >
> > Linus
>
> It is correct. The release method dm_kobject_release doesn't free the
> kobject. It just signals the completion and returns.
>
> This is the sequence of operations:
> * call kobject_put
> * wait until all users stop using the kobject, when it happens the release
> method is called
> * the release method signals the completion and returns
> * the unload code that waits on the completion continues
> * the unload code frees the mapped_device structure that contains the
> kobject
>
> Using kobject this way avoids the module unload race that was mentioned at
> the beginning of this thread.

I've staged your patch in linux-next for 3.14, see:
http://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=af7b1e5c767fc895788c971c8f4686402ac8344f

2014-01-07 20:17:18

by Mikulas Patocka

[permalink] [raw]
Subject: Re: kobject: provide kobject_put_wait to fix module unload race



On Tue, 7 Jan 2014, Mike Snitzer wrote:

> On Tue, Jan 07 2014 at 1:00pm -0500,
> Mikulas Patocka <[email protected]> wrote:
>
> >
> >
> > On Tue, 7 Jan 2014, Linus Torvalds wrote:
> >
> > > This looks completely broken to me. You do a "kobject_put()" and then
> > > after you've dropped that last use, you wait for the completion of
> > > something that may already have been free'd.
> > >
> > > Wtf? Am I missing something?
> > >
> > > Linus
> >
> > It is correct. The release method dm_kobject_release doesn't free the
> > kobject. It just signals the completion and returns.
> >
> > This is the sequence of operations:
> > * call kobject_put
> > * wait until all users stop using the kobject, when it happens the release
> > method is called
> > * the release method signals the completion and returns
> > * the unload code that waits on the completion continues
> > * the unload code frees the mapped_device structure that contains the
> > kobject
> >
> > Using kobject this way avoids the module unload race that was mentioned at
> > the beginning of this thread.
>
> I've staged your patch in linux-next for 3.14, see:
> http://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=af7b1e5c767fc895788c971c8f4686402ac8344f

Looking at this patch, I realize that it is buggy too. If module unload
happens at this point (after the completion is signaled, but before the
release function returns), it crashes.

static void dm_kobject_release(struct kobject *kobj)
{
complete(dm_get_completion_from_kobject(kobj));
>========== module unload here ===============<
}

The patch that I sent initially in this thread doesn't have this bug
because the completion is signaled in non-module code.

That goes back to my initial claim - it is impossible to use the kobject
interface correctly! But if Greg doesn't want a patch that fixes the
kobject interface, I don't really know what to do about it.

Maybe another possibility - maintain a list of all kobjects and walk them
in the generic module unload code to check if their release method ponits
to the module that is being unloaded? Greg, would you accept a patch like
this?

Mikulas

2014-01-07 23:19:27

by Mike Snitzer

[permalink] [raw]
Subject: Re: kobject: provide kobject_put_wait to fix module unload race

On Tue, Jan 07 2014 at 3:16pm -0500,
Mikulas Patocka <[email protected]> wrote:

> On Tue, 7 Jan 2014, Mike Snitzer wrote:
>
> > I've staged your patch in linux-next for 3.14, see:
> > http://git.kernel.org/cgit/linux/kernel/git/device-mapper/linux-dm.git/commit/?h=for-next&id=af7b1e5c767fc895788c971c8f4686402ac8344f
>
> Looking at this patch, I realize that it is buggy too. If module unload
> happens at this point (after the completion is signaled, but before the
> release function returns), it crashes.
>
> static void dm_kobject_release(struct kobject *kobj)
> {
> complete(dm_get_completion_from_kobject(kobj));
> >========== module unload here ===============<
> }

That race isn't a surprise, thought you pointed it out earlier in this
thread? Anyway, I'm inclined to keep the racey patch staged in
linux-dm.git until the stalemate with gregkh is resolved. DM is no
worse than other code that follows this same wait for completion
pattern.

> The patch that I sent initially in this thread doesn't have this bug
> because the completion is signaled in non-module code.
>
> That goes back to my initial claim - it is impossible to use the kobject
> interface correctly! But if Greg doesn't want a patch that fixes the
> kobject interface, I don't really know what to do about it.

Right, I'm missing what is wrong with your proposed kobject_put_wait
interface.

Greg, can you please establish that you understand the problem, and
existing kobject patterns, before you dismiss a fix?