2020-05-13 15:24:11

by Heikki Krogerus

[permalink] [raw]
Subject: [PATCH] kobject: Make sure the parent does not get released before its children

In the function kobject_cleanup(), kobject_del(kobj) is
called before the kobj->release(). That makes it possible to
release the parent of the kobject before the kobject itself.

To fix that, adding function __kboject_del() that does
everything that kobject_del() does except release the parent
reference. kobject_cleanup() then calls __kobject_del()
instead of kobject_del(), and separately decrements the
reference count of the parent kobject after kobj->release()
has been called.

Reported-by: Naresh Kamboju <[email protected]>
Reported-by: kernel test robot <[email protected]>
Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
Cc: Brendan Higgins <[email protected]>
Cc: Randy Dunlap <[email protected]>
Suggested-by: "Rafael J. Wysocki" <[email protected]>
Signed-off-by: Heikki Krogerus <[email protected]>
---
lib/kobject.c | 30 ++++++++++++++++++++----------
1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/lib/kobject.c b/lib/kobject.c
index 65fa7bf70c57..32432036bef8 100644
--- a/lib/kobject.c
+++ b/lib/kobject.c
@@ -599,14 +599,7 @@ int kobject_move(struct kobject *kobj, struct kobject *new_parent)
}
EXPORT_SYMBOL_GPL(kobject_move);

-/**
- * kobject_del() - Unlink kobject from hierarchy.
- * @kobj: object.
- *
- * This is the function that should be called to delete an object
- * successfully added via kobject_add().
- */
-void kobject_del(struct kobject *kobj)
+static void __kobject_del(struct kobject *kobj)
{
struct kernfs_node *sd;
const struct kobj_type *ktype;
@@ -625,9 +618,23 @@ void kobject_del(struct kobject *kobj)

kobj->state_in_sysfs = 0;
kobj_kset_leave(kobj);
- kobject_put(kobj->parent);
kobj->parent = NULL;
}
+
+/**
+ * kobject_del() - Unlink kobject from hierarchy.
+ * @kobj: object.
+ *
+ * This is the function that should be called to delete an object
+ * successfully added via kobject_add().
+ */
+void kobject_del(struct kobject *kobj)
+{
+ struct kobject *parent = kobj->parent;
+
+ __kobject_del(kobj);
+ kobject_put(parent);
+}
EXPORT_SYMBOL(kobject_del);

/**
@@ -663,6 +670,7 @@ EXPORT_SYMBOL(kobject_get_unless_zero);
*/
static void kobject_cleanup(struct kobject *kobj)
{
+ struct kobject *parent = kobj->parent;
struct kobj_type *t = get_ktype(kobj);
const char *name = kobj->name;

@@ -684,7 +692,7 @@ static void kobject_cleanup(struct kobject *kobj)
if (kobj->state_in_sysfs) {
pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
kobject_name(kobj), kobj);
- kobject_del(kobj);
+ __kobject_del(kobj);
}

if (t && t->release) {
@@ -698,6 +706,8 @@ static void kobject_cleanup(struct kobject *kobj)
pr_debug("kobject: '%s': free name\n", name);
kfree_const(name);
}
+
+ kobject_put(parent);
}

#ifdef CONFIG_DEBUG_KOBJECT_RELEASE
--
2.26.2


2020-05-13 15:24:41

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Wed, May 13, 2020 at 5:18 PM Heikki Krogerus
<[email protected]> wrote:
>
> In the function kobject_cleanup(), kobject_del(kobj) is
> called before the kobj->release(). That makes it possible to
> release the parent of the kobject before the kobject itself.
>
> To fix that, adding function __kboject_del() that does
> everything that kobject_del() does except release the parent
> reference. kobject_cleanup() then calls __kobject_del()
> instead of kobject_del(), and separately decrements the
> reference count of the parent kobject after kobj->release()
> has been called.
>
> Reported-by: Naresh Kamboju <[email protected]>
> Reported-by: kernel test robot <[email protected]>
> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> Cc: Brendan Higgins <[email protected]>
> Cc: Randy Dunlap <[email protected]>
> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> Signed-off-by: Heikki Krogerus <[email protected]>

Reviewed-by: Rafael J. Wysocki <[email protected]>

> ---
> lib/kobject.c | 30 ++++++++++++++++++++----------
> 1 file changed, 20 insertions(+), 10 deletions(-)
>
> diff --git a/lib/kobject.c b/lib/kobject.c
> index 65fa7bf70c57..32432036bef8 100644
> --- a/lib/kobject.c
> +++ b/lib/kobject.c
> @@ -599,14 +599,7 @@ int kobject_move(struct kobject *kobj, struct kobject *new_parent)
> }
> EXPORT_SYMBOL_GPL(kobject_move);
>
> -/**
> - * kobject_del() - Unlink kobject from hierarchy.
> - * @kobj: object.
> - *
> - * This is the function that should be called to delete an object
> - * successfully added via kobject_add().
> - */
> -void kobject_del(struct kobject *kobj)
> +static void __kobject_del(struct kobject *kobj)
> {
> struct kernfs_node *sd;
> const struct kobj_type *ktype;
> @@ -625,9 +618,23 @@ void kobject_del(struct kobject *kobj)
>
> kobj->state_in_sysfs = 0;
> kobj_kset_leave(kobj);
> - kobject_put(kobj->parent);
> kobj->parent = NULL;
> }
> +
> +/**
> + * kobject_del() - Unlink kobject from hierarchy.
> + * @kobj: object.
> + *
> + * This is the function that should be called to delete an object
> + * successfully added via kobject_add().
> + */
> +void kobject_del(struct kobject *kobj)
> +{
> + struct kobject *parent = kobj->parent;
> +
> + __kobject_del(kobj);
> + kobject_put(parent);
> +}
> EXPORT_SYMBOL(kobject_del);
>
> /**
> @@ -663,6 +670,7 @@ EXPORT_SYMBOL(kobject_get_unless_zero);
> */
> static void kobject_cleanup(struct kobject *kobj)
> {
> + struct kobject *parent = kobj->parent;
> struct kobj_type *t = get_ktype(kobj);
> const char *name = kobj->name;
>
> @@ -684,7 +692,7 @@ static void kobject_cleanup(struct kobject *kobj)
> if (kobj->state_in_sysfs) {
> pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> kobject_name(kobj), kobj);
> - kobject_del(kobj);
> + __kobject_del(kobj);
> }
>
> if (t && t->release) {
> @@ -698,6 +706,8 @@ static void kobject_cleanup(struct kobject *kobj)
> pr_debug("kobject: '%s': free name\n", name);
> kfree_const(name);
> }
> +
> + kobject_put(parent);
> }
>
> #ifdef CONFIG_DEBUG_KOBJECT_RELEASE
> --
> 2.26.2
>

2020-05-13 15:54:52

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Wed, May 13, 2020 at 5:42 PM Greg Kroah-Hartman
<[email protected]> wrote:
>
> On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> > In the function kobject_cleanup(), kobject_del(kobj) is
> > called before the kobj->release(). That makes it possible to
> > release the parent of the kobject before the kobject itself.
> >
> > To fix that, adding function __kboject_del() that does
> > everything that kobject_del() does except release the parent
> > reference. kobject_cleanup() then calls __kobject_del()
> > instead of kobject_del(), and separately decrements the
> > reference count of the parent kobject after kobj->release()
> > has been called.
> >
> > Reported-by: Naresh Kamboju <[email protected]>
> > Reported-by: kernel test robot <[email protected]>
> > Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> > Cc: Brendan Higgins <[email protected]>
> > Cc: Randy Dunlap <[email protected]>
> > Suggested-by: "Rafael J. Wysocki" <[email protected]>
> > Signed-off-by: Heikki Krogerus <[email protected]>
> > ---
> > lib/kobject.c | 30 ++++++++++++++++++++----------
> > 1 file changed, 20 insertions(+), 10 deletions(-)
> >
> > diff --git a/lib/kobject.c b/lib/kobject.c
> > index 65fa7bf70c57..32432036bef8 100644
> > --- a/lib/kobject.c
> > +++ b/lib/kobject.c
> > @@ -599,14 +599,7 @@ int kobject_move(struct kobject *kobj, struct kobject *new_parent)
> > }
> > EXPORT_SYMBOL_GPL(kobject_move);
> >
> > -/**
> > - * kobject_del() - Unlink kobject from hierarchy.
> > - * @kobj: object.
> > - *
> > - * This is the function that should be called to delete an object
> > - * successfully added via kobject_add().
> > - */
> > -void kobject_del(struct kobject *kobj)
> > +static void __kobject_del(struct kobject *kobj)
> > {
> > struct kernfs_node *sd;
> > const struct kobj_type *ktype;
> > @@ -625,9 +618,23 @@ void kobject_del(struct kobject *kobj)
> >
> > kobj->state_in_sysfs = 0;
> > kobj_kset_leave(kobj);
> > - kobject_put(kobj->parent);
> > kobj->parent = NULL;
> > }
> > +
> > +/**
> > + * kobject_del() - Unlink kobject from hierarchy.
> > + * @kobj: object.
> > + *
> > + * This is the function that should be called to delete an object
> > + * successfully added via kobject_add().
> > + */
> > +void kobject_del(struct kobject *kobj)
> > +{
> > + struct kobject *parent = kobj->parent;
> > +
> > + __kobject_del(kobj);
> > + kobject_put(parent);
> > +}
> > EXPORT_SYMBOL(kobject_del);
> >
> > /**
> > @@ -663,6 +670,7 @@ EXPORT_SYMBOL(kobject_get_unless_zero);
> > */
> > static void kobject_cleanup(struct kobject *kobj)
> > {
> > + struct kobject *parent = kobj->parent;
> > struct kobj_type *t = get_ktype(kobj);
> > const char *name = kobj->name;
> >
> > @@ -684,7 +692,7 @@ static void kobject_cleanup(struct kobject *kobj)
> > if (kobj->state_in_sysfs) {
> > pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> > kobject_name(kobj), kobj);
> > - kobject_del(kobj);
> > + __kobject_del(kobj);
> > }
> >
> > if (t && t->release) {
> > @@ -698,6 +706,8 @@ static void kobject_cleanup(struct kobject *kobj)
> > pr_debug("kobject: '%s': free name\n", name);
> > kfree_const(name);
> > }
> > +
> > + kobject_put(parent);
> > }
> >
> > #ifdef CONFIG_DEBUG_KOBJECT_RELEASE
> > --
> > 2.26.2
> >
>
> Is this the older patch we talked about before, or something else?
>
> I can't remember how we left that thread...

This is an alternative that you said might work. :-)

2020-05-13 20:23:07

by Brendan Higgins

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Wed, May 13, 2020 at 8:18 AM Heikki Krogerus
<[email protected]> wrote:
>
> In the function kobject_cleanup(), kobject_del(kobj) is
> called before the kobj->release(). That makes it possible to
> release the parent of the kobject before the kobject itself.
>
> To fix that, adding function __kboject_del() that does
> everything that kobject_del() does except release the parent
> reference. kobject_cleanup() then calls __kobject_del()
> instead of kobject_del(), and separately decrements the
> reference count of the parent kobject after kobj->release()
> has been called.

I was starting to wonder if anything else needed to happen with this. :-)

Thanks for taking care of this!

> Reported-by: Naresh Kamboju <[email protected]>
> Reported-by: kernel test robot <[email protected]>
> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> Cc: Brendan Higgins <[email protected]>
> Cc: Randy Dunlap <[email protected]>
> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> Signed-off-by: Heikki Krogerus <[email protected]>

Didn't I and someone else test this?

Either way, I will test this out in a little bit.

Thanks!

2020-05-13 20:48:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> In the function kobject_cleanup(), kobject_del(kobj) is
> called before the kobj->release(). That makes it possible to
> release the parent of the kobject before the kobject itself.
>
> To fix that, adding function __kboject_del() that does
> everything that kobject_del() does except release the parent
> reference. kobject_cleanup() then calls __kobject_del()
> instead of kobject_del(), and separately decrements the
> reference count of the parent kobject after kobj->release()
> has been called.
>
> Reported-by: Naresh Kamboju <[email protected]>
> Reported-by: kernel test robot <[email protected]>
> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> Cc: Brendan Higgins <[email protected]>
> Cc: Randy Dunlap <[email protected]>
> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> Signed-off-by: Heikki Krogerus <[email protected]>
> ---
> lib/kobject.c | 30 ++++++++++++++++++++----------
> 1 file changed, 20 insertions(+), 10 deletions(-)
>
> diff --git a/lib/kobject.c b/lib/kobject.c
> index 65fa7bf70c57..32432036bef8 100644
> --- a/lib/kobject.c
> +++ b/lib/kobject.c
> @@ -599,14 +599,7 @@ int kobject_move(struct kobject *kobj, struct kobject *new_parent)
> }
> EXPORT_SYMBOL_GPL(kobject_move);
>
> -/**
> - * kobject_del() - Unlink kobject from hierarchy.
> - * @kobj: object.
> - *
> - * This is the function that should be called to delete an object
> - * successfully added via kobject_add().
> - */
> -void kobject_del(struct kobject *kobj)
> +static void __kobject_del(struct kobject *kobj)
> {
> struct kernfs_node *sd;
> const struct kobj_type *ktype;
> @@ -625,9 +618,23 @@ void kobject_del(struct kobject *kobj)
>
> kobj->state_in_sysfs = 0;
> kobj_kset_leave(kobj);
> - kobject_put(kobj->parent);
> kobj->parent = NULL;
> }
> +
> +/**
> + * kobject_del() - Unlink kobject from hierarchy.
> + * @kobj: object.
> + *
> + * This is the function that should be called to delete an object
> + * successfully added via kobject_add().
> + */
> +void kobject_del(struct kobject *kobj)
> +{
> + struct kobject *parent = kobj->parent;
> +
> + __kobject_del(kobj);
> + kobject_put(parent);
> +}
> EXPORT_SYMBOL(kobject_del);
>
> /**
> @@ -663,6 +670,7 @@ EXPORT_SYMBOL(kobject_get_unless_zero);
> */
> static void kobject_cleanup(struct kobject *kobj)
> {
> + struct kobject *parent = kobj->parent;
> struct kobj_type *t = get_ktype(kobj);
> const char *name = kobj->name;
>
> @@ -684,7 +692,7 @@ static void kobject_cleanup(struct kobject *kobj)
> if (kobj->state_in_sysfs) {
> pr_debug("kobject: '%s' (%p): auto cleanup kobject_del\n",
> kobject_name(kobj), kobj);
> - kobject_del(kobj);
> + __kobject_del(kobj);
> }
>
> if (t && t->release) {
> @@ -698,6 +706,8 @@ static void kobject_cleanup(struct kobject *kobj)
> pr_debug("kobject: '%s': free name\n", name);
> kfree_const(name);
> }
> +
> + kobject_put(parent);
> }
>
> #ifdef CONFIG_DEBUG_KOBJECT_RELEASE
> --
> 2.26.2
>

Is this the older patch we talked about before, or something else?

I can't remember how we left that thread...

thanks,

greg k-h

2020-05-13 20:56:40

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On 5/13/20 1:18 PM, Brendan Higgins wrote:
> On Wed, May 13, 2020 at 8:18 AM Heikki Krogerus
> <[email protected]> wrote:
>>
>> In the function kobject_cleanup(), kobject_del(kobj) is
>> called before the kobj->release(). That makes it possible to
>> release the parent of the kobject before the kobject itself.
>>
>> To fix that, adding function __kboject_del() that does
>> everything that kobject_del() does except release the parent
>> reference. kobject_cleanup() then calls __kobject_del()
>> instead of kobject_del(), and separately decrements the
>> reference count of the parent kobject after kobj->release()
>> has been called.
>
> I was starting to wonder if anything else needed to happen with this. :-)
>
> Thanks for taking care of this!
>
>> Reported-by: Naresh Kamboju <[email protected]>
>> Reported-by: kernel test robot <[email protected]>
>> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
>> Cc: Brendan Higgins <[email protected]>
>> Cc: Randy Dunlap <[email protected]>
>> Suggested-by: "Rafael J. Wysocki" <[email protected]>
>> Signed-off-by: Heikki Krogerus <[email protected]>
>
> Didn't I and someone else test this?
>
> Either way, I will test this out in a little bit.
>
> Thanks!
>

Yes, I tested the earlier patch and acked it.
(using lib/test_printf.ko)

--
~Randy

2020-05-13 21:34:58

by Brendan Higgins

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Wed, May 13, 2020 at 8:18 AM Heikki Krogerus
<[email protected]> wrote:
>
> In the function kobject_cleanup(), kobject_del(kobj) is
> called before the kobj->release(). That makes it possible to
> release the parent of the kobject before the kobject itself.
>
> To fix that, adding function __kboject_del() that does
> everything that kobject_del() does except release the parent
> reference. kobject_cleanup() then calls __kobject_del()
> instead of kobject_del(), and separately decrements the
> reference count of the parent kobject after kobj->release()
> has been called.
>
> Reported-by: Naresh Kamboju <[email protected]>
> Reported-by: kernel test robot <[email protected]>
> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> Cc: Brendan Higgins <[email protected]>
> Cc: Randy Dunlap <[email protected]>
> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> Signed-off-by: Heikki Krogerus <[email protected]>

Reviewed-by: Brendan Higgins <[email protected]>
Tested-by: Brendan Higgins <[email protected]>

2020-05-13 23:16:57

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On 5/13/20 2:30 PM, Brendan Higgins wrote:
> On Wed, May 13, 2020 at 8:18 AM Heikki Krogerus
> <[email protected]> wrote:
>>
>> In the function kobject_cleanup(), kobject_del(kobj) is
>> called before the kobj->release(). That makes it possible to
>> release the parent of the kobject before the kobject itself.
>>
>> To fix that, adding function __kboject_del() that does
>> everything that kobject_del() does except release the parent
>> reference. kobject_cleanup() then calls __kobject_del()
>> instead of kobject_del(), and separately decrements the
>> reference count of the parent kobject after kobj->release()
>> has been called.
>>
>> Reported-by: Naresh Kamboju <[email protected]>
>> Reported-by: kernel test robot <[email protected]>
>> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
>> Cc: Brendan Higgins <[email protected]>
>> Cc: Randy Dunlap <[email protected]>
>> Suggested-by: "Rafael J. Wysocki" <[email protected]>
>> Signed-off-by: Heikki Krogerus <[email protected]>
>
> Reviewed-by: Brendan Higgins <[email protected]>
> Tested-by: Brendan Higgins <[email protected]>
>

Acked-by: Randy Dunlap <[email protected]>
Tested-by: Randy Dunlap <[email protected]>

Thanks.
--
~Randy

2020-05-14 06:59:00

by Heikki Krogerus

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Wed, May 13, 2020 at 04:14:51PM -0700, Randy Dunlap wrote:
> On 5/13/20 2:30 PM, Brendan Higgins wrote:
> > On Wed, May 13, 2020 at 8:18 AM Heikki Krogerus
> > <[email protected]> wrote:
> >>
> >> In the function kobject_cleanup(), kobject_del(kobj) is
> >> called before the kobj->release(). That makes it possible to
> >> release the parent of the kobject before the kobject itself.
> >>
> >> To fix that, adding function __kboject_del() that does
> >> everything that kobject_del() does except release the parent
> >> reference. kobject_cleanup() then calls __kobject_del()
> >> instead of kobject_del(), and separately decrements the
> >> reference count of the parent kobject after kobj->release()
> >> has been called.
> >>
> >> Reported-by: Naresh Kamboju <[email protected]>
> >> Reported-by: kernel test robot <[email protected]>
> >> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> >> Cc: Brendan Higgins <[email protected]>
> >> Cc: Randy Dunlap <[email protected]>
> >> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> >> Signed-off-by: Heikki Krogerus <[email protected]>
> >
> > Reviewed-by: Brendan Higgins <[email protected]>
> > Tested-by: Brendan Higgins <[email protected]>
> >
>
> Acked-by: Randy Dunlap <[email protected]>
> Tested-by: Randy Dunlap <[email protected]>

Thanks guys. Sorry about the mix-up.

Br,

--
heikki

2020-05-15 15:12:59

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Thu, May 14, 2020 at 09:54:15AM +0300, Heikki Krogerus wrote:
> On Wed, May 13, 2020 at 04:14:51PM -0700, Randy Dunlap wrote:
> > On 5/13/20 2:30 PM, Brendan Higgins wrote:
> > > On Wed, May 13, 2020 at 8:18 AM Heikki Krogerus
> > > <[email protected]> wrote:
> > >>
> > >> In the function kobject_cleanup(), kobject_del(kobj) is
> > >> called before the kobj->release(). That makes it possible to
> > >> release the parent of the kobject before the kobject itself.
> > >>
> > >> To fix that, adding function __kboject_del() that does
> > >> everything that kobject_del() does except release the parent
> > >> reference. kobject_cleanup() then calls __kobject_del()
> > >> instead of kobject_del(), and separately decrements the
> > >> reference count of the parent kobject after kobj->release()
> > >> has been called.
> > >>
> > >> Reported-by: Naresh Kamboju <[email protected]>
> > >> Reported-by: kernel test robot <[email protected]>
> > >> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> > >> Cc: Brendan Higgins <[email protected]>
> > >> Cc: Randy Dunlap <[email protected]>
> > >> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> > >> Signed-off-by: Heikki Krogerus <[email protected]>
> > >
> > > Reviewed-by: Brendan Higgins <[email protected]>
> > > Tested-by: Brendan Higgins <[email protected]>
> > >
> >
> > Acked-by: Randy Dunlap <[email protected]>
> > Tested-by: Randy Dunlap <[email protected]>
>
> Thanks guys. Sorry about the mix-up.

I didn't get the chance to review this today, will do so on Monday,
thanks.

greg k-h

2020-05-23 13:23:48

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> In the function kobject_cleanup(), kobject_del(kobj) is
> called before the kobj->release(). That makes it possible to
> release the parent of the kobject before the kobject itself.
>
> To fix that, adding function __kboject_del() that does

s/kboject/kobject/

> everything that kobject_del() does except release the parent
> reference. kobject_cleanup() then calls __kobject_del()
> instead of kobject_del(), and separately decrements the
> reference count of the parent kobject after kobj->release()
> has been called.
>
> Reported-by: Naresh Kamboju <[email protected]>
> Reported-by: kernel test robot <[email protected]>
> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> Cc: Brendan Higgins <[email protected]>
> Cc: Randy Dunlap <[email protected]>
> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> Signed-off-by: Heikki Krogerus <[email protected]>
> Reviewed-by: Rafael J. Wysocki <[email protected]>
> Reviewed-by: Brendan Higgins <[email protected]>
> Tested-by: Brendan Higgins <[email protected]>
> Acked-by: Randy Dunlap <[email protected]>
> Tested-by: Randy Dunlap <[email protected]>

All my arm64be (arm64 big endian) boot tests crash with this patch
applied. Reverting it fixes the problem. Crash log and bisect results
(from pending-fixes branch) below.

Guenter

---
[ 14.944149] ------------[ cut here ]------------
[ 14.946182] Unable to handle kernel paging request at virtual address 006b6b6b6b6b6b6b
[ 14.946194] Mem abort info:
[ 14.946197] ESR = 0x96000004
[ 14.946200] EC = 0x25: DABT (current EL), IL = 32 bits
[ 14.946203] SET = 0, FnV = 0
[ 14.946205] EA = 0, S1PTW = 0
[ 14.946208] Data abort info:
[ 14.946210] ISV = 0, ISS = 0x00000004
[ 14.946213] CM = 0, WnR = 0
[ 14.946216] [006b6b6b6b6b6b6b] address between user and kernel address ranges
[ 14.946218] Internal error: Oops: 96000004 [#1] PREEMPT SMP
[ 14.946229] Modules linked in:
[ 14.946239] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc6-00275-gdbacbfd47d67 #1
[ 14.946242] Hardware name: linux,dummy-virt (DT)
[ 14.946245] pstate: 00000085 (nzcv daIf -PAN -UAO)
[ 14.946247] pc : string_nocheck+0x14/0x90
[ 14.946250] lr : string+0x5c/0x70
[ 14.946252] sp : ffff80001000b580
[ 14.946255] x29: ffff80001000b580 x28: ffff800012d04fa4
[ 14.946262] x27: ffff800011702686 x26: 0000000000000020
[ 14.946269] x25: 00000000000003e0 x24: ffff800011702686
[ 14.946275] x23: ffff800012d04f98 x22: 00000000ffffffd0
[ 14.946282] x21: 04ffffff000affff x20: 6b6b6b6b6b6b6b6b
[ 14.946289] x19: ffff800012d05378 x18: 0000000000000010
[ 14.946296] x17: 0000000000001800 x16: 0000000000001000
[ 14.946302] x15: 00000000ffffffff x14: ffff800011fa0a48
[ 14.946309] x13: ffff80009000b5d7 x12: ffff800011fc45e0
[ 14.946316] x11: ffff80001000b930 x10: ffff80001000b930
[ 14.946322] x9 : ffff80001000b930 x8 : ffff80001000b930
[ 14.946329] x7 : ffff80001000b930 x6 : 00000000ffffffff
[ 14.946336] x5 : 0000000000000000 x4 : 0000000000000000
[ 14.946343] x3 : 04ffffff000affff x2 : 6b6b6b6b6b6b6b6b
[ 14.946349] x1 : ffff800012d05378 x0 : ffff800012d04fa4
[ 14.946356] Call trace:
[ 14.946359] string_nocheck+0x14/0x90
[ 14.946361] string+0x5c/0x70
[ 14.946364] vsnprintf+0x340/0x700
[ 14.946366] vscnprintf+0x30/0x58
[ 14.946369] vprintk_store+0x64/0x230
[ 14.946371] vprintk_emit+0xd4/0x358
[ 14.946374] vprintk_default+0x3c/0x48
[ 14.946376] vprintk_func+0xec/0x230
[ 14.946378] vprintk+0x28/0x38
[ 14.946381] __warn_printk+0x7c/0xa0
[ 14.946383] kobject_put+0xe4/0x138
[ 14.946386] put_device+0x14/0x28
[ 14.946388] put_i2c_dev+0x9c/0xb0
[ 14.946391] i2cdev_detach_adapter.part.5+0x20/0x30
[ 14.946394] i2cdev_notifier_call+0x80/0x88
[ 14.946396] notifier_call_chain+0x54/0x98
[ 14.946399] blocking_notifier_call_chain+0x5c/0x80
[ 14.946401] device_del+0x84/0x3b0
[ 14.946404] device_unregister+0x18/0x38
[ 14.946406] i2c_del_adapter+0x1e0/0x238
[ 14.946409] i2c_mux_del_adapters+0xa0/0xf0
[ 14.946412] unittest_i2c_mux_remove+0x14/0x28
[ 14.946414] i2c_device_remove+0x54/0xe0
[ 14.946417] device_release_driver_internal+0xf8/0x1d0
[ 14.946419] driver_detach+0x50/0xe0
[ 14.946422] bus_remove_driver+0x58/0xb0
[ 14.946424] driver_unregister+0x30/0x60
[ 14.946427] i2c_del_driver+0x28/0x38
[ 14.946429] of_unittest+0x269c/0x2c00
[ 14.946432] do_one_initcall+0x88/0x418
[ 14.946434] kernel_init_freeable+0x29c/0x314
[ 14.946437] kernel_init+0x14/0x108
[ 14.946439] ret_from_fork+0x10/0x1c
[ 14.946442] Code: a9bf7bfd 13003c66 910003fd 34000306 (39400045)
[ 14.950079] ---[ end trace 793dbb9150f3d51a ]---

---
# bad: [dbacbfd47d67736d5ebd724391a8d0d82f36d30f] Merge remote-tracking branch 'drm-misc-fixes/for-linux-next-fixes'
# good: [fb33c6510d5595144d585aa194d377cf74d31911] Linux 5.6-rc6
git bisect start 'HEAD' 'v5.6-rc6'
# good: [f365ab31efacb70bed1e821f7435626e0b2528a6] Merge tag 'drm-next-2020-04-01' of git://anongit.freedesktop.org/drm/drm
git bisect good f365ab31efacb70bed1e821f7435626e0b2528a6
# good: [9c94b39560c3a013de5886ea21ef1eaf21840cb9] Merge tag 'ext4_for_linus' of git://git.kernel.org/pub/scm/linux/kernel/git/tytso/ext4
git bisect good 9c94b39560c3a013de5886ea21ef1eaf21840cb9
# good: [c0259664c6879e1045d6d6703f37501690f6760f] netlabel: Kconfig: Update reference for NetLabel Tools project
git bisect good c0259664c6879e1045d6d6703f37501690f6760f
# good: [d5fef88ccbd3a2d3674e6cc868804a519ef9e5b6] Merge tag 'renesas-fixes-for-v5.7-tag2' of git://git.kernel.org/pub/scm/linux/kernel/git/geert/renesas-devel into arm/fixes
git bisect good d5fef88ccbd3a2d3674e6cc868804a519ef9e5b6
# good: [ce24729667cf8aaebf290613a6026a50a22c3eee] Merge tag 'linux-kselftest-5.7-rc6' of git://git.kernel.org/pub/scm/linux/kernel/git/shuah/linux-kselftest
git bisect good ce24729667cf8aaebf290613a6026a50a22c3eee
# good: [3c9e66568ad40dc17518fa00e2b28c3b450040d4] Merge tag 'arc-5.7-rc7' of git://git.kernel.org/pub/scm/linux/kernel/git/vgupta/arc
git bisect good 3c9e66568ad40dc17518fa00e2b28c3b450040d4
# good: [4aada7791fa2bd3e927502bc67ba604b95bb21bf] Merge remote-tracking branch 'sound-current/for-linus'
git bisect good 4aada7791fa2bd3e927502bc67ba604b95bb21bf
# bad: [2611695ba42692c4d887d379309b9b33edc25055] Merge remote-tracking branch 'usb-chipidea-fixes/ci-for-usb-stable'
git bisect bad 2611695ba42692c4d887d379309b9b33edc25055
# good: [80aa3db7fe3cd90b18a227c9f25798c1743ffe1e] Merge remote-tracking branch 'sound-asoc-fixes/for-linus'
git bisect good 80aa3db7fe3cd90b18a227c9f25798c1743ffe1e
# good: [83c813e237b8edf0fe8184ccdf3dc9622202084c] Merge remote-tracking branch 'spi/for-5.7' into spi-linus
git bisect good 83c813e237b8edf0fe8184ccdf3dc9622202084c
# good: [b054578b1ea052664a1b19fde1c163d29b0856b5] Merge remote-tracking branch 'spi-fixes/for-linus'
git bisect good b054578b1ea052664a1b19fde1c163d29b0856b5
# bad: [e0dd35dbc905307e249cee2301304b6a194848ef] Merge remote-tracking branch 'driver-core.current/driver-core-linus'
git bisect bad e0dd35dbc905307e249cee2301304b6a194848ef
# good: [44e960490ddf868fc9135151c4a658936e771dc2] driver core: Fix handling of SYNC_STATE_ONLY + STATELESS device links
git bisect good 44e960490ddf868fc9135151c4a658936e771dc2
# bad: [4ef12f7198023c09ad6d25b652bd8748c965c7fa] kobject: Make sure the parent does not get released before its children
git bisect bad 4ef12f7198023c09ad6d25b652bd8748c965c7fa
# first bad commit: [4ef12f7198023c09ad6d25b652bd8748c965c7fa] kobject: Make sure the parent does not get released before its children

2020-05-23 13:32:03

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Sat, May 23, 2020 at 06:21:01AM -0700, Guenter Roeck wrote:
> On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> > In the function kobject_cleanup(), kobject_del(kobj) is
> > called before the kobj->release(). That makes it possible to
> > release the parent of the kobject before the kobject itself.
> >
> > To fix that, adding function __kboject_del() that does
>
> s/kboject/kobject/
>
> > everything that kobject_del() does except release the parent
> > reference. kobject_cleanup() then calls __kobject_del()
> > instead of kobject_del(), and separately decrements the
> > reference count of the parent kobject after kobj->release()
> > has been called.
> >
> > Reported-by: Naresh Kamboju <[email protected]>
> > Reported-by: kernel test robot <[email protected]>
> > Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> > Cc: Brendan Higgins <[email protected]>
> > Cc: Randy Dunlap <[email protected]>
> > Suggested-by: "Rafael J. Wysocki" <[email protected]>
> > Signed-off-by: Heikki Krogerus <[email protected]>
> > Reviewed-by: Rafael J. Wysocki <[email protected]>
> > Reviewed-by: Brendan Higgins <[email protected]>
> > Tested-by: Brendan Higgins <[email protected]>
> > Acked-by: Randy Dunlap <[email protected]>
> > Tested-by: Randy Dunlap <[email protected]>
>
> All my arm64be (arm64 big endian) boot tests crash with this patch
> applied. Reverting it fixes the problem. Crash log and bisect results
> (from pending-fixes branch) below.
>

arm64 images don't crash but report lots of "poison overwritten" backtraces
like the one below. On arm, I see "refcount_t: underflow", also attached.
I didn't bisect those, but given the context I would suspect the same
culprit.

Guenter

---
[ 15.017361] =============================================================================
[ 15.017561] BUG kmalloc-2k (Not tainted): Poison overwritten
[ 15.017632] -----------------------------------------------------------------------------
[ 15.017632]
[ 15.017757] Disabling lock debugging due to kernel taint
[ 15.017900] INFO: 0x(____ptrval____)-0x(____ptrval____) @offset=8272. First byte 0x6a instead of 0x6b
[ 15.018039] INFO: Allocated in i2cdev_attach_adapter.part.10+0x44/0x180 age=18 cpu=0 pid=1
[ 15.018122] __slab_alloc.isra.91+0x5c/0xc8
[ 15.018182] kmem_cache_alloc_trace+0x228/0x248
[ 15.018235] i2cdev_attach_adapter.part.10+0x44/0x180
[ 15.018284] i2cdev_notifier_call+0x70/0x88
[ 15.018329] notifier_call_chain+0x54/0x98
[ 15.018372] blocking_notifier_call_chain+0x5c/0x80
[ 15.018423] device_add+0x3bc/0x770
[ 15.018462] device_register+0x20/0x30
[ 15.018502] i2c_register_adapter+0xf0/0x400
[ 15.018546] i2c_add_adapter+0x80/0xd8
[ 15.018587] i2c_add_numbered_adapter+0x2c/0x38
[ 15.018634] unittest_i2c_bus_probe+0x9c/0xf0
[ 15.018679] platform_drv_probe+0x54/0xa8
[ 15.018722] really_probe+0xd8/0x330
[ 15.018762] driver_probe_device+0x58/0xf0
[ 15.018805] device_driver_attach+0x74/0x80
[ 15.018871] INFO: Freed in i2cdev_dev_release+0x14/0x20 age=4 cpu=0 pid=1
[ 15.018933] kfree+0x3d0/0x3e0
[ 15.018969] i2cdev_dev_release+0x14/0x20
[ 15.019011] device_release+0x2c/0x88
[ 15.019054] kobject_put+0x7c/0x138
[ 15.019092] kobject_put+0x90/0x138
[ 15.019133] cdev_del+0x2c/0x40
[ 15.019169] cdev_device_del+0x40/0x50
[ 15.019210] put_i2c_dev+0x94/0xb0
[ 15.019248] i2cdev_detach_adapter.part.5+0x20/0x30
[ 15.019296] i2cdev_notifier_call+0x80/0x88
[ 15.019339] notifier_call_chain+0x54/0x98
[ 15.019381] blocking_notifier_call_chain+0x5c/0x80
[ 15.019428] device_del+0x84/0x3b0
[ 15.019466] device_unregister+0x18/0x38
[ 15.019508] i2c_del_adapter+0x1e8/0x240
[ 15.019549] unittest_i2c_bus_remove+0x18/0x28
[ 15.019632] INFO: Slab 0x(____ptrval____) objects=5 used=5 fp=0x0000000000000000 flags=0xffff00000010200
[ 15.019717] INFO: Object 0x(____ptrval____) @offset=8192 fp=0x(____ptrval____)
[ 15.019717]

---
[ 22.415374] ### dt-test ### EXPECT / : i2c i2c-1: Added multiplexed i2c bus 3
[ 22.419097] ------------[ cut here ]------------
[ 22.419586] WARNING: CPU: 0 PID: 1 at lib/refcount.c:28 i2cdev_notifier_call+0x54/0x5c
[ 22.419708] refcount_t: underflow; use-after-free.
[ 22.419860] Modules linked in:
[ 22.420074] CPU: 0 PID: 1 Comm: swapper/0 Not tainted 5.7.0-rc6-00275-gdbacbfd47d67 #1
[ 22.420227] Hardware name: Generic OMAP3-GP (Flattened Device Tree)
[ 22.420440] [<c03128e0>] (unwind_backtrace) from [<c030c900>] (show_stack+0x10/0x14)
[ 22.420593] [<c030c900>] (show_stack) from [<c08c8df8>] (dump_stack+0xe0/0x10c)
[ 22.420715] [<c08c8df8>] (dump_stack) from [<c0348380>] (__warn+0xf4/0x10c)
[ 22.420867] [<c0348380>] (__warn) from [<c0348410>] (warn_slowpath_fmt+0x78/0xbc)
[ 22.420989] [<c0348410>] (warn_slowpath_fmt) from [<c0eed4c8>] (i2cdev_notifier_call+0x54/0x5c)
[ 22.421142] [<c0eed4c8>] (i2cdev_notifier_call) from [<c03745dc>] (notifier_call_chain+0x48/0x84)
[ 22.421264] [<c03745dc>] (notifier_call_chain) from [<c0374dc0>] (blocking_notifier_call_chain+0x44/0x5c)
[ 22.421386] [<c0374dc0>] (blocking_notifier_call_chain) from [<c0ba9c5c>] (device_del+0x80/0x3d4)
[ 22.421508] [<c0ba9c5c>] (device_del) from [<c0ba9fbc>] (device_unregister+0xc/0x20)
[ 22.421600] [<c0ba9fbc>] (device_unregister) from [<c0ee83d4>] (i2c_del_adapter+0x1ac/0x1f8)
[ 22.421722] [<c0ee83d4>] (i2c_del_adapter) from [<c0eee888>] (i2c_mux_del_adapters+0x90/0xc8)
[ 22.421874] [<c0eee888>] (i2c_mux_del_adapters) from [<c0fd4d50>] (unittest_i2c_mux_remove+0xc/0x14)
[ 22.421997] [<c0fd4d50>] (unittest_i2c_mux_remove) from [<c0ee7b1c>] (i2c_device_remove+0x54/0xa8)
[ 22.422119] [<c0ee7b1c>] (i2c_device_remove) from [<c0baea40>] (device_release_driver_internal+0xe8/0x1b8)
[ 22.422241] [<c0baea40>] (device_release_driver_internal) from [<c0baeb6c>] (driver_detach+0x44/0x80)
[ 22.422363] [<c0baeb6c>] (driver_detach) from [<c0bad6e4>] (bus_remove_driver+0x4c/0xa0)
[ 22.422485] [<c0bad6e4>] (bus_remove_driver) from [<c1ca89e4>] (of_unittest_overlay+0xc90/0x11a8)
[ 22.422576] [<c1ca89e4>] (of_unittest_overlay) from [<c1cab52c>] (of_unittest+0x24a0/0x2af0)
[ 22.422698] [<c1cab52c>] (of_unittest) from [<c03022d4>] (do_one_initcall+0x8c/0x3bc)
[ 22.422821] [<c03022d4>] (do_one_initcall) from [<c1c0103c>] (kernel_init_freeable+0x1a0/0x204)
[ 22.422943] [<c1c0103c>] (kernel_init_freeable) from [<c11fe8c8>] (kernel_init+0x8/0x118)
[ 22.423065] [<c11fe8c8>] (kernel_init) from [<c0300174>] (ret_from_fork+0x14/0x20)
[ 22.423187] Exception stack(0xcb0bdfb0 to 0xcb0bdff8)
[ 22.423339] dfa0: 00000000 00000000 00000000 00000000
[ 22.423492] dfc0: 00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
[ 22.423645] dfe0: 00000000 00000000 00000000 00000000 00000013 00000000
[ 22.423797] irq event stamp: 774333
[ 22.423919] hardirqs last enabled at (774341): [<c03c1ab0>] console_unlock+0x458/0x648
[ 22.424011] hardirqs last disabled at (774348): [<c03c171c>] console_unlock+0xc4/0x648
[ 22.424163] softirqs last enabled at (774258): [<c0301664>] __do_softirq+0x3bc/0x5b4
[ 22.424255] softirqs last disabled at (774235): [<c03519a4>] irq_exit+0x160/0x170
[ 22.424377] ---[ end trace ae0b985481f6b675 ]---

2020-05-23 14:08:48

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Sat, May 23, 2020 at 06:21:01AM -0700, Guenter Roeck wrote:
> On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> > In the function kobject_cleanup(), kobject_del(kobj) is
> > called before the kobj->release(). That makes it possible to
> > release the parent of the kobject before the kobject itself.
> >
> > To fix that, adding function __kboject_del() that does
>
> s/kboject/kobject/
>
> > everything that kobject_del() does except release the parent
> > reference. kobject_cleanup() then calls __kobject_del()
> > instead of kobject_del(), and separately decrements the
> > reference count of the parent kobject after kobj->release()
> > has been called.
> >
> > Reported-by: Naresh Kamboju <[email protected]>
> > Reported-by: kernel test robot <[email protected]>
> > Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> > Cc: Brendan Higgins <[email protected]>
> > Cc: Randy Dunlap <[email protected]>
> > Suggested-by: "Rafael J. Wysocki" <[email protected]>
> > Signed-off-by: Heikki Krogerus <[email protected]>
> > Reviewed-by: Rafael J. Wysocki <[email protected]>
> > Reviewed-by: Brendan Higgins <[email protected]>
> > Tested-by: Brendan Higgins <[email protected]>
> > Acked-by: Randy Dunlap <[email protected]>
> > Tested-by: Randy Dunlap <[email protected]>
>
> All my arm64be (arm64 big endian) boot tests crash with this patch
> applied. Reverting it fixes the problem. Crash log and bisect results
> (from pending-fixes branch) below.

Ugh, ok, just when I send Linus a pull request. Let me go fix that...

greg k-h

2020-05-23 15:38:50

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> In the function kobject_cleanup(), kobject_del(kobj) is
> called before the kobj->release(). That makes it possible to
> release the parent of the kobject before the kobject itself.
>
> To fix that, adding function __kboject_del() that does
> everything that kobject_del() does except release the parent
> reference. kobject_cleanup() then calls __kobject_del()
> instead of kobject_del(), and separately decrements the
> reference count of the parent kobject after kobj->release()
> has been called.
>
> Reported-by: Naresh Kamboju <[email protected]>
> Reported-by: kernel test robot <[email protected]>
> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> Signed-off-by: Heikki Krogerus <[email protected]>
> Reviewed-by: Rafael J. Wysocki <[email protected]>
> Reviewed-by: Brendan Higgins <[email protected]>
> Tested-by: Brendan Higgins <[email protected]>
> Acked-by: Randy Dunlap <[email protected]>
> ---
> lib/kobject.c | 30 ++++++++++++++++++++----------
> 1 file changed, 20 insertions(+), 10 deletions(-)

Stepping back, now that it turns out this patch causes more problems
than it fixes, how is everyone reproducing the original crash here?

Is it just the KUNIT_DRIVER_PE_TEST that is causing the issue?

In looking at 7589238a8cf3 ("Revert "software node: Simplify
software_node_release() function""), the log messages there look
correct. sysfs can't create a duplicate file, and so when your test is
written to try to create software nodes, you always have to check the
return value. If you run the test in parallel, or before another test
has had a chance to clean up, the function will fail, correctly.

So what real-world thing is this test "failure" trying to show?

thanks,

greg k-h

2020-05-23 15:46:48

by Randy Dunlap

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On 5/23/20 8:36 AM, Greg Kroah-Hartman wrote:
> On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
>> In the function kobject_cleanup(), kobject_del(kobj) is
>> called before the kobj->release(). That makes it possible to
>> release the parent of the kobject before the kobject itself.
>>
>> To fix that, adding function __kboject_del() that does
>> everything that kobject_del() does except release the parent
>> reference. kobject_cleanup() then calls __kobject_del()
>> instead of kobject_del(), and separately decrements the
>> reference count of the parent kobject after kobj->release()
>> has been called.
>>
>> Reported-by: Naresh Kamboju <[email protected]>
>> Reported-by: kernel test robot <[email protected]>
>> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
>> Suggested-by: "Rafael J. Wysocki" <[email protected]>
>> Signed-off-by: Heikki Krogerus <[email protected]>
>> Reviewed-by: Rafael J. Wysocki <[email protected]>
>> Reviewed-by: Brendan Higgins <[email protected]>
>> Tested-by: Brendan Higgins <[email protected]>
>> Acked-by: Randy Dunlap <[email protected]>
>> ---
>> lib/kobject.c | 30 ++++++++++++++++++++----------
>> 1 file changed, 20 insertions(+), 10 deletions(-)
>
> Stepping back, now that it turns out this patch causes more problems
> than it fixes, how is everyone reproducing the original crash here?

Just load lib/test_printf.ko and boom!


> Is it just the KUNIT_DRIVER_PE_TEST that is causing the issue?
>
> In looking at 7589238a8cf3 ("Revert "software node: Simplify
> software_node_release() function""), the log messages there look
> correct. sysfs can't create a duplicate file, and so when your test is
> written to try to create software nodes, you always have to check the
> return value. If you run the test in parallel, or before another test
> has had a chance to clean up, the function will fail, correctly.
>
> So what real-world thing is this test "failure" trying to show?


--
~Randy

2020-05-23 19:07:45

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Sat, May 23, 2020 at 8:48 AM Randy Dunlap <[email protected]> wrote:
>
> On 5/23/20 8:36 AM, Greg Kroah-Hartman wrote:
> > On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> >> In the function kobject_cleanup(), kobject_del(kobj) is
> >> called before the kobj->release(). That makes it possible to
> >> release the parent of the kobject before the kobject itself.
> >>
> >> To fix that, adding function __kboject_del() that does
> >> everything that kobject_del() does except release the parent
> >> reference. kobject_cleanup() then calls __kobject_del()
> >> instead of kobject_del(), and separately decrements the
> >> reference count of the parent kobject after kobj->release()
> >> has been called.
> >>
> >> Reported-by: Naresh Kamboju <[email protected]>
> >> Reported-by: kernel test robot <[email protected]>
> >> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> >> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> >> Signed-off-by: Heikki Krogerus <[email protected]>
> >> Reviewed-by: Rafael J. Wysocki <[email protected]>
> >> Reviewed-by: Brendan Higgins <[email protected]>
> >> Tested-by: Brendan Higgins <[email protected]>
> >> Acked-by: Randy Dunlap <[email protected]>
> >> ---
> >> lib/kobject.c | 30 ++++++++++++++++++++----------
> >> 1 file changed, 20 insertions(+), 10 deletions(-)
> >
> > Stepping back, now that it turns out this patch causes more problems
> > than it fixes, how is everyone reproducing the original crash here?
>
> Just load lib/test_printf.ko and boom!
>
>
> > Is it just the KUNIT_DRIVER_PE_TEST that is causing the issue?
> >
> > In looking at 7589238a8cf3 ("Revert "software node: Simplify
> > software_node_release() function""), the log messages there look
> > correct. sysfs can't create a duplicate file, and so when your test is
> > written to try to create software nodes, you always have to check the
> > return value. If you run the test in parallel, or before another test
> > has had a chance to clean up, the function will fail, correctly.
> >
> > So what real-world thing is this test "failure" trying to show?

Well, not sure about the test, but speaking more generally, should not
we postpone releasing parent's reference until we are in
kobj->release() handler? I.e. after all child state is cleared, and
all memory is freed, _then_ we unpin the parent?

Thanks.

--
Dmitry

2020-05-24 11:44:41

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Sat, May 23, 2020 at 12:04:30PM -0700, Dmitry Torokhov wrote:
> On Sat, May 23, 2020 at 8:48 AM Randy Dunlap <[email protected]> wrote:
> >
> > On 5/23/20 8:36 AM, Greg Kroah-Hartman wrote:
> > > On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> > >> In the function kobject_cleanup(), kobject_del(kobj) is
> > >> called before the kobj->release(). That makes it possible to
> > >> release the parent of the kobject before the kobject itself.
> > >>
> > >> To fix that, adding function __kboject_del() that does
> > >> everything that kobject_del() does except release the parent
> > >> reference. kobject_cleanup() then calls __kobject_del()
> > >> instead of kobject_del(), and separately decrements the
> > >> reference count of the parent kobject after kobj->release()
> > >> has been called.
> > >>
> > >> Reported-by: Naresh Kamboju <[email protected]>
> > >> Reported-by: kernel test robot <[email protected]>
> > >> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> > >> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> > >> Signed-off-by: Heikki Krogerus <[email protected]>
> > >> Reviewed-by: Rafael J. Wysocki <[email protected]>
> > >> Reviewed-by: Brendan Higgins <[email protected]>
> > >> Tested-by: Brendan Higgins <[email protected]>
> > >> Acked-by: Randy Dunlap <[email protected]>
> > >> ---
> > >> lib/kobject.c | 30 ++++++++++++++++++++----------
> > >> 1 file changed, 20 insertions(+), 10 deletions(-)
> > >
> > > Stepping back, now that it turns out this patch causes more problems
> > > than it fixes, how is everyone reproducing the original crash here?
> >
> > Just load lib/test_printf.ko and boom!
> >
> >
> > > Is it just the KUNIT_DRIVER_PE_TEST that is causing the issue?
> > >
> > > In looking at 7589238a8cf3 ("Revert "software node: Simplify
> > > software_node_release() function""), the log messages there look
> > > correct. sysfs can't create a duplicate file, and so when your test is
> > > written to try to create software nodes, you always have to check the
> > > return value. If you run the test in parallel, or before another test
> > > has had a chance to clean up, the function will fail, correctly.
> > >
> > > So what real-world thing is this test "failure" trying to show?
>
> Well, not sure about the test, but speaking more generally, should not
> we postpone releasing parent's reference until we are in
> kobj->release() handler? I.e. after all child state is cleared, and
> all memory is freed, _then_ we unpin the parent?

That's what the patch was trying to do in a way. But I think you are
right, we should _only_ be doing it at that point in time, and no other,
which the patch was not doing.

Let me go try that and see what happens...

thanks,

greg k-h

2020-05-24 12:59:51

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Sat, May 23, 2020 at 08:44:06AM -0700, Randy Dunlap wrote:
> On 5/23/20 8:36 AM, Greg Kroah-Hartman wrote:
> > On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> >> In the function kobject_cleanup(), kobject_del(kobj) is
> >> called before the kobj->release(). That makes it possible to
> >> release the parent of the kobject before the kobject itself.
> >>
> >> To fix that, adding function __kboject_del() that does
> >> everything that kobject_del() does except release the parent
> >> reference. kobject_cleanup() then calls __kobject_del()
> >> instead of kobject_del(), and separately decrements the
> >> reference count of the parent kobject after kobj->release()
> >> has been called.
> >>
> >> Reported-by: Naresh Kamboju <[email protected]>
> >> Reported-by: kernel test robot <[email protected]>
> >> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> >> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> >> Signed-off-by: Heikki Krogerus <[email protected]>
> >> Reviewed-by: Rafael J. Wysocki <[email protected]>
> >> Reviewed-by: Brendan Higgins <[email protected]>
> >> Tested-by: Brendan Higgins <[email protected]>
> >> Acked-by: Randy Dunlap <[email protected]>
> >> ---
> >> lib/kobject.c | 30 ++++++++++++++++++++----------
> >> 1 file changed, 20 insertions(+), 10 deletions(-)
> >
> > Stepping back, now that it turns out this patch causes more problems
> > than it fixes, how is everyone reproducing the original crash here?
>
> Just load lib/test_printf.ko and boom!

Thanks, that helps.

Ok, in messing around with the kobject core more, originally we thought
this was an issue of the kobject uevent happening for the parent pointer
(when the parent was invalid). so, moving things around some more, and
now I'm crashing in software_node_release() when we are trying to access
swnode->parent->child_ids as parent is invalid there.

So I feel like this is a swnode bug, or a use of swnode in a way it
shouldn't be that the testing framework is exposing somehow.

Let me dig deeper...

greg k-h

2020-05-24 13:16:58

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Sun, May 24, 2020 at 02:57:27PM +0200, Greg Kroah-Hartman wrote:
> On Sat, May 23, 2020 at 08:44:06AM -0700, Randy Dunlap wrote:
> > On 5/23/20 8:36 AM, Greg Kroah-Hartman wrote:
> > > On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> > >> In the function kobject_cleanup(), kobject_del(kobj) is
> > >> called before the kobj->release(). That makes it possible to
> > >> release the parent of the kobject before the kobject itself.
> > >>
> > >> To fix that, adding function __kboject_del() that does
> > >> everything that kobject_del() does except release the parent
> > >> reference. kobject_cleanup() then calls __kobject_del()
> > >> instead of kobject_del(), and separately decrements the
> > >> reference count of the parent kobject after kobj->release()
> > >> has been called.
> > >>
> > >> Reported-by: Naresh Kamboju <[email protected]>
> > >> Reported-by: kernel test robot <[email protected]>
> > >> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> > >> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> > >> Signed-off-by: Heikki Krogerus <[email protected]>
> > >> Reviewed-by: Rafael J. Wysocki <[email protected]>
> > >> Reviewed-by: Brendan Higgins <[email protected]>
> > >> Tested-by: Brendan Higgins <[email protected]>
> > >> Acked-by: Randy Dunlap <[email protected]>
> > >> ---
> > >> lib/kobject.c | 30 ++++++++++++++++++++----------
> > >> 1 file changed, 20 insertions(+), 10 deletions(-)
> > >
> > > Stepping back, now that it turns out this patch causes more problems
> > > than it fixes, how is everyone reproducing the original crash here?
> >
> > Just load lib/test_printf.ko and boom!
>
> Thanks, that helps.
>
> Ok, in messing around with the kobject core more, originally we thought
> this was an issue of the kobject uevent happening for the parent pointer
> (when the parent was invalid). so, moving things around some more, and
> now I'm crashing in software_node_release() when we are trying to access
> swnode->parent->child_ids as parent is invalid there.
>
> So I feel like this is a swnode bug, or a use of swnode in a way it
> shouldn't be that the testing framework is exposing somehow.
>
> Let me dig deeper...

Ah, ick, static software nodes trying to be cleaned up in the totally
wrong order. You can't just try to randomly clean up a kobject anywhere
in the middle of the hierarchy, that's flat out not going to work
properly. let me unwind it...


greg k-h

2020-05-24 13:33:19

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Sun, May 24, 2020 at 03:14:05PM +0200, Greg Kroah-Hartman wrote:
> On Sun, May 24, 2020 at 02:57:27PM +0200, Greg Kroah-Hartman wrote:
> > On Sat, May 23, 2020 at 08:44:06AM -0700, Randy Dunlap wrote:
> > > On 5/23/20 8:36 AM, Greg Kroah-Hartman wrote:
> > > > On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> > > >> In the function kobject_cleanup(), kobject_del(kobj) is
> > > >> called before the kobj->release(). That makes it possible to
> > > >> release the parent of the kobject before the kobject itself.
> > > >>
> > > >> To fix that, adding function __kboject_del() that does
> > > >> everything that kobject_del() does except release the parent
> > > >> reference. kobject_cleanup() then calls __kobject_del()
> > > >> instead of kobject_del(), and separately decrements the
> > > >> reference count of the parent kobject after kobj->release()
> > > >> has been called.
> > > >>
> > > >> Reported-by: Naresh Kamboju <[email protected]>
> > > >> Reported-by: kernel test robot <[email protected]>
> > > >> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> > > >> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> > > >> Signed-off-by: Heikki Krogerus <[email protected]>
> > > >> Reviewed-by: Rafael J. Wysocki <[email protected]>
> > > >> Reviewed-by: Brendan Higgins <[email protected]>
> > > >> Tested-by: Brendan Higgins <[email protected]>
> > > >> Acked-by: Randy Dunlap <[email protected]>
> > > >> ---
> > > >> lib/kobject.c | 30 ++++++++++++++++++++----------
> > > >> 1 file changed, 20 insertions(+), 10 deletions(-)
> > > >
> > > > Stepping back, now that it turns out this patch causes more problems
> > > > than it fixes, how is everyone reproducing the original crash here?
> > >
> > > Just load lib/test_printf.ko and boom!
> >
> > Thanks, that helps.
> >
> > Ok, in messing around with the kobject core more, originally we thought
> > this was an issue of the kobject uevent happening for the parent pointer
> > (when the parent was invalid). so, moving things around some more, and
> > now I'm crashing in software_node_release() when we are trying to access
> > swnode->parent->child_ids as parent is invalid there.
> >
> > So I feel like this is a swnode bug, or a use of swnode in a way it
> > shouldn't be that the testing framework is exposing somehow.
> >
> > Let me dig deeper...
>
> Ah, ick, static software nodes trying to be cleaned up in the totally
> wrong order. You can't just try to randomly clean up a kobject anywhere
> in the middle of the hierarchy, that's flat out not going to work
> properly. let me unwind it...

Ok, the patch below fixes the test, there's not really anything wrong
with the kobject core, except maybe the kobject uevent for removal,
which I'll send a patch for.

I'll write these up as a real set of patches after a bit.

thanks,

greg k-h

diff --git a/drivers/base/swnode.c b/drivers/base/swnode.c
index de8d3543e8fe..34bc2bbb3ada 100644
--- a/drivers/base/swnode.c
+++ b/drivers/base/swnode.c
@@ -715,14 +715,10 @@ EXPORT_SYMBOL_GPL(software_node_register_nodes);
*/
void software_node_unregister_nodes(const struct software_node *nodes)
{
- struct swnode *swnode;
int i;

- for (i = 0; nodes[i].name; i++) {
- swnode = software_node_to_swnode(&nodes[i]);
- if (swnode)
- fwnode_remove_software_node(&swnode->fwnode);
- }
+ for (i = 0; nodes[i].name; i++)
+ software_node_unregister(&nodes[i]);
}
EXPORT_SYMBOL_GPL(software_node_unregister_nodes);

@@ -741,6 +737,20 @@ int software_node_register(const struct software_node *node)
}
EXPORT_SYMBOL_GPL(software_node_register);

+/**
+ * software_node_unregister - Unregister static software node
+ * @node: The software node to be unregistered
+ */
+void software_node_unregister(const struct software_node *node)
+{
+ struct swnode *swnode;
+
+ swnode = software_node_to_swnode(node);
+ if (swnode)
+ fwnode_remove_software_node(&swnode->fwnode);
+}
+EXPORT_SYMBOL_GPL(software_node_unregister);
+
struct fwnode_handle *
fwnode_create_software_node(const struct property_entry *properties,
const struct fwnode_handle *parent)
diff --git a/include/linux/property.h b/include/linux/property.h
index d86de017c689..0d4099b4ce1f 100644
--- a/include/linux/property.h
+++ b/include/linux/property.h
@@ -441,6 +441,7 @@ int software_node_register_nodes(const struct software_node *nodes);
void software_node_unregister_nodes(const struct software_node *nodes);

int software_node_register(const struct software_node *node);
+void software_node_unregister(const struct software_node *node);

int software_node_notify(struct device *dev, unsigned long action);

diff --git a/lib/test_printf.c b/lib/test_printf.c
index 6b1622f4d7c2..b320c733af70 100644
--- a/lib/test_printf.c
+++ b/lib/test_printf.c
@@ -637,7 +637,9 @@ static void __init fwnode_pointer(void)
test(second_name, "%pfwP", software_node_fwnode(&softnodes[1]));
test(third_name, "%pfwP", software_node_fwnode(&softnodes[2]));

- software_node_unregister_nodes(softnodes);
+ software_node_unregister(&softnodes[2]);
+ software_node_unregister(&softnodes[1]);
+ software_node_unregister(&softnodes[0]);
}

static void __init

2020-05-24 15:39:15

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] kobject: Make sure the parent does not get released before its children

On Sun, May 24, 2020 at 03:28:12PM +0200, Greg Kroah-Hartman wrote:
> On Sun, May 24, 2020 at 03:14:05PM +0200, Greg Kroah-Hartman wrote:
> > On Sun, May 24, 2020 at 02:57:27PM +0200, Greg Kroah-Hartman wrote:
> > > On Sat, May 23, 2020 at 08:44:06AM -0700, Randy Dunlap wrote:
> > > > On 5/23/20 8:36 AM, Greg Kroah-Hartman wrote:
> > > > > On Wed, May 13, 2020 at 06:18:40PM +0300, Heikki Krogerus wrote:
> > > > >> In the function kobject_cleanup(), kobject_del(kobj) is
> > > > >> called before the kobj->release(). That makes it possible to
> > > > >> release the parent of the kobject before the kobject itself.
> > > > >>
> > > > >> To fix that, adding function __kboject_del() that does
> > > > >> everything that kobject_del() does except release the parent
> > > > >> reference. kobject_cleanup() then calls __kobject_del()
> > > > >> instead of kobject_del(), and separately decrements the
> > > > >> reference count of the parent kobject after kobj->release()
> > > > >> has been called.
> > > > >>
> > > > >> Reported-by: Naresh Kamboju <[email protected]>
> > > > >> Reported-by: kernel test robot <[email protected]>
> > > > >> Fixes: 7589238a8cf3 ("Revert "software node: Simplify software_node_release() function"")
> > > > >> Suggested-by: "Rafael J. Wysocki" <[email protected]>
> > > > >> Signed-off-by: Heikki Krogerus <[email protected]>
> > > > >> Reviewed-by: Rafael J. Wysocki <[email protected]>
> > > > >> Reviewed-by: Brendan Higgins <[email protected]>
> > > > >> Tested-by: Brendan Higgins <[email protected]>
> > > > >> Acked-by: Randy Dunlap <[email protected]>
> > > > >> ---
> > > > >> lib/kobject.c | 30 ++++++++++++++++++++----------
> > > > >> 1 file changed, 20 insertions(+), 10 deletions(-)
> > > > >
> > > > > Stepping back, now that it turns out this patch causes more problems
> > > > > than it fixes, how is everyone reproducing the original crash here?
> > > >
> > > > Just load lib/test_printf.ko and boom!
> > >
> > > Thanks, that helps.
> > >
> > > Ok, in messing around with the kobject core more, originally we thought
> > > this was an issue of the kobject uevent happening for the parent pointer
> > > (when the parent was invalid). so, moving things around some more, and
> > > now I'm crashing in software_node_release() when we are trying to access
> > > swnode->parent->child_ids as parent is invalid there.
> > >
> > > So I feel like this is a swnode bug, or a use of swnode in a way it
> > > shouldn't be that the testing framework is exposing somehow.
> > >
> > > Let me dig deeper...
> >
> > Ah, ick, static software nodes trying to be cleaned up in the totally
> > wrong order. You can't just try to randomly clean up a kobject anywhere
> > in the middle of the hierarchy, that's flat out not going to work
> > properly. let me unwind it...
>
> Ok, the patch below fixes the test, there's not really anything wrong
> with the kobject core, except maybe the kobject uevent for removal,
> which I'll send a patch for.
>
> I'll write these up as a real set of patches after a bit.

They are now here:
https://lore.kernel.org/lkml/[email protected]/

thanks,

greg k-h