2010-06-28 01:48:52

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: kmem_cache_destroy() badness with SLUB

Hi folks !

Internally, I'm hitting a little "nit"...

sysfs_slab_add() has this check:

if (slab_state < SYSFS)
/* Defer until later */
return 0;

But sysfs_slab_remove() doesn't.

So if the slab is created -and- destroyed at, for example, arch_initcall
time, then we hit a WARN in the kobject code, trying to dispose of a
non-existing kobject.

Now, at first sight, just adding the same test to sysfs_slab_remove()
would do the job... but it all seems very racy to me.

I don't understand in fact how this slab_state deals with races at all.

What prevents us from hitting slab_sysfs_init() at the same time as
another CPU deos sysfs_slab_add() ? How do that deal with collisions
trying to register the same kobject twice ? Similar race with remove...

Shouldn't we have a mutex around those guys ?

Cheers,
Ben.


2010-06-28 09:03:23

by David Rientjes

[permalink] [raw]
Subject: Re: kmem_cache_destroy() badness with SLUB

On Mon, 28 Jun 2010, Benjamin Herrenschmidt wrote:

> Hi folks !
>
> Internally, I'm hitting a little "nit"...
>
> sysfs_slab_add() has this check:
>
> if (slab_state < SYSFS)
> /* Defer until later */
> return 0;
>
> But sysfs_slab_remove() doesn't.
>
> So if the slab is created -and- destroyed at, for example, arch_initcall
> time, then we hit a WARN in the kobject code, trying to dispose of a
> non-existing kobject.
>

Indeed, but shouldn't we be appropriately handling the return value of
sysfs_slab_add() so that it fails cache creation? We wouldn't be calling
sysfs_slab_remove() on a cache that was never created.

> Now, at first sight, just adding the same test to sysfs_slab_remove()
> would do the job... but it all seems very racy to me.
>
> I don't understand in fact how this slab_state deals with races at all.
>

All modifiers of slab_state are intended to be run only on the boot cpu so
the only concern is the ordering. We need slab_state to indicate how far
slab has been initialized since we can't otherwise enforce how code uses
slab in between things like kmem_cache_init(), kmem_cache_init_late(), and
initcalls on the boot cpu.

2010-06-28 21:45:16

by Benjamin Herrenschmidt

[permalink] [raw]
Subject: Re: kmem_cache_destroy() badness with SLUB

On Mon, 2010-06-28 at 02:03 -0700, David Rientjes wrote:
> On Mon, 28 Jun 2010, Benjamin Herrenschmidt wrote:
>
> > Hi folks !
> >
> > Internally, I'm hitting a little "nit"...
> >
> > sysfs_slab_add() has this check:
> >
> > if (slab_state < SYSFS)
> > /* Defer until later */
> > return 0;
> >
> > But sysfs_slab_remove() doesn't.
> >
> > So if the slab is created -and- destroyed at, for example, arch_initcall
> > time, then we hit a WARN in the kobject code, trying to dispose of a
> > non-existing kobject.
> >
> Indeed, but shouldn't we be appropriately handling the return value of
> sysfs_slab_add() so that it fails cache creation? We wouldn't be calling
> sysfs_slab_remove() on a cache that was never created.

It's eventually created, but yes, we should probably store a state,
unless we have a clean way to know the kobject in there is uninitialized
and test for that.

> > Now, at first sight, just adding the same test to sysfs_slab_remove()
> > would do the job... but it all seems very racy to me.
> >
> > I don't understand in fact how this slab_state deals with races at all.
> >
> All modifiers of slab_state are intended to be run only on the boot cpu so
> the only concern is the ordering. We need slab_state to indicate how far
> slab has been initialized since we can't otherwise enforce how code uses
> slab in between things like kmem_cache_init(), kmem_cache_init_late(), and
> initcalls on the boot cpu.

But initcalls aren't pinned to the boot CPU... IE. I don't see how the
sysfs creation avoids racing with SLAB creation, or am I missing
something ?

Cheers,
Ben.

>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to [email protected]. For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"[email protected]"> [email protected] </a>

2010-06-29 15:47:13

by Christoph Lameter

[permalink] [raw]
Subject: Re: kmem_cache_destroy() badness with SLUB

On Mon, 28 Jun 2010, Benjamin Herrenschmidt wrote:

> So if the slab is created -and- destroyed at, for example, arch_initcall
> time, then we hit a WARN in the kobject code, trying to dispose of a
> non-existing kobject.

Yes dont do that.

> Now, at first sight, just adding the same test to sysfs_slab_remove()
> would do the job... but it all seems very racy to me.

Yes lets leave as is. Dont remove slabs during boot.

> Shouldn't we have a mutex around those guys ?

At boot time?

2010-07-06 03:58:32

by Roland Dreier

[permalink] [raw]
Subject: Re: kmem_cache_destroy() badness with SLUB


Hi folks !
Internally, I'm hitting a little "nit"...

sysfs_slab_add() has this check:

if (slab_state < SYSFS)
/* Defer until later */
return 0;

But sysfs_slab_remove() doesn't.

So if the slab is created -and- destroyed at, for example, arch_initcall
time, then we hit a WARN in the kobject code, trying to dispose of a
non-existing kobject.

Now, at first sight, just adding the same test to sysfs_slab_remove()
would do the job... but it all seems very racy to me.

I don't understand in fact how this slab_state deals with races at all.

What prevents us from hitting slab_sysfs_init() at the same time as
another CPU deos sysfs_slab_add() ? How do that deal with collisions
trying to register the same kobject twice ? Similar race with remove...

Shouldn't we have a mutex around those guys ?

Cheers,
Ben.