From: David Moore <[email protected]>
The idr_remove_all() function returns unused slabs to the kmem cache,
but needs to zero them first or else they will be uninitialized upon
next use. This fixes crashes which have been observed in the firewire
subsystem.
Signed-off-by: David Moore <[email protected]>
---
lib/idr.c | 16 +++++++++++++++-
1 files changed, 15 insertions(+), 1 deletions(-)
diff --git a/lib/idr.c b/lib/idr.c
index 1c4f928..69c3455 100644
--- a/lib/idr.c
+++ b/lib/idr.c
@@ -65,6 +65,20 @@ static inline void free_layer(struct idr_layer *p)
call_rcu(&p->rcu_head, idr_layer_rcu_free);
}
+static void idr_layer_rcu_free_zero(struct rcu_head *head)
+{
+ struct idr_layer *layer;
+
+ layer = container_of(head, struct idr_layer, rcu_head);
+ memset(layer, 0, sizeof(struct idr_layer));
+ kmem_cache_free(idr_layer_cache, layer);
+}
+
+static inline void free_layer_zero(struct idr_layer *p)
+{
+ call_rcu(&p->rcu_head, idr_layer_rcu_free_zero);
+}
+
/* only called when idp->lock is held */
static void __move_to_free_list(struct idr *idp, struct idr_layer *p)
{
@@ -462,7 +476,7 @@ void idr_remove_all(struct idr *idp)
id += 1 << n;
while (n < fls(id)) {
if (p)
- free_layer(p);
+ free_layer_zero(p);
n += IDR_BITS;
p = *--paa;
}
--
1.6.0.6
David Moore wrote:
> From: David Moore <[email protected]>
>
> The idr_remove_all() function returns unused slabs to the kmem cache,
> but needs to zero them first or else they will be uninitialized upon
> next use. This fixes crashes which have been observed in the firewire
> subsystem.
>
> Signed-off-by: David Moore <[email protected]>
Tested-by: Stefan Richter <[email protected]>
> ---
> lib/idr.c | 16 +++++++++++++++-
> 1 files changed, 15 insertions(+), 1 deletions(-)
>
> diff --git a/lib/idr.c b/lib/idr.c
> index 1c4f928..69c3455 100644
> --- a/lib/idr.c
> +++ b/lib/idr.c
> @@ -65,6 +65,20 @@ static inline void free_layer(struct idr_layer *p)
> call_rcu(&p->rcu_head, idr_layer_rcu_free);
> }
>
> +static void idr_layer_rcu_free_zero(struct rcu_head *head)
> +{
> + struct idr_layer *layer;
> +
> + layer = container_of(head, struct idr_layer, rcu_head);
> + memset(layer, 0, sizeof(struct idr_layer));
> + kmem_cache_free(idr_layer_cache, layer);
> +}
> +
> +static inline void free_layer_zero(struct idr_layer *p)
> +{
> + call_rcu(&p->rcu_head, idr_layer_rcu_free_zero);
> +}
> +
> /* only called when idp->lock is held */
> static void __move_to_free_list(struct idr *idp, struct idr_layer *p)
> {
> @@ -462,7 +476,7 @@ void idr_remove_all(struct idr *idp)
> id += 1 << n;
> while (n < fls(id)) {
> if (p)
> - free_layer(p);
> + free_layer_zero(p);
> n += IDR_BITS;
> p = *--paa;
> }
Nadia,
it appears as if post-2.6.26 commit
cf481c20c476ad2c0febdace9ce23f5a4db19582 "idr: make idr_remove rcu-safe"
was buggy as it removed a memset(...0...) from idr_remove_all() without
any obvious replacement. And this patch fixes it. Is this correct?
This was observed by David in Fedora 2.6.27.* kernels and in 2.6.28, and
I have it seen in vanilla 2.6.28 --- but only after I disabled some
debug kconfig options. The trigger for the bug is not the existing
usage of idr in drivers/firewire/, but a new usage which is not yet in
mainline. More details:
http://marc.info/?l=linux1394-devel&m=123140439522563
The symptom is that after a few destructions of idr trees (which involve
idr_remove_all() of course), there appear spurious idr entries in
subsequently newly created idr trees. These spurious entries then crash
the driver when it iterates over them.
Andrew,
the triggering code are feature additions which I vaguely hoped of still
getting ready for pull before 2.6.29-rc1. I see as my options now
- to queue up this lib/idr fix --- if reviewers like it --- together
with my drivers/firewire updates for a pull request very very soon,
- to send my firewire updates independently of this idr patch but
with a simple temporary workaround at the new idr using driver code,
- to wait with these firewire features for 2.6.30.
It's about these updates:
http://git.kernel.org/?p=linux/kernel/git/ieee1394/linux1394-2.6.git;a=shortlog;h=test
What do you think?
--
Stefan Richter
-=====-==--= ---= -=-=-
http://arcgraph.de/sr/
On Sat, 10 Jan 2009 10:03:33 +0100 Stefan Richter <[email protected]> wrote:
> David Moore wrote:
> > From: David Moore <[email protected]>
> >
> > The idr_remove_all() function returns unused slabs to the kmem cache,
> > but needs to zero them first or else they will be uninitialized upon
> > next use. This fixes crashes which have been observed in the firewire
> > subsystem.
> >
hm.
>
> > ---
> > lib/idr.c | 16 +++++++++++++++-
> > 1 files changed, 15 insertions(+), 1 deletions(-)
> >
> > diff --git a/lib/idr.c b/lib/idr.c
> > index 1c4f928..69c3455 100644
> > --- a/lib/idr.c
> > +++ b/lib/idr.c
> > @@ -65,6 +65,20 @@ static inline void free_layer(struct idr_layer *p)
> > call_rcu(&p->rcu_head, idr_layer_rcu_free);
> > }
> >
> > +static void idr_layer_rcu_free_zero(struct rcu_head *head)
> > +{
> > + struct idr_layer *layer;
> > +
> > + layer = container_of(head, struct idr_layer, rcu_head);
> > + memset(layer, 0, sizeof(struct idr_layer));
> > + kmem_cache_free(idr_layer_cache, layer);
> > +}
> > +
> > +static inline void free_layer_zero(struct idr_layer *p)
> > +{
> > + call_rcu(&p->rcu_head, idr_layer_rcu_free_zero);
> > +}
> > +
> > /* only called when idp->lock is held */
> > static void __move_to_free_list(struct idr *idp, struct idr_layer *p)
> > {
> > @@ -462,7 +476,7 @@ void idr_remove_all(struct idr *idp)
> > id += 1 << n;
> > while (n < fls(id)) {
> > if (p)
> > - free_layer(p);
> > + free_layer_zero(p);
> > n += IDR_BITS;
> > p = *--paa;
> > }
>
> Nadia,
>
> it appears as if post-2.6.26 commit
> cf481c20c476ad2c0febdace9ce23f5a4db19582 "idr: make idr_remove rcu-safe"
> was buggy as it removed a memset(...0...) from idr_remove_all() without
> any obvious replacement. And this patch fixes it. Is this correct?
>
> This was observed by David in Fedora 2.6.27.* kernels and in 2.6.28, and
> I have it seen in vanilla 2.6.28 --- but only after I disabled some
> debug kconfig options. The trigger for the bug is not the existing
> usage of idr in drivers/firewire/, but a new usage which is not yet in
> mainline. More details:
> http://marc.info/?l=linux1394-devel&m=123140439522563
>
> The symptom is that after a few destructions of idr trees (which involve
> idr_remove_all() of course), there appear spurious idr entries in
> subsequently newly created idr trees. These spurious entries then crash
> the driver when it iterates over them.
>
> Andrew,
>
> the triggering code are feature additions which I vaguely hoped of still
> getting ready for pull before 2.6.29-rc1. I see as my options now
> - to queue up this lib/idr fix --- if reviewers like it --- together
> with my drivers/firewire updates for a pull request very very soon,
> - to send my firewire updates independently of this idr patch but
> with a simple temporary workaround at the new idr using driver code,
> - to wait with these firewire features for 2.6.30.
> It's about these updates:
> http://git.kernel.org/?p=linux/kernel/git/ieee1394/linux1394-2.6.git;a=shortlog;h=test
Are we sure that all the other callers of free_layer() are freeing
zeroed objects?
It would be cleaner, safer and quite possibly faster to remove the
constructor altogether and use kmem_cache_zalloc() to allocate new
objects.
Andrew Morton wrote:
> On Sat, 10 Jan 2009 10:03:33 +0100 Stefan Richter <[email protected]> wrote:
>
>> David Moore wrote:
>>> From: David Moore <[email protected]>
>>>
>>> The idr_remove_all() function returns unused slabs to the kmem cache,
>>> but needs to zero them first or else they will be uninitialized upon
>>> next use. This fixes crashes which have been observed in the firewire
>>> subsystem.
>>>
>
> hm.
>
>>> ---
>>> lib/idr.c | 16 +++++++++++++++-
>>> 1 files changed, 15 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/lib/idr.c b/lib/idr.c
>>> index 1c4f928..69c3455 100644
>>> --- a/lib/idr.c
>>> +++ b/lib/idr.c
>>> @@ -65,6 +65,20 @@ static inline void free_layer(struct idr_layer *p)
>>> call_rcu(&p->rcu_head, idr_layer_rcu_free);
>>> }
>>>
>>> +static void idr_layer_rcu_free_zero(struct rcu_head *head)
>>> +{
>>> + struct idr_layer *layer;
>>> +
>>> + layer = container_of(head, struct idr_layer, rcu_head);
>>> + memset(layer, 0, sizeof(struct idr_layer));
>>> + kmem_cache_free(idr_layer_cache, layer);
>>> +}
>>> +
>>> +static inline void free_layer_zero(struct idr_layer *p)
>>> +{
>>> + call_rcu(&p->rcu_head, idr_layer_rcu_free_zero);
>>> +}
>>> +
>>> /* only called when idp->lock is held */
>>> static void __move_to_free_list(struct idr *idp, struct idr_layer *p)
>>> {
>>> @@ -462,7 +476,7 @@ void idr_remove_all(struct idr *idp)
>>> id += 1 << n;
>>> while (n < fls(id)) {
>>> if (p)
>>> - free_layer(p);
>>> + free_layer_zero(p);
>>> n += IDR_BITS;
>>> p = *--paa;
>>> }
>> Nadia,
>>
>> it appears as if post-2.6.26 commit
>> cf481c20c476ad2c0febdace9ce23f5a4db19582 "idr: make idr_remove rcu-safe"
>> was buggy as it removed a memset(...0...) from idr_remove_all() without
>> any obvious replacement. And this patch fixes it. Is this correct?
>>
>> This was observed by David in Fedora 2.6.27.* kernels and in 2.6.28, and
>> I have it seen in vanilla 2.6.28 --- but only after I disabled some
>> debug kconfig options. The trigger for the bug is not the existing
>> usage of idr in drivers/firewire/, but a new usage which is not yet in
>> mainline. More details:
>> http://marc.info/?l=linux1394-devel&m=123140439522563
>>
>> The symptom is that after a few destructions of idr trees (which involve
>> idr_remove_all() of course), there appear spurious idr entries in
>> subsequently newly created idr trees. These spurious entries then crash
>> the driver when it iterates over them.
...
> Are we sure that all the other callers of free_layer() are freeing
> zeroed objects?
>
> It would be cleaner, safer and quite possibly faster to remove the
> constructor altogether and use kmem_cache_zalloc() to allocate new
> objects.
Yes, it sounds at least safer if the allocation path should be fixed up.
The zeroing was done in idr_remove_all() though since Kristian added it
in 2.6.23, until 2.6.26 inclusive.
Kristian, was there a deeper reason to do it at deallocation instead of
allocation, and does the reason still apply today?
--
Stefan Richter
-=====-==--= ---= -=-=-
http://arcgraph.de/sr/
On Sat, 2009-01-10 at 11:05 +0100, Stefan Richter wrote:
> Andrew Morton wrote:
...
> > Are we sure that all the other callers of free_layer() are freeing
> > zeroed objects?
> >
> > It would be cleaner, safer and quite possibly faster to remove the
> > constructor altogether and use kmem_cache_zalloc() to allocate new
> > objects.
>
> Yes, it sounds at least safer if the allocation path should be fixed up.
>
> The zeroing was done in idr_remove_all() though since Kristian added it
> in 2.6.23, until 2.6.26 inclusive.
>
> Kristian, was there a deeper reason to do it at deallocation instead of
> allocation, and does the reason still apply today?
Oh wow, that's a long time ago... The idr implementation caches unused,
zeroed out idr_layers in a free list in the idr struct for later use.
If no layers are in the cache, the idr_pre_get() function will allocate
one from the kmem_cache, which will zero out the layer in the ctor, and
then add it to the idr free list. There are two other ways a layer can
go back to the free list: 1) when we free up a layer by freeing the
entries one by one, in which case the layer is already zeroed out, or in
idr_remove_all(), where we just zero it out brute force. The problem
isn't about returning un-zeroed-out objects to the kmem cache, the
problem is returning them to the idr free list.
As far as I know the kmem_cache allocator is plenty fast and we could
just drop the entire free list and allocate out of the kmem cache
everytime in idr_pre_alloc(). If we're doing that, we should really,
please, just drop the stupid idr_pre_get() then idr_get_new() and retry
if fail scheme. Every idr use I've seen could just do the whole thing
under a mutex and not worry about the awkward retry idea. We don't have
to break API, we can just add a new function idr_alloc_new(idr, ptr, id,
gfp) that just does idr_pre_alloc() and then idr_get_new() under the
assumption that the client has taken the required mutex. If the client
protects access to the idr using a mutex or spinlock, we can do
idr_pre_get() and idr_get_new() in succession without anybody else
grabbing that new layer from under us in the meantime. Same thing for
idr_get_new_above(), of course. Anyways... :)
cheers,
Kristian
Kristian H?gsberg wrote:
> The problem
> isn't about returning un-zeroed-out objects to the kmem cache, the
> problem is returning them to the idr free list.
>
I think this is wrong:
The slab allocator assumes that the objects that are given to
kmem_cache_free() are properly constructed.
I.e.: No additional constructor is called prior to returning the object
from the next kmem_cache_alloc() call.
> Every idr use I've seen could just do the whole thing
> under a mutex and not worry about the awkward retry idea.
Unfortunately there are some users that do idr_get_new() within a spinlock.
e.g. from drivers/gpu/drm/drm_gem.c:
> if (idr_pre_get(&file_priv->object_idr, GFP_KERNEL) == 0)
> return -ENOMEM;
>
> /* do the allocation under our spinlock */
> spin_lock(&file_priv->table_lock);
> ret = idr_get_new_above(&file_priv->object_idr, obj, 1, handlep);
> spin_unlock(&file_priv->table_lock);
:-(
--
Manfred
On Mon, 2009-01-12 at 20:53 +0100, Manfred Spraul wrote:
> Kristian Høgsberg wrote:
> > The problem
> > isn't about returning un-zeroed-out objects to the kmem cache, the
> > problem is returning them to the idr free list.
> >
> I think this is wrong:
> The slab allocator assumes that the objects that are given to
> kmem_cache_free() are properly constructed.
> I.e.: No additional constructor is called prior to returning the object
> from the next kmem_cache_alloc() call.
That's fine, the ctor associated with the kmem cache is called, and in
the case of idr, it does a memset().
> > Every idr use I've seen could just do the whole thing
> > under a mutex and not worry about the awkward retry idea.
> Unfortunately there are some users that do idr_get_new() within a spinlock.
> e.g. from drivers/gpu/drm/drm_gem.c:
> > if (idr_pre_get(&file_priv->object_idr, GFP_KERNEL) == 0)
> > return -ENOMEM;
> >
> > /* do the allocation under our spinlock */
> > spin_lock(&file_priv->table_lock);
> > ret = idr_get_new_above(&file_priv->object_idr, obj, 1, handlep);
> > spin_unlock(&file_priv->table_lock);
> :-(
That's valid use of the idr interface, idr_get_new_above() won't block
or allocate, just return -EAGAIN if the idr_layer struct allocated by
idr_pre_get() got snatched up in between the two calls. My complaint
was that in many cases you don't need to access the idr from interrupt
context and you can then just put the idr_pre_get() and
idr_get_new_above() under one big mutex. Even so, many people just
copy-n-paste the boiler plate code that does the
idr_pre_get()/idr_get_new_above()/if(EAGAIN) goto retry sequence.
cheers,
Kristian
Kristian Høgsberg wrote:
> On Mon, 2009-01-12 at 20:53 +0100, Manfred Spraul wrote:
>
>> Kristian Høgsberg wrote:
>>
>>> The problem
>>> isn't about returning un-zeroed-out objects to the kmem cache, the
>>> problem is returning them to the idr free list.
>>>
>>>
>> I think this is wrong:
>> The slab allocator assumes that the objects that are given to
>> kmem_cache_free() are properly constructed.
>> I.e.: No additional constructor is called prior to returning the object
>> from the next kmem_cache_alloc() call.
>>
>
> That's fine, the ctor associated with the kmem cache is called, and in
> the case of idr, it does a memset().
>
No.
As I said, the construtor is not called.
An object that is given to kmem_cache_free() must be properly constructed.
kmem_cache_free() just adds the obj pointer to a list, the next
kmem_cache_alloc returns the pointer.
This is also documented in mm/slab.c:
* The memory is organized in caches, one cache for each object type.
* (e.g. inode_cache, dentry_cache, buffer_head, vm_area_struct)
* Each cache consists out of many slabs (they are small (usually one
* page long) and always contiguous), and each slab contains multiple
* initialized objects.
*
* This means, that your constructor is used only for newly allocated
* slabs and you must pass objects with the same initializations to
* kmem_cache_free.
*
If the idr code passes uninitialized objects to kmem_cache_free(), then
the next kmem_cache_alloc will return a bad object.
--
Manfred
On Mon, 12 Jan 2009 21:50:36 +0100
Manfred Spraul <[email protected]> wrote:
> Kristian H__gsberg wrote:
> > On Mon, 2009-01-12 at 20:53 +0100, Manfred Spraul wrote:
> >
> >> Kristian H__gsberg wrote:
> >>
> >>> The problem
> >>> isn't about returning un-zeroed-out objects to the kmem cache, the
> >>> problem is returning them to the idr free list.
> >>>
> >>>
> >> I think this is wrong:
> >> The slab allocator assumes that the objects that are given to
> >> kmem_cache_free() are properly constructed.
> >> I.e.: No additional constructor is called prior to returning the object
> >> from the next kmem_cache_alloc() call.
> >>
> >
> > That's fine, the ctor associated with the kmem cache is called, and in
> > the case of idr, it does a memset().
> >
> No.
> As I said, the construtor is not called.
> An object that is given to kmem_cache_free() must be properly constructed.
> kmem_cache_free() just adds the obj pointer to a list, the next
> kmem_cache_alloc returns the pointer.
>
> This is also documented in mm/slab.c:
> * The memory is organized in caches, one cache for each object type.
> * (e.g. inode_cache, dentry_cache, buffer_head, vm_area_struct)
> * Each cache consists out of many slabs (they are small (usually one
> * page long) and always contiguous), and each slab contains multiple
> * initialized objects.
> *
> * This means, that your constructor is used only for newly allocated
> * slabs and you must pass objects with the same initializations to
> * kmem_cache_free.
> *
>
> If the idr code passes uninitialized objects to kmem_cache_free(), then
> the next kmem_cache_alloc will return a bad object.
>
None of this got us much closer to fixing the bug ;)
What do we think of just removing the constructor and using
kmem_cache_zalloc()?
--- a/lib/idr.c~a
+++ a/lib/idr.c
@@ -121,7 +121,7 @@ int idr_pre_get(struct idr *idp, gfp_t g
{
while (idp->id_free_cnt < IDR_FREE_MAX) {
struct idr_layer *new;
- new = kmem_cache_alloc(idr_layer_cache, gfp_mask);
+ new = kmem_cache_zalloc(idr_layer_cache, gfp_mask);
if (new == NULL)
return (0);
move_to_free_list(idp, new);
@@ -623,16 +623,10 @@ void *idr_replace(struct idr *idp, void
}
EXPORT_SYMBOL(idr_replace);
-static void idr_cache_ctor(void *idr_layer)
-{
- memset(idr_layer, 0, sizeof(struct idr_layer));
-}
-
void __init idr_init_cache(void)
{
idr_layer_cache = kmem_cache_create("idr_layer_cache",
- sizeof(struct idr_layer), 0, SLAB_PANIC,
- idr_cache_ctor);
+ sizeof(struct idr_layer), 0, SLAB_PANIC, NULL);
}
/**
_
On Tue, 2009-01-13 at 14:48 -0800, Andrew Morton wrote:
> None of this got us much closer to fixing the bug ;)
>
> What do we think of just removing the constructor and using
> kmem_cache_zalloc()?
>
Yes, I agree that using kmem_cache_zalloc would be a good solution.
-David
On Wed, Jan 14, 2009 at 12:48 AM, Andrew Morton
<[email protected]> wrote:
>> If the idr code passes uninitialized objects to kmem_cache_free(), then
>> the next kmem_cache_alloc will return a bad object.
>>
>
> None of this got us much closer to fixing the bug ;)
>
> What do we think of just removing the constructor and using
> kmem_cache_zalloc()?
Why do I get the feeling that we have merged a similar patch before?
Acked-by: Pekka Enberg <[email protected]>
>
> --- a/lib/idr.c~a
> +++ a/lib/idr.c
> @@ -121,7 +121,7 @@ int idr_pre_get(struct idr *idp, gfp_t g
> {
> while (idp->id_free_cnt < IDR_FREE_MAX) {
> struct idr_layer *new;
> - new = kmem_cache_alloc(idr_layer_cache, gfp_mask);
> + new = kmem_cache_zalloc(idr_layer_cache, gfp_mask);
> if (new == NULL)
> return (0);
> move_to_free_list(idp, new);
> @@ -623,16 +623,10 @@ void *idr_replace(struct idr *idp, void
> }
> EXPORT_SYMBOL(idr_replace);
>
> -static void idr_cache_ctor(void *idr_layer)
> -{
> - memset(idr_layer, 0, sizeof(struct idr_layer));
> -}
> -
> void __init idr_init_cache(void)
> {
> idr_layer_cache = kmem_cache_create("idr_layer_cache",
> - sizeof(struct idr_layer), 0, SLAB_PANIC,
> - idr_cache_ctor);
> + sizeof(struct idr_layer), 0, SLAB_PANIC, NULL);
> }
>
> /**
> _
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
On Wed, 14 Jan 2009 09:19:01 +0200 "Pekka Enberg" <[email protected]> wrote:
> On Wed, Jan 14, 2009 at 12:48 AM, Andrew Morton
> <[email protected]> wrote:
> >> If the idr code passes uninitialized objects to kmem_cache_free(), then
> >> the next kmem_cache_alloc will return a bad object.
> >>
> >
> > None of this got us much closer to fixing the bug ;)
> >
> > What do we think of just removing the constructor and using
> > kmem_cache_zalloc()?
>
> Why do I get the feeling that we have merged a similar patch before?
Dunno - maybe we had the same bug in other places.
> Acked-by: Pekka Enberg <[email protected]>
Here 'tis:
From: Andrew Morton <[email protected]>
David points out that the idr_remove_all() function returns unused slabs
to the kmem cache, but needs to zero them first or else they will be
uninitialized upon next use. This causes crashes which have been observed
in the firewire subsystem.
He fixed this by zeroing the object before freeing it in idr_remove_all().
But we agree that simply removing the constructor and zeroing the object
at allocation time is simpler than relying upon slab constructor machinery
and might even be faster.
This problem was introduced by
commit cf481c20c476ad2c0febdace9ce23f5a4db19582
Author: Nadia Derbey <[email protected]>
AuthorDate: Fri Jul 25 01:48:02 2008 -0700
Commit: Linus Torvalds <[email protected]>
CommitDate: Fri Jul 25 10:53:42 2008 -0700
idr: make idr_remove rcu-safe
which was first released in 2.6.27.
There are no known codesites which trigger this bug in 2.6.27 or 2.6.28.
The post-2.6.28 firewire changes are the only known triggerer.
There might of course be not-yet-discovered triggerers in 2.6.27 and
2.6.28, and there might be out-of-tree triggerers which are added to those
kernel versions. I'll let the -stable guys decide whether they want to
backport this fix.
Reported-by: David Moore <[email protected]>
Cc: Stefan Richter <[email protected]>
Cc: Nadia Derbey <[email protected]>
Cc: Paul E. McKenney <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: Kristian Hgsberg <[email protected]>
Acked-by: Pekka Enberg <[email protected]>
Cc: <[email protected]>
Signed-off-by: Andrew Morton <[email protected]>
---
lib/idr.c | 10 ++--------
1 file changed, 2 insertions(+), 8 deletions(-)
diff -puN lib/idr.c~lib-idrc-use-kmem_cache_zalloc-for-the-idr_layer-cache lib/idr.c
--- a/lib/idr.c~lib-idrc-use-kmem_cache_zalloc-for-the-idr_layer-cache
+++ a/lib/idr.c
@@ -121,7 +121,7 @@ int idr_pre_get(struct idr *idp, gfp_t g
{
while (idp->id_free_cnt < IDR_FREE_MAX) {
struct idr_layer *new;
- new = kmem_cache_alloc(idr_layer_cache, gfp_mask);
+ new = kmem_cache_zalloc(idr_layer_cache, gfp_mask);
if (new == NULL)
return (0);
move_to_free_list(idp, new);
@@ -623,16 +623,10 @@ void *idr_replace(struct idr *idp, void
}
EXPORT_SYMBOL(idr_replace);
-static void idr_cache_ctor(void *idr_layer)
-{
- memset(idr_layer, 0, sizeof(struct idr_layer));
-}
-
void __init idr_init_cache(void)
{
idr_layer_cache = kmem_cache_create("idr_layer_cache",
- sizeof(struct idr_layer), 0, SLAB_PANIC,
- idr_cache_ctor);
+ sizeof(struct idr_layer), 0, SLAB_PANIC, NULL);
}
/**
_
Andrew Morton wrote:
> This problem was introduced by
>
> commit cf481c20c476ad2c0febdace9ce23f5a4db19582
...
> which was first released in 2.6.27.
>
> There are no known codesites which trigger this bug in 2.6.27 or 2.6.28.
> The post-2.6.28 firewire changes are the only known triggerer.
(They became post-2.6.29 changes since I missed my chance by two days or
so.)
> There might of course be not-yet-discovered triggerers in 2.6.27 and
> 2.6.28, and there might be out-of-tree triggerers which are added to those
> kernel versions. I'll let the -stable guys decide whether they want to
> backport this fix.
I vote for the cf481c20c breakage be fixed in 2.6.27.y and 2.6.28.y.
Either by your patch or by something equivalent.
*Every* call to idr_remove_all() will add unclean idr_layers to the
pool. There may be more actual occurrences of this than what was found
so far because
- the results may be kernel panics without traces, or with traces that
don't point to an idr problem so clearly,
- one or another kernel debugging build option prevents this bug from
happening. A kernel developer who does most of his tests with debug
options enabled won't notice.
...
> diff -puN lib/idr.c~lib-idrc-use-kmem_cache_zalloc-for-the-idr_layer-cache lib/idr.c
> --- a/lib/idr.c~lib-idrc-use-kmem_cache_zalloc-for-the-idr_layer-cache
> +++ a/lib/idr.c
> @@ -121,7 +121,7 @@ int idr_pre_get(struct idr *idp, gfp_t g
> {
> while (idp->id_free_cnt < IDR_FREE_MAX) {
> struct idr_layer *new;
> - new = kmem_cache_alloc(idr_layer_cache, gfp_mask);
> + new = kmem_cache_zalloc(idr_layer_cache, gfp_mask);
> if (new == NULL)
> return (0);
> move_to_free_list(idp, new);
...
I wonder if it would be more robust --- or even necessary --- to instead
add proper initialization code to get_from_free_list().
As far as David and I tested the new idr using code in firewire, we
called idr_remove_all() *and* idr_destroy() before any subsequent
idr_get_new(). But in practice, idr_get_new() may of course also happen
between idr_remove_all() and idr_destroy().
And then this fix won't be sufficient, would it?
--
Stefan Richter
-=====-==--= ---= -===-
http://arcgraph.de/sr/
On Wed, Jan 14, 2009 at 12:48 AM, Andrew Morton
<[email protected]> wrote:
>> >> If the idr code passes uninitialized objects to kmem_cache_free(), then
>> >> the next kmem_cache_alloc will return a bad object.
>> >>
>> >
>> > None of this got us much closer to fixing the bug ;)
>> >
>> > What do we think of just removing the constructor and using
>> > kmem_cache_zalloc()?
On Wed, 14 Jan 2009 09:19:01 +0200 "Pekka Enberg"
<[email protected]> wrote:
>> Why do I get the feeling that we have merged a similar patch before?
On Wed, Jan 14, 2009 at 10:17 AM, Andrew Morton
<[email protected]> wrote:
> Dunno - maybe we had the same bug in other places.
It's probably commit 571817849c76aabf34d534c905b5e604f2e824c5 ("msi:
use kmem_cache_zalloc()"). But the changelog is bit, uhm, limited on
the subject... oh well.
Pekka
On Wed, 14 Jan 2009 09:59:07 +0100 Stefan Richter <[email protected]> wrote:
> > --- a/lib/idr.c~lib-idrc-use-kmem_cache_zalloc-for-the-idr_layer-cache
> > +++ a/lib/idr.c
> > @@ -121,7 +121,7 @@ int idr_pre_get(struct idr *idp, gfp_t g
> > {
> > while (idp->id_free_cnt < IDR_FREE_MAX) {
> > struct idr_layer *new;
> > - new = kmem_cache_alloc(idr_layer_cache, gfp_mask);
> > + new = kmem_cache_zalloc(idr_layer_cache, gfp_mask);
> > if (new == NULL)
> > return (0);
> > move_to_free_list(idp, new);
> ...
>
> I wonder if it would be more robust --- or even necessary --- to instead
> add proper initialization code to get_from_free_list().
>
> As far as David and I tested the new idr using code in firewire, we
> called idr_remove_all() *and* idr_destroy() before any subsequent
> idr_get_new(). But in practice, idr_get_new() may of course also happen
> between idr_remove_all() and idr_destroy().
>
> And then this fix won't be sufficient, would it?
Maybe I'm having a thick day, but I'm not following you at all here.
What do you think the remaining problem is? get_from_free_list()
starts out with a not-fully-zeroed object? Something else?
Andrew Morton wrote:
> On Wed, 14 Jan 2009 09:59:07 +0100 Stefan Richter <[email protected]> wrote:
>
>> > --- a/lib/idr.c~lib-idrc-use-kmem_cache_zalloc-for-the-idr_layer-cache
>> > +++ a/lib/idr.c
>> > @@ -121,7 +121,7 @@ int idr_pre_get(struct idr *idp, gfp_t g
>> > {
>> > while (idp->id_free_cnt < IDR_FREE_MAX) {
>> > struct idr_layer *new;
>> > - new = kmem_cache_alloc(idr_layer_cache, gfp_mask);
>> > + new = kmem_cache_zalloc(idr_layer_cache, gfp_mask);
>> > if (new == NULL)
>> > return (0);
>> > move_to_free_list(idp, new);
>> ...
>>
>> I wonder if it would be more robust --- or even necessary --- to instead
>> add proper initialization code to get_from_free_list().
>>
>> As far as David and I tested the new idr using code in firewire, we
>> called idr_remove_all() *and* idr_destroy() before any subsequent
>> idr_get_new(). But in practice, idr_get_new() may of course also happen
>> between idr_remove_all() and idr_destroy().
>>
>> And then this fix won't be sufficient, would it?
>
> Maybe I'm having a thick day, but I'm not following you at all here.
>
> What do you think the remaining problem is? get_from_free_list()
> starts out with a not-fully-zeroed object? Something else?
AFAICS:
Before the faulty commit, all code sites which moved something into the
per-idr free list cleared the respective idr_layer.
After the faulty commit, idr_remove_all() forgot to do so.
After your patch, idr_remove_all() still doesn't clear anything. But
there will typically be idr_destroy() called right after an
idr_remove_all(). idr_destroy() moves all idr_layers out of this idr's
free list back into the kernel-global idr_layer_cache.
Your fix only clears idr_layers which an idr_get_new*() pulls out of the
idr_layer_cache, but not any one which it pulls out of the idr's own
free list.
Hmm, OK. Seems to be harmless as long as idr users don't start to add
new entries to an idr after they did idr_remove_all().
I presume there is none such user, is there? Such a usage would be
unusual but not illegal, I suppose.
(Furthermore, I may by thoroughly mistaken about how lib/idr.c works.)
--
Stefan Richter
-=====-==--= ---= -===-
http://arcgraph.de/sr/
I wrote:
> Your fix only clears idr_layers which an idr_get_new*() pulls out of the
> idr_layer_cache, but not any one which it pulls out of the idr's own
> free list.
>
> Hmm, OK. Seems to be harmless as long as idr users don't start to add
> new entries to an idr after they did idr_remove_all()
...on this very same idr instance.
--
Stefan Richter
-=====-==--= ---= -===-
http://arcgraph.de/sr/
On Tue, 2009-01-13 at 14:48 -0800, Andrew Morton wrote:
> On Mon, 12 Jan 2009 21:50:36 +0100
> Manfred Spraul <[email protected]> wrote:
>
> > Kristian H__gsberg wrote:
> > > On Mon, 2009-01-12 at 20:53 +0100, Manfred Spraul wrote:
> > >
> > >> Kristian H__gsberg wrote:
> > >>
> > >>> The problem
> > >>> isn't about returning un-zeroed-out objects to the kmem cache, the
> > >>> problem is returning them to the idr free list.
> > >>>
> > >>>
> > >> I think this is wrong:
> > >> The slab allocator assumes that the objects that are given to
> > >> kmem_cache_free() are properly constructed.
> > >> I.e.: No additional constructor is called prior to returning the object
> > >> from the next kmem_cache_alloc() call.
> > >>
> > >
> > > That's fine, the ctor associated with the kmem cache is called, and in
> > > the case of idr, it does a memset().
> > >
> > No.
> > As I said, the construtor is not called.
> > An object that is given to kmem_cache_free() must be properly constructed.
> > kmem_cache_free() just adds the obj pointer to a list, the next
> > kmem_cache_alloc returns the pointer.
> >
> > This is also documented in mm/slab.c:
> > * The memory is organized in caches, one cache for each object type.
> > * (e.g. inode_cache, dentry_cache, buffer_head, vm_area_struct)
> > * Each cache consists out of many slabs (they are small (usually one
> > * page long) and always contiguous), and each slab contains multiple
> > * initialized objects.
> > *
> > * This means, that your constructor is used only for newly allocated
> > * slabs and you must pass objects with the same initializations to
> > * kmem_cache_free.
> > *
> >
> > If the idr code passes uninitialized objects to kmem_cache_free(), then
> > the next kmem_cache_alloc will return a bad object.
> >
>
> None of this got us much closer to fixing the bug ;)
>
> What do we think of just removing the constructor and using
> kmem_cache_zalloc()?
We still need to zero out the idr_layer before returning it to the idr's
internal free list. I had a memset(p, 0, sizeof *p) just before the
free_layer() call in idr_remove_all() in the original version. I don't
know why that was taken out (some rcu thing?) but if you're looking for
a minimal change to just fix the bug, just put that back in. Or maybe
it now needs to be in idr_layer_rcu_free(), I never really figured out
how rcu works.
> --- a/lib/idr.c~a
> +++ a/lib/idr.c
> @@ -121,7 +121,7 @@ int idr_pre_get(struct idr *idp, gfp_t g
> {
> while (idp->id_free_cnt < IDR_FREE_MAX) {
> struct idr_layer *new;
> - new = kmem_cache_alloc(idr_layer_cache, gfp_mask);
> + new = kmem_cache_zalloc(idr_layer_cache, gfp_mask);
> if (new == NULL)
> return (0);
> move_to_free_list(idp, new);
> @@ -623,16 +623,10 @@ void *idr_replace(struct idr *idp, void
> }
> EXPORT_SYMBOL(idr_replace);
>
> -static void idr_cache_ctor(void *idr_layer)
> -{
> - memset(idr_layer, 0, sizeof(struct idr_layer));
> -}
> -
> void __init idr_init_cache(void)
> {
> idr_layer_cache = kmem_cache_create("idr_layer_cache",
> - sizeof(struct idr_layer), 0, SLAB_PANIC,
> - idr_cache_ctor);
> + sizeof(struct idr_layer), 0, SLAB_PANIC, NULL);
> }
>
> /**
> _
>
Kristian H?gsberg wrote:
> On Tue, 2009-01-13 at 14:48 -0800, Andrew Morton wrote:
>> What do we think of just removing the constructor and using
>> kmem_cache_zalloc()?
>
> We still need to zero out the idr_layer before returning it to the idr's
> internal free list.
I think so too. But a more robust solution would IMO be to initialize
an idr_layer /before/ use, not /after/ use. Will send a patch later.
--
Stefan Richter
-=====-==--= ---= -===-
http://arcgraph.de/sr/
On Wed, 2009-01-14 at 17:21 +0100, Stefan Richter wrote:
> Kristian Høgsberg wrote:
> > On Tue, 2009-01-13 at 14:48 -0800, Andrew Morton wrote:
> >> What do we think of just removing the constructor and using
> >> kmem_cache_zalloc()?
> >
> > We still need to zero out the idr_layer before returning it to the idr's
> > internal free list.
>
> I think so too. But a more robust solution would IMO be to initialize
> an idr_layer /before/ use, not /after/ use. Will send a patch later.
The reason it was done this way is that normally, when the layers are
returned to the free list they're already zero'ed out (since their
elements have been removed one by one), so no need to do this on later
re-initialization or when freeing. It's a silly little
sup-optimization. idr_remove_all() is different in that it doesn't
incrementally zero out the layer, and so it has to do it using a memset.
If you want rework how this works, I'd suggest just not using the free
list except for in the idr_pre_get()+idr_get() sequence. When a layer
is no longer used, just free it, don't put it back on the free list.
And use kmem_cache_zalloc() in idr_pre_get() as Andrew suggests.
cheers,
Kristian
Kristian H?gsberg wrote:
> On Wed, 2009-01-14 at 17:21 +0100, Stefan Richter wrote:
>> Kristian H?gsberg wrote:
>>> On Tue, 2009-01-13 at 14:48 -0800, Andrew Morton wrote:
>>>> What do we think of just removing the constructor and using
>>>> kmem_cache_zalloc()?
>>> We still need to zero out the idr_layer before returning it to the idr's
>>> internal free list.
>> I think so too. But a more robust solution would IMO be to initialize
>> an idr_layer /before/ use, not /after/ use. Will send a patch later.
>
> The reason it was done this way is that normally, when the layers are
> returned to the free list they're already zero'ed out (since their
> elements have been removed one by one), so no need to do this on later
> re-initialization or when freeing. It's a silly little
> sup-optimization.
Ah, right.
> idr_remove_all() is different in that it doesn't
> incrementally zero out the layer, and so it has to do it using a memset.
>
> If you want rework how this works, I'd suggest just not using the free
> list except for in the idr_pre_get()+idr_get() sequence. When a layer
> is no longer used, just free it, don't put it back on the free list.
> And use kmem_cache_zalloc() in idr_pre_get() as Andrew suggests.
Wait a moment: Nadia's change (which introduced the bug) also already
implements your suggestion. idr_remove_all feeds to the kmem cache now,
not to the free list anymore.
So, Andrew, I take back my assertion that the sequence
idr_remove_all(idr);
if (idr_pre_get(idr, GFP_KERNEL))
id = idr_get_new(idr, p, h);
would be unsafe. Your fix has this covered as well.
--
Stefan Richter
-=====-==--= ---= -===-
http://arcgraph.de/sr/