It needed for crypto.ko
Signed-off-by: Kirill A. Shutemov <[email protected]>
---
mm/slab.c | 1 +
mm/slob.c | 1 +
mm/slub.c | 1 +
3 files changed, 3 insertions(+), 0 deletions(-)
diff --git a/mm/slab.c b/mm/slab.c
index ddc41f3..4d00855 100644
--- a/mm/slab.c
+++ b/mm/slab.c
@@ -4457,3 +4457,4 @@ size_t ksize(const void *objp)
return obj_size(virt_to_cache(objp));
}
+EXPORT_SYMBOL(ksize);
diff --git a/mm/slob.c b/mm/slob.c
index bf7e8fc..52bc8a2 100644
--- a/mm/slob.c
+++ b/mm/slob.c
@@ -521,6 +521,7 @@ size_t ksize(const void *block)
} else
return sp->page.private;
}
+EXPORT_SYMBOL(ksize);
struct kmem_cache {
unsigned int size, align;
diff --git a/mm/slub.c b/mm/slub.c
index bdc9abb..0280eee 100644
--- a/mm/slub.c
+++ b/mm/slub.c
@@ -2736,6 +2736,7 @@ size_t ksize(const void *object)
*/
return s->size;
}
+EXPORT_SYMBOL(ksize);
void kfree(const void *x)
{
--
1.6.1.3.GIT
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Tue, Feb 10, 2009 at 3:21 PM, Kirill A. Shutemov
<[email protected]> wrote:
> It needed for crypto.ko
>
> Signed-off-by: Kirill A. Shutemov <[email protected]>
That's bit terse for a changelog. I did a quick grep but wasn't able
to find the offending call-site. Where is it?
We unexported ksize() because it's a problematic interface and you
almost certainly want to use the alternatives (e.g. krealloc). I think
I need bit more convincing to apply this patch...
Pekka
On Tue, Feb 10, 2009 at 03:35:03PM +0200, Pekka Enberg wrote:
> On Tue, Feb 10, 2009 at 3:21 PM, Kirill A. Shutemov
> <[email protected]> wrote:
> > It needed for crypto.ko
> >
> > Signed-off-by: Kirill A. Shutemov <[email protected]>
>
> That's bit terse for a changelog. I did a quick grep but wasn't able
> to find the offending call-site. Where is it?
Commit 7b2cd92a in lastest Linus's git.
>
> We unexported ksize() because it's a problematic interface and you
> almost certainly want to use the alternatives (e.g. krealloc). I think
> I need bit more convincing to apply this patch...
It just a quick fix. If anybody knows better solution, I have no
objections.
--
Regards, Kirill A. Shutemov
+ Belarus, Minsk
+ ALT Linux Team, http://www.altlinux.org/
On Tue, Feb 10, 2009 at 03:35:03PM +0200, Pekka Enberg wrote:
> > We unexported ksize() because it's a problematic interface and you
> > almost certainly want to use the alternatives (e.g. krealloc). I think
> > I need bit more convincing to apply this patch...
On Tue, 10 Feb 2009, Kirill A. Shutemov wrote:
> It just a quick fix. If anybody knows better solution, I have no
> objections.
Herbert, what do you think of this (untested) patch? Alternatively, we
could do something like kfree_secure() but it seems overkill for this one
call-site.
Pekka
>From 7dddc378c19a6d203f147e503d5254df34eda6ae Mon Sep 17 00:00:00 2001
From: Pekka Enberg <[email protected]>
Date: Tue, 10 Feb 2009 16:01:27 +0200
Subject: [PATCH] crypto: api - don't use ksize()
The ksize() function is not exported to modules because it has non-standard
behavour across different slab allocators. While it is technically ok for this
particular call-site, we do not want the use of the API to spread so fix it up
by adding a ->size member to struct crypto_tfm and use that instead.
Cc: Geert Uytterhoeven <[email protected]>
Cc: Herbert Xu <[email protected]>
Signed-off-by: Pekka Enberg <[email protected]>
---
crypto/api.c | 6 +++---
include/linux/crypto.h | 2 ++
2 files changed, 5 insertions(+), 3 deletions(-)
diff --git a/crypto/api.c b/crypto/api.c
index efe77df..3526cc0 100644
--- a/crypto/api.c
+++ b/crypto/api.c
@@ -374,6 +374,7 @@ struct crypto_tfm *__crypto_alloc_tfm(struct crypto_alg *alg, u32 type,
goto out_err;
tfm->__crt_alg = alg;
+ tfm->size = tfm_size;
err = crypto_init_ops(tfm, type, mask);
if (err)
@@ -471,6 +472,7 @@ struct crypto_tfm *crypto_create_tfm(struct crypto_alg *alg,
tfm = (struct crypto_tfm *)(mem + tfmsize);
tfm->__crt_alg = alg;
+ tfm->size = total;
err = frontend->init_tfm(tfm, frontend);
if (err)
@@ -569,19 +571,17 @@ EXPORT_SYMBOL_GPL(crypto_alloc_tfm);
void crypto_destroy_tfm(void *mem, struct crypto_tfm *tfm)
{
struct crypto_alg *alg;
- int size;
if (unlikely(!mem))
return;
alg = tfm->__crt_alg;
- size = ksize(mem);
if (!tfm->exit && alg->cra_exit)
alg->cra_exit(tfm);
crypto_exit_ops(tfm);
crypto_mod_put(alg);
- memset(mem, 0, size);
+ memset(mem, 0, tfm->size);
kfree(mem);
}
EXPORT_SYMBOL_GPL(crypto_destroy_tfm);
diff --git a/include/linux/crypto.h b/include/linux/crypto.h
index 1f2e902..4fd363a 100644
--- a/include/linux/crypto.h
+++ b/include/linux/crypto.h
@@ -487,6 +487,8 @@ struct crypto_tfm {
struct crypto_alg *__crt_alg;
+ size_t size;
+
void *__crt_ctx[] CRYPTO_MINALIGN_ATTR;
};
--
1.5.4.3
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Tue, Feb 10, 2009 at 04:06:53PM +0200, Pekka J Enberg wrote:
>
> Herbert, what do you think of this (untested) patch? Alternatively, we
> could do something like kfree_secure() but it seems overkill for this one
> call-site.
I don't understand why you want to limit the use of ksize. If
kmalloc is reserving more memory than what the user is asking for,
then we should let them know the exact amount so that the extra
bits aren't wasted.
The only way to do that is through ksize.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Tue, Feb 10, 2009 at 04:06:53PM +0200, Pekka J Enberg wrote:
> > Herbert, what do you think of this (untested) patch? Alternatively, we
> > could do something like kfree_secure() but it seems overkill for this one
> > call-site.
On Thu, 2009-02-12 at 18:43 +0800, Herbert Xu wrote:
> I don't understand why you want to limit the use of ksize.
Because the API was being widely abused in the nommu code, for example.
I'd rather not add it back for this special case which can be handled
otherwise.
Pekka
On Thu, Feb 12, 2009 at 12:45:21PM +0200, Pekka Enberg wrote:
>
> Because the API was being widely abused in the nommu code, for example.
> I'd rather not add it back for this special case which can be handled
> otherwise.
I'm sorry but that's like banning the use of heaters just because
they can abused and cause fires.
I think I've said this to you before but in networking we very much
want to use ksize because the standard case of a 1500-byte packet
has loads of extra room given by kmalloc which all goes to waste
right now.
If we could use ksize then we can stuff loads of metadata in that
space.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Thursday 12 February 2009 21:50:34 Herbert Xu wrote:
> On Thu, Feb 12, 2009 at 12:45:21PM +0200, Pekka Enberg wrote:
> > Because the API was being widely abused in the nommu code, for example.
> > I'd rather not add it back for this special case which can be handled
> > otherwise.
>
> I'm sorry but that's like banning the use of heaters just because
> they can abused and cause fires.
>
> I think I've said this to you before but in networking we very much
> want to use ksize because the standard case of a 1500-byte packet
> has loads of extra room given by kmalloc which all goes to waste
> right now.
I'm not against the idea of exporting ksize. It is a fairly well
commented function.
I'd be up for nearly anything in the slab layer that speeds up
networking, to be honest ;)
> If we could use ksize then we can stuff loads of metadata in that
> space.
I would be interested to know how that goes. You always have this
circular issue that if a little more space helps significantly, then
maybe it is a good idea to explicitly ask for those bytes. Of course
that larger allocation is also likely to have some slack bytes.
So the benefit you get from using these slack bytes has to be larger
than the cost of using ksize, but smaller than the cost of explicitly
asking for more bytes at alloc time. Interesting...
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Thu, Feb 12, 2009 at 12:45:21PM +0200, Pekka Enberg wrote:
> >
> > Because the API was being widely abused in the nommu code, for example.
> > I'd rather not add it back for this special case which can be handled
> > otherwise.
On Thu, 2009-02-12 at 18:50 +0800, Herbert Xu wrote:
> I'm sorry but that's like banning the use of heaters just because
> they can abused and cause fires.
>
> I think I've said this to you before but in networking we very much
> want to use ksize because the standard case of a 1500-byte packet
> has loads of extra room given by kmalloc which all goes to waste
> right now.
>
> If we could use ksize then we can stuff loads of metadata in that
> space.
OK, fair enough, I applied Kirill's patch. Thanks.
Pekka
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Fri, Feb 13, 2009 at 12:10:45AM +1100, Nick Piggin wrote:
>
> I would be interested to know how that goes. You always have this
> circular issue that if a little more space helps significantly, then
> maybe it is a good idea to explicitly ask for those bytes. Of course
> that larger allocation is also likely to have some slack bytes.
Well, the thing is we don't know apriori whether we need the
extra space. The idea is to use the extra space if available
to avoid reallocation when we hit things like IPsec.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Thu, Feb 12, 2009 at 05:55:04PM +0200, Pekka Enberg wrote:
>
> OK, fair enough, I applied Kirill's patch. Thanks.
Thanks Pekka!
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Fri, 2009-02-13 at 07:09 +0800, Herbert Xu wrote:
> On Fri, Feb 13, 2009 at 12:10:45AM +1100, Nick Piggin wrote:
> >
> > I would be interested to know how that goes. You always have this
> > circular issue that if a little more space helps significantly, then
> > maybe it is a good idea to explicitly ask for those bytes. Of course
> > that larger allocation is also likely to have some slack bytes.
>
> Well, the thing is we don't know apriori whether we need the
> extra space. The idea is to use the extra space if available
> to avoid reallocation when we hit things like IPsec.
I'm not entirely convinced by this argument. If you're concerned about
space rather than performance, then you want an allocator that doesn't
waste space in the first place and you don't try to do "sub-allocations"
by hand. If you're concerned about performance, you instead optimize
your allocator to be as fast as possible and again avoid conditional
branches for sub-allocations.
--
http://selenic.com : development and support for Mercurial and Linux
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Friday 13 February 2009 10:37:01 Matt Mackall wrote:
> On Fri, 2009-02-13 at 07:09 +0800, Herbert Xu wrote:
> > On Fri, Feb 13, 2009 at 12:10:45AM +1100, Nick Piggin wrote:
> > > I would be interested to know how that goes. You always have this
> > > circular issue that if a little more space helps significantly, then
> > > maybe it is a good idea to explicitly ask for those bytes. Of course
> > > that larger allocation is also likely to have some slack bytes.
> >
> > Well, the thing is we don't know apriori whether we need the
> > extra space. The idea is to use the extra space if available
> > to avoid reallocation when we hit things like IPsec.
>
> I'm not entirely convinced by this argument. If you're concerned about
> space rather than performance, then you want an allocator that doesn't
> waste space in the first place and you don't try to do "sub-allocations"
> by hand. If you're concerned about performance, you instead optimize
> your allocator to be as fast as possible and again avoid conditional
> branches for sub-allocations.
Well, my earlier reasoning is no longer so clear cut if eg. there
are common cases where no extra space is required, but rare cases
where extra space might be a big win if it eg avoids extra
alloc, copy, free or something.
Because even with performance oriented allocators, there is a non-zero
cost to explicitly asking for more memory -- queues tend to get smaller
at larger object sizes, and page allocation orders can increase. So if
it is very uncommon to need extra space you don't want to burden the
common case with it.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Fri, Feb 13, 2009 at 8:20 AM, Nick Piggin <[email protected]> wrote:
> On Friday 13 February 2009 10:37:01 Matt Mackall wrote:
>> On Fri, 2009-02-13 at 07:09 +0800, Herbert Xu wrote:
>> > On Fri, Feb 13, 2009 at 12:10:45AM +1100, Nick Piggin wrote:
>> > > I would be interested to know how that goes. You always have this
>> > > circular issue that if a little more space helps significantly, then
>> > > maybe it is a good idea to explicitly ask for those bytes. Of course
>> > > that larger allocation is also likely to have some slack bytes.
>> >
>> > Well, the thing is we don't know apriori whether we need the
>> > extra space. The idea is to use the extra space if available
>> > to avoid reallocation when we hit things like IPsec.
>>
>> I'm not entirely convinced by this argument. If you're concerned about
>> space rather than performance, then you want an allocator that doesn't
>> waste space in the first place and you don't try to do "sub-allocations"
>> by hand. If you're concerned about performance, you instead optimize
>> your allocator to be as fast as possible and again avoid conditional
>> branches for sub-allocations.
>
> Well, my earlier reasoning is no longer so clear cut if eg. there
> are common cases where no extra space is required, but rare cases
> where extra space might be a big win if it eg avoids extra
> alloc, copy, free or something.
>
> Because even with performance oriented allocators, there is a non-zero
> cost to explicitly asking for more memory -- queues tend to get smaller
> at larger object sizes, and page allocation orders can increase. So if
> it is very uncommon to need extra space you don't want to burden the
> common case with it.
My concern would be that such extra-space reuse would be a very
non-obvious performance hit if allocation patterns changed slightly.
If being able to use the extra space really is a noticeable "big win"
for the rare case, then minor changes to the memory allocator could
dramatically impact performance in a totally nondeterministic way. If
the change isn't performance-significant in the grand scheme of
things, then the use of ksize() would just be code obfuscation. On
the other hand if it *is* performance-significant, it should be
redesigned to be able to guarantee that the space is available when it
is needed.
Cheers,
Kyle Moffett
On Thu, 12 Feb 2009 17:55:04 +0200 Pekka Enberg <[email protected]> wrote:
> On Thu, Feb 12, 2009 at 12:45:21PM +0200, Pekka Enberg wrote:
> > >
> > > Because the API was being widely abused in the nommu code, for example.
> > > I'd rather not add it back for this special case which can be handled
> > > otherwise.
>
> On Thu, 2009-02-12 at 18:50 +0800, Herbert Xu wrote:
> > I'm sorry but that's like banning the use of heaters just because
> > they can abused and cause fires.
> >
> > I think I've said this to you before but in networking we very much
> > want to use ksize because the standard case of a 1500-byte packet
> > has loads of extra room given by kmalloc which all goes to waste
> > right now.
> >
> > If we could use ksize then we can stuff loads of metadata in that
> > space.
>
> OK, fair enough, I applied Kirill's patch. Thanks.
>
Could we please have more details regarding this:
> The ksize() function is not exported to modules because it has non-standard
> behavour across different slab allocators.
How does the behaviour differ? It this documented? Can we fix it?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Sun, 2009-02-15 at 13:36 -0800, Andrew Morton wrote:
> On Thu, 12 Feb 2009 17:55:04 +0200 Pekka Enberg <[email protected]> wrote:
>
> > On Thu, Feb 12, 2009 at 12:45:21PM +0200, Pekka Enberg wrote:
> > > >
> > > > Because the API was being widely abused in the nommu code, for example.
> > > > I'd rather not add it back for this special case which can be handled
> > > > otherwise.
> >
> > On Thu, 2009-02-12 at 18:50 +0800, Herbert Xu wrote:
> > > I'm sorry but that's like banning the use of heaters just because
> > > they can abused and cause fires.
> > >
> > > I think I've said this to you before but in networking we very much
> > > want to use ksize because the standard case of a 1500-byte packet
> > > has loads of extra room given by kmalloc which all goes to waste
> > > right now.
> > >
> > > If we could use ksize then we can stuff loads of metadata in that
> > > space.
> >
> > OK, fair enough, I applied Kirill's patch. Thanks.
> >
>
> Could we please have more details regarding this:
>
> > The ksize() function is not exported to modules because it has non-standard
> > behavour across different slab allocators.
>
> How does the behaviour differ? It this documented? Can we fix it?
SLAB and SLUB support calling ksize() on objects returned by
kmem_cache_alloc.
SLOB only supports it on objects from kmalloc. This is because it does
not store any size or type information in kmem_cache_alloc'ed objects.
Instead, it infers them from the cache argument.
Ideally SLAB and SLUB would complain about using ksize inappropriately
when debugging was enabled.
--
http://selenic.com : development and support for Mercurial and Linux
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Sun, 15 Feb 2009 15:43:14 -0600 Matt Mackall <[email protected]> wrote:
> On Sun, 2009-02-15 at 13:36 -0800, Andrew Morton wrote:
> > On Thu, 12 Feb 2009 17:55:04 +0200 Pekka Enberg <[email protected]> wrote:
> >
> > > On Thu, Feb 12, 2009 at 12:45:21PM +0200, Pekka Enberg wrote:
> > > > >
> > > > > Because the API was being widely abused in the nommu code, for example.
> > > > > I'd rather not add it back for this special case which can be handled
> > > > > otherwise.
> > >
> > > On Thu, 2009-02-12 at 18:50 +0800, Herbert Xu wrote:
> > > > I'm sorry but that's like banning the use of heaters just because
> > > > they can abused and cause fires.
> > > >
> > > > I think I've said this to you before but in networking we very much
> > > > want to use ksize because the standard case of a 1500-byte packet
> > > > has loads of extra room given by kmalloc which all goes to waste
> > > > right now.
> > > >
> > > > If we could use ksize then we can stuff loads of metadata in that
> > > > space.
> > >
> > > OK, fair enough, I applied Kirill's patch. Thanks.
> > >
> >
> > Could we please have more details regarding this:
> >
> > > The ksize() function is not exported to modules because it has non-standard
> > > behavour across different slab allocators.
> >
> > How does the behaviour differ? It this documented? Can we fix it?
>
> SLAB and SLUB support calling ksize() on objects returned by
> kmem_cache_alloc.
>
> SLOB only supports it on objects from kmalloc. This is because it does
> not store any size or type information in kmem_cache_alloc'ed objects.
> Instead, it infers them from the cache argument.
OK. This is really bad, isn't it? People will write code which
happily works under slab and slub, only to have it crash for those small
number of people who (very much later) test with slob?
> Ideally SLAB and SLUB would complain about using ksize inappropriately
> when debugging was enabled.
>
OK, thanks.
Ideally we would support ksize() for both kmalloc() and
kmem_cache_alloc() memory across all implementations.
Could we change ksize()'s argument so that callers must provide the
cache pointer? Then for kmalloc() callers, provide a
kmem_cache *get_cache_for_kmalloc(size_t) function?
Or could we have separate interfaces:
size_t kmalloc_ksize(size_t kmalloced_size);
size_t kmem_cache_alloc_ksize(struct kmem_cache *cachep);
?
Gee this sucks. Biggest mistake I ever made. Are we working hard
enough to remove some of these sl?b implementations? Would it help if
I randomly deleted a couple?
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Sun, 2009-02-15 at 13:55 -0800, Andrew Morton wrote:
> On Sun, 15 Feb 2009 15:43:14 -0600 Matt Mackall <[email protected]> wrote:
>
> > On Sun, 2009-02-15 at 13:36 -0800, Andrew Morton wrote:
> > > On Thu, 12 Feb 2009 17:55:04 +0200 Pekka Enberg <[email protected]> wrote:
> > >
> > > > On Thu, Feb 12, 2009 at 12:45:21PM +0200, Pekka Enberg wrote:
> > > > > >
> > > > > > Because the API was being widely abused in the nommu code, for example.
> > > > > > I'd rather not add it back for this special case which can be handled
> > > > > > otherwise.
> > > >
> > > > On Thu, 2009-02-12 at 18:50 +0800, Herbert Xu wrote:
> > > > > I'm sorry but that's like banning the use of heaters just because
> > > > > they can abused and cause fires.
> > > > >
> > > > > I think I've said this to you before but in networking we very much
> > > > > want to use ksize because the standard case of a 1500-byte packet
> > > > > has loads of extra room given by kmalloc which all goes to waste
> > > > > right now.
> > > > >
> > > > > If we could use ksize then we can stuff loads of metadata in that
> > > > > space.
> > > >
> > > > OK, fair enough, I applied Kirill's patch. Thanks.
> > > >
> > >
> > > Could we please have more details regarding this:
> > >
> > > > The ksize() function is not exported to modules because it has non-standard
> > > > behavour across different slab allocators.
> > >
> > > How does the behaviour differ? It this documented? Can we fix it?
> >
> > SLAB and SLUB support calling ksize() on objects returned by
> > kmem_cache_alloc.
> >
> > SLOB only supports it on objects from kmalloc. This is because it does
> > not store any size or type information in kmem_cache_alloc'ed objects.
> > Instead, it infers them from the cache argument.
>
> OK. This is really bad, isn't it?
No. There are very few ksize callers and very few of those are making
this particular category error.
And it -is- a category error. The fact that kmalloc is implemented on
top of kmem_cache_alloc is an implementation detail that callers should
not assume. They shouldn't call kfree() on kmem_cache_alloc objects
(even though it might just happen to work), nor should they call
ksize().
> > Ideally SLAB and SLUB would complain about using ksize inappropriately
> > when debugging was enabled.
> >
>
> OK, thanks.
>
> Ideally we would support ksize() for both kmalloc() and
> kmem_cache_alloc() memory across all implementations.
There's never a good reason to call ksize on a kmem_cache_alloced
object. You -must- statically know what type of object you have already
to be able to free it later with kmem_cache_free, ergo, you can
statically know how big it is too.
Another alternative to the above is to throw sparse at it, and have it
track what allocators a pointer might have come through.
But as far as I'm aware, there's only been one actual bug in this area:
nommu was calling ksize on pointers of all kinds, including stuff
allocated at compile time.
> Gee this sucks. Biggest mistake I ever made. Are we working hard
> enough to remove some of these sl?b implementations? Would it help if
> I randomly deleted a couple?
Again, I think there's a strong argument for having two. We can't
reasonably expect one allocator to work well on supercomputers and
phones. One will likely value performance significantly higher than
memory usage and vice-versa.
I think most of the pain here is actually peripheral. SLUB in particular
has churned a lot of interfaces. But we would have had that had we
instead decided to throw a lot of effort into making SLAB better.
--
http://selenic.com : development and support for Mercurial and Linux
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Sun, 15 Feb 2009 17:49:41 -0600 Matt Mackall <[email protected]> wrote:
> On Sun, 2009-02-15 at 13:55 -0800, Andrew Morton wrote:
> > On Sun, 15 Feb 2009 15:43:14 -0600 Matt Mackall <[email protected]> wrote:
> >
> > > On Sun, 2009-02-15 at 13:36 -0800, Andrew Morton wrote:
> > > > On Thu, 12 Feb 2009 17:55:04 +0200 Pekka Enberg <[email protected]> wrote:
> > > >
> > > > > On Thu, Feb 12, 2009 at 12:45:21PM +0200, Pekka Enberg wrote:
> > > > > > >
> > > > > > > Because the API was being widely abused in the nommu code, for example.
> > > > > > > I'd rather not add it back for this special case which can be handled
> > > > > > > otherwise.
> > > > >
> > > > > On Thu, 2009-02-12 at 18:50 +0800, Herbert Xu wrote:
> > > > > > I'm sorry but that's like banning the use of heaters just because
> > > > > > they can abused and cause fires.
> > > > > >
> > > > > > I think I've said this to you before but in networking we very much
> > > > > > want to use ksize because the standard case of a 1500-byte packet
> > > > > > has loads of extra room given by kmalloc which all goes to waste
> > > > > > right now.
> > > > > >
> > > > > > If we could use ksize then we can stuff loads of metadata in that
> > > > > > space.
> > > > >
> > > > > OK, fair enough, I applied Kirill's patch. Thanks.
> > > > >
> > > >
> > > > Could we please have more details regarding this:
> > > >
> > > > > The ksize() function is not exported to modules because it has non-standard
> > > > > behavour across different slab allocators.
> > > >
> > > > How does the behaviour differ? It this documented? Can we fix it?
> > >
> > > SLAB and SLUB support calling ksize() on objects returned by
> > > kmem_cache_alloc.
> > >
> > > SLOB only supports it on objects from kmalloc. This is because it does
> > > not store any size or type information in kmem_cache_alloc'ed objects.
> > > Instead, it infers them from the cache argument.
> >
> > OK. This is really bad, isn't it?
>
> No. There are very few ksize callers and very few of those are making
> this particular category error.
>
> And it -is- a category error. The fact that kmalloc is implemented on
> top of kmem_cache_alloc is an implementation detail that callers should
> not assume. They shouldn't call kfree() on kmem_cache_alloc objects
> (even though it might just happen to work), nor should they call
> ksize().
But they could call a new kmem_cache_size(cachep, obj)?
> > > Ideally SLAB and SLUB would complain about using ksize inappropriately
> > > when debugging was enabled.
> > >
> >
> > OK, thanks.
> >
> > Ideally we would support ksize() for both kmalloc() and
> > kmem_cache_alloc() memory across all implementations.
>
> There's never a good reason to call ksize on a kmem_cache_alloced
> object. You -must- statically know what type of object you have already
> to be able to free it later with kmem_cache_free, ergo, you can
> statically know how big it is too.
But kmem_cache_size() would tell you how much extra secret memory there
is available after the object?
How that gets along with redzoning is a bit of a mystery though.
The whole concept is quite hacky and nasty, isn't it?. Does
networking/crypto actually show any gain from pulling this stunt?
> Another alternative to the above is to throw sparse at it, and have it
> track what allocators a pointer might have come through.
>
> But as far as I'm aware, there's only been one actual bug in this area:
> nommu was calling ksize on pointers of all kinds, including stuff
> allocated at compile time.
>
> > Gee this sucks. Biggest mistake I ever made. Are we working hard
> > enough to remove some of these sl?b implementations? Would it help if
> > I randomly deleted a couple?
>
> Again, I think there's a strong argument for having two. We can't
> reasonably expect one allocator to work well on supercomputers and
> phones.
We can't reasonably expect an OS to work well on supercomputers and
phones ;) It's a matter of how much person-power gets tossed at it.
> One will likely value performance significantly higher than
> memory usage and vice-versa.
>
> I think most of the pain here is actually peripheral. SLUB in particular
> has churned a lot of interfaces. But we would have had that had we
> instead decided to throw a lot of effort into making SLAB better.
hm.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Sun, Feb 15, 2009 at 05:00:52PM -0800, Andrew Morton wrote:
>
> But kmem_cache_size() would tell you how much extra secret memory there
> is available after the object?
>
> How that gets along with redzoning is a bit of a mystery though.
>
> The whole concept is quite hacky and nasty, isn't it?. Does
> networking/crypto actually show any gain from pulling this stunt?
I see no point in calling ksize on memory that's not kmalloced.
So no there is nothing to be gained from having kmem_cache_ksize.
However, for kmalloced memory we're wasting hundreds of bytes
for the standard 1500 byte allocation without ksize which means
that we're doing reallocations (and sometimes copying) when it
isn't necessary.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Mon, 2009-02-16 at 09:21 +0800, Herbert Xu wrote:
> On Sun, Feb 15, 2009 at 05:00:52PM -0800, Andrew Morton wrote:
> >
> > But kmem_cache_size() would tell you how much extra secret memory there
> > is available after the object?
> >
> > How that gets along with redzoning is a bit of a mystery though.
> >
> > The whole concept is quite hacky and nasty, isn't it?. Does
> > networking/crypto actually show any gain from pulling this stunt?
>
> I see no point in calling ksize on memory that's not kmalloced.
> So no there is nothing to be gained from having kmem_cache_ksize.
>
> However, for kmalloced memory we're wasting hundreds of bytes
> for the standard 1500 byte allocation without ksize which means
> that we're doing reallocations (and sometimes copying) when it
> isn't necessary.
Yeah. That sucks. We should probably stick in an skb-friendly slab size
and see what happens on network benchmarks.
--
http://selenic.com : development and support for Mercurial and Linux
On Sun, 2009-02-15 at 17:00 -0800, Andrew Morton wrote:
> On Sun, 15 Feb 2009 17:49:41 -0600 Matt Mackall <[email protected]> wrote:
> The whole concept is quite hacky and nasty, isn't it?.
It is, which is part of why we were trying to kill it. The primary users
were thing growing buffers ala realloc. So we were pushing to change the
callers to just do a realloc. But IPSEC doesn't fit well into that mold.
The fundamental problem here for networking is that 1500 is not very
close to a power of two and just about everything in the VM wants it to
be. If we could get SKBs fitting more nicely in memory, I think it would
cease to be a concern.
--
http://selenic.com : development and support for Mercurial and Linux
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Sun, Feb 15, 2009 at 07:28:46PM -0600, Matt Mackall wrote:
>
> Yeah. That sucks. We should probably stick in an skb-friendly slab size
> and see what happens on network benchmarks.
I don't see how that's going to help since we don't want it to
cross page boundaries either (having just wasted a day tracking
down a virtual networking bug because of slab debugging and
crossing page boundaries).
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Mon, 2009-02-16 at 09:52 +0800, Herbert Xu wrote:
> On Sun, Feb 15, 2009 at 07:28:46PM -0600, Matt Mackall wrote:
> >
> > Yeah. That sucks. We should probably stick in an skb-friendly slab size
> > and see what happens on network benchmarks.
>
> I don't see how that's going to help since we don't want it to
> cross page boundaries either (having just wasted a day tracking
> down a virtual networking bug because of slab debugging and
> crossing page boundaries).
I'll bite.. what's wrong with page boundaries? Do we play per-SKB TLB
games in virtual network drivers?
--
http://selenic.com : development and support for Mercurial and Linux
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Sun, Feb 15, 2009 at 07:54:39PM -0600, Matt Mackall wrote:
>
> I'll bite.. what's wrong with page boundaries? Do we play per-SKB TLB
> games in virtual network drivers?
Because virtual physical memory is not physically contiguous.
Cheers,
--
Visit Openswan at http://www.openswan.org/
Email: Herbert Xu ~{PmV>HI~} <[email protected]>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt
On Tue, Feb 10, 2009 at 04:06:53PM +0200, Pekka J Enberg wrote:
> On Tue, Feb 10, 2009 at 03:35:03PM +0200, Pekka Enberg wrote:
> > > We unexported ksize() because it's a problematic interface and you
> > > almost certainly want to use the alternatives (e.g. krealloc). I think
> > > I need bit more convincing to apply this patch...
>
> On Tue, 10 Feb 2009, Kirill A. Shutemov wrote:
> > It just a quick fix. If anybody knows better solution, I have no
> > objections.
>
> Herbert, what do you think of this (untested) patch? Alternatively, we
> could do something like kfree_secure() but it seems overkill for this one
> call-site.
There are more callsites which do memset() + kfree():
arch/s390/crypto/prng.c
drivers/s390/crypto/zcrypt_pcixcc.c
drivers/md/dm-crypt.c
drivers/usb/host/hwa-hc.c
drivers/usb/wusbcore/cbaf.c
(drivers/w1/w1{,_int}.c)
fs/cifs/misc.c
fs/cifs/connect.c
fs/ecryptfs/keystore.c
fs/ecryptfs/messaging.c
net/atm/mpoa_caches.c
How about the attached patch? One problem is that zeroing ksize()
bytes can have an overhead of nearly twice the actual allocation size.
So we would need an interface that lets the caller pass in either a
number of bytes it wants to have zeroed out or say idontknow.
Perhaps add a size parameter that is cut to ksize() if it's too big?
Or (ssize_t)-1 for figureitoutyourself?
Hannes
---
Subject: slab: introduce kzfree()
kzfree() is a wrapper for kfree() that additionally zeroes the
underlying memory before releasing it to the slab allocator.
---
include/linux/slab.h | 1 +
mm/util.c | 20 ++++++++++++++++++++
2 files changed, 21 insertions(+)
--- a/include/linux/slab.h
+++ b/include/linux/slab.h
@@ -127,6 +127,7 @@ int kmem_ptr_validate(struct kmem_cache
void * __must_check __krealloc(const void *, size_t, gfp_t);
void * __must_check krealloc(const void *, size_t, gfp_t);
void kfree(const void *);
+void kzfree(const void *);
size_t ksize(const void *);
/*
--- a/mm/util.c
+++ b/mm/util.c
@@ -129,6 +129,26 @@ void *krealloc(const void *p, size_t new
}
EXPORT_SYMBOL(krealloc);
+/**
+ * kzfree - like kfree but zero memory
+ * @p: object to free memory of
+ * @zsize: size of the memory region to zero
+ *
+ * The memory of the object @p points to is zeroed before freed.
+ * If @p is %NULL, kzfree() does nothing.
+ */
+void kzfree(const void *p)
+{
+ size_t ks;
+ void *mem = (void *)p;
+
+ if (unlikely(ZERO_OR_NULL_PTR(mem)))
+ return;
+ ks = ksize(mem);
+ memset(mem, 0, ks);
+ kfree(mem);
+}
+
/*
* strndup_user - duplicate an existing string from user space
* @s: The string to duplicate
Hi Johannes,
On Mon, 2009-02-16 at 14:56 +0100, Johannes Weiner wrote:
> On Tue, Feb 10, 2009 at 04:06:53PM +0200, Pekka J Enberg wrote:
> > On Tue, Feb 10, 2009 at 03:35:03PM +0200, Pekka Enberg wrote:
> > > > We unexported ksize() because it's a problematic interface and you
> > > > almost certainly want to use the alternatives (e.g. krealloc). I think
> > > > I need bit more convincing to apply this patch...
> >
> > On Tue, 10 Feb 2009, Kirill A. Shutemov wrote:
> > > It just a quick fix. If anybody knows better solution, I have no
> > > objections.
> >
> > Herbert, what do you think of this (untested) patch? Alternatively, we
> > could do something like kfree_secure() but it seems overkill for this one
> > call-site.
>
> There are more callsites which do memset() + kfree():
>
> arch/s390/crypto/prng.c
> drivers/s390/crypto/zcrypt_pcixcc.c
> drivers/md/dm-crypt.c
> drivers/usb/host/hwa-hc.c
> drivers/usb/wusbcore/cbaf.c
> (drivers/w1/w1{,_int}.c)
> fs/cifs/misc.c
> fs/cifs/connect.c
> fs/ecryptfs/keystore.c
> fs/ecryptfs/messaging.c
> net/atm/mpoa_caches.c
>
> How about the attached patch? One problem is that zeroing ksize()
> bytes can have an overhead of nearly twice the actual allocation size.
>
> So we would need an interface that lets the caller pass in either a
> number of bytes it wants to have zeroed out or say idontknow.
>
> Perhaps add a size parameter that is cut to ksize() if it's too big?
> Or (ssize_t)-1 for figureitoutyourself?
I'd prefer the kzfree() interface as-is. I don't think you want to do
the memset/kfree in a fast-path anyway.
If you can convince Andrew to pick this patch up and maybe convert some
call-sites to actually use it, then:
Acked-by: Pekka Enberg <[email protected]>
Pekka
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Mon, 2009-02-16 at 14:56 +0100, Johannes Weiner wrote:
> One problem is that zeroing ksize()
> bytes can have an overhead of nearly twice the actual allocation size.
A possible good thing is when linux has a
mechanism to use known zeroed memory in
kzalloc or kcalloc, it's already good to go.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Mon, 2009-02-16 at 14:56 +0100, Johannes Weiner wrote:
>> One problem is that zeroing ksize()
>> bytes can have an overhead of nearly twice the actual allocation size.
On Mon, Feb 16, 2009 at 6:32 PM, Joe Perches <[email protected]> wrote:
> A possible good thing is when linux has a
> mechanism to use known zeroed memory in
> kzalloc or kcalloc, it's already good to go.
Hmm, kzfree() is not going to be all that common operation so there
won't be that many known zeroed regions and I suspect tracking them
will have more overhead than just doing the memset().
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Sun, 15 Feb 2009, Matt Mackall wrote:
> On Sun, 2009-02-15 at 17:00 -0800, Andrew Morton wrote:
> > On Sun, 15 Feb 2009 17:49:41 -0600 Matt Mackall <[email protected]> wrote:
>
> > The whole concept is quite hacky and nasty, isn't it?.
>
> It is, which is part of why we were trying to kill it. The primary users
> were thing growing buffers ala realloc. So we were pushing to change the
> callers to just do a realloc. But IPSEC doesn't fit well into that mold.
>
> The fundamental problem here for networking is that 1500 is not very
> close to a power of two and just about everything in the VM wants it to
> be. If we could get SKBs fitting more nicely in memory, I think it would
> cease to be a concern.
Does it help to remind that 4 KiB - 2 * 1500 = 1096 \approx 1024?
(hmm, the difference is 72 \approx 64)?
If 1500 is so common as an allocation size, it may make sense to start
special-casing it...
With kind regards,
Geert Uytterhoeven
Software Architect
Sony Techsoft Centre Europe
The Corporate Village ? Da Vincilaan 7-D1 ? B-1935 Zaventem ? Belgium
Phone: +32 (0)2 700 8453
Fax: +32 (0)2 700 8622
E-mail: [email protected]
Internet: http://www.sony-europe.com/
A division of Sony Europe (Belgium) N.V.
VAT BE 0413.825.160 ? RPR Brussels
Fortis ? BIC GEBABEBB ? IBAN BE41293037680010
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Sun, 15 Feb 2009, Matt Mackall wrote:
> And it -is- a category error. The fact that kmalloc is implemented on
> top of kmem_cache_alloc is an implementation detail that callers should
> not assume. They shouldn't call kfree() on kmem_cache_alloc objects
> (even though it might just happen to work), nor should they call
> ksize().
ksize does not take a kmem_cache pointer and it is mainly used for
figuring out how much space kmalloc really allocated for an object. As
such its more part of the kmalloc/kfree set of calls than the
kmem_cache_* calls.
We could add another call
kmem_cache_size()
for symmetries sake.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>
On Tue, Feb 17, 2009 at 6:17 PM, Christoph Lameter
<[email protected]> wrote:
> On Sun, 15 Feb 2009, Matt Mackall wrote:
>
>> And it -is- a category error. The fact that kmalloc is implemented on
>> top of kmem_cache_alloc is an implementation detail that callers should
>> not assume. They shouldn't call kfree() on kmem_cache_alloc objects
>> (even though it might just happen to work), nor should they call
>> ksize().
>
> ksize does not take a kmem_cache pointer and it is mainly used for
> figuring out how much space kmalloc really allocated for an object. As
> such its more part of the kmalloc/kfree set of calls than the
> kmem_cache_* calls.
>
> We could add another call
>
> kmem_cache_size()
>
> for symmetries sake.
Hmm, kmem_cache_size() seems bit pointless to me. For
kmem_cache_create()'d caches, actual allocated size should be more or
less optimal with no extra space.
Pekka
On Tue, 17 Feb 2009, Pekka Enberg wrote:
> Hmm, kmem_cache_size() seems bit pointless to me. For
> kmem_cache_create()'d caches, actual allocated size should be more or
> less optimal with no extra space.
Cacheline alignment and word alignment etc etc can still add some space to
the object.
--
To unsubscribe, send a message with 'unsubscribe linux-mm' in
the body to [email protected]. For more info on Linux MM,
see: http://www.linux-mm.org/ .
Don't email: <a href=mailto:"[email protected]"> [email protected] </a>