2008-03-05 21:21:16

by Pekka Enberg

[permalink] [raw]
Subject: [PATCH] netfilter: replace horrible hack with ksize()

From: Pekka Enberg <[email protected]>

There's a horrible slab abuse in net/netfilter/nf_conntrack_extend.c that
can be replaced with a call to ksize().

Cc: Christoph Lameter <[email protected]>
Signed-off-by: Pekka Enberg <[email protected]>
---
include/net/netfilter/nf_conntrack_extend.h | 1 -
net/netfilter/nf_conntrack_extend.c | 19 +++----------------
2 files changed, 3 insertions(+), 17 deletions(-)

Index: linux-2.6/include/net/netfilter/nf_conntrack_extend.h
===================================================================
--- linux-2.6.orig/include/net/netfilter/nf_conntrack_extend.h
+++ linux-2.6/include/net/netfilter/nf_conntrack_extend.h
@@ -17,7 +17,6 @@ enum nf_ct_ext_id
struct nf_ct_ext {
u8 offset[NF_CT_EXT_NUM];
u8 len;
- u8 real_len;
char data[0];
};

Index: linux-2.6/net/netfilter/nf_conntrack_extend.c
===================================================================
--- linux-2.6.orig/net/netfilter/nf_conntrack_extend.c
+++ linux-2.6/net/netfilter/nf_conntrack_extend.c
@@ -19,14 +19,6 @@
static struct nf_ct_ext_type *nf_ct_ext_types[NF_CT_EXT_NUM];
static DEFINE_MUTEX(nf_ct_ext_type_mutex);

-/* Horrible trick to figure out smallest amount worth kmallocing. */
-#define CACHE(x) (x) + 0 *
-enum {
- NF_CT_EXT_MIN_SIZE =
-#include <linux/kmalloc_sizes.h>
- 1 };
-#undef CACHE
-
void __nf_ct_ext_destroy(struct nf_conn *ct)
{
unsigned int i;
@@ -53,7 +45,7 @@ EXPORT_SYMBOL(__nf_ct_ext_destroy);
static void *
nf_ct_ext_create(struct nf_ct_ext **ext, enum nf_ct_ext_id id, gfp_t gfp)
{
- unsigned int off, len, real_len;
+ unsigned int off, len;
struct nf_ct_ext_type *t;

rcu_read_lock();
@@ -61,16 +53,14 @@ nf_ct_ext_create(struct nf_ct_ext **ext,
BUG_ON(t == NULL);
off = ALIGN(sizeof(struct nf_ct_ext), t->align);
len = off + t->len;
- real_len = t->alloc_size;
rcu_read_unlock();

- *ext = kzalloc(real_len, gfp);
+ *ext = kzalloc(t->alloc_size, gfp);
if (!*ext)
return NULL;

(*ext)->offset[id] = off;
(*ext)->len = len;
- (*ext)->real_len = real_len;

return (void *)(*ext) + off;
}
@@ -95,7 +85,7 @@ void *__nf_ct_ext_add(struct nf_conn *ct
newlen = newoff + t->len;
rcu_read_unlock();

- if (newlen >= ct->ext->real_len) {
+ if (newlen >= ksize(ct->ext)) {
new = kmalloc(newlen, gfp);
if (!new)
return NULL;
@@ -114,7 +104,6 @@ void *__nf_ct_ext_add(struct nf_conn *ct
rcu_read_unlock();
}
kfree(ct->ext);
- new->real_len = newlen;
ct->ext = new;
}

@@ -156,8 +145,6 @@ static void update_alloc_size(struct nf_
t1->alloc_size = ALIGN(t1->alloc_size, t2->align)
+ t2->len;
}
- if (t1->alloc_size < NF_CT_EXT_MIN_SIZE)
- t1->alloc_size = NF_CT_EXT_MIN_SIZE;
}
}


2008-03-05 21:56:51

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH] netfilter: replace horrible hack with ksize()

Acked-by: Christoph Lameter <[email protected]>

I guess this should go through the netdev tree?

2008-03-05 21:58:36

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] netfilter: replace horrible hack with ksize()

Hi,

Christoph Lameter wrote:
> Acked-by: Christoph Lameter <[email protected]>
>
> I guess this should go through the netdev tree?

Yup.

Pekka

2008-03-05 22:21:30

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] netfilter: replace horrible hack with ksize()

From: Christoph Lameter <[email protected]>
Date: Wed, 5 Mar 2008 13:56:13 -0800 (PST)

> Acked-by: Christoph Lameter <[email protected]>
>
> I guess this should go through the netdev tree?

It should go through Patrick McHardy and the netfilter-devel
folks, who in turn will queue it up to me and -stable if
appropriate.

2008-03-06 14:04:53

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] netfilter: replace horrible hack with ksize()

Pekka J Enberg wrote:
> From: Pekka Enberg <[email protected]>
>
> There's a horrible slab abuse in net/netfilter/nf_conntrack_extend.c that
> can be replaced with a call to ksize().

This doesn't look right.

> @@ -95,7 +85,7 @@ void *__nf_ct_ext_add(struct nf_conn *ct
> newlen = newoff + t->len;
> rcu_read_unlock();
>
> - if (newlen >= ct->ext->real_len) {
> + if (newlen >= ksize(ct->ext)) {


This needs to look at the currently allocated size, otherwise
it will always realloc when adding new extensions after having
used up ksize(ct->ext) space.

> new = kmalloc(newlen, gfp);

And this should use ksize(newlen) and store the real length
in real_len below.

> if (!new)
> return NULL;
> @@ -114,7 +104,6 @@ void *__nf_ct_ext_add(struct nf_conn *ct
> rcu_read_unlock();
> }
> kfree(ct->ext);
> - new->real_len = newlen;
> ct->ext = new;
> }

2008-03-06 14:14:41

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] netfilter: replace horrible hack with ksize()

Hi Patrick,

On Thu, 6 Mar 2008, Patrick McHardy wrote:
> > @@ -95,7 +85,7 @@ void *__nf_ct_ext_add(struct nf_conn *ct
> > newlen = newoff + t->len;
> > rcu_read_unlock();
> >
> > - if (newlen >= ct->ext->real_len) {
> > + if (newlen >= ksize(ct->ext)) {
>
> This needs to look at the currently allocated size, otherwise
> it will always realloc when adding new extensions after having
> used up ksize(ct->ext) space.

Lets say you

p = kmalloc(8, ...);

Then ksize(p) will return the currently allocated size which is 32 bytes
when page size is 4 KB, and not 8 bytes. So it should be equivalent of
what the current code does.

What am I missing here?

Pekka

2008-03-06 14:21:21

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] netfilter: replace horrible hack with ksize()

On Thu, 6 Mar 2008, Pekka J Enberg wrote:
> > > - if (newlen >= ct->ext->real_len) {
> > > + if (newlen >= ksize(ct->ext)) {
> >
> > This needs to look at the currently allocated size, otherwise
> > it will always realloc when adding new extensions after having
> > used up ksize(ct->ext) space.
>
> Lets say you
>
> p = kmalloc(8, ...);
>
> Then ksize(p) will return the currently allocated size which is 32 bytes
> when page size is 4 KB, and not 8 bytes. So it should be equivalent of
> what the current code does.
>
> What am I missing here?

Ok, it's not equivalent. We have two sizes: object size (8 bytes) and
buffer size (32 bytes) here. In netfilter, ->real_len is same as object
size, not buffer size as ksize() is.

But now I am officially even more confused, why does the netfilter code
decided whether to reallocate based on _object size_ and not _buffer size_
(as krealloc() does, for example)?

Pekka

2008-03-06 14:36:32

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] netfilter: replace horrible hack with ksize()

Pekka J Enberg wrote:
> On Thu, 6 Mar 2008, Pekka J Enberg wrote:
>>>> - if (newlen >= ct->ext->real_len) {
>>>> + if (newlen >= ksize(ct->ext)) {
>>> This needs to look at the currently allocated size, otherwise
>>> it will always realloc when adding new extensions after having
>>> used up ksize(ct->ext) space.
>> Lets say you
>>
>> p = kmalloc(8, ...);
>>
>> Then ksize(p) will return the currently allocated size which is 32 bytes
>> when page size is 4 KB, and not 8 bytes. So it should be equivalent of
>> what the current code does.
>>
>> What am I missing here?
>
> Ok, it's not equivalent. We have two sizes: object size (8 bytes) and
> buffer size (32 bytes) here. In netfilter, ->real_len is same as object
> size, not buffer size as ksize() is.
>
> But now I am officially even more confused, why does the netfilter code
> decided whether to reallocate based on _object size_ and not _buffer size_
> (as krealloc() does, for example)?


It decides to reallocate when the remaining space isn't enough
to hold the new data. NF_CT_EXT_MIN_SIZE is used to make sure it
doesn't allocate anything smaller than the minimum slab size and
hopefully avoid reallocations in the future. Unless I'm
misunderstanding what ksize() does, the easiest way to get
rid of this would be to replace NF_CT_EXT_MIN_SIZE by ksize(0).

Even better would be to not only avoid waste on the first allocation,
but also on reallocations. This would look something like this:

__nf_ct_ext_add():
{
struct nf_ct_ext *new;
- int i, newlen, newoff;
+ int i, newlen, newoff, reallen;
...
if (newlen >= ct->ext->real_len) {
+ reallen = ksize(newlen);
- new = kmalloc(newlen, gfp);
+ new = kmalloc(reallen, gfp);
if (!new)
return NULL;
...
- new->real_len = newlen;
+ new->real_len = reallen;
ct->ext = new;
}

2008-03-06 14:42:18

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] netfilter: replace horrible hack with ksize()

On Thu, 6 Mar 2008, Patrick McHardy wrote:
> It decides to reallocate when the remaining space isn't enough
> to hold the new data. NF_CT_EXT_MIN_SIZE is used to make sure it
> doesn't allocate anything smaller than the minimum slab size and
> hopefully avoid reallocations in the future. Unless I'm
> misunderstanding what ksize() does, the easiest way to get
> rid of this would be to replace NF_CT_EXT_MIN_SIZE by ksize(0).

I think you are misunderstanding ksize() (see mm/slub.c::ksize() for
example). Furthermore, I think your current reallocation code is broken
too as explained in a previous mail and my patch fixes that to behave as
krealloc() does.

Pekka

2008-03-06 14:51:49

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] netfilter: replace horrible hack with ksize()

Pekka J Enberg wrote:
> On Thu, 6 Mar 2008, Patrick McHardy wrote:
>> It decides to reallocate when the remaining space isn't enough
>> to hold the new data. NF_CT_EXT_MIN_SIZE is used to make sure it
>> doesn't allocate anything smaller than the minimum slab size and
>> hopefully avoid reallocations in the future. Unless I'm
>> misunderstanding what ksize() does, the easiest way to get
>> rid of this would be to replace NF_CT_EXT_MIN_SIZE by ksize(0).
>
> I think you are misunderstanding ksize() (see mm/slub.c::ksize() for
> example).

The ksize() description in mm/slab.c matches exactly what netfilter
wants to do:

* kmalloc may internally round up allocations and return more memory
* than requested. ksize() can be used to determine the actual amount
of
* memory allocated. The caller may use this additional memory, even though
* a smaller amount of memory was initially specified with the kmalloc
call.

> Furthermore, I think your current reallocation code is broken
> too as explained in a previous mail and my patch fixes that to behave as
> krealloc() does.

I don't think there is anything broken with that code.

The initial allocation size is calculated as max(size, min slab size)
and is stored as ext->alloc_size. When adding the first extension,
it allocates ext->alloc_size of memory and stores both the real amount
of space used (ext->len) and the actual size (ext->real_len).
When adding further extensions, it calculates the new total amount of
space needed (newlen). If that is larger than the real amount of
memory allocated (real_len), it reallocates.

What am I missing here?

2008-03-06 15:50:40

by Pekka Enberg

[permalink] [raw]
Subject: Re: [PATCH] netfilter: replace horrible hack with ksize()

Hi Patrick,

Patrick McHardy wrote:
> > I think you are misunderstanding ksize() (see mm/slub.c::ksize() for
> > example).
>
> The ksize() description in mm/slab.c matches exactly what netfilter
> wants to do:

Agreed.

Patrick McHardy wrote:
> The initial allocation size is calculated as max(size, min slab size)
> and is stored as ext->alloc_size. When adding the first extension,

Yes, this part is correct, however...

> it allocates ext->alloc_size of memory and stores both the real amount
> of space used (ext->len) and the actual size (ext->real_len).
> When adding further extensions, it calculates the new total amount of
> space needed (newlen). If that is larger than the real amount of
> memory allocated (real_len), it reallocates.

...looking at nf_ct_ext_create() you do:

*ext = kzalloc(real_len, gfp);
^^^^^^^^
if (!*ext)
return NULL;

(*ext)->offset[id] = off;
(*ext)->len = len;
(*ext)->real_len = real_len;
^^^^^^^^

You are storing the _object size_ (total amount of memory requested) and
not the _buffer size_ (total amount of memory allocated). Keep in mind
that object size < buffer size and that ksize() returns the latter.

Now continuing in __nf_ct_ext_add() you do:

if (newlen >= ct->ext->real_len) {
^^^^^^^^
new = kmalloc(newlen, gfp);
if (!new)
return NULL;

So you're comparing newlen to the object size and not the buffer size
which is what you want and what ksize() and consequently my patch does.

Take a look at mm/util.c::krealloc(). It does exactly what you want
modulo the RCU bits. My patch converts the netfilter code to follow the
exact same semantics.

Pekka

2008-03-06 16:41:47

by Christoph Lameter

[permalink] [raw]
Subject: Re: [PATCH] netfilter: replace horrible hack with ksize()

On Thu, 6 Mar 2008, Pekka J Enberg wrote:

> Then ksize(p) will return the currently allocated size which is 32 bytes
> when page size is 4 KB, and not 8 bytes. So it should be equivalent of
> what the current code does.

True for SLAB. SLUB will actually allocate only 8 bytes.

2008-03-10 17:58:39

by Patrick McHardy

[permalink] [raw]
Subject: Re: [PATCH] netfilter: replace horrible hack with ksize()

Pekka Enberg wrote:
> Hi Patrick,
>
> Patrick McHardy wrote:
>> > I think you are misunderstanding ksize() (see mm/slub.c::ksize() for
>> > example).
>>
>> The ksize() description in mm/slab.c matches exactly what netfilter
>> wants to do:
>
> Agreed.
>
> Patrick McHardy wrote:
>> The initial allocation size is calculated as max(size, min slab size)
>> and is stored as ext->alloc_size. When adding the first extension,
>
> Yes, this part is correct, however...
>
>> it allocates ext->alloc_size of memory and stores both the real amount
>> of space used (ext->len) and the actual size (ext->real_len).
>> When adding further extensions, it calculates the new total amount of
>> space needed (newlen). If that is larger than the real amount of
>> memory allocated (real_len), it reallocates.
>
> ...looking at nf_ct_ext_create() you do:
>
> *ext = kzalloc(real_len, gfp);
> ^^^^^^^^
> if (!*ext)
> return NULL;
>
> (*ext)->offset[id] = off;
> (*ext)->len = len;
> (*ext)->real_len = real_len;
> ^^^^^^^^
>
> You are storing the _object size_ (total amount of memory requested) and
> not the _buffer size_ (total amount of memory allocated). Keep in mind
> that object size < buffer size and that ksize() returns the latter.


For all length <= minimum slab size alloc_size (and thus
real_len) is equal to the buffer size. You are correct
however that your patch is fine, I somehow misread the

+ if (newlen >= ksize(ct->ext)) {

part and thought you would always compare against the
minimum slab size.

I've queued your patch and will pass it upstream after
some testing, thanks.