2001-04-09 16:04:07

by Imran.Patel

[permalink] [raw]
Subject: skb allocation problems

hello all,

For the past week, i am having problems with (possibly BUGGY) memory
allocation by the kernel in my NAT-PT module. I posted this to netdev before
but did not get any response. So, here i am with some more proof:)

In my IPV6-IPV4 translator module, i capture an IPV4 packet by registering
at the netfilter pre-routing hook, i translate it to an IPV6 packet and
allocate a new skb for the translated packet and put the packet back in the
IPV6 stack...but when i allocate the skb for the translated packet, i start
getting the problems. A call to alloc_skb gives me the memory area that is
already being pointed by the IPV4 packet which came into the hook...

I have written a test module which closely mirrors what my code tries to
do(attached below). This is what i get:

PRE_R: old skb:c371ee40 new skb:c371ee30
<7>NAT: 3 dropping untracked packet c350fa00 1 192.168.102.22 ->
192.168.102.29

The above message was generated by sending a udp packet from a simple udp
client on the host 192.168.102.29 to 192.168.102.22. The test module was on
the host 192.168.102.22.

As you can see the memory allocated to new skb is already being pointed to
by the packet that came in (old skb). Isn't that strange/BUGGY ??

And regarding the netfilter message, the packet that is being dropped by
netfilter is an ICMP protocol unreachable message since there is no udp
server listening on 192.168.102.29. Why conntrack drops the icmp error
packet is a question for the netfilter guys:)

I have been fighting this bug or whatever it is for a week now and i can't
get anywhere. And i cannot produce this behaviour reliably i.e. this test
module allocates different memory area most of the time & seems to work fine
but when it goes bad, it keeps on allocating the same overlapping
addresses...and never stops misbehaving until i reboot the machine....

Any help would be appreciated.

PS: I am not on the linux-kernel mailing list & yes, i use the 2.4.1 kernel.

TIA,
imran

-------------------------



#define __KERNEL__
#define MODULE

#include <linux/module.h>
#include <linux/netfilter_ipv4.h>
#include <linux/ip.h>

static unsigned int
prehook(unsigned int hook, struct sk_buff **pskb,
const struct net_device *indev,
const struct net_device *outdev,
int (*okfn)(struct sk_buff *)) {
struct sk_buff *skb;
printk("\nPRE_R: old skb:%p", (*pskb)->data);

/* translation happens in the real code here */

skb = alloc_skb((*pskb)->len, GFP_ATOMIC);
if(!skb)
printk("alloc failed");
skb_reserve(skb, 16);
printk(" new skb:%p", skb->data);
kfree_skb(skb);

/* actually it is something like this in my real code:
kfree(*pskb);
return NF_STOLEN;
*/
return NF_ACCEPT;
}

static struct nf_hook_ops pre =
{
{NULL, NULL}, /* link list */
prehook, /* the hook fn */
PF_INET, /* protocol family */
NF_IP_PRE_ROUTING, /* hook no */
NF_IP_PRI_CONNTRACK /* priority */
};


static int init_module(void) {
nf_register_hook(&pre);
}

static void cleanup_module(void) {
nf_unregister_hook(&pre);
}


2001-04-10 06:07:44

by Andi Kleen

[permalink] [raw]
Subject: Re: skb allocation problems

On Mon, Apr 09, 2001 at 07:03:46PM +0300, [email protected] wrote:
> I have written a test module which closely mirrors what my code tries to
> do(attached below). This is what i get:
>
> PRE_R: old skb:c371ee40 new skb:c371ee30

I guess oldskb->len is <=0xc, and the slab allocator packs them near together
in the same zone.

> printk("\nPRE_R: old skb:%p", (*pskb)->data);
>
> /* translation happens in the real code here */
>
> skb = alloc_skb((*pskb)->len, GFP_ATOMIC);
> if(!skb)
> printk("alloc failed");

I guess you want a return here.

> skb_reserve(skb, 16);

You cannot do that if you didn't make sure that the old skb had enough
room for it (or rather it'll sometimes panic)


-Andi

2001-04-10 16:27:56

by Imran.Patel

[permalink] [raw]
Subject: RE: skb allocation problems

> On Mon, Apr 09, 2001 at 07:03:46PM +0300, [email protected] wrote:
> > I have written a test module which closely mirrors what my
> code tries to
> > do(attached below). This is what i get:
> >
> > PRE_R: old skb:c371ee40 new skb:c371ee30
>
> I guess oldskb->len is <=0xc, and the slab allocator packs
> them near together
> in the same zone.

nope. i have checked it, the length of the older skb is perfectly ok.....and
i even found that this weird behaviour happens only when the old skb buffer
length is between 80 and 224 bytes.

and apart from this problem, i even found out that when i send a packet >=
186 bytes the kernel allocates a lot of memory in the skb...the output from
a simple test module is below:

PRE_R: skb: c7aa0da0 skblen: 185 headroom: 32 tailroom: 7 total_len: 224
PRE_R: skb: c7a43020 skblen: 186 headroom: 32 tailroom: 1334 total_len: 1552

This was generated by sending ipv4 ping packets of different sizes...I
tested it using 2.4.1 and 2.4.3 on two machines (both dual stack).
What the hell's going on :( ???

imran

2001-04-10 17:23:59

by Andi Kleen

[permalink] [raw]
Subject: Re: skb allocation problems

On Tue, Apr 10, 2001 at 07:27:29PM +0300, [email protected] wrote:
> > On Mon, Apr 09, 2001 at 07:03:46PM +0300, [email protected] wrote:
> > > I have written a test module which closely mirrors what my
> > code tries to
> > > do(attached below). This is what i get:
> > >
> > > PRE_R: old skb:c371ee40 new skb:c371ee30
> >
> > I guess oldskb->len is <=0xc, and the slab allocator packs
> > them near together
> > in the same zone.
>
> nope. i have checked it, the length of the older skb is perfectly ok.....and
> i even found that this weird behaviour happens only when the old skb buffer
> length is between 80 and 224 bytes.

Well, I don't know then. You have to debug it. It's probably something stupid
(if fundamental services like alloc_skb/kfree_skb were completely buggy
someone surely would have noticed earlier)

-Andi