2000-12-19 14:35:04

by Tom Leete

[permalink] [raw]
Subject: Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking)

Harald Welte wrote:
>
> On Mon, Dec 18, 2000 at 10:11:14AM -0800, David S. Miller wrote:
> > From: Rusty Russell <[email protected]>
> > Date: Mon, 18 Dec 2000 14:15:52 +1100
> >
> > Alexey is right, locking is screwed (explains some reports of
> > occasional failure during rmmod).
> >
> > Patch applied, thank you.
> >
> > I have a patch which compiles for non-linear skb mods to netfilter
> > (NAT uses linear packets still, but connection tracking and packet
> > filtering only linearize minimal requirements). Waiting for
> > DaveM's solution to ip_ct_gather_frags()...
> >
> > I feel it's best to just skb_clone() the skb arg to ip_defrag
> > and this will close the whole thing, I think.
>
> no. The clone()d skb will still have a skb->dev pointing to NULL.
>
> The problem occurs only for locally-generated outgoing packets, which
> need to be fragmented:
>
> - ip_build_xmit() discoveres it has to fragment
> - ip_build_xmit_slow() generates fragments and calls
> - NF_HOOK() for NF_IP_LOCAL_OUT
> - ip_conntrack_local() is called, which in turn calls
> - ip_conntrack_in(), which calls
> - ip_ct_gather_frags(), which calls
> - ip_defrag(), which calls
>
> [now we have two possible oops - caues]
>
> a ip_find(), which calls
> a ip_frag_create(), which initialises a timer with the function
> a ip_expire(), which dereferences qp->iif
>
> b ip_frag_queue(), which dereferences it in qp->iif= skb->dev->ifindex
>
> as andi kleen pointed out:
>
> > > Also is it sure that the backtrace involves ip_rcv ? A more likely
> > > guess is that it happens during the IP_LOCAL_OUT hook, when skb->dev
> > > isn't set yet, but conntrack already has to already reassemble fragments.
>
>
> > Actually, I do not understand how current code could even have worked
> > in the past. Once the SKB is passed to ip_defrag, it is nobody's
> > buisness to reference that SKB anymore. This ip_defrag call is (from
>
> mmh... we really don't do this. We use the return value of ip_defrag(),
> which is what ip_frag_reasm() returns (== the new datagram consisting
> out of all its fragments).
>
> > Alexey, what have I missed? I don't like the ip_fragment.c proposed
> > fix for this reason, what netfilter is doing with ip_defrag here looks
> > just wrong.
>
> Well, my conclusion is:
>
> - the defragmentation code in the ipv4 stack assumes that skb->dev points
> to a valid device. It does this primaryly to send icmp reassembly errors
> if a fragment reassembly timeout occurs
>
> - netfilter wants to use the ip_defrag for defragmentation, not only for
> incoming packets from the network at NF_IP_PRE_ROUTING, but also for
> locally-generated fragmented packets (at NF_IP_LOCAL_OUT). Unfortunately
> we don't have a valid skb->dev at this point.
>
> I don't think that there is any skb_clone()ing needed, nor this would solve
> the problem. The solution is
>
> a) netfilter conntrack (more exactly: ip_ct_gather_frags) sets skb->dev to
> something valid (skb->dst->dev). Dirty hack.
>
> b) make ip_defrag(), ... aware of the case where skb->dev == NULL. Sounds
> like a good idea, since it is only one if(skb->dev) clause.
>
> c) netfilter stops using ip_defrag() for this case. Bad idea, it had to
> reinvent the wheel :(
>
> - Harald Welte / [email protected] http://www.gnumonks.org

I'm now testing this small patch based on your suggestion b). I have
netfilter running, nfs mounts are behaving well, and fragmented pings dont
kill the box. I made no attempt to resolve anything but the derefernce of a
netdevice *dev.= NULL

The patch only deals with the ip_defrag_queue path. I have not seen the
alternate one happen. It's only been up an hour, I'll put some more time on
it.

Cheers,
Tom

--- linux/net/ipv4/ip_fragment.c~ Tue Dec 12 06:56:29 2000
+++ linux/net/ipv4/ip_fragment.c Tue Dec 19 07:29:53 2000
@@ -485,7 +485,8 @@
else
qp->fragments = skb;

- qp->iif = skb->dev->ifindex;
+ if (skb->dev)
+ qp->iif = skb->dev->ifindex;
skb->dev = NULL;
qp->meat += skb->len;
atomic_add(skb->truesize, &ip_frag_mem);


2000-12-19 14:59:52

by David Miller

[permalink] [raw]
Subject: Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter locking)

Date: Tue, 19 Dec 2000 08:55:35 -0500
From: Tom Leete <[email protected]>

The patch only deals with the ip_defrag_queue path. I have not seen the
alternate one happen. It's only been up an hour, I'll put some more time on
it.

--- linux/net/ipv4/ip_fragment.c~ Tue Dec 12 06:56:29 2000
+++ linux/net/ipv4/ip_fragment.c Tue Dec 19 07:29:53 2000
@@ -485,7 +485,8 @@

This is basically what is in my tree right now. However, there was
one reporter who claimed that after this kind of change he still was
able to lockup/OOPS his machine by logging into X as a user who had
his home directory over NFS. This was with netfilter enabled as well
so it has to be the same bug.

Later,
David S. Miller
[email protected]

2000-12-19 15:26:15

by Alexey Kuznetsov

[permalink] [raw]
Subject: Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter

Hello!

> able to lockup/OOPS his machine by logging into X as a user who had
> his home directory over NFS.

I believe this report is to be ignored. It is fully meaningless.
X has nothing to do with NFS, NFS is with X, and defragmenter is
at least with one of them.

Alexey

2000-12-20 02:16:17

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter

how is this meaningless?

This just confirms what I and others have found in test12 wrt to the
netfilter issue.

[email protected] wrote:
>
> Hello!
>
> > able to lockup/OOPS his machine by logging into X as a user who had
> > his home directory over NFS.
>
> I believe this report is to be ignored. It is fully meaningless.
> X has nothing to do with NFS, NFS is with X, and defragmenter is
> at least with one of them.

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-12-20 02:22:40

by David Miller

[permalink] [raw]
Subject: Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter

Date: Tue, 19 Dec 2000 20:45:14 -0500
From: "Mohammad A. Haque" <[email protected]>

how is this meaningless?

This just confirms what I and others have found in test12 wrt to the
netfilter issue.

Alexey is talking about test12/netfilter + our ip_fragment.c fix,
not vanilla test12.

Later,
David S. Miller
[email protected]

2000-12-20 03:18:08

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: ip_defrag / ip_conntrack issues (was Re: [PATCH] Fix netfilter

Woops, my bad. These finally exams are getting to me.

Sorry.

"David S. Miller" wrote:
> Alexey is talking about test12/netfilter + our ip_fragment.c fix,
> not vanilla test12.

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================