2001-11-27 11:45:49

by Rolf Fokkens

[permalink] [raw]
Subject: [BUG] vanilla 2.4.15 iptables/REDIRECT kernel oops

I got another kernel oops related to iptables/REDIRECT, this time it's a plain vanilla kernel 2.4.15

iptables -t nat -I PREROUTING -p tcp --dst 145.66.1.1 \
--dport 1080 -j REDIRECT --to 80 > /dev/null 2>&1
iptables -t nat -I OUTPUT -p tcp --dst 145.66.1.1 \
--dport 1080 -j REDIRECT --to 80 > /dev/null 2>&1

This redirects connects to the machine itself (145.66.1.1) port 1080 to port
80, which is where apache listens. Connecting to http://145.66.1.1:1080/ seems
to result in the reported oops, as reported before. It's very reproducable
here, on several machines. So enjoy!

Rolf

[fokkensr@iasdev fokkensr]$ ksymoops -k ksyms-2.4.15 -i -m /boot/System.map-2.4.15 -o /lib/modules/2.4.15/ < oops-2.4.15.txt

ksymoops 2.4.3 on i686 2.4.8-clk. Options used
-V (default)
-k ksyms-2.4.15 (specified)
-l /proc/modules (default)
-o /lib/modules/2.4.15/ (specified)
-m /boot/System.map-2.4.15 (specified)
-i

Unable to handle kernel NULL pointer dereference at virtual address 0000034e
c0200cb5
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0200cb5>] not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010246
eax: 000005dc ebx: f76b0ea0 ecx: f76be800 edx: 00000000
esi: f76b30ac edi: 00000000 ebp: f72ab224 esp: f7193d98
ds: 0010 es: 0018 ss: 0018
Stack: 00000001 f7192000 00000003 00000000 f76be800 c01f102d f76b0ea0 f76bc0ac
f72fa6d4 f76b0130 f76bc0c0 f7193dd8 c034bed8 c01ffb46 00000002 00000003
f76b0ea0 00000000 f76be800 c0200c14 0000003c f72fa6d4 f76b0130 f76bc0c0
Call Trace: <c01f102d> <c01ffb46> <c0200c14> <c0198c23> <c0216179>
<c0210238> <c02107f2> <c0210b23> <c020617f> <c0223acc> <c0223b06>
<c01e115d> <c0223acc> <c01e137b> <c01437ec> <c01076a3>
Code: 8a 87 4c 03 00 00 3c 02 74 0a 3c 01 75 0b f6 45 20 04 75 05

>>EIP; c0200cb4 <ip_queue_xmit2+a0/22c> <=====
Trace; c01f102c <nf_hook_slow+19c/21c>
Trace; c01ffb46 <ip_queue_xmit+28e/300>
Trace; c0200c14 <ip_queue_xmit2+0/22c>
Trace; c0198c22 <generic_unplug_device+62/b0>
Trace; c0216178 <tcp_v4_send_check+6c/ac>
Trace; c0210238 <tcp_cwnd_restart+18/a4>
Trace; c02107f2 <tcp_transmit_skb+52e/5ec>
Trace; c0210b22 <tcp_push_one+8a/114>
Trace; c020617e <tcp_sendmsg+b3a/1370>
Trace; c0223acc <inet_sendmsg+0/40>
Trace; c0223b06 <inet_sendmsg+3a/40>
Trace; c01e115c <sock_sendmsg+80/a4>
Trace; c0223acc <inet_sendmsg+0/40>
Trace; c01e137a <sock_write+a2/ac>
Trace; c01437ec <sys_write+8c/c4>
Trace; c01076a2 <system_call+2e/34>
Code; c0200cb4 <ip_queue_xmit2+a0/22c>
00000000 <_EIP>:
Code; c0200cb4 <ip_queue_xmit2+a0/22c> <=====
0: 8a 87 4c 03 00 00 mov 0x34c(%edi),%al <=====
Code; c0200cba <ip_queue_xmit2+a6/22c>
6: 3c 02 cmp $0x2,%al
Code; c0200cbc <ip_queue_xmit2+a8/22c>
8: 74 0a je 14 <_EIP+0x14> c0200cc8 <ip_queue_xmit2+b4/22c>
Code; c0200cbe <ip_queue_xmit2+aa/22c>
a: 3c 01 cmp $0x1,%al
Code; c0200cc0 <ip_queue_xmit2+ac/22c>
c: 75 0b jne 19 <_EIP+0x19> c0200ccc <ip_queue_xmit2+b8/22c>
Code; c0200cc2 <ip_queue_xmit2+ae/22c>
e: f6 45 20 04 testb $0x4,0x20(%ebp)
Code; c0200cc6 <ip_queue_xmit2+b2/22c>
12: 75 05 jne 19 <_EIP+0x19> c0200ccc <ip_queue_xmit2+b8/22c>


2001-11-27 15:04:55

by Martin Josefsson

[permalink] [raw]
Subject: Re: [BUG] vanilla 2.4.15 iptables/REDIRECT kernel oops

On Tue, 27 Nov 2001, Rolf Fokkens wrote:

I've forwarded this report to the place where it should be reported, the
[email protected] mailinglist as I havn't seen this report
there.

/Martin

> I got another kernel oops related to iptables/REDIRECT, this time it's a plain vanilla kernel 2.4.15
>
> iptables -t nat -I PREROUTING -p tcp --dst 145.66.1.1 \
> --dport 1080 -j REDIRECT --to 80 > /dev/null 2>&1
> iptables -t nat -I OUTPUT -p tcp --dst 145.66.1.1 \
> --dport 1080 -j REDIRECT --to 80 > /dev/null 2>&1
>
> This redirects connects to the machine itself (145.66.1.1) port 1080 to port
> 80, which is where apache listens. Connecting to http://145.66.1.1:1080/ seems
> to result in the reported oops, as reported before. It's very reproducable
> here, on several machines. So enjoy!
>
> Rolf
>
> [fokkensr@iasdev fokkensr]$ ksymoops -k ksyms-2.4.15 -i -m /boot/System.map-2.4.15 -o /lib/modules/2.4.15/ < oops-2.4.15.txt
>
> ksymoops 2.4.3 on i686 2.4.8-clk. Options used
> -V (default)
> -k ksyms-2.4.15 (specified)
> -l /proc/modules (default)
> -o /lib/modules/2.4.15/ (specified)
> -m /boot/System.map-2.4.15 (specified)
> -i
>
> Unable to handle kernel NULL pointer dereference at virtual address 0000034e
> c0200cb5
> *pde = 00000000
> Oops: 0000
> CPU: 0
> EIP: 0010:[<c0200cb5>] not tainted
> Using defaults from ksymoops -t elf32-i386 -a i386
> EFLAGS: 00010246
> eax: 000005dc ebx: f76b0ea0 ecx: f76be800 edx: 00000000
> esi: f76b30ac edi: 00000000 ebp: f72ab224 esp: f7193d98
> ds: 0010 es: 0018 ss: 0018
> Stack: 00000001 f7192000 00000003 00000000 f76be800 c01f102d f76b0ea0 f76bc0ac
> f72fa6d4 f76b0130 f76bc0c0 f7193dd8 c034bed8 c01ffb46 00000002 00000003
> f76b0ea0 00000000 f76be800 c0200c14 0000003c f72fa6d4 f76b0130 f76bc0c0
> Call Trace: <c01f102d> <c01ffb46> <c0200c14> <c0198c23> <c0216179>
> <c0210238> <c02107f2> <c0210b23> <c020617f> <c0223acc> <c0223b06>
> <c01e115d> <c0223acc> <c01e137b> <c01437ec> <c01076a3>
> Code: 8a 87 4c 03 00 00 3c 02 74 0a 3c 01 75 0b f6 45 20 04 75 05
>
> >>EIP; c0200cb4 <ip_queue_xmit2+a0/22c> <=====
> Trace; c01f102c <nf_hook_slow+19c/21c>
> Trace; c01ffb46 <ip_queue_xmit+28e/300>
> Trace; c0200c14 <ip_queue_xmit2+0/22c>
> Trace; c0198c22 <generic_unplug_device+62/b0>
> Trace; c0216178 <tcp_v4_send_check+6c/ac>
> Trace; c0210238 <tcp_cwnd_restart+18/a4>
> Trace; c02107f2 <tcp_transmit_skb+52e/5ec>
> Trace; c0210b22 <tcp_push_one+8a/114>
> Trace; c020617e <tcp_sendmsg+b3a/1370>
> Trace; c0223acc <inet_sendmsg+0/40>
> Trace; c0223b06 <inet_sendmsg+3a/40>
> Trace; c01e115c <sock_sendmsg+80/a4>
> Trace; c0223acc <inet_sendmsg+0/40>
> Trace; c01e137a <sock_write+a2/ac>
> Trace; c01437ec <sys_write+8c/c4>
> Trace; c01076a2 <system_call+2e/34>
> Code; c0200cb4 <ip_queue_xmit2+a0/22c>
> 00000000 <_EIP>:
> Code; c0200cb4 <ip_queue_xmit2+a0/22c> <=====
> 0: 8a 87 4c 03 00 00 mov 0x34c(%edi),%al <=====
> Code; c0200cba <ip_queue_xmit2+a6/22c>
> 6: 3c 02 cmp $0x2,%al
> Code; c0200cbc <ip_queue_xmit2+a8/22c>
> 8: 74 0a je 14 <_EIP+0x14> c0200cc8 <ip_queue_xmit2+b4/22c>
> Code; c0200cbe <ip_queue_xmit2+aa/22c>
> a: 3c 01 cmp $0x1,%al
> Code; c0200cc0 <ip_queue_xmit2+ac/22c>
> c: 75 0b jne 19 <_EIP+0x19> c0200ccc <ip_queue_xmit2+b8/22c>
> Code; c0200cc2 <ip_queue_xmit2+ae/22c>
> e: f6 45 20 04 testb $0x4,0x20(%ebp)
> Code; c0200cc6 <ip_queue_xmit2+b2/22c>
> 12: 75 05 jne 19 <_EIP+0x19> c0200ccc <ip_queue_xmit2+b8/22c>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2001-11-27 23:03:54

by Rolf Fokkens

[permalink] [raw]
Subject: Re: [BUG] vanilla 2.4.15 iptables/REDIRECT kernel oops

OK. So there hasn't been much response for this BUG report. So I did a little
investigating myself.

By use of objdump I get the impression that the Oops shows up when
ip_dont_fragment is called (expanded). To be more specific: sk seems to be
NULL.

This is rather weird. ip_queue_xmit2 is also the caller of nf_hook_slow (via
ip_queue_xmit) at which time skb->sk must be OK. After nf_hook_slow things
suddenly are wrong.

For the interrested I can send a disassembly listing with matching C lines.

On Tuesday 27 November 2001 07:02, Martin Josefsson wrote:
> On Tue, 27 Nov 2001, Rolf Fokkens wrote:
>
> I've forwarded this report to the place where it should be reported, the
> [email protected] mailinglist as I havn't seen this report
> there.

Haven't had much response from either list. So I just report this to both.

2001-11-28 14:13:31

by Stephan von Krawczynski

[permalink] [raw]
Subject: Re: [BUG] vanilla 2.4.15 iptables/REDIRECT kernel oops

On Tue, 27 Nov 2001 23:55:53 -0800
Rolf Fokkens <[email protected]> wrote:

> OK. So there hasn't been much response for this BUG report. So I did a little

> investigating myself.
>
> By use of objdump I get the impression that the Oops shows up when
> ip_dont_fragment is called (expanded). To be more specific: sk seems to be
> NULL.
>
> This is rather weird. ip_queue_xmit2 is also the caller of nf_hook_slow (via
> ip_queue_xmit) at which time skb->sk must be OK. After nf_hook_slow things
> suddenly are wrong.

Well, I have another impression. Look at this code from ip_queue_xmit2:

if (skb_headroom(skb) < dev->hard_header_len && dev->hard_header) {
struct sk_buff *skb2;

skb2 = skb_realloc_headroom(skb, (dev->hard_header_len + 15) &
~15);
kfree_skb(skb);
if (skb2 == NULL)
return -ENOMEM;
if (sk)
skb_set_owner_w(skb2, sk);
skb = skb2;
iph = skb->nh.iph;
}

What you see here is that the original author obviously thought it may be
possible that sk is NULL. Otherwise he would not have put the "if (sk)" in. We
believe him, that he knows what he's doing and read on.

if (skb->len > rt->u.dst.pmtu)
goto fragment;

if (ip_dont_fragment(sk, &rt->u.dst))
iph->frag_off |= __constant_htons(IP_DF);

Shit. You have it.

static inline
int ip_dont_fragment(struct sock *sk, struct dst_entry *dst)
{
return (sk->protinfo.af_inet.pmtudisc == IP_PMTUDISC_DO ||
(sk->protinfo.af_inet.pmtudisc == IP_PMTUDISC_WANT &&
!(dst->mxlock&(1<<RTAX_MTU))));
}

Obviously this breaks in case of sk == NULL.
This leaves you with the following alternatives:

1) He should do
if (sk && ip_dont_fragment(sk, &rt->u.dst))

or

2) ip_dont_fragment is in itself broken, as it is obviously called with
sk==NULL and should handle that.

Anyway, this is a serious bug, because the code is definitely inconsistent.
ip_dont_fragment shows up 6 times in the source (according to my counting). The
safe patch would look like this:

--- ip.h-orig Wed Nov 28 14:55:50 2001
+++ ip.h Wed Nov 28 14:56:25 2001
@@ -181,9 +181,9 @@
static inline
int ip_dont_fragment(struct sock *sk, struct dst_entry *dst)
{
- return (sk->protinfo.af_inet.pmtudisc == IP_PMTUDISC_DO ||
+ return (sk && ( sk->protinfo.af_inet.pmtudisc == IP_PMTUDISC_DO ||
(sk->protinfo.af_inet.pmtudisc == IP_PMTUDISC_WANT &&
- !(dst->mxlock&(1<<RTAX_MTU))));
+ !(dst->mxlock&(1<<RTAX_MTU)))));
}

extern void __ip_select_ident(struct iphdr *iph, struct dst_entry *dst);



Can you check out?

Can any ip-stack guru comment?

Regards,
Stephan

PS: ip_select_ident looks clean in terms of sk==NULL, BTW.

2001-11-30 10:57:34

by Harald Welte

[permalink] [raw]
Subject: [RESOLVED] [BUG] vanilla 2.4.15 iptables/REDIRECT kernel oops

On Tue, Nov 27, 2001 at 11:55:53PM -0800, Rolf Fokkens wrote:
> OK. So there hasn't been much response for this BUG report. So I did a little
> investigating myself.

The bug was just known in 2.4.15-pre series, but not in the final 2.4.15.

Anyway, 2.4.16 is certainly (we just confirmed that by testing) not affected by
this bug, since the problematic patch has been pulled out again.

--
Live long and prosper
- Harald Welte / [email protected] http://www.gnumonks.org/
============================================================================
GCS/E/IT d- s-: a-- C+++ UL++++$ P+++ L++++$ E--- W- N++ o? K- w--- O- M-
V-- PS+ PE-- Y+ PGP++ t++ 5-- !X !R tv-- b+++ DI? !D G+ e* h+ r% y+(*)

2001-12-06 05:57:32

by Rusty Russell

[permalink] [raw]
Subject: Re: [BUG] vanilla 2.4.15 iptables/REDIRECT kernel oops

On Tue, 27 Nov 2001 12:43:09 +0100
Rolf Fokkens <[email protected]> wrote:

> I got another kernel oops related to iptables/REDIRECT, this time it's a plain vanilla kernel 2.4.15

Hi Rolf!

This bug does not exist in the final 2.4.15: it was my mistake in
the pre- kernels, and was removed for 2.4.15. Please check again.

Hope that helps,
Rusty.