2009-11-03 17:50:19

by Valdis Klētnieks

[permalink] [raw]
Subject: 2.6.32-rc5-mmotm1101 - kernel BUG at net/ipv4/tcp_input.c:3707!

Seen right after I started 'fetchmail'. Reproducible - 3 out of 3.
I'll bisect this tonight if nobody jumps up and yells they know what it is...

Looking at the traceback, I wonder if we started sending the SYN packet,
but didn't finish the paperwork before the SYN/ACK came back?

[ 87.269743] ------------[ cut here ]------------
[ 87.270011] kernel BUG at net/ipv4/tcp_input.c:3707!
[ 87.270011] invalid opcode: 0000 [#1] PREEMPT SMP
[ 87.270011] last sysfs file: /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0A:00/power_supply/BAT0/charge_full
[ 87.270011] CPU 0
[ 87.270011] Modules linked in: ext4 jbd2 crc16 [last unloaded: microcode]
[ 87.270011] Pid: 2421, comm: fetchmail Not tainted 2.6.32-rc5-mmotm1101 #1 Latitude D820
[ 87.270011] RIP: 0010:[<ffffffff813d13c2>] [<ffffffff813d13c2>] tcp_parse_options+0x62/0x273
[ 87.270011] RSP: 0018:ffff880002603af8 EFLAGS: 00010202
[ 87.270011] RAX: 0000000000000001 RBX: ffff880002603b78 RCX: 000000000000000a
[ 87.270011] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81ad9fb0
[ 87.270011] RBP: ffff880002603b48 R08: ffff88007e2d9000 R09: 0000000000000001
[ 87.270011] R10: 00000000000006f6 R11: ffff880002603a78 R12: 0000000000000000
[ 87.270011] R13: ffff88007eb0ece8 R14: 0000000000000000 R15: ffff88007e693168
[ 87.270011] FS: 00007fa02ff827c0(0000) GS:ffff880002600000(0000) knlGS:0000000000000000
[ 87.270011] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 87.270011] CR2: 00000032cf7411e0 CR3: 000000007e41d000 CR4: 00000000000006f0
[ 87.270011] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 87.270011] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 87.270011] Process fetchmail (pid: 2421, threadinfo ffff880079234000, task ffff88007d2bc300)
[ 87.270011] Stack:
[ 87.270011] ffff880002603b48 ffffffff00000001 000000000000000a ffffffff00000000
[ 87.270011] <0> 0000000000000000 ffff88007903bbc0 ffff88007e693168 ffff88007c6eb2c0
[ 87.270011] <0> 0000000000000000 0000000000000000 ffff880002603bc8 ffffffff81430486
[ 87.270011] Call Trace:
[ 87.270011] <IRQ>
[ 87.270011] [<ffffffff81430486>] tcp_v6_conn_request+0x171/0x3cb
[ 87.270011] [<ffffffff813d3bf8>] tcp_rcv_state_process+0x5f/0x857
[ 87.270011] [<ffffffff8142ff4d>] tcp_v6_do_rcv+0x313/0x445
[ 87.270011] [<ffffffff814bd489>] ? _spin_lock_nested+0x42/0x49
[ 87.270011] [<ffffffff8143155b>] ? tcp_v6_rcv+0x2aa/0x61c
[ 87.270011] [<ffffffff814316f6>] tcp_v6_rcv+0x445/0x61c
[ 87.270011] [<ffffffff81414bb9>] ip6_input_finish+0x1bf/0x31f
[ 87.270011] [<ffffffff81414d66>] ip6_input+0x4d/0x54
[ 87.270011] [<ffffffff81414502>] ip6_rcv_finish+0x22/0x26
[ 87.270011] [<ffffffff81414964>] ipv6_rcv+0x45e/0x4b4
[ 87.270011] [<ffffffff81391662>] netif_receive_skb+0x29e/0x2c8
[ 87.270011] [<ffffffff81391717>] process_backlog+0x8b/0xc1
[ 87.270011] [<ffffffff81391e6a>] net_rx_action+0xed/0x2b0
[ 87.270011] [<ffffffff8108518a>] ? handle_edge_irq+0x16a/0x176
[ 87.270011] [<ffffffff81041c2a>] __do_softirq+0x127/0x23c
[ 87.270011] [<ffffffff813922f4>] ? rcu_read_unlock_bh+0x21/0x23
[ 87.270011] [<ffffffff8100347c>] call_softirq+0x1c/0x34
[ 87.270011] <EOI>
[ 87.270011] [<ffffffff810049cc>] do_softirq+0x44/0xf0
[ 87.270011] [<ffffffff813922f4>] ? rcu_read_unlock_bh+0x21/0x23
[ 87.270011] [<ffffffff810414c7>] _local_bh_enable_ip+0x120/0x16e
[ 87.270011] [<ffffffff8104152d>] local_bh_enable+0xd/0xf
[ 87.270011] [<ffffffff813922f4>] rcu_read_unlock_bh+0x21/0x23
[ 87.270011] [<ffffffff81392e10>] dev_queue_xmit+0x3e4/0x408
[ 87.270011] [<ffffffff81392b70>] ? dev_queue_xmit+0x144/0x408
[ 87.270011] [<ffffffff8139afa8>] neigh_resolve_output+0x1ef/0x240
[ 87.270011] [<ffffffff81410f50>] ? ip6_output_finish+0x0/0xfc
[ 87.270011] [<ffffffff81410fec>] ip6_output_finish+0x9c/0xfc
[ 87.270011] [<ffffffff8141242d>] ip6_output2+0x2bf/0x2c8
[ 87.270011] [<ffffffff81413107>] ip6_output+0xcd1/0xce6
[ 87.270011] [<ffffffff813ab63a>] ? rcu_read_unlock+0x21/0x23
[ 87.270011] [<ffffffff813ab89b>] ? nf_hook_slow+0xca/0xdb
[ 87.270011] [<ffffffff81410c9c>] ? dst_output+0x0/0xd
[ 87.270011] [<ffffffff81410ca7>] dst_output+0xb/0xd
[ 87.270011] [<ffffffff81413517>] ip6_xmit+0x3fb/0x4d4
[ 87.270011] [<ffffffff8143dba6>] ? __inet6_hash+0xe5/0x122
[ 87.270011] [<ffffffff814358b2>] inet6_csk_xmit+0x265/0x274
[ 87.270011] [<ffffffff811caa4c>] ? _raw_spin_lock+0xe9/0x1ab
[ 87.270011] [<ffffffff813d557d>] tcp_transmit_skb+0x816/0x85f
[ 87.270011] [<ffffffff813d6ed5>] tcp_connect+0x3ae/0x409
[ 87.270011] [<ffffffff8142f74f>] tcp_v6_connect+0x4f0/0x55e
[ 87.270011] [<ffffffff813e78d0>] inet_stream_connect+0xa0/0x268
[ 87.270011] [<ffffffff81383c71>] sys_connect+0x75/0x98
[ 87.270011] [<ffffffff810e2328>] ? path_put+0x1d/0x22
[ 87.270011] [<ffffffff81066193>] ? trace_hardirqs_on_caller+0x16/0x13c
[ 87.270011] [<ffffffff8107fdf1>] ? audit_syscall_entry+0xcb/0x19c
[ 87.270011] [<ffffffff8100246b>] system_call_fastpath+0x16/0x1b
[ 87.270011] Code: e9 04 4d 85 f6 88 4d c0 0f 94 c2 31 c0 45 85 e4 0f 94 c0 21 d0 31 d2 89 c6 89 45 b8 e8 70 6a cc ff 8b 45 b8 8a 4d c0 85 c0 74 04 <0f> 0b eb fe 0f b6 d1 49 83 c7 38 8d 14 95 ec ff ff ff 49 8d 45
[ 87.270011] RIP [<ffffffff813d13c2>] tcp_parse_options+0x62/0x273
[ 87.270011] RSP <ffff880002603af8>


Attachments:
(No filename) (227.00 B)

2009-11-03 18:03:10

by Eric Dumazet

[permalink] [raw]
Subject: Re: 2.6.32-rc5-mmotm1101 - kernel BUG at net/ipv4/tcp_input.c:3707!

[email protected] a ?crit :
> Seen right after I started 'fetchmail'. Reproducible - 3 out of 3.
> I'll bisect this tonight if nobody jumps up and yells they know what it is...
>
> Looking at the traceback, I wonder if we started sending the SYN packet,
> but didn't finish the paperwork before the SYN/ACK came back?
>
> [ 87.269743] ------------[ cut here ]------------
> [ 87.270011] kernel BUG at net/ipv4/tcp_input.c:3707!
> [ 87.270011] invalid opcode: 0000 [#1] PREEMPT SMP
> [ 87.270011] last sysfs file: /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0C0A:00/power_supply/BAT0/charge_full
> [ 87.270011] CPU 0
> [ 87.270011] Modules linked in: ext4 jbd2 crc16 [last unloaded: microcode]
> [ 87.270011] Pid: 2421, comm: fetchmail Not tainted 2.6.32-rc5-mmotm1101 #1 Latitude D820
> [ 87.270011] RIP: 0010:[<ffffffff813d13c2>] [<ffffffff813d13c2>] tcp_parse_options+0x62/0x273
> [ 87.270011] RSP: 0018:ffff880002603af8 EFLAGS: 00010202
> [ 87.270011] RAX: 0000000000000001 RBX: ffff880002603b78 RCX: 000000000000000a
> [ 87.270011] RDX: 0000000000000000 RSI: 0000000000000001 RDI: ffffffff81ad9fb0
> [ 87.270011] RBP: ffff880002603b48 R08: ffff88007e2d9000 R09: 0000000000000001
> [ 87.270011] R10: 00000000000006f6 R11: ffff880002603a78 R12: 0000000000000000
> [ 87.270011] R13: ffff88007eb0ece8 R14: 0000000000000000 R15: ffff88007e693168
> [ 87.270011] FS: 00007fa02ff827c0(0000) GS:ffff880002600000(0000) knlGS:0000000000000000
> [ 87.270011] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 87.270011] CR2: 00000032cf7411e0 CR3: 000000007e41d000 CR4: 00000000000006f0
> [ 87.270011] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 87.270011] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 87.270011] Process fetchmail (pid: 2421, threadinfo ffff880079234000, task ffff88007d2bc300)
> [ 87.270011] Stack:
> [ 87.270011] ffff880002603b48 ffffffff00000001 000000000000000a ffffffff00000000
> [ 87.270011] <0> 0000000000000000 ffff88007903bbc0 ffff88007e693168 ffff88007c6eb2c0
> [ 87.270011] <0> 0000000000000000 0000000000000000 ffff880002603bc8 ffffffff81430486
> [ 87.270011] Call Trace:
> [ 87.270011] <IRQ>
> [ 87.270011] [<ffffffff81430486>] tcp_v6_conn_request+0x171/0x3cb
> [ 87.270011] [<ffffffff813d3bf8>] tcp_rcv_state_process+0x5f/0x857
> [ 87.270011] [<ffffffff8142ff4d>] tcp_v6_do_rcv+0x313/0x445
> [ 87.270011] [<ffffffff814bd489>] ? _spin_lock_nested+0x42/0x49
> [ 87.270011] [<ffffffff8143155b>] ? tcp_v6_rcv+0x2aa/0x61c
> [ 87.270011] [<ffffffff814316f6>] tcp_v6_rcv+0x445/0x61c
> [ 87.270011] [<ffffffff81414bb9>] ip6_input_finish+0x1bf/0x31f
> [ 87.270011] [<ffffffff81414d66>] ip6_input+0x4d/0x54
> [ 87.270011] [<ffffffff81414502>] ip6_rcv_finish+0x22/0x26
> [ 87.270011] [<ffffffff81414964>] ipv6_rcv+0x45e/0x4b4
> [ 87.270011] [<ffffffff81391662>] netif_receive_skb+0x29e/0x2c8
> [ 87.270011] [<ffffffff81391717>] process_backlog+0x8b/0xc1
> [ 87.270011] [<ffffffff81391e6a>] net_rx_action+0xed/0x2b0
> [ 87.270011] [<ffffffff8108518a>] ? handle_edge_irq+0x16a/0x176
> [ 87.270011] [<ffffffff81041c2a>] __do_softirq+0x127/0x23c
> [ 87.270011] [<ffffffff813922f4>] ? rcu_read_unlock_bh+0x21/0x23
> [ 87.270011] [<ffffffff8100347c>] call_softirq+0x1c/0x34
> [ 87.270011] <EOI>
> [ 87.270011] [<ffffffff810049cc>] do_softirq+0x44/0xf0
> [ 87.270011] [<ffffffff813922f4>] ? rcu_read_unlock_bh+0x21/0x23
> [ 87.270011] [<ffffffff810414c7>] _local_bh_enable_ip+0x120/0x16e
> [ 87.270011] [<ffffffff8104152d>] local_bh_enable+0xd/0xf
> [ 87.270011] [<ffffffff813922f4>] rcu_read_unlock_bh+0x21/0x23
> [ 87.270011] [<ffffffff81392e10>] dev_queue_xmit+0x3e4/0x408
> [ 87.270011] [<ffffffff81392b70>] ? dev_queue_xmit+0x144/0x408
> [ 87.270011] [<ffffffff8139afa8>] neigh_resolve_output+0x1ef/0x240
> [ 87.270011] [<ffffffff81410f50>] ? ip6_output_finish+0x0/0xfc
> [ 87.270011] [<ffffffff81410fec>] ip6_output_finish+0x9c/0xfc
> [ 87.270011] [<ffffffff8141242d>] ip6_output2+0x2bf/0x2c8
> [ 87.270011] [<ffffffff81413107>] ip6_output+0xcd1/0xce6
> [ 87.270011] [<ffffffff813ab63a>] ? rcu_read_unlock+0x21/0x23
> [ 87.270011] [<ffffffff813ab89b>] ? nf_hook_slow+0xca/0xdb
> [ 87.270011] [<ffffffff81410c9c>] ? dst_output+0x0/0xd
> [ 87.270011] [<ffffffff81410ca7>] dst_output+0xb/0xd
> [ 87.270011] [<ffffffff81413517>] ip6_xmit+0x3fb/0x4d4
> [ 87.270011] [<ffffffff8143dba6>] ? __inet6_hash+0xe5/0x122
> [ 87.270011] [<ffffffff814358b2>] inet6_csk_xmit+0x265/0x274
> [ 87.270011] [<ffffffff811caa4c>] ? _raw_spin_lock+0xe9/0x1ab
> [ 87.270011] [<ffffffff813d557d>] tcp_transmit_skb+0x816/0x85f
> [ 87.270011] [<ffffffff813d6ed5>] tcp_connect+0x3ae/0x409
> [ 87.270011] [<ffffffff8142f74f>] tcp_v6_connect+0x4f0/0x55e
> [ 87.270011] [<ffffffff813e78d0>] inet_stream_connect+0xa0/0x268
> [ 87.270011] [<ffffffff81383c71>] sys_connect+0x75/0x98
> [ 87.270011] [<ffffffff810e2328>] ? path_put+0x1d/0x22
> [ 87.270011] [<ffffffff81066193>] ? trace_hardirqs_on_caller+0x16/0x13c
> [ 87.270011] [<ffffffff8107fdf1>] ? audit_syscall_entry+0xcb/0x19c
> [ 87.270011] [<ffffffff8100246b>] system_call_fastpath+0x16/0x1b
> [ 87.270011] Code: e9 04 4d 85 f6 88 4d c0 0f 94 c2 31 c0 45 85 e4 0f 94 c0 21 d0 31 d2 89 c6 89 45 b8 e8 70 6a cc ff 8b 45 b8 8a 4d c0 85 c0 74 04 <0f> 0b eb fe 0f b6 d1 49 83 c7 38 8d 14 95 ec ff ff ff 49 8d 45
> [ 87.270011] RIP [<ffffffff813d13c2>] tcp_parse_options+0x62/0x273
> [ 87.270011] RSP <ffff880002603af8>
>


BUG_ON(!estab && !dst);

Probably comes from commit 022c3f7d82f0f1c68018696f2f027b87b9bb45c2

(Allow tcp_parse_options to consult dst entry)

CC Gilad Ben-Yossef <[email protected]> and Ori Finkelman <[email protected]> for a diagnostic



2009-11-03 19:44:08

by Gilad Ben-Yossef

[permalink] [raw]
Subject: Re: 2.6.32-rc5-mmotm1101 - kernel BUG at net/ipv4/tcp_input.c:3707!

Eric Dumazet wrote:

> [email protected] a ?crit :
>
>> Seen right after I started 'fetchmail'. Reproducible - 3 out of 3.
>> I'll bisect this tonight if nobody jumps up and yells they know what it is...
>>
Bah... this is most probably my fault. Sorry about that.

Can you please try the patch in the next email?

But also, can you please send me the route table in effect when this
happened and the fetchmail command line/config (removing any passwords
or account details of course)? I want to understand better when this
happens.

Thanks!
Gilad


--
Gilad Ben-Yossef
Chief Coffee Drinker & CTO
Codefidence Ltd.

Web: http://codefidence.com
Cell: +972-52-8260388
Skype: gilad_codefidence
Tel: +972-8-9316883 ext. 201
Fax: +972-8-9316884
Email: [email protected]

Check out our Open Source technology and training blog - http://tuxology.net

"The biggest risk you can take it is to take no risk."
-- Mark Zuckerberg and probably others

2009-11-03 19:44:06

by Gilad Ben-Yossef

[permalink] [raw]
Subject: [PATCH 1/1] Use defaults when no route options are available

Trying to parse the option of a SYN packet that we have
no route entry for should just use global wide defaults
for route entry options.

Signed-off-by: Gilad Ben-Yossef <[email protected]>
---
include/net/dst.h | 2 +-
net/ipv4/tcp_input.c | 2 --
2 files changed, 1 insertions(+), 3 deletions(-)

diff --git a/include/net/dst.h b/include/net/dst.h
index b562be3..0654c27 100644
--- a/include/net/dst.h
+++ b/include/net/dst.h
@@ -114,7 +114,7 @@ dst_metric(const struct dst_entry *dst, int metric)
static inline u32
dst_feature(const struct dst_entry *dst, u32 feature)
{
- return dst_metric(dst, RTAX_FEATURES) & feature;
+ return (dst ? dst_metric(dst, RTAX_FEATURES) & feature : 0);
}

static inline u32 dst_mtu(const struct dst_entry *dst)
diff --git a/net/ipv4/tcp_input.c b/net/ipv4/tcp_input.c
index 4262da5..57e99e1 100644
--- a/net/ipv4/tcp_input.c
+++ b/net/ipv4/tcp_input.c
@@ -3704,8 +3704,6 @@ void tcp_parse_options(struct sk_buff *skb, struct tcp_options_received *opt_rx,
struct tcphdr *th = tcp_hdr(skb);
int length = (th->doff * 4) - sizeof(struct tcphdr);

- BUG_ON(!estab && !dst);
-
ptr = (unsigned char *)(th + 1);
opt_rx->saw_tstamp = 0;

--
1.5.6.3

2009-11-03 21:34:17

by Ilpo Järvinen

[permalink] [raw]
Subject: Re: 2.6.32-rc5-mmotm1101 - kernel BUG at net/ipv4/tcp_input.c:3707!

On Tue, 3 Nov 2009, Gilad Ben-Yossef wrote:

> Eric Dumazet wrote:
>
> > [email protected] a ?crit :
> >
> > > Seen right after I started 'fetchmail'. Reproducible - 3 out of 3.
> > > I'll bisect this tonight if nobody jumps up and yells they know what it
> > > is...
> > >
> Bah... this is most probably my fault. Sorry about that.
>
> Can you please try the patch in the next email?
>
> But also, can you please send me the route table in effect when this happened
> and the fetchmail command line/config (removing any passwords or account
> details of course)? I want to understand better when this happens.

According to the stacktrace, it came from ipv6 side which doesn't have any
null checking what so ever atm (you only handled ipv4 correctly). ...You
should be a bit more careful next time when adding any BUG_ONs...

--
i.

2009-11-04 02:01:51

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.32-rc5-mmotm1101 - kernel BUG at net/ipv4/tcp_input.c:3707!

On Tue, 03 Nov 2009 23:34:19 +0200, Ilpo J?€rvinen said:

> On Tue, 3 Nov 2009, Gilad Ben-Yossef wrote:
>
> > Eric Dumazet wrote:
> >
> > > [email protected] a ?crit :
> > >
> > > > Seen right after I started 'fetchmail'. Reproducible - 3 out of 3.
> > > > I'll bisect this tonight if nobody jumps up and yells they know what it
> > > > is...
> > > >
> > Bah... this is most probably my fault. Sorry about that.
> >
> > Can you please try the patch in the next email?

Tried while at home, machine panic'ed. No netconsole here at the moment, sorry.

> > But also, can you please send me the route table in effect when this happened
> > and the fetchmail command line/config (removing any passwords or account
> > details of course)? I want to understand better when this happens.

% route -n -A inet
Kernel IP routing table
Destination Gateway Genmask Flags Metric Ref Use Iface
128.173.12.0 0.0.0.0 255.255.252.0 U 0 0 0 eth0
0.0.0.0 128.173.12.1 0.0.0.0 UG 0 0 0 eth0
% route -n -A inet6
Kernel IPv6 routing table
Destination Next Hop Flags Metric Ref Use Iface
::1/128 :: U 0 687 1 lo
2001:468:c80:2103:215:c5ff:fec8:334e/128 :: U 0 53533 1 lo
2001:468:c80:2103::/64 :: UA 256 316 0 eth0
fe80::215:c5ff:fec8:334e/128 :: U 0 176 1 lo
fe80::218:deff:fe9c:24e0/128 :: U 0 0 1 lo
fe80::/64 :: U 256 0 0 eth0
fe80::/64 :: U 256 0 0 wlan0
ff02::1/128 ff02::1 UC 0 451 0 eth0
ff00::/8 :: U 256 0 0 eth0
ff00::/8 :: U 256 0 0 wlan0
::/0 fe80::20f:35ff:fe3e:d41a UGDA 1024 2082 1 eth0

Command line was just 'fetchmail'. Relevant .fetchmailrc:

set postmaster "valdis"
set bouncemail
set no spambounce
set properties ""
set daemon 240
poll imap.vt.edu with proto IMAP and options
user 'valdis' there with password 'redacted' is 'valdis' here ssl fetchsizelimit 0 smtpaddress turing-police.cc.vt.edu

(imap.vt.edu is 198.82.183.77 - so off the local subnet)

> According to the stacktrace, it came from ipv6 side which doesn't have any
> null checking what so ever atm (you only handled ipv4 correctly). ...You
> should be a bit more careful next time when adding any BUG_ONs...

Why was this blowing chunks in the IPv6 when I was making an IPv4 connection
then? I just noticed that. My fetchmail can't make an IPv6 TCP connection to
our IMAP server because the server isn't v6-enabled yet. And although I
contact our DNS over IPv6, but that's UDP not TCP.


Attachments:
(No filename) (227.00 B)

2009-11-04 02:34:11

by David Miller

[permalink] [raw]
Subject: Re: 2.6.32-rc5-mmotm1101 - kernel BUG at net/ipv4/tcp_input.c:3707!

From: [email protected]
Date: Tue, 03 Nov 2009 21:01:33 -0500

> Why was this blowing chunks in the IPv6 when I was making an IPv4 connection
> then? I just noticed that. My fetchmail can't make an IPv6 TCP connection to
> our IMAP server because the server isn't v6-enabled yet. And although I
> contact our DNS over IPv6, but that's UDP not TCP.

Many things IPV6 capable will go through the IPV4 compat
path of the IPV6 stack, depending upon how listening sockets
configure themselves etc.

2009-11-04 06:27:23

by Gilad Ben-Yossef

[permalink] [raw]
Subject: Re: 2.6.32-rc5-mmotm1101 - kernel BUG at net/ipv4/tcp_input.c:3707!

Ilpo J?rvinen wrote:

> On Tue, 3 Nov 2009, Gilad Ben-Yossef wrote:
>
>
>> Eric Dumazet wrote:
>>
>>
>>> [email protected] a ?crit :
>>>
>>>
>>>> Seen right after I started 'fetchmail'. Reproducible - 3 out of 3.
>>>> I'll bisect this tonight if nobody jumps up and yells they know what it
>>>> is...
>>>>
>>>>
>> Bah... this is most probably my fault. Sorry about that.
>>
>> Can you please try the patch in the next email?
>>
>> But also, can you please send me the route table in effect when this happened
>> and the fetchmail command line/config (removing any passwords or account
>> details of course)? I want to understand better when this happens.
>>
>
> According to the stacktrace, it came from ipv6 side which doesn't have any
> null checking what so ever atm (you only handled ipv4 correctly). ...You
> should be a bit more careful next time when adding any BUG_ONs...
>
I agree, but for my defense I should add this was not just plain
carelessness, I believed
that the dst_entry cannot be NULL at that location. That was obviously
wrong. :-(

Gilad





--
Gilad Ben-Yossef
Chief Coffee Drinker & CTO
Codefidence Ltd.

Web: http://codefidence.com
Cell: +972-52-8260388
Skype: gilad_codefidence
Tel: +972-8-9316883 ext. 201
Fax: +972-8-9316884
Email: [email protected]

Check out our Open Source technology and training blog - http://tuxology.net

"The biggest risk you can take it is to take no risk."
-- Mark Zuckerberg and probably others

2009-11-04 06:38:24

by Gilad Ben-Yossef

[permalink] [raw]
Subject: Re: 2.6.32-rc5-mmotm1101 - kernel BUG at net/ipv4/tcp_input.c:3707!

[email protected] wrote:

> On Tue, 03 Nov 2009 23:34:19 +0200, Ilpo J?€rvinen said:
>
>
>> On Tue, 3 Nov 2009, Gilad Ben-Yossef wrote:
>>
>>
>>> Eric Dumazet wrote:
>>>
>>>
>>>> [email protected] a ?crit :
>>>>
>>>>
>>>>> Seen right after I started 'fetchmail'. Reproducible - 3 out of 3.
>>>>> I'll bisect this tonight if nobody jumps up and yells they know what it
>>>>> is...
>>>>>
>>>>>
>>> Bah... this is most probably my fault. Sorry about that.
>>>
>>> Can you please try the patch in the next email?
>>>
>
> Tried while at home, machine panic'ed. No netconsole here at the moment, sorry.
>

Ok, thanks.That is ... strange.

I didn't manage to recreate this here with a simple IPv6 set and netcat
as server client but I will try further, but if there is any way to send me
the crash location that would be a big help. Thanks.
>
>
> Why was this blowing chunks in the IPv6 when I was making an IPv4 connection
> then? I just noticed that. My fetchmail can't make an IPv6 TCP connection to
> our IMAP server because the server isn't v6-enabled yet. And although I
> contact our DNS over IPv6, but that's UDP not TCP.
>
>
I don't think the chunk blowing occurred due to the connection to the IMAP
server. That codes deals with incoming SYNs. I guess it happened when
fetchmail
tried to connect to the local mail daemon and this should be happening
over the loopback interface...


Gilad


--
Gilad Ben-Yossef
Chief Coffee Drinker & CTO
Codefidence Ltd.

Web: http://codefidence.com
Cell: +972-52-8260388
Skype: gilad_codefidence
Tel: +972-8-9316883 ext. 201
Fax: +972-8-9316884
Email: [email protected]

Check out our Open Source technology and training blog - http://tuxology.net

"The biggest risk you can take it is to take no risk."
-- Mark Zuckerberg and probably others

2009-11-04 13:27:53

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 1/1] Use defaults when no route options are available

From: Gilad Ben-Yossef <[email protected]>
Date: Tue, 3 Nov 2009 21:21:25 +0200

> Trying to parse the option of a SYN packet that we have
> no route entry for should just use global wide defaults
> for route entry options.
>
> Signed-off-by: Gilad Ben-Yossef <[email protected]>

The tester has indicated that this doesn't solve things
for them. I suspect there is another dependency on 'dst'
not being NULL in another path somewhere.

So until this is fully resolved I'm holding off on applying
this patch.

2009-11-04 14:27:18

by Gilad Ben-Yossef

[permalink] [raw]
Subject: Re: [PATCH 1/1] Use defaults when no route options are available

David Miller wrote:

> From: Gilad Ben-Yossef <[email protected]>
> Date: Tue, 3 Nov 2009 21:21:25 +0200
>
>
>> Trying to parse the option of a SYN packet that we have
>> no route entry for should just use global wide defaults
>> for route entry options.
>>
>> Signed-off-by: Gilad Ben-Yossef <[email protected]>
>>
>
> The tester has indicated that this doesn't solve things
> for them. I suspect there is another dependency on 'dst'
> not being NULL in another path somewhere.
>
> So until this is fully resolved I'm holding off on applying
> this patch.
>
>
Yes, indeed.

I managed to replicate it here. Looking into it right now.

Thanks,
Gilad


--
Gilad Ben-Yossef
Chief Coffee Drinker & CTO
Codefidence Ltd.

Web: http://codefidence.com
Cell: +972-52-8260388
Skype: gilad_codefidence
Tel: +972-8-9316883 ext. 201
Fax: +972-8-9316884
Email: [email protected]

Check out our Open Source technology and training blog - http://tuxology.net

"The biggest risk you can take it is to take no risk."
-- Mark Zuckerberg and probably others

2009-11-04 16:40:57

by Gilad Ben-Yossef

[permalink] [raw]
Subject: [PATCH testing] Do not call IPv4 specific func in tcp_check_req

Calling IPv4 specific inet_csk_route_req in tcp_check_req
is a bad idea and crashes machine on IPv6 connections, as reported
by Valdis Kletnieks

Also, all we are really interested in is the timestamp
option in the header, so calling tcp_parse_options()
with the "estab" set to false flag is an overkill as
it tries to parse half a dozen other TCP options.

We know whether timestamp should be enabled or not
using data from request_sock.

Signed-off-by: Gilad Ben-Yossef <[email protected]>
---
net/ipv4/tcp_minisocks.c | 9 +++------
1 files changed, 3 insertions(+), 6 deletions(-)

diff --git a/net/ipv4/tcp_minisocks.c b/net/ipv4/tcp_minisocks.c
index 8bb560d..c816e50 100644
--- a/net/ipv4/tcp_minisocks.c
+++ b/net/ipv4/tcp_minisocks.c
@@ -500,11 +500,10 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
int paws_reject = 0;
struct tcp_options_received tmp_opt;
struct sock *child;
- struct dst_entry *dst = inet_csk_route_req(sk, req);

- tmp_opt.saw_tstamp = 0;
- if (th->doff > (sizeof(struct tcphdr)>>2)) {
- tcp_parse_options(skb, &tmp_opt, 0, dst);
+ if ((th->doff > (sizeof(struct tcphdr)>>2)) && (req->ts_recent)) {
+ tmp_opt.tstamp_ok = 1;
+ tcp_parse_options(skb, &tmp_opt, 1, NULL);

if (tmp_opt.saw_tstamp) {
tmp_opt.ts_recent = req->ts_recent;
@@ -517,8 +516,6 @@ struct sock *tcp_check_req(struct sock *sk, struct sk_buff *skb,
}
}

- dst_release(dst);
-
/* Check for pure retransmitted SYN. */
if (TCP_SKB_CB(skb)->seq == tcp_rsk(req)->rcv_isn &&
flg == TCP_FLAG_SYN &&
--
1.5.6.3

2009-11-04 16:43:26

by Gilad Ben-Yossef

[permalink] [raw]
Subject: Re: 2.6.32-rc5-mmotm1101 - kernel BUG at net/ipv4/tcp_input.c:3707!

[email protected] wrote:

> On Tue, 03 Nov 2009 23:34:19 +0200, Ilpo J?€rvinen said:
>
>
>> On Tue, 3 Nov 2009, Gilad Ben-Yossef wrote:
>>
>>
>>> Eric Dumazet wrote:
>>>
>>>
>>>> [email protected] a ?crit :
>>>>
>>>>
>>>>> Seen right after I started 'fetchmail'. Reproducible - 3 out of 3.
>>>>> I'll bisect this tonight if nobody jumps up and yells they know what it
>>>>> is...
>>>>>
>>>>>
>>> Bah... this is most probably my fault. Sorry about that.
>>>
>>> Can you please try the patch in the next email?
>>>
>
> Tried while at home, machine panic'ed. No netconsole here at the moment, sorry.
>

OK, I thing I've got it.

Kindly try the next patch. It goes on top (in addition to) the previous
one. This should fix the crash.

There is still some small cruft in the handling of the per route TCP
options for IPv6 left, which means that the per route options might get
ignored for
incoming IPv6 connections right now. I will fix this if this works.

Thanks!
Gilad


--
Gilad Ben-Yossef
Chief Coffee Drinker & CTO
Codefidence Ltd.

Web: http://codefidence.com
Cell: +972-52-8260388
Skype: gilad_codefidence
Tel: +972-8-9316883 ext. 201
Fax: +972-8-9316884
Email: [email protected]

Check out our Open Source technology and training blog - http://tuxology.net

"The biggest risk you can take it is to take no risk."
-- Mark Zuckerberg and probably others

2009-11-05 02:33:58

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: 2.6.32-rc5-mmotm1101 - kernel BUG at net/ipv4/tcp_input.c:3707!

On Wed, 04 Nov 2009 18:43:20 +0200, Gilad Ben-Yossef said:

> OK, I thing I've got it.
>
> Kindly try the next patch. It goes on top (in addition to) the previous
> one. This should fix the crash.

OK. much better. Have been up for about 25 minutes now, and fetchmail has
pulled down e-mail several times, and no proboems seen.

> There is still some small cruft in the handling of the per route TCP
> options for IPv6 left, which means that the per route options might get
> ignored for
> incoming IPv6 connections right now. I will fix this if this works.

Yell if you want something tested. ;)


Attachments:
(No filename) (227.00 B)

2009-11-05 07:21:50

by David Miller

[permalink] [raw]
Subject: Re: [PATCH testing] Do not call IPv4 specific func in tcp_check_req

From: Gilad Ben-Yossef <[email protected]>
Date: Wed, 4 Nov 2009 18:40:54 +0200

> Calling IPv4 specific inet_csk_route_req in tcp_check_req
> is a bad idea and crashes machine on IPv6 connections, as reported
> by Valdis Kletnieks
>
> Also, all we are really interested in is the timestamp
> option in the header, so calling tcp_parse_options()
> with the "estab" set to false flag is an overkill as
> it tries to parse half a dozen other TCP options.
>
> We know whether timestamp should be enabled or not
> using data from request_sock.
>
> Signed-off-by: Gilad Ben-Yossef <[email protected]>

Applied, thanks.

I assume that your other patch which attempted to fix this isn't
necessary.

2009-11-05 07:22:25

by David Miller

[permalink] [raw]
Subject: Re: 2.6.32-rc5-mmotm1101 - kernel BUG at net/ipv4/tcp_input.c:3707!

From: Gilad Ben-Yossef <[email protected]>
Date: Wed, 04 Nov 2009 18:43:20 +0200

> Kindly try the next patch. It goes on top (in addition to) the
> previous one. This should fix the crash.

Ok, now I know that both patches need to go in, I'll apply
the other one too.

Forget about my confusion in my other reply.

2009-11-05 07:25:06

by David Miller

[permalink] [raw]
Subject: Re: 2.6.32-rc5-mmotm1101 - kernel BUG at net/ipv4/tcp_input.c:3707!

From: [email protected]
Date: Wed, 04 Nov 2009 21:33:12 -0500

> On Wed, 04 Nov 2009 18:43:20 +0200, Gilad Ben-Yossef said:
>
>> OK, I thing I've got it.
>>
>> Kindly try the next patch. It goes on top (in addition to) the previous
>> one. This should fix the crash.
>
> OK. much better. Have been up for about 25 minutes now, and fetchmail has
> pulled down e-mail several times, and no proboems seen.

Thank you for testing.