2011-05-04 17:05:52

by TB

[permalink] [raw]
Subject: Divide error in bictcp_cong_avoid ?

We're having this issue sporadically on a few servers and this is the
backtrace we get from netconsole.


[28522.642419] divide error: 0000 [#1] SMP
[28522.642457] last sysfs file:
/sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/vendor
[28522.642504] CPU 0
[28522.642511] Modules linked in:
i2c_i801
i2c_core
evdev
button
[28522.642570]
[28522.642590] Pid: 0, comm: swapper Not tainted 2.6.38.5 #6

Supermicro X8DTH-i/6/iF/6F
/
X8DTH

[28522.642651] RIP: 0010:[<ffffffff8150b27b>] [<ffffffff8150b27b>]
bictcp_cong_avoid+0x21a/0x247
[28522.642708] RSP: 0018:ffff8800bf403a90 EFLAGS: 00010202
[28522.642735] RAX: 0000000000000010 RBX: ffff880352aa6400 RCX:
0000000000000000
[28522.642765] RDX: 0000000000000000 RSI: ffff880352aa67c0 RDI:
0000000000001607
[28522.642795] RBP: 000000007caa5a1b R08: 00000000000035c2 R09:
00000000000000e6
[28522.642825] R10: ffff88003d499c00 R11: ffff880109831b00 R12:
ffffffff817cecd0
[28522.642855] R13: 0000000000000004 R14: 000000000001001b R15:
0000000000000123
[28522.642886] FS: 0000000000000000(0000) GS:ffff8800bf400000(0000)
knlGS:0000000000000000
[28522.642932] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[28522.642959] CR2: 00007fb4c6ffd000 CR3: 000000042e4a1000 CR4:
00000000000006f0
[28522.642990] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[28522.643020] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[28522.643050] Process swapper (pid: 0, threadinfo ffffffff8176c000,
task ffffffff81777020)
[28522.643095] Stack:
[28522.643116] ffff880352aa6400
ffffffff817cecd0
0000000000000004
0000000000000406

[28522.643171] ffff880352aa6400
ffffffff814e3dc5
000000000000111c
ffff88022a9b3200

[28522.643226] 0000000000000000
7caa5a1b7caa4ec3
0000000000000000
0000000000000000

[28522.643281] Call Trace:
[28522.643303] <IRQ>

[28522.643330] [<ffffffff814e3dc5>] ? tcp_ack+0x18b5/0x1a89
[28522.643359] [<ffffffff814e45c2>] ? tcp_rcv_established+0xd1/0xa13
[28522.643389] [<ffffffff814ec60b>] ? tcp_v4_do_rcv+0x1b2/0x382
[28522.643418] [<ffffffff814c95d4>] ? nf_iterate+0x40/0x78
[28522.643446] [<ffffffff814ecc5f>] ? tcp_v4_rcv+0x484/0x797
[28522.643475] [<ffffffff814d11c7>] ? ip_local_deliver_finish+0xab/0x139
[28522.643505] [<ffffffff814ae2b3>] ? __netif_receive_skb+0x31c/0x349
[28522.643535] [<ffffffff814aec82>] ? netif_receive_skb+0x67/0x6d
[28522.643564] [<ffffffff814af1fb>] ? napi_gro_receive+0x9d/0xab
[28522.643592] [<ffffffff814aed57>] ? napi_skb_finish+0x1c/0x31
[28522.643623] [<ffffffff813e4248>] ? igb_poll+0x7d5/0xb2e
[28522.643653] [<ffffffff812b6b22>] ? blk_run_queue+0x23/0x37
[28522.643683] [<ffffffff813520d4>] ? scsi_run_queue+0x2ee/0x381
[28522.643712] [<ffffffff81353810>] ? scsi_io_completion+0x3e0/0x409
[28522.643741] [<ffffffff814af337>] ? net_rx_action+0xa7/0x212
[28522.643771] [<ffffffff8103b6c2>] ? __do_softirq+0xbe/0x184
[28522.643800] [<ffffffff8100364c>] ? call_softirq+0x1c/0x28
[28522.643828] [<ffffffff81005085>] ? do_softirq+0x31/0x63
[28522.643856] [<ffffffff8103b56c>] ? irq_exit+0x36/0x78
[28522.643883] [<ffffffff81004784>] ? do_IRQ+0x98/0xae
[28522.643912] [<ffffffff81562c13>] ? ret_from_intr+0x0/0xe
[28522.643938] <EOI>

[28522.643963] [<ffffffff81009a41>] ? mwait_idle+0xb9/0xf3
[28522.643991] [<ffffffff81001c6e>] ? cpu_idle+0x57/0x8d
[28522.644019] [<ffffffff81801c49>] ? start_kernel+0x34e/0x35a
[28522.644048] [<ffffffff81801398>] ? x86_64_start_kernel+0xf3/0xf9
[28522.644075] Code:
39 c9 76 18 44 29 c9 31 d2 44 89 c8 f7 f1 39 83 c0 03
00 00
76 06 89 83 c0 03 00 00 8b 83 c0 03 00 00 31 d2 c1 e0
04 0f
b7 4e 2c
f7> f1 ba 01 00 00 00 85 c0 0f 45 d0 89 93 c0 03 00
00 8b b3
c0

[28522.644338] RIP
[<ffffffff8150b27b>] bictcp_cong_avoid+0x21a/0x247
[28522.644371] RSP <ffff8800bf403a90>
[28522.644733] ---[ end trace 9db294ef7ff3a7b5 ]---
[28522.644800] Kernel panic - not syncing: Fatal exception in interrupt
[28522.644871] Pid: 0, comm: swapper Tainted: G D 2.6.38.5 #6
[28522.644942] Call Trace:
[28522.645012] <IRQ>
[<ffffffff81560690>] ? panic+0x9d/0x1a0
[28522.645131] [<ffffffff81562c13>] ? ret_from_intr+0x0/0xe
[28522.645200] [<ffffffff810365bb>] ? kmsg_dump+0x46/0xec
[28522.645268] [<ffffffff81006176>] ? oops_end+0x9f/0xac
[28522.645335] [<ffffffff810040d8>] ? do_divide_error+0x7f/0x89
[28522.645404] [<ffffffff8150b27b>] ? bictcp_cong_avoid+0x21a/0x247
[28522.645473] [<ffffffff814b057c>] ? dev_queue_xmit+0x4a4/0x4b2
[28522.645545] [<ffffffff814d5390>] ? ip_queue_xmit+0x2e9/0x32f
[28522.645614] [<ffffffff81003375>] ? divide_error+0x15/0x20
[28522.645685] [<ffffffff8150b27b>] ? bictcp_cong_avoid+0x21a/0x247
[28522.645754] [<ffffffff814e3dc5>] ? tcp_ack+0x18b5/0x1a89
[28522.645823] [<ffffffff814e45c2>] ? tcp_rcv_established+0xd1/0xa13
[28522.645892] [<ffffffff814ec60b>] ? tcp_v4_do_rcv+0x1b2/0x382
[28522.645961] [<ffffffff814c95d4>] ? nf_iterate+0x40/0x78
[28522.646029] [<ffffffff814ecc5f>] ? tcp_v4_rcv+0x484/0x797
[28522.646097] [<ffffffff814d11c7>] ? ip_local_deliver_finish+0xab/0x139
[28522.646167] [<ffffffff814ae2b3>] ? __netif_receive_skb+0x31c/0x349
[28522.646240] [<ffffffff814aec82>] ? netif_receive_skb+0x67/0x6d
[28522.646308] [<ffffffff814af1fb>] ? napi_gro_receive+0x9d/0xab
[28522.646377] [<ffffffff814aed57>] ? napi_skb_finish+0x1c/0x31
[28522.646445] [<ffffffff813e4248>] ? igb_poll+0x7d5/0xb2e
[28522.646513] [<ffffffff812b6b22>] ? blk_run_queue+0x23/0x37
[28522.646582] [<ffffffff813520d4>] ? scsi_run_queue+0x2ee/0x381
[28522.646651] [<ffffffff81353810>] ? scsi_io_completion+0x3e0/0x409
[28522.646721] [<ffffffff814af337>] ? net_rx_action+0xa7/0x212
[28522.646791] [<ffffffff8103b6c2>] ? __do_softirq+0xbe/0x184
[28522.646884] [<ffffffff8100364c>] ? call_softirq+0x1c/0x28
[28522.646953] [<ffffffff81005085>] ? do_softirq+0x31/0x63
[28522.647021] [<ffffffff8103b56c>] ? irq_exit+0x36/0x78
[28522.647089] [<ffffffff81004784>] ? do_IRQ+0x98/0xae
[28522.647164] [<ffffffff81562c13>] ? ret_from_intr+0x0/0xe
[28522.647239] <EOI>
[<ffffffff81009a41>] ? mwait_idle+0xb9/0xf3
[28522.647354] [<ffffffff81001c6e>] ? cpu_idle+0x57/0x8d
[28522.647422] [<ffffffff81801c49>] ? start_kernel+0x34e/0x35a
[28522.647491] [<ffffffff81801398>] ? x86_64_start_kernel+0xf3/0xf9


2011-05-04 17:49:45

by Randy Dunlap

[permalink] [raw]
Subject: Re: Divide error in bictcp_cong_avoid ?

[add cc to netdev]


On Wed, 04 May 2011 12:03:31 -0400 TB wrote:

> We're having this issue sporadically on a few servers and this is the
> backtrace we get from netconsole.
>
>
> [28522.642419] divide error: 0000 [#1] SMP
> [28522.642457] last sysfs file:
> /sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/vendor
> [28522.642504] CPU 0
> [28522.642511] Modules linked in:
> i2c_i801
> i2c_core
> evdev
> button
> [28522.642570]
> [28522.642590] Pid: 0, comm: swapper Not tainted 2.6.38.5 #6
>
> Supermicro X8DTH-i/6/iF/6F
> /
> X8DTH
>
> [28522.642651] RIP: 0010:[<ffffffff8150b27b>] [<ffffffff8150b27b>]
> bictcp_cong_avoid+0x21a/0x247
> [28522.642708] RSP: 0018:ffff8800bf403a90 EFLAGS: 00010202
> [28522.642735] RAX: 0000000000000010 RBX: ffff880352aa6400 RCX:
> 0000000000000000
> [28522.642765] RDX: 0000000000000000 RSI: ffff880352aa67c0 RDI:
> 0000000000001607
> [28522.642795] RBP: 000000007caa5a1b R08: 00000000000035c2 R09:
> 00000000000000e6
> [28522.642825] R10: ffff88003d499c00 R11: ffff880109831b00 R12:
> ffffffff817cecd0
> [28522.642855] R13: 0000000000000004 R14: 000000000001001b R15:
> 0000000000000123
> [28522.642886] FS: 0000000000000000(0000) GS:ffff8800bf400000(0000)
> knlGS:0000000000000000
> [28522.642932] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [28522.642959] CR2: 00007fb4c6ffd000 CR3: 000000042e4a1000 CR4:
> 00000000000006f0
> [28522.642990] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [28522.643020] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [28522.643050] Process swapper (pid: 0, threadinfo ffffffff8176c000,
> task ffffffff81777020)
> [28522.643095] Stack:
> [28522.643116] ffff880352aa6400
> ffffffff817cecd0
> 0000000000000004
> 0000000000000406
>
> [28522.643171] ffff880352aa6400
> ffffffff814e3dc5
> 000000000000111c
> ffff88022a9b3200
>
> [28522.643226] 0000000000000000
> 7caa5a1b7caa4ec3
> 0000000000000000
> 0000000000000000
>
> [28522.643281] Call Trace:
> [28522.643303] <IRQ>
>
> [28522.643330] [<ffffffff814e3dc5>] ? tcp_ack+0x18b5/0x1a89
> [28522.643359] [<ffffffff814e45c2>] ? tcp_rcv_established+0xd1/0xa13
> [28522.643389] [<ffffffff814ec60b>] ? tcp_v4_do_rcv+0x1b2/0x382
> [28522.643418] [<ffffffff814c95d4>] ? nf_iterate+0x40/0x78
> [28522.643446] [<ffffffff814ecc5f>] ? tcp_v4_rcv+0x484/0x797
> [28522.643475] [<ffffffff814d11c7>] ? ip_local_deliver_finish+0xab/0x139
> [28522.643505] [<ffffffff814ae2b3>] ? __netif_receive_skb+0x31c/0x349
> [28522.643535] [<ffffffff814aec82>] ? netif_receive_skb+0x67/0x6d
> [28522.643564] [<ffffffff814af1fb>] ? napi_gro_receive+0x9d/0xab
> [28522.643592] [<ffffffff814aed57>] ? napi_skb_finish+0x1c/0x31
> [28522.643623] [<ffffffff813e4248>] ? igb_poll+0x7d5/0xb2e
> [28522.643653] [<ffffffff812b6b22>] ? blk_run_queue+0x23/0x37
> [28522.643683] [<ffffffff813520d4>] ? scsi_run_queue+0x2ee/0x381
> [28522.643712] [<ffffffff81353810>] ? scsi_io_completion+0x3e0/0x409
> [28522.643741] [<ffffffff814af337>] ? net_rx_action+0xa7/0x212
> [28522.643771] [<ffffffff8103b6c2>] ? __do_softirq+0xbe/0x184
> [28522.643800] [<ffffffff8100364c>] ? call_softirq+0x1c/0x28
> [28522.643828] [<ffffffff81005085>] ? do_softirq+0x31/0x63
> [28522.643856] [<ffffffff8103b56c>] ? irq_exit+0x36/0x78
> [28522.643883] [<ffffffff81004784>] ? do_IRQ+0x98/0xae
> [28522.643912] [<ffffffff81562c13>] ? ret_from_intr+0x0/0xe
> [28522.643938] <EOI>
>
> [28522.643963] [<ffffffff81009a41>] ? mwait_idle+0xb9/0xf3
> [28522.643991] [<ffffffff81001c6e>] ? cpu_idle+0x57/0x8d
> [28522.644019] [<ffffffff81801c49>] ? start_kernel+0x34e/0x35a
> [28522.644048] [<ffffffff81801398>] ? x86_64_start_kernel+0xf3/0xf9
> [28522.644075] Code:
> 39 c9 76 18 44 29 c9 31 d2 44 89 c8 f7 f1 39 83 c0 03
> 00 00
> 76 06 89 83 c0 03 00 00 8b 83 c0 03 00 00 31 d2 c1 e0
> 04 0f
> b7 4e 2c
> f7> f1 ba 01 00 00 00 85 c0 0f 45 d0 89 93 c0 03 00
> 00 8b b3
> c0
>
> [28522.644338] RIP
> [<ffffffff8150b27b>] bictcp_cong_avoid+0x21a/0x247
> [28522.644371] RSP <ffff8800bf403a90>
> [28522.644733] ---[ end trace 9db294ef7ff3a7b5 ]---
> [28522.644800] Kernel panic - not syncing: Fatal exception in interrupt
> [28522.644871] Pid: 0, comm: swapper Tainted: G D 2.6.38.5 #6
> [28522.644942] Call Trace:
> [28522.645012] <IRQ>
> [<ffffffff81560690>] ? panic+0x9d/0x1a0
> [28522.645131] [<ffffffff81562c13>] ? ret_from_intr+0x0/0xe
> [28522.645200] [<ffffffff810365bb>] ? kmsg_dump+0x46/0xec
> [28522.645268] [<ffffffff81006176>] ? oops_end+0x9f/0xac
> [28522.645335] [<ffffffff810040d8>] ? do_divide_error+0x7f/0x89
> [28522.645404] [<ffffffff8150b27b>] ? bictcp_cong_avoid+0x21a/0x247
> [28522.645473] [<ffffffff814b057c>] ? dev_queue_xmit+0x4a4/0x4b2
> [28522.645545] [<ffffffff814d5390>] ? ip_queue_xmit+0x2e9/0x32f
> [28522.645614] [<ffffffff81003375>] ? divide_error+0x15/0x20
> [28522.645685] [<ffffffff8150b27b>] ? bictcp_cong_avoid+0x21a/0x247
> [28522.645754] [<ffffffff814e3dc5>] ? tcp_ack+0x18b5/0x1a89
> [28522.645823] [<ffffffff814e45c2>] ? tcp_rcv_established+0xd1/0xa13
> [28522.645892] [<ffffffff814ec60b>] ? tcp_v4_do_rcv+0x1b2/0x382
> [28522.645961] [<ffffffff814c95d4>] ? nf_iterate+0x40/0x78
> [28522.646029] [<ffffffff814ecc5f>] ? tcp_v4_rcv+0x484/0x797
> [28522.646097] [<ffffffff814d11c7>] ? ip_local_deliver_finish+0xab/0x139
> [28522.646167] [<ffffffff814ae2b3>] ? __netif_receive_skb+0x31c/0x349
> [28522.646240] [<ffffffff814aec82>] ? netif_receive_skb+0x67/0x6d
> [28522.646308] [<ffffffff814af1fb>] ? napi_gro_receive+0x9d/0xab
> [28522.646377] [<ffffffff814aed57>] ? napi_skb_finish+0x1c/0x31
> [28522.646445] [<ffffffff813e4248>] ? igb_poll+0x7d5/0xb2e
> [28522.646513] [<ffffffff812b6b22>] ? blk_run_queue+0x23/0x37
> [28522.646582] [<ffffffff813520d4>] ? scsi_run_queue+0x2ee/0x381
> [28522.646651] [<ffffffff81353810>] ? scsi_io_completion+0x3e0/0x409
> [28522.646721] [<ffffffff814af337>] ? net_rx_action+0xa7/0x212
> [28522.646791] [<ffffffff8103b6c2>] ? __do_softirq+0xbe/0x184
> [28522.646884] [<ffffffff8100364c>] ? call_softirq+0x1c/0x28
> [28522.646953] [<ffffffff81005085>] ? do_softirq+0x31/0x63
> [28522.647021] [<ffffffff8103b56c>] ? irq_exit+0x36/0x78
> [28522.647089] [<ffffffff81004784>] ? do_IRQ+0x98/0xae
> [28522.647164] [<ffffffff81562c13>] ? ret_from_intr+0x0/0xe
> [28522.647239] <EOI>
> [<ffffffff81009a41>] ? mwait_idle+0xb9/0xf3
> [28522.647354] [<ffffffff81001c6e>] ? cpu_idle+0x57/0x8d
> [28522.647422] [<ffffffff81801c49>] ? start_kernel+0x34e/0x35a
> [28522.647491] [<ffffffff81801398>] ? x86_64_start_kernel+0xf3/0xf9
> --


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2011-05-04 18:33:55

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Divide error in bictcp_cong_avoid ?

On Wed, 4 May 2011 10:49:40 -0700
Randy Dunlap <[email protected]> wrote:

> [add cc to netdev]
>
>
> On Wed, 04 May 2011 12:03:31 -0400 TB wrote:
>
> > We're having this issue sporadically on a few servers and this is the
> > backtrace we get from netconsole.
> >
> >
> > [28522.642419] divide error: 0000 [#1] SMP
> > [28522.642457] last sysfs file:
> > /sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/vendor
> > [28522.642504] CPU 0
> > [28522.642511] Modules linked in:
> > i2c_i801
> > i2c_core
> > evdev
> > button
> > [28522.642570]
> > [28522.642590] Pid: 0, comm: swapper Not tainted 2.6.38.5 #6
> >
> > Supermicro X8DTH-i/6/iF/6F
> > /
> > X8DTH
> >

What kernel version? I suspect they are running something really old
kernel like RHEL 5 since BIC has not been the default congestion control for
several years.

--

2011-05-04 18:35:24

by Randy Dunlap

[permalink] [raw]
Subject: Re: Divide error in bictcp_cong_avoid ?

On Wed, 4 May 2011 11:33:51 -0700 Stephen Hemminger wrote:

> On Wed, 4 May 2011 10:49:40 -0700
> Randy Dunlap <[email protected]> wrote:
>
> > [add cc to netdev]
> >
> >
> > On Wed, 04 May 2011 12:03:31 -0400 TB wrote:
> >
> > > We're having this issue sporadically on a few servers and this is the
> > > backtrace we get from netconsole.
> > >
> > >
> > > [28522.642419] divide error: 0000 [#1] SMP
> > > [28522.642457] last sysfs file:
> > > /sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/vendor
> > > [28522.642504] CPU 0
> > > [28522.642511] Modules linked in:
> > > i2c_i801
> > > i2c_core
> > > evdev
> > > button
> > > [28522.642570]
> > > [28522.642590] Pid: 0, comm: swapper Not tainted 2.6.38.5 #6 <<<<<<<<<<<<<<
> > >
> > > Supermicro X8DTH-i/6/iF/6F
> > > /
> > > X8DTH
> > >
>
> What kernel version? I suspect they are running something really old
> kernel like RHEL 5 since BIC has not been the default congestion control for
> several years.

Please see about 6 lines above.... at the "<<<<<<<<<<<".


---
~Randy
*** Remember to use Documentation/SubmitChecklist when testing your code ***

2011-05-04 18:57:23

by David Miller

[permalink] [raw]
Subject: Re: Divide error in bictcp_cong_avoid ?

From: Stephen Hemminger <[email protected]>
Date: Wed, 4 May 2011 11:33:51 -0700

> On Wed, 4 May 2011 10:49:40 -0700
> Randy Dunlap <[email protected]> wrote:
>
>> > [28522.642590] Pid: 0, comm: swapper Not tainted 2.6.38.5 #6
^^^^^^^^

>> >
>> > Supermicro X8DTH-i/6/iF/6F
>> > /
>> > X8DTH
>> >
>
> What kernel version?

It's right there in the dump Stephen, see above.

2011-05-04 19:03:43

by TB

[permalink] [raw]
Subject: Re: Divide error in bictcp_cong_avoid ?

On 11-05-04 02:35 PM, Randy Dunlap wrote:
> On Wed, 4 May 2011 11:33:51 -0700 Stephen Hemminger wrote:
>
>> On Wed, 4 May 2011 10:49:40 -0700
>> Randy Dunlap<[email protected]> wrote:
>>
>>> [add cc to netdev]
>>>
>>>
>>> On Wed, 04 May 2011 12:03:31 -0400 TB wrote:
>>>
>>>> We're having this issue sporadically on a few servers and this is the
>>>> backtrace we get from netconsole.
>>>>
>>>>
>>>> [28522.642419] divide error: 0000 [#1] SMP
>>>> [28522.642457] last sysfs file:
>>>> /sys/devices/pci0000:00/0000:00:1f.2/host2/target2:0:0/2:0:0:0/vendor
>>>> [28522.642504] CPU 0
>>>> [28522.642511] Modules linked in:
>>>> i2c_i801
>>>> i2c_core
>>>> evdev
>>>> button
>>>> [28522.642570]
>>>> [28522.642590] Pid: 0, comm: swapper Not tainted 2.6.38.5 #6<<<<<<<<<<<<<<
>>>>
>>>> Supermicro X8DTH-i/6/iF/6F
>>>> /
>>>> X8DTH
>>>>
>>
>> What kernel version? I suspect they are running something really old
>> kernel like RHEL 5 since BIC has not been the default congestion control for
>> several years.
>
> Please see about 6 lines above.... at the "<<<<<<<<<<<".

/proc/sys/net/ipv4/tcp_available_congestion_control
cubic reno

cat /proc/sys/net/ipv4/tcp_congestion_control
cubic

2011-05-04 19:31:27

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: Divide error in bictcp_cong_avoid ?

On Wed, 04 May 2011 11:33:51 PDT, Stephen Hemminger said:

> What kernel version? I suspect they are running something really old
> kernel like RHEL 5 since BIC has not been the default congestion control for
> several years.

Pretty bleeding edge here, and BIC is *still* the default:

% uname -a
Linux turing-police.cc.vt.edu 2.6.39-rc5-mmotm0429 #1 SMP PREEMPT Sat Apr 30 00:36:49 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
% zgrep -i BIC /proc/config.gz
CONFIG_TCP_CONG_BIC=y
CONFIG_TCP_CONG_CUBIC=m
CONFIG_DEFAULT_BIC=y
CONFIG_DEFAULT_TCP_CONG="bic"

Why? Because BIC was working "well enough", so I never changed it by hand, and
"make oldconfig" never changed it automagically either. Now,


Attachments:
(No filename) (227.00 B)

2011-05-04 19:37:42

by Stephen Hemminger

[permalink] [raw]
Subject: Re: Divide error in bictcp_cong_avoid ?

On Wed, 04 May 2011 15:31:21 -0400
[email protected] wrote:

> On Wed, 04 May 2011 11:33:51 PDT, Stephen Hemminger said:
>
> > What kernel version? I suspect they are running something really old
> > kernel like RHEL 5 since BIC has not been the default congestion control for
> > several years.
>
> Pretty bleeding edge here, and BIC is *still* the default:
>
> % uname -a
> Linux turing-police.cc.vt.edu 2.6.39-rc5-mmotm0429 #1 SMP PREEMPT Sat Apr 30 00:36:49 EDT 2011 x86_64 x86_64 x86_64 GNU/Linux
> % zgrep -i BIC /proc/config.gz
> CONFIG_TCP_CONG_BIC=y
> CONFIG_TCP_CONG_CUBIC=m
> CONFIG_DEFAULT_BIC=y
> CONFIG_DEFAULT_TCP_CONG="bic"
>
> Why? Because BIC was working "well enough", so I never changed it by hand, and
> "make oldconfig" never changed it automagically either. Now,
>

The CUBIC code has the same function names so that is where the
problem is.

Don't use BIC, it is known to be unfair between flows.
The configuration process never changes your values, it only asks
about new ones.

--

2011-05-04 19:41:27

by David Miller

[permalink] [raw]
Subject: Re: Divide error in bictcp_cong_avoid ?

From: Stephen Hemminger <[email protected]>
Date: Wed, 4 May 2011 12:37:38 -0700

> Don't use BIC, it is known to be unfair between flows.

That doesn't fix the user's bug, nothing in the kernel should causes
crashes under normal circumstances.

Please stop with this deflecting on this report and instead try to
figure out why the illegal divide happens.

Thanks.

2011-05-04 20:01:50

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: Divide error in bictcp_cong_avoid ?

On Wed, 04 May 2011 12:40:53 PDT, David Miller said:
> From: Stephen Hemminger <[email protected]>
> Date: Wed, 4 May 2011 12:37:38 -0700
>
> > Don't use BIC, it is known to be unfair between flows.
>
> That doesn't fix the user's bug, nothing in the kernel should causes
> crashes under normal circumstances.

And if it's a "don't use" thing, maybe it's time to deprecate it and/or
otherwise discourage its use? I know *my* config still has it mostly because it
*seems* to work, and nothing's suggested I should change it.



Attachments:
(No filename) (227.00 B)

2011-05-04 20:05:05

by Stephen Hemminger

[permalink] [raw]
Subject: [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error

TCP Cubic keeps a metric that estimates the amount of delayed
acknowledgements to use in adjusting the window. If an abnormally
large number of packets are acknowledged at once, then the update
could wrap and reach zero. This kind of ACK could only
happen when there was a large window and huge number of
ACK's were lost.

This patch limits the value of delayed ack ratio. The choice of 32
is just a conservative value since normally it should be range of
1 to 4 packets.

Signed-off-by: Stephen Hemminger <[email protected]>

---
Patch against 2.6.39-rc5+


--- a/net/ipv4/tcp_cubic.c 2011-05-04 11:58:49.666027155 -0700
+++ b/net/ipv4/tcp_cubic.c 2011-05-04 12:52:34.716767304 -0700
@@ -93,6 +93,7 @@ struct bictcp {
u32 ack_cnt; /* number of acks */
u32 tcp_cwnd; /* estimated tcp cwnd */
#define ACK_RATIO_SHIFT 4
+#define ACK_RATIO_LIMIT (32u << ACK_RATIO_SHIFT)
u16 delayed_ack; /* estimate the ratio of Packets/ACKs << 4 */
u8 sample_cnt; /* number of samples to decide curr_rtt */
u8 found; /* the exit point is found? */
@@ -398,8 +399,12 @@ static void bictcp_acked(struct sock *sk
u32 delay;

if (icsk->icsk_ca_state == TCP_CA_Open) {
- cnt -= ca->delayed_ack >> ACK_RATIO_SHIFT;
- ca->delayed_ack += cnt;
+ u32 ratio = ca->delayed_ack;
+
+ ratio -= ca->delayed_ack >> ACK_RATIO_SHIFT;
+ ratio += cnt;
+
+ ca->delayed_ack = min(ratio, ACK_RATIO_LIMIT);
}

/* Some calls are for duplicates without timetamps */

2011-05-04 20:54:03

by Jesse Brandeburg

[permalink] [raw]
Subject: Re: [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error



On Wed, 4 May 2011, Stephen Hemminger wrote:

> TCP Cubic keeps a metric that estimates the amount of delayed
> acknowledgements to use in adjusting the window. If an abnormally
> large number of packets are acknowledged at once, then the update
> could wrap and reach zero. This kind of ACK could only
> happen when there was a large window and huge number of
> ACK's were lost.
>
> This patch limits the value of delayed ack ratio. The choice of 32
> is just a conservative value since normally it should be range of
> 1 to 4 packets.
>
> Signed-off-by: Stephen Hemminger <[email protected]>

patch seems fine, but please credit the reporter ([email protected]) with
reporting the issue with logs, maybe even with Reported-by: and some kind
of reference to the panic message or the email thread in the text or
header?

2011-05-06 16:15:24

by TB

[permalink] [raw]
Subject: Re: [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 11-05-04 04:53 PM, Brandeburg, Jesse wrote:
>
>
> On Wed, 4 May 2011, Stephen Hemminger wrote:
>
>> TCP Cubic keeps a metric that estimates the amount of delayed
>> acknowledgements to use in adjusting the window. If an abnormally
>> large number of packets are acknowledged at once, then the update
>> could wrap and reach zero. This kind of ACK could only
>> happen when there was a large window and huge number of
>> ACK's were lost.
>>
>> This patch limits the value of delayed ack ratio. The choice of 32
>> is just a conservative value since normally it should be range of
>> 1 to 4 packets.
>>
>> Signed-off-by: Stephen Hemminger <[email protected]>
>
> patch seems fine, but please credit the reporter ([email protected]) with
> reporting the issue with logs, maybe even with Reported-by: and some kind
> of reference to the panic message or the email thread in the text or
> header?

We're currently testing the patch on 6 production servers

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iQEcBAEBAgAGBQJNxB6yAAoJENOh8x1aI8Ye4ocH/3+6gjWWppgOwql0J4XGGD5R
wJX+u8A+YK2V+GBvxFgQs/qNa3IB/nnWwELolflO80twq2JrOq1I6g2n1VJhHjX4
b5jyROMe2gPHRKESibi84gNIuoImq4bqM/S1u7xWzcikTh8FxCevYQXTNilIKOOf
siuOIypFY7AyqSPjhq5/+HpTrrOQa097PAcVAr8RBO7niyrxAE75ACTolGAKBfvQ
HlOYKmxBT8SbnZ7YJNINopPdtpqz3iaraKWUoT44Wuv8Q8jt0cqB7YJWl0RG/C3y
ABK50Qihl1p6M+LL9jjR2YwVFkjiLyN3fO8g2pjVfn4wh0afFCyWtitN0OFd/4I=
=Vy5E
-----END PGP SIGNATURE-----

2011-05-06 16:54:04

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error

On Fri, 06 May 2011 12:15:46 -0400
TB <[email protected]> wrote:

> -----BEGIN PGP SIGNED MESSAGE-----
> Hash: SHA1
>
> On 11-05-04 04:53 PM, Brandeburg, Jesse wrote:
> >
> >
> > On Wed, 4 May 2011, Stephen Hemminger wrote:
> >
> >> TCP Cubic keeps a metric that estimates the amount of delayed
> >> acknowledgements to use in adjusting the window. If an abnormally
> >> large number of packets are acknowledged at once, then the update
> >> could wrap and reach zero. This kind of ACK could only
> >> happen when there was a large window and huge number of
> >> ACK's were lost.
> >>
> >> This patch limits the value of delayed ack ratio. The choice of 32
> >> is just a conservative value since normally it should be range of
> >> 1 to 4 packets.
> >>
> >> Signed-off-by: Stephen Hemminger <[email protected]>
> >
> > patch seems fine, but please credit the reporter ([email protected]) with
> > reporting the issue with logs, maybe even with Reported-by: and some kind
> > of reference to the panic message or the email thread in the text or
> > header?
>
> We're currently testing the patch on 6 production servers

Thank you, is there some regularity to the failures previously?

2011-05-06 17:39:30

by TB

[permalink] [raw]
Subject: Re: [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error

On 11-05-06 12:53 PM, Stephen Hemminger wrote:
> On Fri, 06 May 2011 12:15:46 -0400
> TB <[email protected]> wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 11-05-04 04:53 PM, Brandeburg, Jesse wrote:
>>>
>>>
>>> On Wed, 4 May 2011, Stephen Hemminger wrote:
>>>
>>>> TCP Cubic keeps a metric that estimates the amount of delayed
>>>> acknowledgements to use in adjusting the window. If an abnormally
>>>> large number of packets are acknowledged at once, then the update
>>>> could wrap and reach zero. This kind of ACK could only
>>>> happen when there was a large window and huge number of
>>>> ACK's were lost.
>>>>
>>>> This patch limits the value of delayed ack ratio. The choice of 32
>>>> is just a conservative value since normally it should be range of
>>>> 1 to 4 packets.
>>>>
>>>> Signed-off-by: Stephen Hemminger <[email protected]>
>>>
>>> patch seems fine, but please credit the reporter ([email protected]) with
>>> reporting the issue with logs, maybe even with Reported-by: and some kind
>>> of reference to the panic message or the email thread in the text or
>>> header?
>>
>> We're currently testing the patch on 6 production servers
>
> Thank you, is there some regularity to the failures previously?

Not really, there was more chance of it happening after a reboot and
during the night (when there is less traffic) for some weird reason.

As a workaround we switched most of the servers to reno

2011-05-08 22:52:51

by David Miller

[permalink] [raw]
Subject: Re: [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error

From: Stephen Hemminger <[email protected]>
Date: Wed, 4 May 2011 13:04:56 -0700

> TCP Cubic keeps a metric that estimates the amount of delayed
> acknowledgements to use in adjusting the window. If an abnormally
> large number of packets are acknowledged at once, then the update
> could wrap and reach zero. This kind of ACK could only
> happen when there was a large window and huge number of
> ACK's were lost.
>
> This patch limits the value of delayed ack ratio. The choice of 32
> is just a conservative value since normally it should be range of
> 1 to 4 packets.
>
> Signed-off-by: Stephen Hemminger <[email protected]>

Applied, thanks Stephen.

2011-05-11 16:06:41

by TB

[permalink] [raw]
Subject: Re: [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error

On 11-05-06 12:53 PM, Stephen Hemminger wrote:
> On Fri, 06 May 2011 12:15:46 -0400
> TB <[email protected]> wrote:
>
>> -----BEGIN PGP SIGNED MESSAGE-----
>> Hash: SHA1
>>
>> On 11-05-04 04:53 PM, Brandeburg, Jesse wrote:
>>>
>>>
>>> On Wed, 4 May 2011, Stephen Hemminger wrote:
>>>
>>>> TCP Cubic keeps a metric that estimates the amount of delayed
>>>> acknowledgements to use in adjusting the window. If an abnormally
>>>> large number of packets are acknowledged at once, then the update
>>>> could wrap and reach zero. This kind of ACK could only
>>>> happen when there was a large window and huge number of
>>>> ACK's were lost.
>>>>
>>>> This patch limits the value of delayed ack ratio. The choice of 32
>>>> is just a conservative value since normally it should be range of
>>>> 1 to 4 packets.
>>>>
>>>> Signed-off-by: Stephen Hemminger <[email protected]>
>>>
>>> patch seems fine, but please credit the reporter ([email protected]) with
>>> reporting the issue with logs, maybe even with Reported-by: and some kind
>>> of reference to the panic message or the email thread in the text or
>>> header?
>>
>> We're currently testing the patch on 6 production servers
>
> Thank you, is there some regularity to the failures previously?

This is now being tested on about 50 servers and we just had another
panic, on a server with 2.6.38.5 and this patch.

[405542.454073] ------------[ cut here ]------------
[405542.454109] kernel BUG at net/ipv4/tcp_output.c:1006!
[405542.454136] invalid opcode: 0000 [#1]

[405542.454166] last sysfs file:
/sys/devices/pci0000:00/0000:00:1f.2/host6/scsi_host/host6/proc_name
[405542.454213] CPU 0

[405542.454220] Modules linked in:
i2c_i801
evdev
i2c_core
button
[last unloaded: scsi_wait_scan]

[405542.454300]
[405542.454320] Pid: 0, comm: swapper Not tainted 2.6.38.5 #8

/

[405542.454379] RIP: 0010:[<ffffffff814e7ed2>]
[<ffffffff814e7ed2>] tcp_fragment+0x22/0x29a
[405542.454433] RSP: 0018:ffff8800bf403a30 EFLAGS: 00010202
[405542.454460] RAX: ffff88000cd35000 RBX: ffff88006b84f480 RCX:
0000000000000218
[405542.454504] RDX: 0000000000001708 RSI: ffff88006b84f480 RDI:
ffff880008d6b200
[405542.454548] RBP: 0000000000001540 R08: 0000000000000002 R09:
000000001027984a
[405542.454592] R10: ffff8800b915f428 R11: ffff880008d6b200 R12:
ffff88006b84f4a8
[405542.454636] R13: 0000000000001708 R14: 0000000000000000 R15:
ffff880008d6b200
[405542.454680] FS: 0000000000000000(0000) GS:ffff8800bf400000(0000)
knlGS:0000000000000000
[405542.454726] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[405542.454754] CR2: 00007f94055c7000 CR3: 000000083e0bd000 CR4:
00000000000006f0
[405542.454798] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[405542.454842] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[405542.454886] Process swapper (pid: 0, threadinfo ffffffff8176c000,
task ffffffff81777020)
[405542.454931] Stack:
[405542.454951] 0000000000000000
0000021808d6b798
00000002000005b4
ffff88006b84f480

[405542.455006] ffff880008d6b200
ffff88006b84f4a8
0000000000000015
0000000000000000

[405542.455061] ffff880008d6b300
ffffffff814df7a4
ffff8802a3965140
00000000000001a0

[405542.455115] Call Trace:
[405542.455137] <IRQ>

[405542.455162] [<ffffffff814df7a4>] ? tcp_mark_head_lost+0x13c/0x202
[405542.455192] [<ffffffff814e33a8>] ? tcp_ack+0xe98/0x1a89
[405542.455220] [<ffffffff814e42ca>] ? tcp_validate_incoming+0x69/0x290
[405542.455250] [<ffffffff814e4c9b>] ? tcp_rcv_established+0x7aa/0xa13
[405542.455281] [<ffffffff814ec60b>] ? tcp_v4_do_rcv+0x1b2/0x382
[405542.455310] [<ffffffff814c95d4>] ? nf_iterate+0x40/0x78
[405542.455338] [<ffffffff814ecc5f>] ? tcp_v4_rcv+0x484/0x797
[405542.455368] [<ffffffff814d11c7>] ? ip_local_deliver_finish+0xab/0x139
[405542.455398] [<ffffffff814ae2b3>] ? __netif_receive_skb+0x31c/0x349
[405542.455428] [<ffffffff814aec82>] ? netif_receive_skb+0x67/0x6d
[405542.455457] [<ffffffff814af1fb>] ? napi_gro_receive+0x9d/0xab
[405542.455485] [<ffffffff814aed57>] ? napi_skb_finish+0x1c/0x31
[405542.455516] [<ffffffff813e4248>] ? igb_poll+0x7d5/0xb2e
[405542.455544] [<ffffffff813e432f>] ? igb_poll+0x8bc/0xb2e
[405542.455572] [<ffffffff813e211a>] ? igb_msix_ring+0x6e/0x75
[405542.455602] [<ffffffff8106749c>] ? handle_IRQ_event+0x51/0x119
[405542.455631] [<ffffffff814af337>] ? net_rx_action+0xa7/0x212
[405542.455661] [<ffffffff8103b6c2>] ? __do_softirq+0xbe/0x184
[405542.455690] [<ffffffff8100364c>] ? call_softirq+0x1c/0x28
[405542.455719] [<ffffffff81005085>] ? do_softirq+0x31/0x63
[405542.455746] [<ffffffff8103b56c>] ? irq_exit+0x36/0x78
[405542.455773] [<ffffffff81004784>] ? do_IRQ+0x98/0xae
[405542.455802] [<ffffffff81562ed3>] ? ret_from_intr+0x0/0xe
[405542.455829] <EOI>

[405542.455860] [<ffffffff81009a41>] ? mwait_idle+0xb9/0xf3
[405542.455888] [<ffffffff81001c6e>] ? cpu_idle+0x57/0x8d
[405542.455921] [<ffffffff81801c49>] ? start_kernel+0x34e/0x35a
[405542.455950] [<ffffffff81801398>] ? x86_64_start_kernel+0xf3/0xf9
[405542.455977] Code:
f>

[405542.456239] RIP
[<ffffffff814e7ed2>] tcp_fragment+0x22/0x29a
[405542.456270] RSP <ffff8800bf403a30>
[405542.456543] ---[ end trace 231aaa222f893065 ]---
[405542.456600] Kernel panic - not syncing: Fatal exception in interrupt
[405542.456659] Pid: 0, comm: swapper Tainted: G D 2.6.38.5 #8
[405542.456719] Call Trace:
[405542.456770] <IRQ>
[<ffffffff81560960>] ? panic+0x9d/0x1a0
[405542.456863] [<ffffffff81562ed3>] ? ret_from_intr+0x0/0xe
[405542.456923] [<ffffffff810365bb>] ? kmsg_dump+0x46/0xec
[405542.456981] [<ffffffff81006176>] ? oops_end+0x9f/0xac
[405542.457039] [<ffffffff81003f83>] ? do_invalid_op+0x85/0x8f
[405542.457097] [<ffffffff814e7ed2>] ? tcp_fragment+0x22/0x29a
[405542.457156] [<ffffffff814e80a9>] ? tcp_fragment+0x1f9/0x29a
[405542.457216] [<ffffffff810033d5>] ? invalid_op+0x15/0x20
[405542.457276] [<ffffffff814e7ed2>] ? tcp_fragment+0x22/0x29a
[405542.457337] [<ffffffff814df7a4>] ? tcp_mark_head_lost+0x13c/0x202
[405542.457400] [<ffffffff814e33a8>] ? tcp_ack+0xe98/0x1a89
[405542.457461] [<ffffffff814e42ca>] ? tcp_validate_incoming+0x69/0x290
[405542.457524] [<ffffffff814e4c9b>] ? tcp_rcv_established+0x7aa/0xa13
[405542.457586] [<ffffffff814ec60b>] ? tcp_v4_do_rcv+0x1b2/0x382
[405542.457645] [<ffffffff814c95d4>] ? nf_iterate+0x40/0x78
[405542.457703] [<ffffffff814ecc5f>] ? tcp_v4_rcv+0x484/0x797
[405542.457761] [<ffffffff814d11c7>] ? ip_local_deliver_finish+0xab/0x139
[405542.457827] [<ffffffff814ae2b3>] ? __netif_receive_skb+0x31c/0x349
[405542.457894] [<ffffffff814aec82>] ? netif_receive_skb+0x67/0x6d
[405542.457953] [<ffffffff814af1fb>] ? napi_gro_receive+0x9d/0xab
[405542.458021] [<ffffffff814aed57>] ? napi_skb_finish+0x1c/0x31
[405542.458080] [<ffffffff813e4248>] ? igb_poll+0x7d5/0xb2e
[405542.458138] [<ffffffff813e432f>] ? igb_poll+0x8bc/0xb2e
[405542.458196] [<ffffffff813e211a>] ? igb_msix_ring+0x6e/0x75
[405542.458254] [<ffffffff8106749c>] ? handle_IRQ_event+0x51/0x119
[405542.458313] [<ffffffff814af337>] ? net_rx_action+0xa7/0x212
[405542.458371] [<ffffffff8103b6c2>] ? __do_softirq+0xbe/0x184
[405542.458430] [<ffffffff8100364c>] ? call_softirq+0x1c/0x28
[405542.458488] [<ffffffff81005085>] ? do_softirq+0x31/0x63
[405542.458545] [<ffffffff8103b56c>] ? irq_exit+0x36/0x78
[405542.458602] [<ffffffff81004784>] ? do_IRQ+0x98/0xae
[405542.458660] [<ffffffff81562ed3>] ? ret_from_intr+0x0/0xe
[405542.458717] <EOI>
[<ffffffff81009a41>] ? mwait_idle+0xb9/0xf3
[405542.458810] [<ffffffff81001c6e>] ? cpu_idle+0x57/0x8d
[405542.458867] [<ffffffff81801c49>] ? start_kernel+0x34e/0x35a
[405542.458926] [<ffffffff81801398>] ? x86_64_start_kernel+0xf3/0xf9

2011-05-11 15:45:56

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error

On Wed, 11 May 2011 10:49:01 -0400
TB <[email protected]> wrote:

> On 11-05-06 12:53 PM, Stephen Hemminger wrote:
> > On Fri, 06 May 2011 12:15:46 -0400
> > TB <[email protected]> wrote:
> >
> >> -----BEGIN PGP SIGNED MESSAGE-----
> >> Hash: SHA1
> >>
> >> On 11-05-04 04:53 PM, Brandeburg, Jesse wrote:
> >>>
> >>>
> >>> On Wed, 4 May 2011, Stephen Hemminger wrote:
> >>>
> >>>> TCP Cubic keeps a metric that estimates the amount of delayed
> >>>> acknowledgements to use in adjusting the window. If an abnormally
> >>>> large number of packets are acknowledged at once, then the update
> >>>> could wrap and reach zero. This kind of ACK could only
> >>>> happen when there was a large window and huge number of
> >>>> ACK's were lost.
> >>>>
> >>>> This patch limits the value of delayed ack ratio. The choice of 32
> >>>> is just a conservative value since normally it should be range of
> >>>> 1 to 4 packets.
> >>>>
> >>>> Signed-off-by: Stephen Hemminger <[email protected]>

> >>>
> >>> patch seems fine, but please credit the reporter ([email protected]) with
> >>> reporting the issue with logs, maybe even with Reported-by: and some kind
> >>> of reference to the panic message or the email thread in the text or
> >>> header?
> >>
> >> We're currently testing the patch on 6 production servers
> >
> > Thank you, is there some regularity to the failures previously?
>
> This is now being tested on about 50 servers and we just had another
> panic, on a server with 2.6.38.5 and this patch.
>
> [405542.454073] ------------[ cut here ]------------
> [405542.454109] kernel BUG at net/ipv4/tcp_output.c:1006!
> [405542.454136] invalid opcode: 0000 [#1]
>
> [405542.454166] last sysfs file:
> /sys/devices/pci0000:00/0000:00:1f.2/host6/scsi_host/host6/proc_name
> [405542.454213] CPU 0
>
> [405542.454220] Modules linked in:
> i2c_i801
> evdev
> i2c_core
> button
> [last unloaded: scsi_wait_scan]
>
> [405542.454300]
> [405542.454320] Pid: 0, comm: swapper Not tainted 2.6.38.5 #8
>
> /
>
> [405542.454379] RIP: 0010:[<ffffffff814e7ed2>]
> [<ffffffff814e7ed2>] tcp_fragment+0x22/0x29a
> [405542.454433] RSP: 0018:ffff8800bf403a30 EFLAGS: 00010202
> [405542.454460] RAX: ffff88000cd35000 RBX: ffff88006b84f480 RCX:
> 0000000000000218
> [405542.454504] RDX: 0000000000001708 RSI: ffff88006b84f480 RDI:
> ffff880008d6b200
> [405542.454548] RBP: 0000000000001540 R08: 0000000000000002 R09:
> 000000001027984a
> [405542.454592] R10: ffff8800b915f428 R11: ffff880008d6b200 R12:
> ffff88006b84f4a8
> [405542.454636] R13: 0000000000001708 R14: 0000000000000000 R15:
> ffff880008d6b200
> [405542.454680] FS: 0000000000000000(0000) GS:ffff8800bf400000(0000)
> knlGS:0000000000000000
> [405542.454726] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [405542.454754] CR2: 00007f94055c7000 CR3: 000000083e0bd000 CR4:
> 00000000000006f0
> [405542.454798] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [405542.454842] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [405542.454886] Process swapper (pid: 0, threadinfo ffffffff8176c000,
> task ffffffff81777020)
> [405542.454931] Stack:
> [405542.454951] 0000000000000000
> 0000021808d6b798
> 00000002000005b4
> ffff88006b84f480
>
> [405542.455006] ffff880008d6b200
> ffff88006b84f4a8
> 0000000000000015
> 0000000000000000
>
> [405542.455061] ffff880008d6b300
> ffffffff814df7a4
> ffff8802a3965140
> 00000000000001a0
>
> [405542.455115] Call Trace:
> [405542.455137] <IRQ>
>
> [405542.455162] [<ffffffff814df7a4>] ? tcp_mark_head_lost+0x13c/0x202
> [405542.455192] [<ffffffff814e33a8>] ? tcp_ack+0xe98/0x1a89
> [405542.455220] [<ffffffff814e42ca>] ? tcp_validate_incoming+0x69/0x290
> [405542.455250] [<ffffffff814e4c9b>] ? tcp_rcv_established+0x7aa/0xa13
> [405542.455281] [<ffffffff814ec60b>] ? tcp_v4_do_rcv+0x1b2/0x382
> [405542.455310] [<ffffffff814c95d4>] ? nf_iterate+0x40/0x78
> [405542.455338] [<ffffffff814ecc5f>] ? tcp_v4_rcv+0x484/0x797
> [405542.455368] [<ffffffff814d11c7>] ? ip_local_deliver_finish+0xab/0x139
> [405542.455398] [<ffffffff814ae2b3>] ? __netif_receive_skb+0x31c/0x349
> [405542.455428] [<ffffffff814aec82>] ? netif_receive_skb+0x67/0x6d
> [405542.455457] [<ffffffff814af1fb>] ? napi_gro_receive+0x9d/0xab
> [405542.455485] [<ffffffff814aed57>] ? napi_skb_finish+0x1c/0x31
> [405542.455516] [<ffffffff813e4248>] ? igb_poll+0x7d5/0xb2e
> [405542.455544] [<ffffffff813e432f>] ? igb_poll+0x8bc/0xb2e
> [405542.455572] [<ffffffff813e211a>] ? igb_msix_ring+0x6e/0x75
> [405542.455602] [<ffffffff8106749c>] ? handle_IRQ_event+0x51/0x119
> [405542.455631] [<ffffffff814af337>] ? net_rx_action+0xa7/0x212
> [405542.455661] [<ffffffff8103b6c2>] ? __do_softirq+0xbe/0x184
> [405542.455690] [<ffffffff8100364c>] ? call_softirq+0x1c/0x28
> [405542.455719] [<ffffffff81005085>] ? do_softirq+0x31/0x63
> [405542.455746] [<ffffffff8103b56c>] ? irq_exit+0x36/0x78
> [405542.455773] [<ffffffff81004784>] ? do_IRQ+0x98/0xae
> [405542.455802] [<ffffffff81562ed3>] ? ret_from_intr+0x0/0xe
> [405542.455829] <EOI>
>
> [405542.455860] [<ffffffff81009a41>] ? mwait_idle+0xb9/0xf3
> [405542.455888] [<ffffffff81001c6e>] ? cpu_idle+0x57/0x8d
> [405542.455921] [<ffffffff81801c49>] ? start_kernel+0x34e/0x35a
> [405542.455950] [<ffffffff81801398>] ? x86_64_start_kernel+0xf3/0xf9

This panic is different than the last one.
It is coming from TCP fragment code being
called with an invalid skb. If I read the registers correctly,
skb->len (R14) = 0 and len (EDX) = 1708; the check here is failing.

int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
unsigned int mss_now)
{

BUG_ON(len > skb->len);


Are you running with large (or small) MTU? What netfilter rules, perhaps
the firewall rule altered the packet.
--

2011-05-11 15:35:46

by TB

[permalink] [raw]
Subject: Re: [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error

On 11-05-11 11:22 AM, Stephen Hemminger wrote:
> On Wed, 11 May 2011 10:49:01 -0400
> TB <[email protected]> wrote:
>
>> On 11-05-06 12:53 PM, Stephen Hemminger wrote:
>>> On Fri, 06 May 2011 12:15:46 -0400
>>> TB <[email protected]> wrote:
>>>
>>>> -----BEGIN PGP SIGNED MESSAGE-----
>>>> Hash: SHA1
>>>>
>>>> On 11-05-04 04:53 PM, Brandeburg, Jesse wrote:
>>>>>
>>>>>
>>>>> On Wed, 4 May 2011, Stephen Hemminger wrote:
>>>>>
>>>>>> TCP Cubic keeps a metric that estimates the amount of delayed
>>>>>> acknowledgements to use in adjusting the window. If an abnormally
>>>>>> large number of packets are acknowledged at once, then the update
>>>>>> could wrap and reach zero. This kind of ACK could only
>>>>>> happen when there was a large window and huge number of
>>>>>> ACK's were lost.
>>>>>>
>>>>>> This patch limits the value of delayed ack ratio. The choice of 32
>>>>>> is just a conservative value since normally it should be range of
>>>>>> 1 to 4 packets.
>>>>>>
>>>>>> Signed-off-by: Stephen Hemminger <[email protected]>
>
>>>>>
>>>>> patch seems fine, but please credit the reporter ([email protected]) with
>>>>> reporting the issue with logs, maybe even with Reported-by: and some kind
>>>>> of reference to the panic message or the email thread in the text or
>>>>> header?
>>>>
>>>> We're currently testing the patch on 6 production servers
>>>
>>> Thank you, is there some regularity to the failures previously?
>>
>> This is now being tested on about 50 servers and we just had another
>> panic, on a server with 2.6.38.5 and this patch.
>>
>> [405542.454073] ------------[ cut here ]------------
>> [405542.454109] kernel BUG at net/ipv4/tcp_output.c:1006!
>> [405542.454136] invalid opcode: 0000 [#1]
>>
>> [405542.454166] last sysfs file:
>> /sys/devices/pci0000:00/0000:00:1f.2/host6/scsi_host/host6/proc_name
>> [405542.454213] CPU 0
>>
>> [405542.454220] Modules linked in:
>> i2c_i801
>> evdev
>> i2c_core
>> button
>> [last unloaded: scsi_wait_scan]
>>
>> [405542.454300]
>> [405542.454320] Pid: 0, comm: swapper Not tainted 2.6.38.5 #8
>>
>> /
>>
>> [405542.454379] RIP: 0010:[<ffffffff814e7ed2>]
>> [<ffffffff814e7ed2>] tcp_fragment+0x22/0x29a
>> [405542.454433] RSP: 0018:ffff8800bf403a30 EFLAGS: 00010202
>> [405542.454460] RAX: ffff88000cd35000 RBX: ffff88006b84f480 RCX:
>> 0000000000000218
>> [405542.454504] RDX: 0000000000001708 RSI: ffff88006b84f480 RDI:
>> ffff880008d6b200
>> [405542.454548] RBP: 0000000000001540 R08: 0000000000000002 R09:
>> 000000001027984a
>> [405542.454592] R10: ffff8800b915f428 R11: ffff880008d6b200 R12:
>> ffff88006b84f4a8
>> [405542.454636] R13: 0000000000001708 R14: 0000000000000000 R15:
>> ffff880008d6b200
>> [405542.454680] FS: 0000000000000000(0000) GS:ffff8800bf400000(0000)
>> knlGS:0000000000000000
>> [405542.454726] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
>> [405542.454754] CR2: 00007f94055c7000 CR3: 000000083e0bd000 CR4:
>> 00000000000006f0
>> [405542.454798] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
>> 0000000000000000
>> [405542.454842] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
>> 0000000000000400
>> [405542.454886] Process swapper (pid: 0, threadinfo ffffffff8176c000,
>> task ffffffff81777020)
>> [405542.454931] Stack:
>> [405542.454951] 0000000000000000
>> 0000021808d6b798
>> 00000002000005b4
>> ffff88006b84f480
>>
>> [405542.455006] ffff880008d6b200
>> ffff88006b84f4a8
>> 0000000000000015
>> 0000000000000000
>>
>> [405542.455061] ffff880008d6b300
>> ffffffff814df7a4
>> ffff8802a3965140
>> 00000000000001a0
>>
>> [405542.455115] Call Trace:
>> [405542.455137] <IRQ>
>>
>> [405542.455162] [<ffffffff814df7a4>] ? tcp_mark_head_lost+0x13c/0x202
>> [405542.455192] [<ffffffff814e33a8>] ? tcp_ack+0xe98/0x1a89
>> [405542.455220] [<ffffffff814e42ca>] ? tcp_validate_incoming+0x69/0x290
>> [405542.455250] [<ffffffff814e4c9b>] ? tcp_rcv_established+0x7aa/0xa13
>> [405542.455281] [<ffffffff814ec60b>] ? tcp_v4_do_rcv+0x1b2/0x382
>> [405542.455310] [<ffffffff814c95d4>] ? nf_iterate+0x40/0x78
>> [405542.455338] [<ffffffff814ecc5f>] ? tcp_v4_rcv+0x484/0x797
>> [405542.455368] [<ffffffff814d11c7>] ? ip_local_deliver_finish+0xab/0x139
>> [405542.455398] [<ffffffff814ae2b3>] ? __netif_receive_skb+0x31c/0x349
>> [405542.455428] [<ffffffff814aec82>] ? netif_receive_skb+0x67/0x6d
>> [405542.455457] [<ffffffff814af1fb>] ? napi_gro_receive+0x9d/0xab
>> [405542.455485] [<ffffffff814aed57>] ? napi_skb_finish+0x1c/0x31
>> [405542.455516] [<ffffffff813e4248>] ? igb_poll+0x7d5/0xb2e
>> [405542.455544] [<ffffffff813e432f>] ? igb_poll+0x8bc/0xb2e
>> [405542.455572] [<ffffffff813e211a>] ? igb_msix_ring+0x6e/0x75
>> [405542.455602] [<ffffffff8106749c>] ? handle_IRQ_event+0x51/0x119
>> [405542.455631] [<ffffffff814af337>] ? net_rx_action+0xa7/0x212
>> [405542.455661] [<ffffffff8103b6c2>] ? __do_softirq+0xbe/0x184
>> [405542.455690] [<ffffffff8100364c>] ? call_softirq+0x1c/0x28
>> [405542.455719] [<ffffffff81005085>] ? do_softirq+0x31/0x63
>> [405542.455746] [<ffffffff8103b56c>] ? irq_exit+0x36/0x78
>> [405542.455773] [<ffffffff81004784>] ? do_IRQ+0x98/0xae
>> [405542.455802] [<ffffffff81562ed3>] ? ret_from_intr+0x0/0xe
>> [405542.455829] <EOI>
>>
>> [405542.455860] [<ffffffff81009a41>] ? mwait_idle+0xb9/0xf3
>> [405542.455888] [<ffffffff81001c6e>] ? cpu_idle+0x57/0x8d
>> [405542.455921] [<ffffffff81801c49>] ? start_kernel+0x34e/0x35a
>> [405542.455950] [<ffffffff81801398>] ? x86_64_start_kernel+0xf3/0xf9
>
> This panic is different than the last one.
> It is coming from TCP fragment code being
> called with an invalid skb. If I read the registers correctly,
> skb->len (R14) = 0 and len (EDX) = 1708; the check here is failing.
>
> int tcp_fragment(struct sock *sk, struct sk_buff *skb, u32 len,
> unsigned int mss_now)
> {
>
> BUG_ON(len > skb->len);
>
>
> Are you running with large (or small) MTU? What netfilter rules, perhaps
> the firewall rule altered the packet.


MTU 1500, No firewall rules (empty rules for filter, no mangle, no nat
modules)