This is the 2.6.38.5 kernel with the patch in
[PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error
Had this crash a few days ago, but never got any response to subsequent
emails. Another server crashed today with a network related backtrace,
but netconsole did not work.
The network switch is a
HP ProCurve J9147A 2910al-48G Switch, revision W.14.49, ROM W.14.04
MTU is set at 1500
No firewall rules in the filter chain
No mangle or NAT loaded or compiled in the kernel.
Bonding on 2 igb gigabit ethernet is activated.
Network traffic was at 700mbit and 900mbit in each instance.
[405542.454073] ------------[ cut here ]------------
[405542.454109] kernel BUG at net/ipv4/tcp_output.c:1006!
[405542.454136] invalid opcode: 0000 [#1]
[405542.454166] last sysfs file:
/sys/devices/pci0000:00/0000:00:1f.2/host6/scsi_host/host6/proc_name
[405542.454213] CPU 0
[405542.454220] Modules linked in:
i2c_i801
evdev
i2c_core
button
[last unloaded: scsi_wait_scan]
[405542.454300]
[405542.454320] Pid: 0, comm: swapper Not tainted 2.6.38.5 #8
/
[405542.454379] RIP: 0010:[<ffffffff814e7ed2>]
[<ffffffff814e7ed2>] tcp_fragment+0x22/0x29a
[405542.454433] RSP: 0018:ffff8800bf403a30 EFLAGS: 00010202
[405542.454460] RAX: ffff88000cd35000 RBX: ffff88006b84f480 RCX:
0000000000000218
[405542.454504] RDX: 0000000000001708 RSI: ffff88006b84f480 RDI:
ffff880008d6b200
[405542.454548] RBP: 0000000000001540 R08: 0000000000000002 R09:
000000001027984a
[405542.454592] R10: ffff8800b915f428 R11: ffff880008d6b200 R12:
ffff88006b84f4a8
[405542.454636] R13: 0000000000001708 R14: 0000000000000000 R15:
ffff880008d6b200
[405542.454680] FS: 0000000000000000(0000) GS:ffff8800bf400000(0000)
knlGS:0000000000000000
[405542.454726] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[405542.454754] CR2: 00007f94055c7000 CR3: 000000083e0bd000 CR4:
00000000000006f0
[405542.454798] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
0000000000000000
[405542.454842] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
0000000000000400
[405542.454886] Process swapper (pid: 0, threadinfo ffffffff8176c000,
task ffffffff81777020)
[405542.454931] Stack:
[405542.454951] 0000000000000000
0000021808d6b798
00000002000005b4
ffff88006b84f480
[405542.455006] ffff880008d6b200
ffff88006b84f4a8
0000000000000015
0000000000000000
[405542.455061] ffff880008d6b300
ffffffff814df7a4
ffff8802a3965140
00000000000001a0
[405542.455115] Call Trace:
[405542.455137] <IRQ>
[405542.455162] [<ffffffff814df7a4>] ? tcp_mark_head_lost+0x13c/0x202
[405542.455192] [<ffffffff814e33a8>] ? tcp_ack+0xe98/0x1a89
[405542.455220] [<ffffffff814e42ca>] ? tcp_validate_incoming+0x69/0x290
[405542.455250] [<ffffffff814e4c9b>] ? tcp_rcv_established+0x7aa/0xa13
[405542.455281] [<ffffffff814ec60b>] ? tcp_v4_do_rcv+0x1b2/0x382
[405542.455310] [<ffffffff814c95d4>] ? nf_iterate+0x40/0x78
[405542.455338] [<ffffffff814ecc5f>] ? tcp_v4_rcv+0x484/0x797
[405542.455368] [<ffffffff814d11c7>] ? ip_local_deliver_finish+0xab/0x139
[405542.455398] [<ffffffff814ae2b3>] ? __netif_receive_skb+0x31c/0x349
[405542.455428] [<ffffffff814aec82>] ? netif_receive_skb+0x67/0x6d
[405542.455457] [<ffffffff814af1fb>] ? napi_gro_receive+0x9d/0xab
[405542.455485] [<ffffffff814aed57>] ? napi_skb_finish+0x1c/0x31
[405542.455516] [<ffffffff813e4248>] ? igb_poll+0x7d5/0xb2e
[405542.455544] [<ffffffff813e432f>] ? igb_poll+0x8bc/0xb2e
[405542.455572] [<ffffffff813e211a>] ? igb_msix_ring+0x6e/0x75
[405542.455602] [<ffffffff8106749c>] ? handle_IRQ_event+0x51/0x119
[405542.455631] [<ffffffff814af337>] ? net_rx_action+0xa7/0x212
[405542.455661] [<ffffffff8103b6c2>] ? __do_softirq+0xbe/0x184
[405542.455690] [<ffffffff8100364c>] ? call_softirq+0x1c/0x28
[405542.455719] [<ffffffff81005085>] ? do_softirq+0x31/0x63
[405542.455746] [<ffffffff8103b56c>] ? irq_exit+0x36/0x78
[405542.455773] [<ffffffff81004784>] ? do_IRQ+0x98/0xae
[405542.455802] [<ffffffff81562ed3>] ? ret_from_intr+0x0/0xe
[405542.455829] <EOI>
[405542.455860] [<ffffffff81009a41>] ? mwait_idle+0xb9/0xf3
[405542.455888] [<ffffffff81001c6e>] ? cpu_idle+0x57/0x8d
[405542.455921] [<ffffffff81801c49>] ? start_kernel+0x34e/0x35a
[405542.455950] [<ffffffff81801398>] ? x86_64_start_kernel+0xf3/0xf9
[405542.455977] Code:
f>
[405542.456239] RIP
[<ffffffff814e7ed2>] tcp_fragment+0x22/0x29a
[405542.456270] RSP <ffff8800bf403a30>
[405542.456543] ---[ end trace 231aaa222f893065 ]---
[405542.456600] Kernel panic - not syncing: Fatal exception in interrupt
[405542.456659] Pid: 0, comm: swapper Tainted: G D 2.6.38.5 #8
[405542.456719] Call Trace:
[405542.456770] <IRQ>
[<ffffffff81560960>] ? panic+0x9d/0x1a0
[405542.456863] [<ffffffff81562ed3>] ? ret_from_intr+0x0/0xe
[405542.456923] [<ffffffff810365bb>] ? kmsg_dump+0x46/0xec
[405542.456981] [<ffffffff81006176>] ? oops_end+0x9f/0xac
[405542.457039] [<ffffffff81003f83>] ? do_invalid_op+0x85/0x8f
[405542.457097] [<ffffffff814e7ed2>] ? tcp_fragment+0x22/0x29a
[405542.457156] [<ffffffff814e80a9>] ? tcp_fragment+0x1f9/0x29a
[405542.457216] [<ffffffff810033d5>] ? invalid_op+0x15/0x20
[405542.457276] [<ffffffff814e7ed2>] ? tcp_fragment+0x22/0x29a
[405542.457337] [<ffffffff814df7a4>] ? tcp_mark_head_lost+0x13c/0x202
[405542.457400] [<ffffffff814e33a8>] ? tcp_ack+0xe98/0x1a89
[405542.457461] [<ffffffff814e42ca>] ? tcp_validate_incoming+0x69/0x290
[405542.457524] [<ffffffff814e4c9b>] ? tcp_rcv_established+0x7aa/0xa13
[405542.457586] [<ffffffff814ec60b>] ? tcp_v4_do_rcv+0x1b2/0x382
[405542.457645] [<ffffffff814c95d4>] ? nf_iterate+0x40/0x78
[405542.457703] [<ffffffff814ecc5f>] ? tcp_v4_rcv+0x484/0x797
[405542.457761] [<ffffffff814d11c7>] ? ip_local_deliver_finish+0xab/0x139
[405542.457827] [<ffffffff814ae2b3>] ? __netif_receive_skb+0x31c/0x349
[405542.457894] [<ffffffff814aec82>] ? netif_receive_skb+0x67/0x6d
[405542.457953] [<ffffffff814af1fb>] ? napi_gro_receive+0x9d/0xab
[405542.458021] [<ffffffff814aed57>] ? napi_skb_finish+0x1c/0x31
[405542.458080] [<ffffffff813e4248>] ? igb_poll+0x7d5/0xb2e
[405542.458138] [<ffffffff813e432f>] ? igb_poll+0x8bc/0xb2e
[405542.458196] [<ffffffff813e211a>] ? igb_msix_ring+0x6e/0x75
[405542.458254] [<ffffffff8106749c>] ? handle_IRQ_event+0x51/0x119
[405542.458313] [<ffffffff814af337>] ? net_rx_action+0xa7/0x212
[405542.458371] [<ffffffff8103b6c2>] ? __do_softirq+0xbe/0x184
[405542.458430] [<ffffffff8100364c>] ? call_softirq+0x1c/0x28
[405542.458488] [<ffffffff81005085>] ? do_softirq+0x31/0x63
[405542.458545] [<ffffffff8103b56c>] ? irq_exit+0x36/0x78
[405542.458602] [<ffffffff81004784>] ? do_IRQ+0x98/0xae
[405542.458660] [<ffffffff81562ed3>] ? ret_from_intr+0x0/0xe
[405542.458717] <EOI>
[<ffffffff81009a41>] ? mwait_idle+0xb9/0xf3
[405542.458810] [<ffffffff81001c6e>] ? cpu_idle+0x57/0x8d
[405542.458867] [<ffffffff81801c49>] ? start_kernel+0x34e/0x35a
[405542.458926] [<ffffffff81801398>] ? x86_64_start_kernel+0xf3/0xf9
On 05/13/2011 10:11 AM, TB wrote:
> This is the 2.6.38.5 kernel with the patch in
> [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error
>
> Had this crash a few days ago, but never got any response to subsequent
> emails. Another server crashed today with a network related backtrace,
> but netconsole did not work.
We've seen some funny things with the in-kernel igb driver.
Not crashes, but just strange performance issues and CRC errors
on the wire and such. Intel's igb driver seems to work fine
for us.
If you have no other things to try..you might try using the
out-of-kernel igb driver from Intel and see if that makes
any difference.
Thanks,
Ben
>
> The network switch is a
> HP ProCurve J9147A 2910al-48G Switch, revision W.14.49, ROM W.14.04
>
> MTU is set at 1500
> No firewall rules in the filter chain
> No mangle or NAT loaded or compiled in the kernel.
> Bonding on 2 igb gigabit ethernet is activated.
> Network traffic was at 700mbit and 900mbit in each instance.
>
> [405542.454073] ------------[ cut here ]------------
> [405542.454109] kernel BUG at net/ipv4/tcp_output.c:1006!
> [405542.454136] invalid opcode: 0000 [#1]
>
> [405542.454166] last sysfs file:
> /sys/devices/pci0000:00/0000:00:1f.2/host6/scsi_host/host6/proc_name
> [405542.454213] CPU 0
>
> [405542.454220] Modules linked in:
> i2c_i801
> evdev
> i2c_core
> button
> [last unloaded: scsi_wait_scan]
>
> [405542.454300]
> [405542.454320] Pid: 0, comm: swapper Not tainted 2.6.38.5 #8
>
> /
>
> [405542.454379] RIP: 0010:[<ffffffff814e7ed2>]
> [<ffffffff814e7ed2>] tcp_fragment+0x22/0x29a
> [405542.454433] RSP: 0018:ffff8800bf403a30 EFLAGS: 00010202
> [405542.454460] RAX: ffff88000cd35000 RBX: ffff88006b84f480 RCX:
> 0000000000000218
> [405542.454504] RDX: 0000000000001708 RSI: ffff88006b84f480 RDI:
> ffff880008d6b200
> [405542.454548] RBP: 0000000000001540 R08: 0000000000000002 R09:
> 000000001027984a
> [405542.454592] R10: ffff8800b915f428 R11: ffff880008d6b200 R12:
> ffff88006b84f4a8
> [405542.454636] R13: 0000000000001708 R14: 0000000000000000 R15:
> ffff880008d6b200
> [405542.454680] FS: 0000000000000000(0000) GS:ffff8800bf400000(0000)
> knlGS:0000000000000000
> [405542.454726] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
> [405542.454754] CR2: 00007f94055c7000 CR3: 000000083e0bd000 CR4:
> 00000000000006f0
> [405542.454798] DR0: 0000000000000000 DR1: 0000000000000000 DR2:
> 0000000000000000
> [405542.454842] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7:
> 0000000000000400
> [405542.454886] Process swapper (pid: 0, threadinfo ffffffff8176c000,
> task ffffffff81777020)
> [405542.454931] Stack:
> [405542.454951] 0000000000000000
> 0000021808d6b798
> 00000002000005b4
> ffff88006b84f480
>
> [405542.455006] ffff880008d6b200
> ffff88006b84f4a8
> 0000000000000015
> 0000000000000000
>
> [405542.455061] ffff880008d6b300
> ffffffff814df7a4
> ffff8802a3965140
> 00000000000001a0
>
> [405542.455115] Call Trace:
> [405542.455137]<IRQ>
>
> [405542.455162] [<ffffffff814df7a4>] ? tcp_mark_head_lost+0x13c/0x202
> [405542.455192] [<ffffffff814e33a8>] ? tcp_ack+0xe98/0x1a89
> [405542.455220] [<ffffffff814e42ca>] ? tcp_validate_incoming+0x69/0x290
> [405542.455250] [<ffffffff814e4c9b>] ? tcp_rcv_established+0x7aa/0xa13
> [405542.455281] [<ffffffff814ec60b>] ? tcp_v4_do_rcv+0x1b2/0x382
> [405542.455310] [<ffffffff814c95d4>] ? nf_iterate+0x40/0x78
> [405542.455338] [<ffffffff814ecc5f>] ? tcp_v4_rcv+0x484/0x797
> [405542.455368] [<ffffffff814d11c7>] ? ip_local_deliver_finish+0xab/0x139
> [405542.455398] [<ffffffff814ae2b3>] ? __netif_receive_skb+0x31c/0x349
> [405542.455428] [<ffffffff814aec82>] ? netif_receive_skb+0x67/0x6d
> [405542.455457] [<ffffffff814af1fb>] ? napi_gro_receive+0x9d/0xab
> [405542.455485] [<ffffffff814aed57>] ? napi_skb_finish+0x1c/0x31
> [405542.455516] [<ffffffff813e4248>] ? igb_poll+0x7d5/0xb2e
> [405542.455544] [<ffffffff813e432f>] ? igb_poll+0x8bc/0xb2e
> [405542.455572] [<ffffffff813e211a>] ? igb_msix_ring+0x6e/0x75
> [405542.455602] [<ffffffff8106749c>] ? handle_IRQ_event+0x51/0x119
> [405542.455631] [<ffffffff814af337>] ? net_rx_action+0xa7/0x212
> [405542.455661] [<ffffffff8103b6c2>] ? __do_softirq+0xbe/0x184
> [405542.455690] [<ffffffff8100364c>] ? call_softirq+0x1c/0x28
> [405542.455719] [<ffffffff81005085>] ? do_softirq+0x31/0x63
> [405542.455746] [<ffffffff8103b56c>] ? irq_exit+0x36/0x78
> [405542.455773] [<ffffffff81004784>] ? do_IRQ+0x98/0xae
> [405542.455802] [<ffffffff81562ed3>] ? ret_from_intr+0x0/0xe
> [405542.455829]<EOI>
>
> [405542.455860] [<ffffffff81009a41>] ? mwait_idle+0xb9/0xf3
> [405542.455888] [<ffffffff81001c6e>] ? cpu_idle+0x57/0x8d
> [405542.455921] [<ffffffff81801c49>] ? start_kernel+0x34e/0x35a
> [405542.455950] [<ffffffff81801398>] ? x86_64_start_kernel+0xf3/0xf9
> [405542.455977] Code:
> f>
>
> [405542.456239] RIP
> [<ffffffff814e7ed2>] tcp_fragment+0x22/0x29a
> [405542.456270] RSP<ffff8800bf403a30>
> [405542.456543] ---[ end trace 231aaa222f893065 ]---
> [405542.456600] Kernel panic - not syncing: Fatal exception in interrupt
> [405542.456659] Pid: 0, comm: swapper Tainted: G D 2.6.38.5 #8
> [405542.456719] Call Trace:
> [405542.456770]<IRQ>
> [<ffffffff81560960>] ? panic+0x9d/0x1a0
> [405542.456863] [<ffffffff81562ed3>] ? ret_from_intr+0x0/0xe
> [405542.456923] [<ffffffff810365bb>] ? kmsg_dump+0x46/0xec
> [405542.456981] [<ffffffff81006176>] ? oops_end+0x9f/0xac
> [405542.457039] [<ffffffff81003f83>] ? do_invalid_op+0x85/0x8f
> [405542.457097] [<ffffffff814e7ed2>] ? tcp_fragment+0x22/0x29a
> [405542.457156] [<ffffffff814e80a9>] ? tcp_fragment+0x1f9/0x29a
> [405542.457216] [<ffffffff810033d5>] ? invalid_op+0x15/0x20
> [405542.457276] [<ffffffff814e7ed2>] ? tcp_fragment+0x22/0x29a
> [405542.457337] [<ffffffff814df7a4>] ? tcp_mark_head_lost+0x13c/0x202
> [405542.457400] [<ffffffff814e33a8>] ? tcp_ack+0xe98/0x1a89
> [405542.457461] [<ffffffff814e42ca>] ? tcp_validate_incoming+0x69/0x290
> [405542.457524] [<ffffffff814e4c9b>] ? tcp_rcv_established+0x7aa/0xa13
> [405542.457586] [<ffffffff814ec60b>] ? tcp_v4_do_rcv+0x1b2/0x382
> [405542.457645] [<ffffffff814c95d4>] ? nf_iterate+0x40/0x78
> [405542.457703] [<ffffffff814ecc5f>] ? tcp_v4_rcv+0x484/0x797
> [405542.457761] [<ffffffff814d11c7>] ? ip_local_deliver_finish+0xab/0x139
> [405542.457827] [<ffffffff814ae2b3>] ? __netif_receive_skb+0x31c/0x349
> [405542.457894] [<ffffffff814aec82>] ? netif_receive_skb+0x67/0x6d
> [405542.457953] [<ffffffff814af1fb>] ? napi_gro_receive+0x9d/0xab
> [405542.458021] [<ffffffff814aed57>] ? napi_skb_finish+0x1c/0x31
> [405542.458080] [<ffffffff813e4248>] ? igb_poll+0x7d5/0xb2e
> [405542.458138] [<ffffffff813e432f>] ? igb_poll+0x8bc/0xb2e
> [405542.458196] [<ffffffff813e211a>] ? igb_msix_ring+0x6e/0x75
> [405542.458254] [<ffffffff8106749c>] ? handle_IRQ_event+0x51/0x119
> [405542.458313] [<ffffffff814af337>] ? net_rx_action+0xa7/0x212
> [405542.458371] [<ffffffff8103b6c2>] ? __do_softirq+0xbe/0x184
> [405542.458430] [<ffffffff8100364c>] ? call_softirq+0x1c/0x28
> [405542.458488] [<ffffffff81005085>] ? do_softirq+0x31/0x63
> [405542.458545] [<ffffffff8103b56c>] ? irq_exit+0x36/0x78
> [405542.458602] [<ffffffff81004784>] ? do_IRQ+0x98/0xae
> [405542.458660] [<ffffffff81562ed3>] ? ret_from_intr+0x0/0xe
> [405542.458717]<EOI>
> [<ffffffff81009a41>] ? mwait_idle+0xb9/0xf3
> [405542.458810] [<ffffffff81001c6e>] ? cpu_idle+0x57/0x8d
> [405542.458867] [<ffffffff81801c49>] ? start_kernel+0x34e/0x35a
> [405542.458926] [<ffffffff81801398>] ? x86_64_start_kernel+0xf3/0xf9
>
> --
> To unsubscribe from this list: send the line "unsubscribe netdev" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com
Le vendredi 13 mai 2011 à 13:11 -0400, TB a écrit :
> This is the 2.6.38.5 kernel with the patch in
> [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error
>
> Had this crash a few days ago, but never got any response to subsequent
> emails. Another server crashed today with a network related backtrace,
> but netconsole did not work.
I understand you are anxious, but sending this message to lkml will
hardly find more people to look at this problem.
You sent your messages two days ago on netdev, thats not like its one or
two months.
Please send us full disassembly of tcp_fragment (from vmlinux file)
On 11-05-13 01:27 PM, Eric Dumazet wrote:
> Le vendredi 13 mai 2011 à 13:11 -0400, TB a écrit :
>> This is the 2.6.38.5 kernel with the patch in
>> [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error
>>
>
> Please send us full disassembly of tcp_fragment (from vmlinux file)
GCC is debian 4.3.2-1.1
AS 2.18.0.20080103
CPU is Intel Xeon E5620
Kernel CPU is set to MCORE2 (Core 2/newer Xeon)
ffffffff814e7eb0 <tcp_fragment>:
ffffffff814e7eb0: 41 57 push %r15
ffffffff814e7eb2: 49 89 ff mov %rdi,%r15
ffffffff814e7eb5: 41 56 push %r14
ffffffff814e7eb7: 41 55 push %r13
ffffffff814e7eb9: 41 89 d5 mov %edx,%r13d
ffffffff814e7ebc: 41 54 push %r12
ffffffff814e7ebe: 55 push %rbp
ffffffff814e7ebf: 53 push %rbx
ffffffff814e7ec0: 48 89 f3 mov %rsi,%rbx
ffffffff814e7ec3: 48 83 ec 18 sub $0x18,%rsp
ffffffff814e7ec7: 89 4c 24 0c mov %ecx,0xc(%rsp)
ffffffff814e7ecb: 8b 6e 68 mov 0x68(%rsi),%ebp
ffffffff814e7ece: 39 ea cmp %ebp,%edx
ffffffff814e7ed0: 76 04 jbe ffffffff814e7ed6
<tcp_fragment+0x26>
ffffffff814e7ed2: 0f 0b ud2a
ffffffff814e7ed4: eb fe jmp ffffffff814e7ed4
<tcp_fragment+0x24>
ffffffff814e7ed6: 44 8b 66 6c mov 0x6c(%rsi),%r12d
ffffffff814e7eda: f6 46 7c 02 testb $0x2,0x7c(%rsi)
ffffffff814e7ede: 74 33 je ffffffff814e7f13
<tcp_fragment+0x63>
ffffffff814e7ee0: 8b 86 b4 00 00 00 mov 0xb4(%rsi),%eax
ffffffff814e7ee6: 48 03 86 b8 00 00 00 add 0xb8(%rsi),%rax
ffffffff814e7eed: 8b 40 28 mov 0x28(%rax),%eax
ffffffff814e7ef0: 66 ff c8 dec %ax
ffffffff814e7ef3: 74 1e je ffffffff814e7f13
<tcp_fragment+0x63>
ffffffff814e7ef5: 45 85 e4 test %r12d,%r12d
ffffffff814e7ef8: 74 19 je ffffffff814e7f13
<tcp_fragment+0x63>
ffffffff814e7efa: 31 d2 xor %edx,%edx
ffffffff814e7efc: 31 f6 xor %esi,%esi
ffffffff814e7efe: b9 20 00 00 00 mov $0x20,%ecx
ffffffff814e7f03: 48 89 df mov %rbx,%rdi
ffffffff814e7f06: e8 68 fe fb ff callq ffffffff814a7d73
<pskb_expand_head>
ffffffff814e7f0b: 85 c0 test %eax,%eax
ffffffff814e7f0d: 0f 85 23 02 00 00 jne ffffffff814e8136
<tcp_fragment+0x286>
ffffffff814e7f13: 44 29 e5 sub %r12d,%ebp
ffffffff814e7f16: 45 31 f6 xor %r14d,%r14d
ffffffff814e7f19: 89 e8 mov %ebp,%eax
ffffffff814e7f1b: ba 20 00 00 00 mov $0x20,%edx
ffffffff814e7f20: 44 29 e8 sub %r13d,%eax
ffffffff814e7f23: 4c 89 ff mov %r15,%rdi
ffffffff814e7f26: 44 0f 49 f0 cmovns %eax,%r14d
ffffffff814e7f2a: 44 89 f6 mov %r14d,%esi
ffffffff814e7f2d: e8 82 51 ff ff callq ffffffff814dd0b4
<sk_stream_alloc_skb>
ffffffff814e7f32: 48 89 c5 mov %rax,%rbp
ffffffff814e7f35: 48 85 c0 test %rax,%rax
ffffffff814e7f38: 0f 84 f8 01 00 00 je ffffffff814e8136
<tcp_fragment+0x286>
ffffffff814e7f3e: 8b 80 c8 00 00 00 mov 0xc8(%rax),%eax
ffffffff814e7f44: 41 01 87 1c 01 00 00 add %eax,0x11c(%r15)
ffffffff814e7f4b: 49 8b 47 28 mov 0x28(%r15),%rax
ffffffff814e7f4f: 8b 95 c8 00 00 00 mov 0xc8(%rbp),%edx
ffffffff814e7f55: 48 83 b8 c8 00 00 00 cmpq $0x0,0xc8(%rax)
ffffffff814e7f5c: 00
ffffffff814e7f5d: 74 07 je ffffffff814e7f66
<tcp_fragment+0xb6>
ffffffff814e7f5f: 41 29 97 98 00 00 00 sub %edx,0x98(%r15)
ffffffff814e7f66: 8b 43 68 mov 0x68(%rbx),%eax
ffffffff814e7f69: 4c 8d 63 28 lea 0x28(%rbx),%r12
ffffffff814e7f6d: 44 29 e8 sub %r13d,%eax
ffffffff814e7f70: 44 89 ea mov %r13d,%edx
ffffffff814e7f73: 44 29 f0 sub %r14d,%eax
ffffffff814e7f76: 01 85 c8 00 00 00 add %eax,0xc8(%rbp)
ffffffff814e7f7c: 29 83 c8 00 00 00 sub %eax,0xc8(%rbx)
ffffffff814e7f82: 48 8d 45 28 lea 0x28(%rbp),%rax
ffffffff814e7f86: 48 89 44 24 10 mov %rax,0x10(%rsp)
ffffffff814e7f8b: 41 03 54 24 10 add 0x10(%r12),%edx
ffffffff814e7f90: 89 50 10 mov %edx,0x10(%rax)
ffffffff814e7f93: 41 8b 44 24 14 mov 0x14(%r12),%eax
ffffffff814e7f98: 48 8b 4c 24 10 mov 0x10(%rsp),%rcx
ffffffff814e7f9d: 89 41 14 mov %eax,0x14(%rcx)
ffffffff814e7fa0: 41 89 54 24 14 mov %edx,0x14(%r12)
ffffffff814e7fa5: 41 8a 54 24 1c mov 0x1c(%r12),%dl
ffffffff814e7faa: 88 d0 mov %dl,%al
ffffffff814e7fac: 83 e0 f6 and
$0xfffffffffffffff6,%eax
ffffffff814e7faf: 41 88 44 24 1c mov %al,0x1c(%r12)
ffffffff814e7fb4: 88 51 1c mov %dl,0x1c(%rcx)
ffffffff814e7fb7: 41 8a 44 24 1d mov 0x1d(%r12),%al
ffffffff814e7fbc: 88 41 1d mov %al,0x1d(%rcx)
ffffffff814e7fbf: 8b 93 b4 00 00 00 mov 0xb4(%rbx),%edx
ffffffff814e7fc5: 48 8b 83 b8 00 00 00 mov 0xb8(%rbx),%rax
ffffffff814e7fcc: 66 83 3c 10 00 cmpw $0x0,(%rax,%rdx,1)
ffffffff814e7fd1: 75 6e jne ffffffff814e8041
<tcp_fragment+0x191>
ffffffff814e7fd3: 8a 43 7c mov 0x7c(%rbx),%al
ffffffff814e7fd6: 83 e0 0c and $0xc,%eax
ffffffff814e7fd9: 3c 0c cmp $0xc,%al
ffffffff814e7fdb: 74 64 je ffffffff814e8041
<tcp_fragment+0x191>
ffffffff814e7fdd: 44 89 f6 mov %r14d,%esi
ffffffff814e7fe0: 48 89 ef mov %rbp,%rdi
ffffffff814e7fe3: e8 da f7 fb ff callq ffffffff814a77c2
<skb_put>
ffffffff814e7fe8: 31 c9 xor %ecx,%ecx
ffffffff814e7fea: 48 89 c6 mov %rax,%rsi
ffffffff814e7fed: 44 89 ef mov %r13d,%edi
ffffffff814e7ff0: 44 89 f2 mov %r14d,%edx
ffffffff814e7ff3: 48 03 bb c0 00 00 00 add 0xc0(%rbx),%rdi
ffffffff814e7ffa: e8 91 4f 05 00 callq ffffffff8153cf90
<csum_partial_copy_nocheck>
ffffffff814e7fff: 44 89 ee mov %r13d,%esi
ffffffff814e8002: 89 45 74 mov %eax,0x74(%rbp)
ffffffff814e8005: 48 89 df mov %rbx,%rdi
ffffffff814e8008: e8 09 de fb ff callq ffffffff814a5e16
<skb_trim>
ffffffff814e800d: 8b 45 74 mov 0x74(%rbp),%eax
ffffffff814e8010: 8b 4b 74 mov 0x74(%rbx),%ecx
ffffffff814e8013: 41 80 e5 01 and $0x1,%r13b
ffffffff814e8017: 74 15 je ffffffff814e802e
<tcp_fragment+0x17e>
ffffffff814e8019: 89 c2 mov %eax,%edx
ffffffff814e801b: c1 e8 08 shr $0x8,%eax
ffffffff814e801e: 81 e2 ff 00 ff 00 and $0xff00ff,%edx
ffffffff814e8024: 25 ff 00 ff 00 and $0xff00ff,%eax
ffffffff814e8029: c1 e2 08 shl $0x8,%edx
ffffffff814e802c: 01 d0 add %edx,%eax
ffffffff814e802e: f7 d0 not %eax
ffffffff814e8030: 89 c2 mov %eax,%edx
ffffffff814e8032: 01 ca add %ecx,%edx
ffffffff814e8034: 0f 92 c0 setb %al
ffffffff814e8037: 0f b6 c0 movzbl %al,%eax
ffffffff814e803a: 01 d0 add %edx,%eax
ffffffff814e803c: 89 43 74 mov %eax,0x74(%rbx)
ffffffff814e803f: eb 12 jmp ffffffff814e8053
<tcp_fragment+0x1a3>
ffffffff814e8041: 80 4b 7c 0c orb $0xc,0x7c(%rbx)
ffffffff814e8045: 44 89 ea mov %r13d,%edx
ffffffff814e8048: 48 89 ee mov %rbp,%rsi
ffffffff814e804b: 48 89 df mov %rbx,%rdi
ffffffff814e804e: e8 f8 f7 fb ff callq ffffffff814a784b
<skb_split>
ffffffff814e8053: 8a 53 7c mov 0x7c(%rbx),%dl
ffffffff814e8056: 8a 45 7c mov 0x7c(%rbp),%al
ffffffff814e8059: 83 e2 0c and $0xc,%edx
ffffffff814e805c: 83 e0 f3 and
$0xfffffffffffffff3,%eax
ffffffff814e805f: 48 89 de mov %rbx,%rsi
ffffffff814e8062: 09 d0 or %edx,%eax
ffffffff814e8064: 4c 89 ff mov %r15,%rdi
ffffffff814e8067: 88 45 7c mov %al,0x7c(%rbp)
ffffffff814e806a: 41 8b 44 24 18 mov 0x18(%r12),%eax
ffffffff814e806f: 48 8b 54 24 10 mov 0x10(%rsp),%rdx
ffffffff814e8074: 89 42 18 mov %eax,0x18(%rdx)
ffffffff814e8077: 48 8b 43 10 mov 0x10(%rbx),%rax
ffffffff814e807b: 8b 93 b4 00 00 00 mov 0xb4(%rbx),%edx
ffffffff814e8081: 48 89 45 10 mov %rax,0x10(%rbp)
ffffffff814e8085: 48 8b 83 b8 00 00 00 mov 0xb8(%rbx),%rax
ffffffff814e808c: 44 8b 64 10 04 mov
0x4(%rax,%rdx,1),%r12d
ffffffff814e8091: 8b 54 24 0c mov 0xc(%rsp),%edx
ffffffff814e8095: e8 3d dd ff ff callq ffffffff814e5dd7
<tcp_set_skb_tso_segs>
ffffffff814e809a: 8b 54 24 0c mov 0xc(%rsp),%edx
ffffffff814e809e: 48 89 ee mov %rbp,%rsi
ffffffff814e80a1: 4c 89 ff mov %r15,%rdi
ffffffff814e80a4: e8 2e dd ff ff callq ffffffff814e5dd7
<tcp_set_skb_tso_segs>
ffffffff814e80a9: 48 8b 4c 24 10 mov 0x10(%rsp),%rcx
ffffffff814e80ae: 8b 49 14 mov 0x14(%rcx),%ecx
ffffffff814e80b1: 41 39 8f 1c 04 00 00 cmp %ecx,0x41c(%r15)
ffffffff814e80b8: 78 39 js ffffffff814e80f3
<tcp_fragment+0x243>
ffffffff814e80ba: 8b 8b b4 00 00 00 mov 0xb4(%rbx),%ecx
ffffffff814e80c0: 41 0f b7 d4 movzwl %r12w,%edx
ffffffff814e80c4: 48 8b 83 b8 00 00 00 mov 0xb8(%rbx),%rax
ffffffff814e80cb: 0f b7 44 08 04 movzwl 0x4(%rax,%rcx,1),%eax
ffffffff814e80d0: 8b 8d b4 00 00 00 mov 0xb4(%rbp),%ecx
ffffffff814e80d6: 29 c2 sub %eax,%edx
ffffffff814e80d8: 48 8b 85 b8 00 00 00 mov 0xb8(%rbp),%rax
ffffffff814e80df: 0f b7 44 08 04 movzwl 0x4(%rax,%rcx,1),%eax
ffffffff814e80e4: 29 c2 sub %eax,%edx
ffffffff814e80e6: 74 0b je ffffffff814e80f3
<tcp_fragment+0x243>
ffffffff814e80e8: 48 89 de mov %rbx,%rsi
ffffffff814e80eb: 4c 89 ff mov %r15,%rdi
ffffffff814e80ee: e8 1a f4 ff ff callq ffffffff814e750d
<tcp_adjust_pcount>
ffffffff814e80f3: 8a 45 7c mov 0x7c(%rbp),%al
ffffffff814e80f6: a8 10 test $0x10,%al
ffffffff814e80f8: 74 04 je ffffffff814e80fe
<tcp_fragment+0x24e>
ffffffff814e80fa: 0f 0b ud2a
ffffffff814e80fc: eb fe jmp ffffffff814e80fc
<tcp_fragment+0x24c>
ffffffff814e80fe: 83 c8 10 or $0x10,%eax
ffffffff814e8101: 88 45 7c mov %al,0x7c(%rbp)
ffffffff814e8104: 8b 85 b4 00 00 00 mov 0xb4(%rbp),%eax
ffffffff814e810a: 48 03 85 b8 00 00 00 add 0xb8(%rbp),%rax
ffffffff814e8111: f0 81 40 28 00 00 01 lock addl
$0x10000,0x28(%rax)
ffffffff814e8118: 00
ffffffff814e8119: 48 8b 03 mov (%rbx),%rax
ffffffff814e811c: 48 89 5d 08 mov %rbx,0x8(%rbp)
ffffffff814e8120: 48 89 45 00 mov %rax,0x0(%rbp)
ffffffff814e8124: 48 89 68 08 mov %rbp,0x8(%rax)
ffffffff814e8128: 48 89 2b mov %rbp,(%rbx)
ffffffff814e812b: 31 c0 xor %eax,%eax
ffffffff814e812d: 41 ff 87 10 01 00 00 incl 0x110(%r15)
ffffffff814e8134: eb 05 jmp ffffffff814e813b
<tcp_fragment+0x28b>
ffffffff814e8136: b8 f4 ff ff ff mov $0xfffffff4,%eax
ffffffff814e813b: 48 83 c4 18 add $0x18,%rsp
ffffffff814e813f: 5b pop %rbx
ffffffff814e8140: 5d pop %rbp
ffffffff814e8141: 41 5c pop %r12
ffffffff814e8143: 41 5d pop %r13
ffffffff814e8145: 41 5e pop %r14
ffffffff814e8147: 41 5f pop %r15
ffffffff814e8149: c3 retq
Le vendredi 13 mai 2011 à 15:30 -0400, TB a écrit :
> On 11-05-13 01:27 PM, Eric Dumazet wrote:
> > Le vendredi 13 mai 2011 à 13:11 -0400, TB a écrit :
> >> This is the 2.6.38.5 kernel with the patch in
> >> [PATCH] tcp_cubic: limit delayed_ack ratio to prevent divide error
> >>
> >
> > Please send us full disassembly of tcp_fragment (from vmlinux file)
>
>
> GCC is debian 4.3.2-1.1
> AS 2.18.0.20080103
>
> CPU is Intel Xeon E5620
> Kernel CPU is set to MCORE2 (Core 2/newer Xeon)
>
>
> ffffffff814e7eb0 <tcp_fragment>:
> ffffffff814e7eb0: 41 57 push %r15
> ffffffff814e7eb2: 49 89 ff mov %rdi,%r15
> ffffffff814e7eb5: 41 56 push %r14
> ffffffff814e7eb7: 41 55 push %r13
> ffffffff814e7eb9: 41 89 d5 mov %edx,%r13d
> ffffffff814e7ebc: 41 54 push %r12
> ffffffff814e7ebe: 55 push %rbp
> ffffffff814e7ebf: 53 push %rbx
> ffffffff814e7ec0: 48 89 f3 mov %rsi,%rbx
> ffffffff814e7ec3: 48 83 ec 18 sub $0x18,%rsp
> ffffffff814e7ec7: 89 4c 24 0c mov %ecx,0xc(%rsp)
> ffffffff814e7ecb: 8b 6e 68 mov 0x68(%rsi),%ebp
> ffffffff814e7ece: 39 ea cmp %ebp,%edx
> ffffffff814e7ed0: 76 04 jbe ffffffff814e7ed6
> <tcp_fragment+0x26>
> ffffffff814e7ed2: 0f 0b ud2a
> ffffffff814e7ed4: eb fe jmp ffffffff814e7ed4
> <tcp_fragment+0x24>
So skb->len = 0x1540 and len = 0x1708
I suspect we should push commit 2fceec13375e5d98 (tcp: len check is
unnecessarily devastating, change to WARN_ON) to stable if not already
done...
http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2fceec13375e5d98
David, is this commit in your stable queue ?
Thanks !
From: Eric Dumazet <[email protected]>
Date: Fri, 13 May 2011 21:47:38 +0200
> I suspect we should push commit 2fceec13375e5d98 (tcp: len check is
> unnecessarily devastating, change to WARN_ON) to stable if not already
> done...
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2fceec13375e5d98
>
> David, is this commit in your stable queue ?
No, but now it is.
On 11-05-13 04:01 PM, David Miller wrote:
> From: Eric Dumazet <[email protected]>
> Date: Fri, 13 May 2011 21:47:38 +0200
>
>> I suspect we should push commit 2fceec13375e5d98 (tcp: len check is
>> unnecessarily devastating, change to WARN_ON) to stable if not already
>> done...
>>
>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2fceec13375e5d98
>>
>> David, is this commit in your stable queue ?
>
> No, but now it is.
We've put this commit with the previous tcp_cubic patch on 60 of our
servers and we're waiting to see how it goes.
Le jeudi 19 mai 2011 à 13:08 -0400, TB a écrit :
> On 11-05-13 04:01 PM, David Miller wrote:
> > From: Eric Dumazet <[email protected]>
> > Date: Fri, 13 May 2011 21:47:38 +0200
> >
> >> I suspect we should push commit 2fceec13375e5d98 (tcp: len check is
> >> unnecessarily devastating, change to WARN_ON) to stable if not already
> >> done...
> >>
> >> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2fceec13375e5d98
> >>
> >> David, is this commit in your stable queue ?
> >
> > No, but now it is.
>
> We've put this commit with the previous tcp_cubic patch on 60 of our
> servers and we're waiting to see how it goes.
Dont expect too much. It only permits to survive after logging messages,
instead of halting machine ;)
On 11-05-19 01:11 PM, Eric Dumazet wrote:
> Le jeudi 19 mai 2011 à 13:08 -0400, TB a écrit :
>> On 11-05-13 04:01 PM, David Miller wrote:
>>> From: Eric Dumazet <[email protected]>
>>> Date: Fri, 13 May 2011 21:47:38 +0200
>>>
>>>> I suspect we should push commit 2fceec13375e5d98 (tcp: len check is
>>>> unnecessarily devastating, change to WARN_ON) to stable if not already
>>>> done...
>>>>
>>>> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=commitdiff;h=2fceec13375e5d98
>>>>
>>>> David, is this commit in your stable queue ?
>>>
>>> No, but now it is.
>>
>> We've put this commit with the previous tcp_cubic patch on 60 of our
>> servers and we're waiting to see how it goes.
>
> Dont expect too much. It only permits to survive after logging messages,
> instead of halting machine ;)
That's all I'm expecting :)
However we've got 3 servers that hung with a blank screen and no network
over the weekend and nothing unusual in the net console
Le mardi 24 mai 2011 à 12:09 -0400, TB a écrit :
> That's all I'm expecting :)
>
> However we've got 3 servers that hung with a blank screen and no network
> over the weekend and nothing unusual in the net console
Oh well, it could be anything ...