2009-12-29 08:00:27

by Bruno Prémont

[permalink] [raw]
Subject: BNX2: Kernel crashes with 2.6.31 and 2.6.31.9

On a system that was running 2.6.31 since last September I got two
crashes this December at night (cause unknown), yesterday after second
crash I updated kernel to 2.6.31.9 and enabled netconsole in the hope
to get some information about the cause of the crash.

Today system crashed once again and all I got is the following
incomplete trace on the receiving side of netconsole:

[24701.841185] BUG: unable to handle kernel NULL pointer dereference at (null)
[24701.841188] IP: [<ffffffffa00610fc>] bnx2_poll_work+0x2c/0x12d0 [bnx2]
[24701.841197] PGD 16509067 PUD 4e776067 PMD 0
[24701.841199] Oops: 0000 [#1] SMP
[24701.841202] last sysfs file: /sys/kernel/uevent_seqnum
[24701.841204] CPU 0
[24701.841205] Modules linked in: ipmi_devintf squashfs ext2
zlib_inflate netconsole configfs loop dm_round_robin scsi_dh_rdac
dm_multipath scsi_dh dm_mod sg sr_mod cdrom ata_piix i pmi_si
ipmi_msghandler qla2xxx ahci bnx2 hpwdt uhci_hcd ehci_hcd libata
[24701.841218] Pid: 11273, comm: php-cgi Not tainted 2.6.31.9-x86_64 #1 ProLiant DL360 G5
[24701.841220] RIP: 0010:[<ffffffffa00610fc>] [<ffffffffa00610fc>] bnx2_poll_work+0x2c/0x12d0 [bnx2]


Running objdump on the bnx2.ko module I get the following:
000000000000a0d0 <bnx2_poll_work>:
a0d0: 41 57 push %r15
a0d2: 41 56 push %r14
a0d4: 41 55 push %r13
a0d6: 41 54 push %r12
a0d8: 55 push %rbp
a0d9: 53 push %rbx
a0da: 48 81 ec 28 01 00 00 sub $0x128,%rsp
a0e1: 48 89 7c 24 18 mov %rdi,0x18(%rsp)
a0e6: 48 89 74 24 10 mov %rsi,0x10(%rsp)
a0eb: 89 54 24 0c mov %edx,0xc(%rsp)
a0ef: 89 4c 24 08 mov %ecx,0x8(%rsp)
a0f3: 48 8b 54 24 10 mov 0x10(%rsp),%rdx
a0f8: 48 8b 42 70 mov 0x70(%rdx),%rax
a0fc: 0f b7 10 movzwl (%rax),%edx
a0ff: 31 c0 xor %eax,%eax
a101: 48 8b 4c 24 10 mov 0x10(%rsp),%rcx
a106: 80 fa ff cmp $0xff,%dl
a109: 0f 94 c0 sete %al
a10c: 01 c2 add %eax,%edx
a10e: 66 39 91 1a 02 00 00 cmp %dx,0x21a(%rcx)
a115: 0f 84 78 01 00 00 je a293 <bnx2_poll_work+0x1c3>
a11b: 48 8b 57 08 mov 0x8(%rdi),%rdx
a11f: 48 89 f8 mov %rdi,%rax
a122: 48 8b 9a 00 03 00 00 mov 0x300(%rdx),%rbx
a129: 48 83 c0 40 add $0x40,%rax
a12d: 48 29 c1 sub %rax,%rcx
a130: 48 89 c8 mov %rcx,%rax
a133: 48 c1 f8 06 sar $0x6,%rax
a137: 69 c0 39 8e e3 38 imul $0x38e38e39,%eax,%eax
a13d: 48 c1 e0 07 shl $0x7,%rax
a141: 48 01 d8 add %rbx,%rax
a144: 48 89 44 24 20 mov %rax,0x20(%rsp)
a149: 48 8b 7c 24 10 mov 0x10(%rsp),%rdi
a14e: 48 8b 47 70 mov 0x70(%rdi),%rax
a152: 44 0f b7 30 movzwl (%rax),%r14d
a156: 31 c0 xor %eax,%eax
a158: 0f b7 9f 18 02 00 00 movzwl 0x218(%rdi),%ebx
a15f: 41 80 fe ff cmp $0xff,%r14b
a163: 0f 94 c0 sete %al
a166: 45 31 ff xor %r15d,%r15d
a169: 41 01 c6 add %eax,%r14d
a16c: 66 44 39 f3 cmp %r14w,%bx
a170: 0f 84 ee 00 00 00 je a264 <bnx2_poll_work+0x194>
a176: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
a17d: 00 00 00
a180: 0f b6 cb movzbl %bl,%ecx
a183: 48 8b 44 24 10 mov 0x10(%rsp),%rax
a188: 44 0f b7 e1 movzwl %cx,%r12d
a18c: 49 c1 e4 04 shl $0x4,%r12
a190: 4c 03 a0 10 02 00 00 add 0x210(%rax),%r12
a197: 4d 8b 2c 24 mov (%r12),%r13
a19b: 66 41 83 7c 24 08 00 cmpw $0x0,0x8(%r12)
a1a2: 41 0f 18 8d bc 00 00 prefetcht0 0xbc(%r13)
a1a9: 00
...


Kernel is compiled on Gentoo (64bit):
Linux version 2.6.31.9-x86_64 () (gcc version 4.3.4 (Gentoo 4.3.4 p1.0, pie-10.1.5) ) #1 SMP Mon Dec 28 15:49:16 CET 2009
The affected server (HP DL360 G5) is running OpenSuSE-11.1,
32bit userspace

Any idea if there is a recent patch that could fix this issue? At the
crashing time the server was not specifically loaded and had around
200 packets/s network traffic.

Regards,
Bruno


2009-12-29 09:05:52

by Benjamin Li

[permalink] [raw]
Subject: Re: BNX2: Kernel crashes with 2.6.31 and 2.6.31.9

Hi Bruno,

It looks like the the NULL dereference is happening at a0fc.

a0f8: 48 8b 42 70 mov 0x70(%rdx),%rax
a0fc: 0f b7 10 movzwl (%rax),%edx
a0ff: 31 c0 xor %eax,%eax

The offset of 0x70 is the bp field in the bnx2_napi structure. (Seen in
the bnx2_napi structure dump below) These lines are found in the
routine, bnx2_get_hw_tx_cons() which look like they were inlined by the
compiler. More specifically it looks like the dereference of the
hw_tx_cons_ptr failed.

cons = *bnapi->hw_tx_cons_ptr;

http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/net/bnx2.c;h=06b901152d4487fa04164437cc179661b44657fe;hb=74fca6a42863ffacaf7ba6f1936a9f228950f657#l2761

To be sure this is the case, could you send the .config file you are
using or if you could send me the bnx2 kernel module built with the
CFLAG '-g', then we can definitely verify where in the code it is
crashing.

Did you see anything suspicious in the system kernel logs? If you could
isolate the logs from when the machine booted to when it crash and send
it to us it would be very helpful.

Thanks again for your time.

-Ben


<--snip snip structure dump from pahole-->
struct bnx2_napi {
struct napi_struct napi; /* 0 96
*/
/* --- cacheline 1 boundary (64 bytes) was 32 bytes ago --- */
struct bnx2 * bp; /* 96 8
*/
union {
struct status_block * msi; /* 8
*/
struct status_block_msix * msix; /* 8
*/
} status_blk; /* 104 8
*/
u16 * hw_tx_cons_ptr; /* 112 8
*/
u16 * hw_rx_cons_ptr; /* 120 8
*/
/* --- cacheline 2 boundary (128 bytes) --- */
u32 last_status_idx; /* 128 4
*/
u32 int_num; /* 132 4
*/
struct bnx2_rx_ring_info rx_ring; /* 136 360
*/
/* --- cacheline 7 boundary (448 bytes) was 48 bytes ago --- */
struct bnx2_tx_ring_info tx_ring; /* 496 48
*/
/* --- cacheline 8 boundary (512 bytes) was 32 bytes ago --- */

/* size: 576, cachelines: 9 */
/* padding: 32 */
};
<--snip snip-->

On Mon, 2009-12-28 at 23:49 -0800, Bruno Pr?mont wrote:
> On a system that was running 2.6.31 since last September I got two
> crashes this December at night (cause unknown), yesterday after second
> crash I updated kernel to 2.6.31.9 and enabled netconsole in the hope
> to get some information about the cause of the crash.
>
> Today system crashed once again and all I got is the following
> incomplete trace on the receiving side of netconsole:
>
> [24701.841185] BUG: unable to handle kernel NULL pointer dereference at (null)
> [24701.841188] IP: [<ffffffffa00610fc>] bnx2_poll_work+0x2c/0x12d0 [bnx2]
> [24701.841197] PGD 16509067 PUD 4e776067 PMD 0
> [24701.841199] Oops: 0000 [#1] SMP
> [24701.841202] last sysfs file: /sys/kernel/uevent_seqnum
> [24701.841204] CPU 0
> [24701.841205] Modules linked in: ipmi_devintf squashfs ext2
> zlib_inflate netconsole configfs loop dm_round_robin scsi_dh_rdac
> dm_multipath scsi_dh dm_mod sg sr_mod cdrom ata_piix i pmi_si
> ipmi_msghandler qla2xxx ahci bnx2 hpwdt uhci_hcd ehci_hcd libata
> [24701.841218] Pid: 11273, comm: php-cgi Not tainted 2.6.31.9-x86_64 #1 ProLiant DL360 G5
> [24701.841220] RIP: 0010:[<ffffffffa00610fc>] [<ffffffffa00610fc>] bnx2_poll_work+0x2c/0x12d0 [bnx2]
>
>
> Running objdump on the bnx2.ko module I get the following:
> 000000000000a0d0 <bnx2_poll_work>:
> a0d0: 41 57 push %r15
> a0d2: 41 56 push %r14
> a0d4: 41 55 push %r13
> a0d6: 41 54 push %r12
> a0d8: 55 push %rbp
> a0d9: 53 push %rbx
> a0da: 48 81 ec 28 01 00 00 sub $0x128,%rsp
> a0e1: 48 89 7c 24 18 mov %rdi,0x18(%rsp)
> a0e6: 48 89 74 24 10 mov %rsi,0x10(%rsp)
> a0eb: 89 54 24 0c mov %edx,0xc(%rsp)
> a0ef: 89 4c 24 08 mov %ecx,0x8(%rsp)
> a0f3: 48 8b 54 24 10 mov 0x10(%rsp),%rdx
> a0f8: 48 8b 42 70 mov 0x70(%rdx),%rax
> a0fc: 0f b7 10 movzwl (%rax),%edx
> a0ff: 31 c0 xor %eax,%eax
> a101: 48 8b 4c 24 10 mov 0x10(%rsp),%rcx
> a106: 80 fa ff cmp $0xff,%dl
> a109: 0f 94 c0 sete %al
> a10c: 01 c2 add %eax,%edx
> a10e: 66 39 91 1a 02 00 00 cmp %dx,0x21a(%rcx)
> a115: 0f 84 78 01 00 00 je a293 <bnx2_poll_work+0x1c3>
> a11b: 48 8b 57 08 mov 0x8(%rdi),%rdx
> a11f: 48 89 f8 mov %rdi,%rax
> a122: 48 8b 9a 00 03 00 00 mov 0x300(%rdx),%rbx
> a129: 48 83 c0 40 add $0x40,%rax
> a12d: 48 29 c1 sub %rax,%rcx
> a130: 48 89 c8 mov %rcx,%rax
> a133: 48 c1 f8 06 sar $0x6,%rax
> a137: 69 c0 39 8e e3 38 imul $0x38e38e39,%eax,%eax
> a13d: 48 c1 e0 07 shl $0x7,%rax
> a141: 48 01 d8 add %rbx,%rax
> a144: 48 89 44 24 20 mov %rax,0x20(%rsp)
> a149: 48 8b 7c 24 10 mov 0x10(%rsp),%rdi
> a14e: 48 8b 47 70 mov 0x70(%rdi),%rax
> a152: 44 0f b7 30 movzwl (%rax),%r14d
> a156: 31 c0 xor %eax,%eax
> a158: 0f b7 9f 18 02 00 00 movzwl 0x218(%rdi),%ebx
> a15f: 41 80 fe ff cmp $0xff,%r14b
> a163: 0f 94 c0 sete %al
> a166: 45 31 ff xor %r15d,%r15d
> a169: 41 01 c6 add %eax,%r14d
> a16c: 66 44 39 f3 cmp %r14w,%bx
> a170: 0f 84 ee 00 00 00 je a264 <bnx2_poll_work+0x194>
> a176: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
> a17d: 00 00 00
> a180: 0f b6 cb movzbl %bl,%ecx
> a183: 48 8b 44 24 10 mov 0x10(%rsp),%rax
> a188: 44 0f b7 e1 movzwl %cx,%r12d
> a18c: 49 c1 e4 04 shl $0x4,%r12
> a190: 4c 03 a0 10 02 00 00 add 0x210(%rax),%r12
> a197: 4d 8b 2c 24 mov (%r12),%r13
> a19b: 66 41 83 7c 24 08 00 cmpw $0x0,0x8(%r12)
> a1a2: 41 0f 18 8d bc 00 00 prefetcht0 0xbc(%r13)
> a1a9: 00
> ...
>
>
> Kernel is compiled on Gentoo (64bit):
> Linux version 2.6.31.9-x86_64 () (gcc version 4.3.4 (Gentoo 4.3.4 p1.0, pie-10.1.5) ) #1 SMP Mon Dec 28 15:49:16 CET 2009
> The affected server (HP DL360 G5) is running OpenSuSE-11.1,
> 32bit userspace
>
> Any idea if there is a recent patch that could fix this issue? At the
> crashing time the server was not specifically loaded and had around
> 200 packets/s network traffic.
>
> Regards,
> Bruno
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2009-12-29 09:33:24

by Bruno Prémont

[permalink] [raw]
Subject: Re: BNX2: Kernel crashes with 2.6.31 and 2.6.31.9

Hi Benjamin,

On Tue, 29 Dec 2009 01:05:40 "Benjamin Li" <[email protected]> wrote:
> Hi Bruno,
>
> It looks like the the NULL dereference is happening at a0fc.
>
> a0f8: 48 8b 42 70 mov 0x70(%rdx),%rax
> a0fc: 0f b7 10 movzwl (%rax),%edx
> a0ff: 31 c0 xor %eax,%eax

Thanks for confirming my guess

> The offset of 0x70 is the bp field in the bnx2_napi structure. (Seen
> in the bnx2_napi structure dump below) These lines are found in the
> routine, bnx2_get_hw_tx_cons() which look like they were inlined by
> the compiler. More specifically it looks like the dereference of the
> hw_tx_cons_ptr failed.
>
> cons = *bnapi->hw_tx_cons_ptr;
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/net/bnx2.c;h=06b901152d4487fa04164437cc179661b44657fe;hb=74fca6a42863ffacaf7ba6f1936a9f228950f657#l2761
>
> To be sure this is the case, could you send the .config file you are
> using or if you could send me the bnx2 kernel module built with the
> CFLAG '-g', then we can definitely verify where in the code it is
> crashing.

See attached .config, if needed I can recompile with the module with
'-g', but the original instance does not contain debugging info.

> Did you see anything suspicious in the system kernel logs? If you
> could isolate the logs from when the machine booted to when it crash
> and send it to us it would be very helpful.

Unfortunately there is nothing suspicious in there, all I have is
attached dmesg (with IP addresses, MAC addresses replaced by '*'s)

I've not appended the crash dump gathered via netconsole which didn't
make it to the affected system's disk (see previous mail for it).


Regards,
Bruno



> Thanks again for your time.
>
> -Ben
>
>
> <--snip snip structure dump from pahole-->
> struct bnx2_napi {
> struct napi_struct napi; /* 0
> 96 */
> /* --- cacheline 1 boundary (64 bytes) was 32 bytes ago --- */
> struct bnx2 * bp; /* 96
> 8 */
> union {
> struct status_block * msi; /*
> 8 */
> struct status_block_msix * msix; /*
> 8 */
> } status_blk; /* 104
> 8 */
> u16 * hw_tx_cons_ptr; /* 112
> 8 */
> u16 * hw_rx_cons_ptr; /* 120
> 8 */
> /* --- cacheline 2 boundary (128 bytes) --- */
> u32 last_status_idx; /* 128
> 4 */
> u32 int_num; /* 132
> 4 */
> struct bnx2_rx_ring_info rx_ring; /* 136
> 360 */
> /* --- cacheline 7 boundary (448 bytes) was 48 bytes ago ---
> */ struct bnx2_tx_ring_info tx_ring; /* 496 48
> */
> /* --- cacheline 8 boundary (512 bytes) was 32 bytes ago ---
> */
>
> /* size: 576, cachelines: 9 */
> /* padding: 32 */
> };
> <--snip snip-->
>
> On Mon, 2009-12-28 at 23:49 -0800, Bruno Prémont wrote:
> > On a system that was running 2.6.31 since last September I got two
> > crashes this December at night (cause unknown), yesterday after
> > second crash I updated kernel to 2.6.31.9 and enabled netconsole in
> > the hope to get some information about the cause of the crash.
> >
> > Today system crashed once again and all I got is the following
> > incomplete trace on the receiving side of netconsole:
> >
> > [24701.841185] BUG: unable to handle kernel NULL pointer
> > dereference at (null) [24701.841188] IP: [<ffffffffa00610fc>]
> > bnx2_poll_work+0x2c/0x12d0 [bnx2] [24701.841197] PGD 16509067 PUD
> > 4e776067 PMD 0 [24701.841199] Oops: 0000 [#1] SMP
> > [24701.841202] last sysfs file: /sys/kernel/uevent_seqnum
> > [24701.841204] CPU 0
> > [24701.841205] Modules linked in: ipmi_devintf squashfs ext2
> > zlib_inflate netconsole configfs loop dm_round_robin scsi_dh_rdac
> > dm_multipath scsi_dh dm_mod sg sr_mod cdrom ata_piix i pmi_si
> > ipmi_msghandler qla2xxx ahci bnx2 hpwdt uhci_hcd ehci_hcd libata
> > [24701.841218] Pid: 11273, comm: php-cgi Not tainted
> > 2.6.31.9-x86_64 #1 ProLiant DL360 G5 [24701.841220] RIP:
> > 0010:[<ffffffffa00610fc>] [<ffffffffa00610fc>]
> > bnx2_poll_work+0x2c/0x12d0 [bnx2]
> >
> >
> > Running objdump on the bnx2.ko module I get the following:
> > 000000000000a0d0 <bnx2_poll_work>:
> > a0d0: 41 57 push %r15
> > a0d2: 41 56 push %r14
> > a0d4: 41 55 push %r13
> > a0d6: 41 54 push %r12
> > a0d8: 55 push %rbp
> > a0d9: 53 push %rbx
> > a0da: 48 81 ec 28 01 00 00 sub $0x128,%rsp
> > a0e1: 48 89 7c 24 18 mov %rdi,0x18(%rsp)
> > a0e6: 48 89 74 24 10 mov %rsi,0x10(%rsp)
> > a0eb: 89 54 24 0c mov %edx,0xc(%rsp)
> > a0ef: 89 4c 24 08 mov %ecx,0x8(%rsp)
> > a0f3: 48 8b 54 24 10 mov 0x10(%rsp),%rdx
> > a0f8: 48 8b 42 70 mov 0x70(%rdx),%rax
> > a0fc: 0f b7 10 movzwl (%rax),%edx
> > a0ff: 31 c0 xor %eax,%eax
> > a101: 48 8b 4c 24 10 mov 0x10(%rsp),%rcx
> > a106: 80 fa ff cmp $0xff,%dl
> > a109: 0f 94 c0 sete %al
> > a10c: 01 c2 add %eax,%edx
> > a10e: 66 39 91 1a 02 00 00 cmp %dx,0x21a(%rcx)
> > a115: 0f 84 78 01 00 00 je a293
> > <bnx2_poll_work+0x1c3> a11b: 48 8b 57 08 mov
> > 0x8(%rdi),%rdx a11f: 48 89 f8 mov %rdi,%rax
> > a122: 48 8b 9a 00 03 00 00 mov 0x300(%rdx),%rbx
> > a129: 48 83 c0 40 add $0x40,%rax
> > a12d: 48 29 c1 sub %rax,%rcx
> > a130: 48 89 c8 mov %rcx,%rax
> > a133: 48 c1 f8 06 sar $0x6,%rax
> > a137: 69 c0 39 8e e3 38 imul $0x38e38e39,%eax,%eax
> > a13d: 48 c1 e0 07 shl $0x7,%rax
> > a141: 48 01 d8 add %rbx,%rax
> > a144: 48 89 44 24 20 mov %rax,0x20(%rsp)
> > a149: 48 8b 7c 24 10 mov 0x10(%rsp),%rdi
> > a14e: 48 8b 47 70 mov 0x70(%rdi),%rax
> > a152: 44 0f b7 30 movzwl (%rax),%r14d
> > a156: 31 c0 xor %eax,%eax
> > a158: 0f b7 9f 18 02 00 00 movzwl 0x218(%rdi),%ebx
> > a15f: 41 80 fe ff cmp $0xff,%r14b
> > a163: 0f 94 c0 sete %al
> > a166: 45 31 ff xor %r15d,%r15d
> > a169: 41 01 c6 add %eax,%r14d
> > a16c: 66 44 39 f3 cmp %r14w,%bx
> > a170: 0f 84 ee 00 00 00 je a264
> > <bnx2_poll_work+0x194> a176: 66 2e 0f 1f 84 00 00 nopw
> > %cs:0x0(%rax,%rax,1) a17d: 00 00 00
> > a180: 0f b6 cb movzbl %bl,%ecx
> > a183: 48 8b 44 24 10 mov 0x10(%rsp),%rax
> > a188: 44 0f b7 e1 movzwl %cx,%r12d
> > a18c: 49 c1 e4 04 shl $0x4,%r12
> > a190: 4c 03 a0 10 02 00 00 add 0x210(%rax),%r12
> > a197: 4d 8b 2c 24 mov (%r12),%r13
> > a19b: 66 41 83 7c 24 08 00 cmpw $0x0,0x8(%r12)
> > a1a2: 41 0f 18 8d bc 00 00 prefetcht0 0xbc(%r13)
> > a1a9: 00
> > ...
> >
> >
> > Kernel is compiled on Gentoo (64bit):
> > Linux version 2.6.31.9-x86_64 () (gcc version 4.3.4 (Gentoo 4.3.4
> > p1.0, pie-10.1.5) ) #1 SMP Mon Dec 28 15:49:16 CET 2009 The
> > affected server (HP DL360 G5) is running OpenSuSE-11.1, 32bit
> > userspace
> >
> > Any idea if there is a recent patch that could fix this issue? At
> > the crashing time the server was not specifically loaded and had
> > around 200 packets/s network traffic.
> >
> > Regards,
> > Bruno


Attachments:
(No filename) (8.12 kB)
dmesg (48.92 kB)
.config (50.16 kB)
Download all attachments

2009-12-29 13:54:11

by Bruno Prémont

[permalink] [raw]
Subject: Re: BNX2: Kernel crashes with 2.6.31 and 2.6.31.9

On Tue, 29 Dec 2009 01:05:40 "Benjamin Li" <[email protected]> wrote:
> Hi Bruno,
>
> It looks like the the NULL dereference is happening at a0fc.
>
> a0f8: 48 8b 42 70 mov 0x70(%rdx),%rax
> a0fc: 0f b7 10 movzwl (%rax),%edx
> a0ff: 31 c0 xor %eax,%eax
>
> The offset of 0x70 is the bp field in the bnx2_napi structure. (Seen
> in the bnx2_napi structure dump below) These lines are found in the
> routine, bnx2_get_hw_tx_cons() which look like they were inlined by
> the compiler. More specifically it looks like the dereference of the
> hw_tx_cons_ptr failed.
>
> cons = *bnapi->hw_tx_cons_ptr;
>
> http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/net/bnx2.c;h=06b901152d4487fa04164437cc179661b44657fe;hb=74fca6a42863ffacaf7ba6f1936a9f228950f657#l2761
>
> To be sure this is the case, could you send the .config file you are
> using or if you could send me the bnx2 kernel module built with the
> CFLAG '-g', then we can definitely verify where in the code it is
> crashing.
>
> Did you see anything suspicious in the system kernel logs? If you
> could isolate the logs from when the machine booted to when it crash
> and send it to us it would be very helpful.

It crashes every now and then (since netconsole is enabled it does not
survive 24 hours :( ) while or just after transmitting log messages with
netconsole, the messages being transmitted are logging that occurs with
netfilter 'LOG' target.

Sample output as seen by netconsole recipient (1 packet per line, IP
addresses masked):

[ 2115.949606] (reject)output: IN= OUT=eth0
SRC=***.**.*.** DST=**.***.**.***
LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=29589
DF
PROTO=TCP
SPT=58991 DPT=80
WINDOW=5840
RES=0x00
SYN
URGP=0

[ 2115.949704] (reject)output: IN= OUT=eth0
SRC=***.**.*.** DST=**.***.**.***
[ 2115.949729] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 2115.949732] IP: [<ffffffffa00680fc>] bnx2_poll_work+0x2c/0x12d0 [bnx2]
[ 2115.949742] PGD 5b6f0067 PUD 59c04067 PMD 0
[ 2115.949744] Oops: 0000 [#1] SMP
[ 2115.949746] last sysfs file: /sys/kernel/uevent_seqnum
[ 2115.949749] CPU 3
[ 2115.949750] Modules linked in: dm_round_robin scsi_dh_rdac ipmi_devintf netconsole squashfs configfs zlib_inflate ext2 loop dm_multipath scsi_dh dm_mod sg sr_mod cdrom ata_piix h
pwdt qla2xxx ipmi_si ahci bnx2 ipmi_msghandler libata uhci_hcd ehci_hcd
[ 2115.949764] Pid: 7926, comm: php-cgi Not tainted 2.6.31.9-x86_64 #1 ProLiant DL360 G5
[ 2115.949766] RIP: 0010:[<ffffffffa00680fc>] [<ffffffffa00680fc>] bnx2_poll_work+0x2c/0x12d0 [bnx2]

Looks like netpoll is triggering suicide on BNX2.

Any way to get the NULL-pointer non-fatal would help a lot! (any
sensible thing to do when bnapi->hw_tx_cons_ptr is NULL that would
allow the system to continue working without killing everything?)


Regards,
Bruno

2009-12-30 05:08:18

by Benjamin Li

[permalink] [raw]
Subject: Re: BNX2: Kernel crashes with 2.6.31 and 2.6.31.9

Hi Bruno.

Could you try running with the attached patch? This debug patch is
built against the linux-2.6.31.9 kernel. I think the panic is occuring
right before a reset has occured due to a TX timeout. To see if this is
happening, this patch will print hardware state information when a TX
timeout occurs. If you could run with this patch and send the logs when
the panic occurs, I would really appreciate it.

Thanks again.

-Ben

On Tue, 2009-12-29 at 05:54 -0800, Bruno Pr?mont wrote:
> On Tue, 29 Dec 2009 01:05:40 "Benjamin Li" <[email protected]> wrote:
> > Hi Bruno,
> >
> > It looks like the the NULL dereference is happening at a0fc.
> >
> > a0f8: 48 8b 42 70 mov 0x70(%rdx),%rax
> > a0fc: 0f b7 10 movzwl (%rax),%edx
> > a0ff: 31 c0 xor %eax,%eax
> >
> > The offset of 0x70 is the bp field in the bnx2_napi structure. (Seen
> > in the bnx2_napi structure dump below) These lines are found in the
> > routine, bnx2_get_hw_tx_cons() which look like they were inlined by
> > the compiler. More specifically it looks like the dereference of the
> > hw_tx_cons_ptr failed.
> >
> > cons = *bnapi->hw_tx_cons_ptr;
> >
> > http://git.kernel.org/?p=linux/kernel/git/torvalds/linux-2.6.git;a=blob;f=drivers/net/bnx2.c;h=06b901152d4487fa04164437cc179661b44657fe;hb=74fca6a42863ffacaf7ba6f1936a9f228950f657#l2761
> >
> > To be sure this is the case, could you send the .config file you are
> > using or if you could send me the bnx2 kernel module built with the
> > CFLAG '-g', then we can definitely verify where in the code it is
> > crashing.
> >
> > Did you see anything suspicious in the system kernel logs? If you
> > could isolate the logs from when the machine booted to when it crash
> > and send it to us it would be very helpful.
>
> It crashes every now and then (since netconsole is enabled it does not
> survive 24 hours :( ) while or just after transmitting log messages with
> netconsole, the messages being transmitted are logging that occurs with
> netfilter 'LOG' target.
>
> Sample output as seen by netconsole recipient (1 packet per line, IP
> addresses masked):
>
> [ 2115.949606] (reject)output: IN= OUT=eth0
> SRC=***.**.*.** DST=**.***.**.***
> LEN=60 TOS=0x00 PREC=0x00 TTL=64 ID=29589
> DF
> PROTO=TCP
> SPT=58991 DPT=80
> WINDOW=5840
> RES=0x00
> SYN
> URGP=0
>
> [ 2115.949704] (reject)output: IN= OUT=eth0
> SRC=***.**.*.** DST=**.***.**.***
> [ 2115.949729] BUG: unable to handle kernel NULL pointer dereference at (null)
> [ 2115.949732] IP: [<ffffffffa00680fc>] bnx2_poll_work+0x2c/0x12d0 [bnx2]
> [ 2115.949742] PGD 5b6f0067 PUD 59c04067 PMD 0
> [ 2115.949744] Oops: 0000 [#1] SMP
> [ 2115.949746] last sysfs file: /sys/kernel/uevent_seqnum
> [ 2115.949749] CPU 3
> [ 2115.949750] Modules linked in: dm_round_robin scsi_dh_rdac ipmi_devintf netconsole squashfs configfs zlib_inflate ext2 loop dm_multipath scsi_dh dm_mod sg sr_mod cdrom ata_piix h
> pwdt qla2xxx ipmi_si ahci bnx2 ipmi_msghandler libata uhci_hcd ehci_hcd
> [ 2115.949764] Pid: 7926, comm: php-cgi Not tainted 2.6.31.9-x86_64 #1 ProLiant DL360 G5
> [ 2115.949766] RIP: 0010:[<ffffffffa00680fc>] [<ffffffffa00680fc>] bnx2_poll_work+0x2c/0x12d0 [bnx2]
>
> Looks like netpoll is triggering suicide on BNX2.
>
> Any way to get the NULL-pointer non-fatal would help a lot! (any
> sensible thing to do when bnapi->hw_tx_cons_ptr is NULL that would
> allow the system to continue working without killing everything?)
>
>
> Regards,
> Bruno
>


Attachments:
bnx2_ftq_state_dump.diff (5.56 kB)

2010-02-19 08:10:40

by Bruno Prémont

[permalink] [raw]
Subject: Re: BNX2: Kernel crashes with 2.6.31 and 2.6.31.9

Hi Benjamin,

On Tue, 29 Dec 2009 21:08:11 "Benjamin Li" wrote:
> Could you try running with the attached patch? This debug patch is
> built against the linux-2.6.31.9 kernel. I think the panic is
> occuring right before a reset has occured due to a TX timeout. To
> see if this is happening, this patch will print hardware state
> information when a TX timeout occurs. If you could run with this
> patch and send the logs when the panic occurs, I would really
> appreciate it.
>
> Thanks again.
>
> -Ben

Sorry for replying only this late but I've been too busy with other
things.

Anyhow, I've been doing some more testing yesterday and today and now
am able to reproduce the/a crash pretty easily.

Either running netconsole and doing 'echo t > /proc/sysrq-trigger' via
SSH on otherwise idle server (from local console nothing bad happens),
but then I have no means to communicate with the kernel (I guess it's
deadlocked somewhere in printk code)

The slightly less easy way to trigger it is with a dummy module that
kind of simulates netconsole behavior but with dummy data (see
attached). I have to have some more traffic (TCP?) going on for the bug
to trigger and tell my module multiple times to push data. This way
server is still accessible via VGA or serial console.

Attached are my 'netbomb.c' (which is a modified netconsole.c) and
full kernel log. This time running a 2.6.33-rc8-git3 kernel, having
forward-ported your patch above (e.g. half of it was already present)


I this time I got the following trace:
[ 134.643292] BUG: unable to handle kernel NULL pointer dereference at (null)
[ 134.643304] IP: [<ffffffffa003edc2>] bnx2_poll_work+0x32/0x13d0 [bnx2]
[ 134.643314] PGD 2a972a067 PUD 2aa245067 PMD 0
[ 134.643319] Oops: 0000 [#1] SMP
[ 134.643323] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:01:04.6/class
[ 134.643328] CPU 4
[ 134.643334] Pid: 3226, comm: cat Not tainted 2.6.33-rc8-git3-x86_64 #3 /ProLiant DL360 G5
[ 134.643339] RIP: 0010:[<ffffffffa003edc2>] [<ffffffffa003edc2>] bnx2_poll_work+0x32/0x13d0 [bnx2]
[ 134.643347] RSP: 0018:ffff8802a9643b38 EFLAGS: 00010092
[ 134.643351] RAX: 0000000000000000 RBX: ffff8802afab57c0 RCX: 0000000000000010
[ 134.643355] RDX: 0000000000000000 RSI: ffff8802afab57c0 RDI: ffff8802afab4580
[ 134.643359] RBP: ffff8802a9643cd8 R08: ffff8802af051000 R09: 0000000000000007
[ 134.643363] R10: 000000000000000e R11: 0000000000000000 R12: 0000000000000000
[ 134.643367] R13: 0000000000000010 R14: 0000000000000000 R15: ffff8802afab4580
[ 134.643371] FS: 0000000000000000(0000) GS:ffff880028300000(0063) knlGS:00000000f765f6c0
[ 134.643376] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
[ 134.643380] CR2: 0000000000000000 CR3: 00000002a9606000 CR4: 00000000000006e0
[ 134.643384] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 134.643388] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 134.643392] Process cat (pid: 3226, threadinfo ffff8802a9642000, task ffff8802aa14bff0)
[ 134.643396] Stack:
[ 134.643398] 0000000000000070 0000000000000002 0000000000000010 ffff8802afab57c0
[ 134.643404] <0> ffff8802afab4580 0000000300000002 0000000000000000 0000000100000002
[ 134.643410] <0> 0000000000000000 0000000200025220 ffffffff81862e80 0000000000000001
[ 134.643418] Call Trace:
[ 134.643427] [<ffffffff8107e3fe>] ? __alloc_pages_nodemask+0xfe/0x660
[ 134.643433] [<ffffffff811b6de6>] ? msi_set_mask_bit+0x26/0xc0
[ 134.643438] [<ffffffff811b6e8b>] ? unmask_msi_irq+0xb/0x10
[ 134.643443] [<ffffffff8106db54>] ? default_enable+0x24/0x40
[ 134.643448] [<ffffffff8106d9b6>] ? check_irq_resend+0x26/0x70
[ 134.643453] [<ffffffff8106cc23>] ? __enable_irq+0x73/0x80
[ 134.643459] [<ffffffffa004019e>] bnx2_poll_msix+0x3e/0xd0 [bnx2]
[ 134.643465] [<ffffffff8135bcd1>] netpoll_poll+0xe1/0x3c0
[ 134.643470] [<ffffffff8135c168>] netpoll_send_skb+0x118/0x210
[ 134.643475] [<ffffffff8135c45b>] netpoll_send_udp+0x1fb/0x210
[ 134.643480] [<ffffffffa00981c5>] write_msg+0x95/0xd0 [netbomb]
[ 134.643485] [<ffffffffa0098255>] netbomb_write+0x55/0xa4 [netbomb]
[ 134.643492] [<ffffffff810f6581>] proc_reg_write+0x71/0xb0
[ 134.643498] [<ffffffff810ab6cb>] vfs_write+0xcb/0x180
[ 134.643503] [<ffffffff810ab870>] sys_write+0x50/0x90
[ 134.643509] [<ffffffff8102a1a4>] sysenter_dispatch+0x7/0x2b
[ 134.643513] Code: 56 41 55 41 54 53 48 81 ec 78 01 00 00 48 89 bd 80 fe ff ff 48 89 b5 78 fe ff ff 89 95 74 fe ff ff 89 8d 70 fe ff ff 48 8b 46 70 <0f> b7 10 31 c0 80 fa ff 0f 94 c0 01 c2 66 39 96 12 02 00 00 0f
[ 134.643551] RIP [<ffffffffa003edc2>] bnx2_poll_work+0x32/0x13d0 [bnx2]
[ 134.643557] RSP <ffff8802a9643b38>
[ 134.643559] CR2: 0000000000000000
[ 134.643563] ---[ end trace 48bdec67d6d7aadb ]---

Running objdump on kernel compile with debugging symbols this matches:
000000000000ad90 <bnx2_poll_work>:
}
}

static int bnx2_poll_work(struct bnx2 *bp, struct bnx2_napi *bnapi,
int work_done, int budget)
{
ad90: 55 push %rbp
ad91: 48 89 e5 mov %rsp,%rbp
ad94: 41 57 push %r15
ad96: 41 56 push %r14
ad98: 41 55 push %r13
ad9a: 41 54 push %r12
ad9c: 53 push %rbx
ad9d: 48 81 ec 78 01 00 00 sub $0x178,%rsp
ada4: 48 89 bd 80 fe ff ff mov %rdi,-0x180(%rbp)
adab: 48 89 b5 78 fe ff ff mov %rsi,-0x188(%rbp)
adb2: 89 95 74 fe ff ff mov %edx,-0x18c(%rbp)
adb8: 89 8d 70 fe ff ff mov %ecx,-0x190(%rbp)
{
u16 cons;

/* Tell compiler that status block fields can change. */
barrier();
cons = *bnapi->hw_tx_cons_ptr;
adbe: 48 8b 46 70 mov 0x70(%rsi),%rax
adc2: 0f b7 10 movzwl (%rax),%edx
barrier();
if (unlikely((cons & MAX_TX_DESC_CNT) == MAX_TX_DESC_CNT))
cons++;
adc5: 31 c0 xor %eax,%eax
adc7: 80 fa ff cmp $0xff,%dl
adca: 0f 94 c0 sete %al
adcd: 01 c2 add %eax,%edx
int work_done, int budget)
{
struct bnx2_tx_ring_info *txr = &bnapi->tx_ring;
struct bnx2_rx_ring_info *rxr = &bnapi->rx_ring;

if (bnx2_get_hw_tx_cons(bnapi) != txr->hw_tx_cons)
adcf: 66 39 96 12 02 00 00 cmp %dx,0x212(%rsi)
add6: 0f 84 4f 03 00 00 je b12b <bnx2_poll_work+0x39b>


So as already determined bnapi->hw_tx_cons_ptr is NULL... but nothing is
happening after that on network side.

Regards,
Bruno


Attachments:
(No filename) (6.65 kB)
bnx2.dmesg (55.14 kB)
netbomb.c (9.25 kB)
Download all attachments

2010-02-19 19:57:20

by Benjamin Li

[permalink] [raw]
Subject: Re: BNX2: Kernel crashes with 2.6.31 and 2.6.31.9

Hi Bruno,

No problems. Thanks for following up with this problem, I really
appreciate all your help.

>From your logs it looks like the device came up using MSI, but in the
MSI-X poll routine was being called:

[ 9.836673] bnx2: eth0: using MSI
...

[ 134.643459] [<ffffffffa004019e>] bnx2_poll_msix+0x3e/0xd0 [bnx2]
[ 134.643465] [<ffffffff8135bcd1>] netpoll_poll+0xe1/0x3c0

which is incorrect. If we are in MSI mode, the bnx2_poll() routine
should be used.

I think what is going on here is that during the bnx2x driver
initialization the current bnx2 driver adds all possible NAPI structures
that map to all the hardware vectors (BNX2_MAX_MSIX_VEC=9) to the NAPI
list in the net_device structure regardless if they are used or not
(Seen in drivers/net/bnx2.c:bnx2_init_napi()). This can cause
uninitialized NAPI structures to be placed on the napi_list. Because
this device is in MSI mode, only 1 vector is initialized. Now, the
problem is triggered when net/core/netpoll.c:poll_napi() is called.
This is because this routine will run through the entire napi_list
calling all the poll routines. In your particular case, it is calling
the poll routine on an uninitialized vector causing the kernel panic.

Please try the patch below to see if it solves your problem. Note, this
only have been compile tested and tested against basic traffic runs.
Unfortunately, I could not reproduce the kernel panic with the
instructions below to verify the patch.

Thanks again for all your help in helping us track this down.

-Ben

On Fri, 2010-02-19 at 00:10 -0800, Bruno Pr?mont wrote:
> Hi Benjamin,
>
> On Tue, 29 Dec 2009 21:08:11 "Benjamin Li" wrote:
> > Could you try running with the attached patch? This debug patch is
> > built against the linux-2.6.31.9 kernel. I think the panic is
> > occuring right before a reset has occured due to a TX timeout. To
> > see if this is happening, this patch will print hardware state
> > information when a TX timeout occurs. If you could run with this
> > patch and send the logs when the panic occurs, I would really
> > appreciate it.
> >
> > Thanks again.
> >
> > -Ben
>
> Sorry for replying only this late but I've been too busy with other
> things.
>
> Anyhow, I've been doing some more testing yesterday and today and now
> am able to reproduce the/a crash pretty easily.
>
> Either running netconsole and doing 'echo t > /proc/sysrq-trigger' via
> SSH on otherwise idle server (from local console nothing bad happens),
> but then I have no means to communicate with the kernel (I guess it's
> deadlocked somewhere in printk code)
>
> The slightly less easy way to trigger it is with a dummy module that
> kind of simulates netconsole behavior but with dummy data (see
> attached). I have to have some more traffic (TCP?) going on for the bug
> to trigger and tell my module multiple times to push data. This way
> server is still accessible via VGA or serial console.
>
> Attached are my 'netbomb.c' (which is a modified netconsole.c) and
> full kernel log. This time running a 2.6.33-rc8-git3 kernel, having
> forward-ported your patch above (e.g. half of it was already present)
>
>
> I this time I got the following trace:
> [ 134.643292] BUG: unable to handle kernel NULL pointer dereference at (null)
> [ 134.643304] IP: [<ffffffffa003edc2>] bnx2_poll_work+0x32/0x13d0 [bnx2]
> [ 134.643314] PGD 2a972a067 PUD 2aa245067 PMD 0
> [ 134.643319] Oops: 0000 [#1] SMP
> [ 134.643323] last sysfs file: /sys/devices/pci0000:00/0000:00:1e.0/0000:01:04.6/class
> [ 134.643328] CPU 4
> [ 134.643334] Pid: 3226, comm: cat Not tainted 2.6.33-rc8-git3-x86_64 #3 /ProLiant DL360 G5
> [ 134.643339] RIP: 0010:[<ffffffffa003edc2>] [<ffffffffa003edc2>] bnx2_poll_work+0x32/0x13d0 [bnx2]
> [ 134.643347] RSP: 0018:ffff8802a9643b38 EFLAGS: 00010092
> [ 134.643351] RAX: 0000000000000000 RBX: ffff8802afab57c0 RCX: 0000000000000010
> [ 134.643355] RDX: 0000000000000000 RSI: ffff8802afab57c0 RDI: ffff8802afab4580
> [ 134.643359] RBP: ffff8802a9643cd8 R08: ffff8802af051000 R09: 0000000000000007
> [ 134.643363] R10: 000000000000000e R11: 0000000000000000 R12: 0000000000000000
> [ 134.643367] R13: 0000000000000010 R14: 0000000000000000 R15: ffff8802afab4580
> [ 134.643371] FS: 0000000000000000(0000) GS:ffff880028300000(0063) knlGS:00000000f765f6c0
> [ 134.643376] CS: 0010 DS: 002b ES: 002b CR0: 000000008005003b
> [ 134.643380] CR2: 0000000000000000 CR3: 00000002a9606000 CR4: 00000000000006e0
> [ 134.643384] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [ 134.643388] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
> [ 134.643392] Process cat (pid: 3226, threadinfo ffff8802a9642000, task ffff8802aa14bff0)
> [ 134.643396] Stack:
> [ 134.643398] 0000000000000070 0000000000000002 0000000000000010 ffff8802afab57c0
> [ 134.643404] <0> ffff8802afab4580 0000000300000002 0000000000000000 0000000100000002
> [ 134.643410] <0> 0000000000000000 0000000200025220 ffffffff81862e80 0000000000000001
> [ 134.643418] Call Trace:
> [ 134.643427] [<ffffffff8107e3fe>] ? __alloc_pages_nodemask+0xfe/0x660
> [ 134.643433] [<ffffffff811b6de6>] ? msi_set_mask_bit+0x26/0xc0
> [ 134.643438] [<ffffffff811b6e8b>] ? unmask_msi_irq+0xb/0x10
> [ 134.643443] [<ffffffff8106db54>] ? default_enable+0x24/0x40
> [ 134.643448] [<ffffffff8106d9b6>] ? check_irq_resend+0x26/0x70
> [ 134.643453] [<ffffffff8106cc23>] ? __enable_irq+0x73/0x80
> [ 134.643459] [<ffffffffa004019e>] bnx2_poll_msix+0x3e/0xd0 [bnx2]
> [ 134.643465] [<ffffffff8135bcd1>] netpoll_poll+0xe1/0x3c0
> [ 134.643470] [<ffffffff8135c168>] netpoll_send_skb+0x118/0x210
> [ 134.643475] [<ffffffff8135c45b>] netpoll_send_udp+0x1fb/0x210
> [ 134.643480] [<ffffffffa00981c5>] write_msg+0x95/0xd0 [netbomb]
> [ 134.643485] [<ffffffffa0098255>] netbomb_write+0x55/0xa4 [netbomb]
> [ 134.643492] [<ffffffff810f6581>] proc_reg_write+0x71/0xb0
> [ 134.643498] [<ffffffff810ab6cb>] vfs_write+0xcb/0x180
> [ 134.643503] [<ffffffff810ab870>] sys_write+0x50/0x90
> [ 134.643509] [<ffffffff8102a1a4>] sysenter_dispatch+0x7/0x2b
> [ 134.643513] Code: 56 41 55 41 54 53 48 81 ec 78 01 00 00 48 89 bd 80 fe ff ff 48 89 b5 78 fe ff ff 89 95 74 fe ff ff 89 8d 70 fe ff ff 48 8b 46 70 <0f> b7 10 31 c0 80 fa ff 0f 94 c0 01 c2 66 39 96 12 02 00 00 0f
> [ 134.643551] RIP [<ffffffffa003edc2>] bnx2_poll_work+0x32/0x13d0 [bnx2]
> [ 134.643557] RSP <ffff8802a9643b38>
> [ 134.643559] CR2: 0000000000000000
> [ 134.643563] ---[ end trace 48bdec67d6d7aadb ]---
>
> Running objdump on kernel compile with debugging symbols this matches:
> 000000000000ad90 <bnx2_poll_work>:
> }
> }
>
> static int bnx2_poll_work(struct bnx2 *bp, struct bnx2_napi *bnapi,
> int work_done, int budget)
> {
> ad90: 55 push %rbp
> ad91: 48 89 e5 mov %rsp,%rbp
> ad94: 41 57 push %r15
> ad96: 41 56 push %r14
> ad98: 41 55 push %r13
> ad9a: 41 54 push %r12
> ad9c: 53 push %rbx
> ad9d: 48 81 ec 78 01 00 00 sub $0x178,%rsp
> ada4: 48 89 bd 80 fe ff ff mov %rdi,-0x180(%rbp)
> adab: 48 89 b5 78 fe ff ff mov %rsi,-0x188(%rbp)
> adb2: 89 95 74 fe ff ff mov %edx,-0x18c(%rbp)
> adb8: 89 8d 70 fe ff ff mov %ecx,-0x190(%rbp)
> {
> u16 cons;
>
> /* Tell compiler that status block fields can change. */
> barrier();
> cons = *bnapi->hw_tx_cons_ptr;
> adbe: 48 8b 46 70 mov 0x70(%rsi),%rax
> adc2: 0f b7 10 movzwl (%rax),%edx
> barrier();
> if (unlikely((cons & MAX_TX_DESC_CNT) == MAX_TX_DESC_CNT))
> cons++;
> adc5: 31 c0 xor %eax,%eax
> adc7: 80 fa ff cmp $0xff,%dl
> adca: 0f 94 c0 sete %al
> adcd: 01 c2 add %eax,%edx
> int work_done, int budget)
> {
> struct bnx2_tx_ring_info *txr = &bnapi->tx_ring;
> struct bnx2_rx_ring_info *rxr = &bnapi->rx_ring;
>
> if (bnx2_get_hw_tx_cons(bnapi) != txr->hw_tx_cons)
> adcf: 66 39 96 12 02 00 00 cmp %dx,0x212(%rsi)
> add6: 0f 84 4f 03 00 00 je b12b <bnx2_poll_work+0x39b>
>
>
> So as already determined bnapi->hw_tx_cons_ptr is NULL... but nothing is
> happening after that on network side.
>
> Regards,
> Bruno


Attachments:
bnx2_add_only_used_vectors.c (1.03 kB)

2010-02-19 21:04:03

by Brian Haley

[permalink] [raw]
Subject: Re: BNX2: Kernel crashes with 2.6.31 and 2.6.31.9

Hi Ben,

Benjamin Li wrote:
> Hi Bruno,
>
> No problems. Thanks for following up with this problem, I really
> appreciate all your help.
>
>>From your logs it looks like the device came up using MSI, but in the
> MSI-X poll routine was being called:
>
> [ 9.836673] bnx2: eth0: using MSI
> ...
>
> [ 134.643459] [<ffffffffa004019e>] bnx2_poll_msix+0x3e/0xd0 [bnx2]
> [ 134.643465] [<ffffffff8135bcd1>] netpoll_poll+0xe1/0x3c0
>
> which is incorrect. If we are in MSI mode, the bnx2_poll() routine
> should be used.
>
> I think what is going on here is that during the bnx2x driver
> initialization the current bnx2 driver adds all possible NAPI structures
> that map to all the hardware vectors (BNX2_MAX_MSIX_VEC=9) to the NAPI
> list in the net_device structure regardless if they are used or not
> (Seen in drivers/net/bnx2.c:bnx2_init_napi()). This can cause
> uninitialized NAPI structures to be placed on the napi_list. Because
> this device is in MSI mode, only 1 vector is initialized. Now, the
> problem is triggered when net/core/netpoll.c:poll_napi() is called.
> This is because this routine will run through the entire napi_list
> calling all the poll routines. In your particular case, it is calling
> the poll routine on an uninitialized vector causing the kernel panic.
...
> @@ -8201,7 +8204,7 @@ bnx2_init_napi(struct bnx2 *bp)
> {
> int i;
>
> - for (i = 0; i < BNX2_MAX_MSIX_VEC; i++) {
> + for (i = 0; i < bp->irq_nvecs; i++) {
> struct bnx2_napi *bnapi = &bp->bnx2_napi[i];
> int (*poll)(struct napi_struct *, int);

Would this same change need to be made in other places, like bnx2_init_chip()
or bnx2_clear_ring_states() ?

-Brian

2010-02-19 21:47:25

by Benjamin Li

[permalink] [raw]
Subject: Re: BNX2: Kernel crashes with 2.6.31 and 2.6.31.9

Hi Brian,

On Fri, 2010-02-19 at 13:03 -0800, Brian Haley wrote:
> Hi Ben,
>
> Benjamin Li wrote:
> > Hi Bruno,
> >
> > @@ -8201,7 +8204,7 @@ bnx2_init_napi(struct bnx2 *bp)
> > {
> > int i;
> >
> > - for (i = 0; i < BNX2_MAX_MSIX_VEC; i++) {
> > + for (i = 0; i < bp->irq_nvecs; i++) {
> > struct bnx2_napi *bnapi = &bp->bnx2_napi[i];
> > int (*poll)(struct napi_struct *, int);
>
> Would this same change need to be made in other places, like bnx2_init_chip()
> or bnx2_clear_ring_states() ?

The other locations in the bnx2.c driver are bnx2_init_chip(),
bnx2_clear_ring_states(), bnx2_alloc_mem(). With the current
implementation, the bnx2_napi structures are initialize but never used
which should be ok. But, we can clean this up to save some cycles.

The following are the areas in the code which iterate through all the
vectors.

bnx2_init_chip() - zero the last_status_idx field in the bnx2_napi
structure
bnx2_clear_ring_states() - zero the rings producer/consumer indexes
bnx2_alloc_mem() - initialize the consumer pointers

Thanks again.

-Ben

>
> -Brian
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2010-02-23 12:15:43

by Bruno Prémont

[permalink] [raw]
Subject: Re: BNX2: Kernel crashes with 2.6.31 and 2.6.31.9

Hi Benjamin,

On Fri, 19 February 2010 "Benjamin Li" <[email protected]> wrote:
> >From your logs it looks like the device came up using MSI, but in the
> MSI-X poll routine was being called:
>
> [ 9.836673] bnx2: eth0: using MSI
> ...
>
> [ 134.643459] [<ffffffffa004019e>] bnx2_poll_msix+0x3e/0xd0 [bnx2]
> [ 134.643465] [<ffffffff8135bcd1>] netpoll_poll+0xe1/0x3c0
>
> which is incorrect. If we are in MSI mode, the bnx2_poll() routine
> should be used.
>
> I think what is going on here is that during the bnx2x driver
> initialization the current bnx2 driver adds all possible NAPI
> structures that map to all the hardware vectors (BNX2_MAX_MSIX_VEC=9)
> to the NAPI list in the net_device structure regardless if they are
> used or not (Seen in drivers/net/bnx2.c:bnx2_init_napi()). This can
> cause uninitialized NAPI structures to be placed on the napi_list.
> Because this device is in MSI mode, only 1 vector is initialized.
> Now, the problem is triggered when net/core/netpoll.c:poll_napi() is
> called. This is because this routine will run through the entire
> napi_list calling all the poll routines. In your particular case, it
> is calling the poll routine on an uninitialized vector causing the
> kernel panic.
>
> Please try the patch below to see if it solves your problem. Note,
> this only have been compile tested and tested against basic traffic
> runs. Unfortunately, I could not reproduce the kernel panic with the
> instructions below to verify the patch.
>
> Thanks again for all your help in helping us track this down.

I applied the patch today and tried to reproduce with my showcases.

Seems that it's harder to trigger now but I still end up being able to
crash the box. Don't know if it's the same cause or not (could also
be the tcp-retransmit ghost)...

This time I had to run a few paralell scp's (8Mb/s each) to the box and
'echo t > /proc/sysrq-trigger' multiple times via ssh session for it to
happen. It didn't trigger with by netbomb though I will try some more
and see)

I don't know if it's the same reason or not (hopefully something
reached disk as serial console is dead and pings are not
answered anymore.
It's probably some printk/bug/warn that triggers in network stack and
deadlocks with netconsole.

Regards,
Bruno