2009-11-08 22:33:12

by Philippe De Muyter

[permalink] [raw]
Subject: tulip : kernel BUG in tulip_up/tulip_resume

Hello,

I have just installed 2.6.31 (from opensuse 11.2) one a tulip-equipped
computer and I get the following error message from the kernel :

[ 2495.526390] ------------[ cut here ]------------
[ 2495.526390] kernel BUG at /usr/src/packages/BUILD/kernel-default-2.6.31.5/linux-2.6.31/include/linux/netdevice.h:439!
[ 2495.526390] invalid opcode: 0000 [#1] SMP
[ 2495.526390] last sysfs file: /sys/devices/pci0000:00/0000:00:07.1/host0/target0:0:0/0:0:0:0/block/sda/uevent
[ 2495.526390] Modules linked in: ohci_hcd raw1394 ohci1394 ieee1394 acpi_cpufreq speedstep_lib processor thermal_sys hwmon edd ipv6 af_packet fuse loop dm_mod rtc_cmos rtc_core rtc_lib apm pcspkr sg tulip uhci_hcd ehci_hcd reiserfs ata_piix ahci libata
[ 2495.526390]
[ 2495.526390] Pid: 339, comm: kapmd Not tainted (2.6.31.5-0.1-default #1)
[ 2495.526390] EIP: 0060:[<c3d6045d>] EFLAGS: 00010246 CPU: 0
[ 2495.526390] EIP is at tulip_up+0xa2d/0xa80 [tulip]
[ 2495.526390] EAX: 00000000 EBX: c1cd7000 ECX: c022af50 EDX: 0001ec00
[ 2495.526390] ESI: c1cd7340 EDI: 00000000 EBP: c20bde44 ESP: c20bddfc
[ 2495.526390] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
[ 2495.526390] Process kapmd (pid: 339, ti=c20bc000 task=c25a32c0 task.ti=c20bc000)
[ 2495.526390] Stack:
[ 2495.526390] 0000000b c20bde44 c02acf76 2caa9a94 c2481800 c20bde38 c043ca4a c20bde27
[ 2495.526390] <0> 069ee44b c20ec120 c087c020 0001ec00 c1cd7000 c3d5c720 2caa9a94 c1cd7000
[ 2495.526390] <0> c2480000 00000000 c20bde68 c3d60555 00000080 c1cd7000 c1cd7000 2caa9a94
[ 2495.526390] Call Trace:
[ 2495.526390] [<c3d60555>] tulip_resume+0xa5/0xd0 [tulip]
[ 2495.526390] [<c043dd65>] pci_legacy_resume+0x35/0x60
[ 2495.526390] [<c043df2f>] pci_pm_resume+0x7f/0xb0
[ 2495.526390] [<c04d82f2>] pm_op+0xd2/0x180
[ 2495.526390] [<c04d91ee>] device_resume+0x5e/0x1a0
[ 2495.526390] [<c04d93dd>] dpm_resume+0xad/0x140
[ 2495.526390] [<c04d948b>] dpm_resume_end+0x1b/0x40
[ 2495.526390] [<c3db1978>] check_events+0x148/0x240 [apm]
[ 2495.526390] [<c3db23a2>] apm_mainloop+0x82/0x130 [apm]
[ 2495.526390] [<c3db28fe>] apm+0x10e/0x3d0 [apm]
[ 2495.526390] [<c026bef4>] kthread+0x84/0x90
[ 2495.526390] [<c0204db7>] kernel_thread_helper+0x7/0x10
[ 2495.526390] Code: 45 e4 e8 37 ce 6c fc 8b 4d e8 89 5c 24 10 89 7c 24 0c 89 4c 24 04 89 44 24 08 c7 04 24 4c 4e d6 c3 e8 86 f7 89 fc e9 f4 f8 ff ff <0f> 0b eb fe 0f be 96 16 09 00 00 b9 01 00 00 00 8b 45 e8 e8 fb
[ 2495.526390] EIP: [<c3d6045d>] tulip_up+0xa2d/0xa80 [tulip] SS:ESP 0068:c20bddfc
[ 2495.534162] ---[ end trace 609ed25c95a75fa1 ]---

This comes from a BUG_ON in napi_enable in netdevice.h.

napi_enable itself is called by tulip_up as such :

#ifdef CONFIG_TULIP_NAPI
napi_enable(&tp->napi);
#endif

At first reading, a matching napi_disable is called in tulip_down.

Does someone know what could be wrong and have a fix or should I look myself ?

Thanks in advance

Philippe


2009-11-21 19:10:29

by David Miller

[permalink] [raw]
Subject: Re: tulip : kernel BUG in tulip_up/tulip_resume

From: Philippe De Muyter <[email protected]>
Date: Sun, 8 Nov 2009 23:33:05 +0100

> I have just installed 2.6.31 (from opensuse 11.2) one a tulip-equipped
> computer and I get the following error message from the kernel :

I took some looks at this bug.

I simply can't figure out how this is possible. As both suspend
and resume make sure to make the necessary napi_disable() and
napi_enable() calls, and therefore they should match up and not
trigger this BUG().

I'll try to study the code some more when I get a chance.

2009-11-29 00:17:15

by Jarek Poplawski

[permalink] [raw]
Subject: Re: tulip : kernel BUG in tulip_up/tulip_resume

Philippe De Muyter wrote, On 11/08/2009 11:33 PM:

> Hello,
>
> I have just installed 2.6.31 (from opensuse 11.2) one a tulip-equipped
> computer and I get the following error message from the kernel :
>
> [ 2495.526390] ------------[ cut here ]------------
> [ 2495.526390] kernel BUG at /usr/src/packages/BUILD/kernel-default-2.6.31.5/linux-2.6.31/include/linux/netdevice.h:439!
> [ 2495.526390] invalid opcode: 0000 [#1] SMP
...
> This comes from a BUG_ON in napi_enable in netdevice.h.
>
> napi_enable itself is called by tulip_up as such :
>
> #ifdef CONFIG_TULIP_NAPI
> napi_enable(&tp->napi);
> #endif
>
> At first reading, a matching napi_disable is called in tulip_down.
>
> Does someone know what could be wrong and have a fix or should I look myself ?

Don't know, guess only...

Jarek P.
---

drivers/net/tulip/tulip_core.c | 4 ++--
1 files changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/tulip/tulip_core.c b/drivers/net/tulip/tulip_core.c
index 0df983b..6b1456c 100644
--- a/drivers/net/tulip/tulip_core.c
+++ b/drivers/net/tulip/tulip_core.c
@@ -1787,11 +1787,11 @@ static int tulip_resume(struct pci_dev *pdev)
return retval;
}

- netif_device_attach(dev);
-
if (netif_running(dev))
tulip_up(dev);

+ netif_device_attach(dev);
+
return 0;
}

2009-11-29 11:37:20

by Jarek Poplawski

[permalink] [raw]
Subject: Re: tulip : kernel BUG in tulip_up/tulip_resume

On Sun, Nov 29, 2009 at 01:17:10AM +0100, Jarek Poplawski wrote:
> Philippe De Muyter wrote, On 11/08/2009 11:33 PM:
> > Does someone know what could be wrong and have a fix or should I look myself ?
>
> Don't know, guess only...

...And maybe a second guess btw. (to try together or separate).

Jarek P.
---

diff --git a/drivers/net/tulip/tulip_core.c b/drivers/net/tulip/tulip_core.c
index 6b2330e..fd32601 100644
--- a/drivers/net/tulip/tulip_core.c
+++ b/drivers/net/tulip/tulip_core.c
@@ -1749,9 +1749,9 @@ static int tulip_suspend (struct pci_dev *pdev, pm_message_t state)
if (!netif_running(dev))
goto save_state;

+ netif_device_detach(dev);
tulip_down(dev);

- netif_device_detach(dev);
free_irq(dev->irq, dev);

save_state:

2009-12-03 06:25:22

by David Miller

[permalink] [raw]
Subject: Re: tulip : kernel BUG in tulip_up/tulip_resume

From: Jarek Poplawski <[email protected]>
Date: Sun, 29 Nov 2009 12:36:59 +0100

> On Sun, Nov 29, 2009 at 01:17:10AM +0100, Jarek Poplawski wrote:
>> Philippe De Muyter wrote, On 11/08/2009 11:33 PM:
>> > Does someone know what could be wrong and have a fix or should I look myself ?
>>
>> Don't know, guess only...
>
> ...And maybe a second guess btw. (to try together or separate).

Philippe please test Jarek's patches.

Thank you.

2009-12-03 08:47:42

by Philippe De Muyter

[permalink] [raw]
Subject: Re: tulip : kernel BUG in tulip_up/tulip_resume

Hi Jarek and David,
On Wed, Dec 02, 2009 at 10:25:26PM -0800, David Miller wrote:
> From: Jarek Poplawski <[email protected]>
> Date: Sun, 29 Nov 2009 12:36:59 +0100
>
> > On Sun, Nov 29, 2009 at 01:17:10AM +0100, Jarek Poplawski wrote:
> >> Philippe De Muyter wrote, On 11/08/2009 11:33 PM:
> >> > Does someone know what could be wrong and have a fix or should I look myself ?
> >>
> >> Don't know, guess only...
> >
> > ...And maybe a second guess btw. (to try together or separate).
>
> Philippe please test Jarek's patches.

Busy doing that. I had some problems trying to activate apm's debug.
Will report later.

Best regards

Philippe