2005-01-28 16:48:56

by Michael Gernoth

[permalink] [raw]
Subject: 2.4.29, e100 and a WOL packet causes keventd going mad

Hi,

we have about 70 P4 uniprocessor machines (some with Hyperthreading
capable CPUs) running linux 2.4.29, which are woken up on the weekdays
by sending a WOL packet to them. The machines all have a E100 nic with
WOL enabled in the bios. The E100 driver is compiled into the kernel
and not loaded as a module.

If the machine which should be woken up is already running (because
someone switched it on by hand), the WOL packet causes keventd to go
mad and "use" 100% CPU:

PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2 root 15 0 0 0 0 R 99.9 0.0 140:50.94 keventd

This can be reproduced on any of the 70 machines by simply sending a WOL
packet to it, when it's already running... No entry is made in the
kernel log.

The dmesg of an affected machine can be found at:
http://wwwcip.informatik.uni-erlangen.de/~simigern/cip-dmesg
Our kernel-config is at:
http://wwwcip.informatik.uni-erlangen.de/~simigern/cip-generic-config
lspci -vvv is at:
http://wwwcip.informatik.uni-erlangen.de/~simigern/cip-lspci

We are using a kernel.org linux 2.4.29 kernel patched with the current
autofs patch and ACL support.

Regards,
Michael


2005-01-28 18:43:18

by Bukie Mabayoje

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad



Michael Gernoth wrote:

> Hi,
>
> we have about 70 P4 uniprocessor machines (some with Hyperthreading
> capable CPUs) running linux 2.4.29, which are woken up on the weekdays
> by sending a WOL packet to them. The machines all have a E100 nic with
> WOL enabled in the bios. The E100 driver is compiled into the kernel
> and not loaded as a module.
>
> If the machine which should be woken up is already running (because
> someone switched it on by hand), the WOL packet causes keventd to go
> mad and "use" 100% CPU:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 2 root 15 0 0 0 0 R 99.9 0.0 140:50.94 keventd
>
> This can be reproduced on any of the 70 machines by simply sending a WOL
> packet to it, when it's already running... No entry is made in the
> kernel log.
>
> The dmesg of an affected machine can be found at:
> http://wwwcip.informatik.uni-erlangen.de/~simigern/cip-dmesg
> Our kernel-config is at:
> http://wwwcip.informatik.uni-erlangen.de/~simigern/cip-generic-config
> lspci -vvv is at:
> http://wwwcip.informatik.uni-erlangen.de/~simigern/cip-lspci
>
> We are using a kernel.org linux 2.4.29 kernel patched with the current
> autofs patch and ACL support.
>
> Regards,
> Michael
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

Do you know the official NIC product name e.g Pro/100B. I need to identify the LAN Controller. There are differences between 557 (not sure if 557 can do WOL), 558 and 559 how they ASSERT the PME# signal. Even the same chip have differences between steppings.

I suspect that PME# is not being DEASSERT after the Wake-up packet is received

2005-01-28 18:59:26

by Michael Gernoth

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad

On Fri, Jan 28, 2005 at 10:53:51AM -0800, Bukie Mabayoje wrote:
> Do you know the official NIC product name e.g Pro/100B. I need to identify
> the LAN Controller. There are differences between 557 (not sure if 557 can
> do WOL), 558 and 559 how they ASSERT the PME# signal. Even the same chip have
> differences between steppings.

The chip is integrated on the motherboard. Its PCI ID is 8086:1039.
lspci says: Intel Corp. 82801BD PRO/100 VE (LOM) Ethernet Controller (rev 81)
If you want I can open up one of these machines tomorrow to look on the chip
directly.

Regards,
Michael

2005-01-28 19:59:28

by Bukie Mabayoje

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad



Michael Gernoth wrote:

> On Fri, Jan 28, 2005 at 10:53:51AM -0800, Bukie Mabayoje wrote:
> > Do you know the official NIC product name e.g Pro/100B. I need to identify
> > the LAN Controller. There are differences between 557 (not sure if 557 can
> > do WOL), 558 and 559 how they ASSERT the PME# signal. Even the same chip have
> > differences between steppings.
>
> The chip is integrated on the motherboard. Its PCI ID is 8086:1039.
> lspci says: Intel Corp. 82801BD PRO/100 VE (LOM) Ethernet Controller (rev 81)
> If you want I can open up one of these machines tomorrow to look on the chip
> directly.
>
> Regards,
> Michael

Thanks got enough information....

2005-01-29 23:59:16

by Bukie Mabayoje

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad



Michael Gernoth wrote:

> On Fri, Jan 28, 2005 at 10:53:51AM -0800, Bukie Mabayoje wrote:
> > Do you know the official NIC product name e.g Pro/100B. I need to identify
> > the LAN Controller. There are differences between 557 (not sure if 557 can
> > do WOL), 558 and 559 how they ASSERT the PME# signal. Even the same chip have
> > differences between steppings.
>
> The chip is integrated on the motherboard. Its PCI ID is 8086:1039.
> lspci says: Intel Corp. 82801BD PRO/100 VE (LOM) Ethernet Controller (rev 81)
> If you want I can open up one of these machines tomorrow to look on the chip
> directly.
>
> Regards,
> Michael
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

I can't find the datasheet for 82801BD. 82801 are typically I/O Controller Hub. I need to see how it drives the PCI Interface Signals.

I will try an reproduce it on a different set of chipset. Basically send a WOL packet to a live linux system. And see if keventd consumes excessive CPU time.

2005-01-30 01:03:21

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad

On Fri, Jan 28, 2005 at 05:48:11PM +0100, Michael Gernoth wrote:
> Hi,
>
> we have about 70 P4 uniprocessor machines (some with Hyperthreading
> capable CPUs) running linux 2.4.29, which are woken up on the weekdays
> by sending a WOL packet to them. The machines all have a E100 nic with
> WOL enabled in the bios. The E100 driver is compiled into the kernel
> and not loaded as a module.
>
> If the machine which should be woken up is already running (because
> someone switched it on by hand), the WOL packet causes keventd to go
> mad and "use" 100% CPU:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 2 root 15 0 0 0 0 R 99.9 0.0 140:50.94 keventd

Probably a task event is rescheduling itself repeatedly? e100 does not seem
to schedule_task() events directly, so I wonder what is going on.

Can you boot a machine with profile=2, then send the WOL packet causing
keventd to go mad and run:

readprofile | sort -nr +2 | head -20

After a few minutes.

Ganesh, Scott, Jeff, any ideas?

> This can be reproduced on any of the 70 machines by simply sending a WOL
> packet to it, when it's already running... No entry is made in the
> kernel log.
>
> The dmesg of an affected machine can be found at:
> http://wwwcip.informatik.uni-erlangen.de/~simigern/cip-dmesg
> Our kernel-config is at:
> http://wwwcip.informatik.uni-erlangen.de/~simigern/cip-generic-config
> lspci -vvv is at:
> http://wwwcip.informatik.uni-erlangen.de/~simigern/cip-lspci
>
> We are using a kernel.org linux 2.4.29 kernel patched with the current
> autofs patch and ACL support.

2005-01-30 12:16:20

by Michael Gernoth

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad

On Sat, Jan 29, 2005 at 04:18:21PM -0200, Marcelo Tosatti wrote:
> Probably a task event is rescheduling itself repeatedly? e100 does not seem
> to schedule_task() events directly, so I wonder what is going on.
>
> Can you boot a machine with profile=2, then send the WOL packet causing
> keventd to go mad and run:
>
> readprofile | sort -nr +2 | head -20

faui07c:~# readprofile | sort -nr +2 | head -20
2352 acpi_ns_get_next_valid_node 130.6667
9577 acpi_os_read_port 121.2278
4431 acpi_os_signal_semaphore 100.7045
4639 default_idle 57.9875
4363 acpi_ns_get_next_node 57.4079
5957 acpi_ns_delete_namespace_by_owner 37.7025
2307 acpi_os_write_port 34.9545
4048 acpi_os_wait_semaphore 18.8279
1924 acpi_ut_acquire_mutex 16.8772
492 __rdtsc_delay 15.3750
526 acpi_os_get_thread_id 15.0286
1421 acpi_ut_release_mutex 13.1574
339 acpi_ns_get_parent_node 11.6897
1506 acpi_ut_release_to_cache 10.9130
1600 acpi_ut_acquire_from_cache 9.8160
909 acpi_ds_method_data_init 8.4953
327 acpi_ut_valid_acpi_name 6.9574
311 acpi_ns_search_node 4.7121
268 acpi_ps_get_opcode_info 4.0606
31 acpi_ut_delete_generic_state_cache 3.4444

The machine has an uptime of 10 minutes at that point, and the second
WOL packet was sent directly after the machine came up and I was able
to ssh into it.

Regards,
Michael

2005-01-30 17:19:04

by David Härdeman

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad

Hi,

I experience the same problems as reported by Michael Gernoth when
sending a WOL-packet to computer with a e100 NIC which is already
powered on.

In my case, it's running kernel 2.6.8.1 and the NIC is identified by
lspci as:
0000:02:08.0 Ethernet controller: Intel Corp. 82562EZ 10/100 Ethernet
Controller (rev 02)
or numerically:
0000:02:08.0 0200: 8086:1050 (rev 02)

The symptoms is that kacpid starts using all the CPU time it can, a
shutdown takes 5 - 10 minutes after I've done this (in contrast to 20 -
30 seconds when the machine is healthy).

Also, if I do a "shutdown -h" on the machine after sending a WOL packet
when it's already powered up, it will shutdown and immediately start up
again instead of powering off.

So, any suggestions on how to fix it?

Regards,
David

Please CC me on any replies.

2005-01-31 03:46:13

by Scott Feldman

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad

On Sun, 2005-01-30 at 09:18, David Härdeman wrote:
> I experience the same problems as reported by Michael Gernoth when
> sending a WOL-packet to computer with a e100 NIC which is already
> powered on.

I didn't look at the 2.4 case, but for 2.6, it seems e100 was enabling
PME wakeup during probe. PME shouldn't be enabled while the system is
up. I suspect the assertion of PME while the system is up is what's
causing problems. This patch moves PME wakeup enabling to either
suspend or shutdown.

David, would you give this patch a try? Make sure the system still
wakes from a magic packet if suspended or shut down, and doesn't cause
kacpid to go crazy if system is running. If it helps for 2.6, perhaps
someone can look into 2.4 to see if there is something similar going on
there.

-scott

--- linux-2.6.11-rc2/drivers/net/e100.c.orig 2005-01-30 19:13:56.850497376 -0800
+++ linux-2.6.11-rc2/drivers/net/e100.c 2005-01-30 19:26:41.154305536 -0800
@@ -1868,7 +1868,6 @@ static int e100_set_wol(struct net_devic
else
nic->flags &= ~wol_magic;

- pci_enable_wake(nic->pdev, 0, nic->flags & (wol_magic | e100_asf(nic)));
e100_exec_cb(nic, NULL, e100_configure);

return 0;
@@ -2262,8 +2261,6 @@ static int __devinit e100_probe(struct p
(nic->eeprom[eeprom_id] & eeprom_id_wol))
nic->flags |= wol_magic;

- pci_enable_wake(pdev, 0, nic->flags & (wol_magic | e100_asf(nic)));
-
strcpy(netdev->name, "eth%d");
if((err = register_netdev(netdev))) {
DPRINTK(PROBE, ERR, "Cannot register net device, aborting.\n");
@@ -2344,6 +2341,15 @@ static int e100_resume(struct pci_dev *p
}
#endif

+static void e100_shutdown(struct device *dev)
+{
+ struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct nic *nic = netdev_priv(netdev);
+
+ pci_enable_wake(pdev, 0, nic->flags & (wol_magic | e100_asf(nic)));
+}
+
static struct pci_driver e100_driver = {
.name = DRV_NAME,
.id_table = e100_id_table,
@@ -2353,6 +2359,9 @@ static struct pci_driver e100_driver = {
.suspend = e100_suspend,
.resume = e100_resume,
#endif
+ .driver = {
+ .shutdown = e100_shutdown,
+ }
};

static int __init e100_init_module(void)



2005-01-31 03:56:29

by Nigel Cunningham

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad

Hi.

Do you also disable the WOL event when resuming?

Regards,

Nigel

On Mon, 2005-01-31 at 14:47, Scott Feldman wrote:
> On Sun, 2005-01-30 at 09:18, David H?rdeman wrote:
> > I experience the same problems as reported by Michael Gernoth when
> > sending a WOL-packet to computer with a e100 NIC which is already
> > powered on.
>
> I didn't look at the 2.4 case, but for 2.6, it seems e100 was enabling
> PME wakeup during probe. PME shouldn't be enabled while the system is
> up. I suspect the assertion of PME while the system is up is what's
> causing problems. This patch moves PME wakeup enabling to either
> suspend or shutdown.
>
> David, would you give this patch a try? Make sure the system still
> wakes from a magic packet if suspended or shut down, and doesn't cause
> kacpid to go crazy if system is running. If it helps for 2.6, perhaps
> someone can look into 2.4 to see if there is something similar going on
> there.
>
> -scott
>
> --- linux-2.6.11-rc2/drivers/net/e100.c.orig 2005-01-30 19:13:56.850497376 -0800
> +++ linux-2.6.11-rc2/drivers/net/e100.c 2005-01-30 19:26:41.154305536 -0800
> @@ -1868,7 +1868,6 @@ static int e100_set_wol(struct net_devic
> else
> nic->flags &= ~wol_magic;
>
> - pci_enable_wake(nic->pdev, 0, nic->flags & (wol_magic | e100_asf(nic)));
> e100_exec_cb(nic, NULL, e100_configure);
>
> return 0;
> @@ -2262,8 +2261,6 @@ static int __devinit e100_probe(struct p
> (nic->eeprom[eeprom_id] & eeprom_id_wol))
> nic->flags |= wol_magic;
>
> - pci_enable_wake(pdev, 0, nic->flags & (wol_magic | e100_asf(nic)));
> -
> strcpy(netdev->name, "eth%d");
> if((err = register_netdev(netdev))) {
> DPRINTK(PROBE, ERR, "Cannot register net device, aborting.\n");
> @@ -2344,6 +2341,15 @@ static int e100_resume(struct pci_dev *p
> }
> #endif
>
> +static void e100_shutdown(struct device *dev)
> +{
> + struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
> + struct net_device *netdev = pci_get_drvdata(pdev);
> + struct nic *nic = netdev_priv(netdev);
> +
> + pci_enable_wake(pdev, 0, nic->flags & (wol_magic | e100_asf(nic)));
> +}
> +
> static struct pci_driver e100_driver = {
> .name = DRV_NAME,
> .id_table = e100_id_table,
> @@ -2353,6 +2359,9 @@ static struct pci_driver e100_driver = {
> .suspend = e100_suspend,
> .resume = e100_resume,
> #endif
> + .driver = {
> + .shutdown = e100_shutdown,
> + }
> };
>
> static int __init e100_init_module(void)
>
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com

Ph: +61 (2) 6292 8028 Mob: +61 (417) 100 574

2005-01-31 04:33:40

by Bukie Mabayoje

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad


Scott Feldman wrote:

> On Sun, 2005-01-30 at 09:18, David Härdeman wrote:
> > I experience the same problems as reported by Michael Gernoth when
> > sending a WOL-packet to computer with a e100 NIC which is already
> > powered on.
>
> I didn't look at the 2.4 case, but for 2.6, it seems e100 was enabling
> PME wakeup during probe. PME shouldn't be enabled while the system is
> up. I suspect the assertion of PME while the system is up is what's
> causing problems. This patch moves PME wakeup enabling to either
> suspend or shutdown.
>
> David, would you give this patch a try? Make sure the system still
> wakes from a magic packet if suspended or shut down, and doesn't cause
> kacpid to go crazy if system is running. If it helps for 2.6, perhaps
> someone can look into 2.4 to see if there is something similar going on

This issue was reported on 2.4.

>
> there.
>
> -scott
>
> --- linux-2.6.11-rc2/drivers/net/e100.c.orig 2005-01-30 19:13:56.850497376 -0800
> +++ linux-2.6.11-rc2/drivers/net/e100.c 2005-01-30 19:26:41.154305536 -0800
> @@ -1868,7 +1868,6 @@ static int e100_set_wol(struct net_devic
> else
> nic->flags &= ~wol_magic;
>
> - pci_enable_wake(nic->pdev, 0, nic->flags & (wol_magic | e100_asf(nic)));
> e100_exec_cb(nic, NULL, e100_configure);
>
> return 0;
> @@ -2262,8 +2261,6 @@ static int __devinit e100_probe(struct p
> (nic->eeprom[eeprom_id] & eeprom_id_wol))
> nic->flags |= wol_magic;
>
> - pci_enable_wake(pdev, 0, nic->flags & (wol_magic | e100_asf(nic)));
> -
> strcpy(netdev->name, "eth%d");
> if((err = register_netdev(netdev))) {
> DPRINTK(PROBE, ERR, "Cannot register net device, aborting.\n");
> @@ -2344,6 +2341,15 @@ static int e100_resume(struct pci_dev *p
> }
> #endif
>
> +static void e100_shutdown(struct device *dev)
> +{
> + struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
> + struct net_device *netdev = pci_get_drvdata(pdev);
> + struct nic *nic = netdev_priv(netdev);
> +
> + pci_enable_wake(pdev, 0, nic->flags & (wol_magic | e100_asf(nic)));
> +}
> +
> static struct pci_driver e100_driver = {
> .name = DRV_NAME,
> .id_table = e100_id_table,
> @@ -2353,6 +2359,9 @@ static struct pci_driver e100_driver = {
> .suspend = e100_suspend,
> .resume = e100_resume,
> #endif
> + .driver = {
> + .shutdown = e100_shutdown,
> + }
> };
>
> static int __init e100_init_module(void)
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2005-01-31 04:58:46

by Scott Feldman

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad

On Sun, 2005-01-30 at 19:58, Nigel Cunningham wrote:
> Do you also disable the WOL event when resuming?

Good catch. How's this look?

--- linux-2.6.11-rc2/drivers/net/e100.c.orig 2005-01-30 19:13:56.850497376 -0800
+++ linux-2.6.11-rc2/drivers/net/e100.c 2005-01-30 20:53:22.630560952 -0800
@@ -1868,7 +1868,6 @@ static int e100_set_wol(struct net_devic
else
nic->flags &= ~wol_magic;

- pci_enable_wake(nic->pdev, 0, nic->flags & (wol_magic | e100_asf(nic)));
e100_exec_cb(nic, NULL, e100_configure);

return 0;
@@ -2262,8 +2261,6 @@ static int __devinit e100_probe(struct p
(nic->eeprom[eeprom_id] & eeprom_id_wol))
nic->flags |= wol_magic;

- pci_enable_wake(pdev, 0, nic->flags & (wol_magic | e100_asf(nic)));
-
strcpy(netdev->name, "eth%d");
if((err = register_netdev(netdev))) {
DPRINTK(PROBE, ERR, "Cannot register net device, aborting.\n");
@@ -2320,7 +2317,8 @@ static int e100_suspend(struct pci_dev *
netif_device_detach(netdev);

pci_save_state(pdev);
- pci_enable_wake(pdev, state, nic->flags & (wol_magic | e100_asf(nic)));
+ pci_enable_wake(pdev, pci_choose_state(pdev, state),
+ nic->flags & (wol_magic | e100_asf(nic)));
pci_disable_device(pdev);
pci_set_power_state(pdev, pci_choose_state(pdev, state));

@@ -2333,6 +2331,7 @@ static int e100_resume(struct pci_dev *p
struct nic *nic = netdev_priv(netdev);

pci_set_power_state(pdev, PCI_D0);
+ pci_enable_wake(pdev, PCI_D0, 0);
pci_restore_state(pdev);
e100_hw_init(nic);

@@ -2344,6 +2343,15 @@ static int e100_resume(struct pci_dev *p
}
#endif

+static void e100_shutdown(struct device *dev)
+{
+ struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
+ struct net_device *netdev = pci_get_drvdata(pdev);
+ struct nic *nic = netdev_priv(netdev);
+
+ pci_enable_wake(pdev, PCI_D0, nic->flags & (wol_magic | e100_asf(nic)));
+}
+
static struct pci_driver e100_driver = {
.name = DRV_NAME,
.id_table = e100_id_table,
@@ -2353,6 +2361,9 @@ static struct pci_driver e100_driver = {
.suspend = e100_suspend,
.resume = e100_resume,
#endif
+ .driver = {
+ .shutdown = e100_shutdown,
+ }
};

static int __init e100_init_module(void)


2005-01-31 06:14:33

by Nigel Cunningham

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad

Hi.

On Mon, 2005-01-31 at 16:00, Scott Feldman wrote:
> On Sun, 2005-01-30 at 19:58, Nigel Cunningham wrote:
> > Do you also disable the WOL event when resuming?
>
> Good catch. How's this look?

I looked at it last week because I used it for an example of device
model drivers at the CELF conference. I got your intel address from the
top of the .c file, but IIRC it bounced. Providence :>

[...]

> @@ -2333,6 +2331,7 @@ static int e100_resume(struct pci_dev *p
> struct nic *nic = netdev_priv(netdev);
>
> pci_set_power_state(pdev, PCI_D0);
> + pci_enable_wake(pdev, PCI_D0, 0);
> pci_restore_state(pdev);
> e100_hw_init(nic);

Shouldn't this be disable_wake?

Regards,

Nigel
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com

Ph: +61 (2) 6292 8028 Mob: +61 (417) 100 574

2005-01-31 09:06:04

by Nigel Cunningham

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad

Hi again.

Ignore that :> I realised later that there's only one badly named
routine and my assumption that there was another called disable_.. was
wrong :>

Nigel

On Mon, 2005-01-31 at 17:14, Nigel Cunningham wrote:
> Hi.
>
> On Mon, 2005-01-31 at 16:00, Scott Feldman wrote:
> > On Sun, 2005-01-30 at 19:58, Nigel Cunningham wrote:
> > > Do you also disable the WOL event when resuming?
> >
> > Good catch. How's this look?
>
> I looked at it last week because I used it for an example of device
> model drivers at the CELF conference. I got your intel address from the
> top of the .c file, but IIRC it bounced. Providence :>
>
> [...]
>
> > @@ -2333,6 +2331,7 @@ static int e100_resume(struct pci_dev *p
> > struct nic *nic = netdev_priv(netdev);
> >
> > pci_set_power_state(pdev, PCI_D0);
> > + pci_enable_wake(pdev, PCI_D0, 0);
> > pci_restore_state(pdev);
> > e100_hw_init(nic);
>
> Shouldn't this be disable_wake?
>
> Regards,
>
> Nigel
--
Nigel Cunningham
Software Engineer, Canberra, Australia
http://www.cyclades.com

Ph: +61 (2) 6292 8028 Mob: +61 (417) 100 574

2005-01-31 18:25:15

by Marcelo Tosatti

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad

On Sun, Jan 30, 2005 at 08:23:47PM -0800, Bukie Mabayoje wrote:
>
> Scott Feldman wrote:
>
> > On Sun, 2005-01-30 at 09:18, David Härdeman wrote:
> > > I experience the same problems as reported by Michael Gernoth when
> > > sending a WOL-packet to computer with a e100 NIC which is already
> > > powered on.
> >
> > I didn't look at the 2.4 case, but for 2.6, it seems e100 was enabling
> > PME wakeup during probe. PME shouldn't be enabled while the system is
> > up. I suspect the assertion of PME while the system is up is what's
> > causing problems. This patch moves PME wakeup enabling to either
> > suspend or shutdown.
> >
> > David, would you give this patch a try? Make sure the system still
> > wakes from a magic packet if suspended or shut down, and doesn't cause
> > kacpid to go crazy if system is running. If it helps for 2.6, perhaps
> > someone can look into 2.4 to see if there is something similar going on
>
> This issue was reported on 2.4.

Can any of you guys test v2.6, please?

2005-01-31 19:30:19

by Jesse Brandeburg

[permalink] [raw]
Subject: RE: 2.4.29, e100 and a WOL packet causes keventd going mad

>+static void e100_shutdown(struct device *dev)
>+{
>+ struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
>+ struct net_device *netdev = pci_get_drvdata(pdev);
>+ struct nic *nic = netdev_priv(netdev);
>+
>+ pci_enable_wake(pdev, PCI_D0, nic->flags & (wol_magic |
>e100_asf(nic)));
>+}
>+

Separately, does anyone think that the OS should be handling the PME event on the bus (as it comes from the PIC as an interrupt, and can be masked at the PIC) with a default handler? The machines having the problem seem to be killed by an interrupt storm generated by the PME interrupt, just a guess.

Jesse

2005-01-31 20:30:53

by David Härdeman

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad

On Mon, Jan 31, 2005 at 01:24:31PM -0200, Marcelo Tosatti wrote:
>On Sun, Jan 30, 2005 at 08:23:47PM -0800, Bukie Mabayoje wrote:
>> Scott Feldman wrote:
>>> David, would you give this patch a try? Make sure the system still
>>> wakes from a magic packet if suspended or shut down, and doesn't cause
>>> kacpid to go crazy if system is running. If it helps for 2.6, perhaps
>>> someone can look into 2.4 to see if there is something similar going on
>>
>> This issue was reported on 2.4.
>
>Can any of you guys test v2.6, please?
>

I tried the second patch provided by Scott on a 2.6.10 kernel, I did
some minor tweaks to get it to apply (changed pci_choose_state() and
PCI_D0 back to the way they were in 2.6.10) and tested the results five
minutes ago.

It works great, I havent tried suspending the machine cause I have no
need for that functionality. I have however started the machine via WOL
(works), sent WOL-packet to the machine when powered on (nothing
happends - kacpid doesn't go wild, works), shutdown (works without the
machine spontaneously rebooting).

So everything seems to be fixed by the patch (save for suspending which
I didn't test).

Thanks alot, I hope the patch will be in the next stable 2.6 kernel.

Regards,
David

2005-01-31 20:42:05

by Bukie Mabayoje

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad

The issue is not the PME interrupt, the issue is that the device is going into a state that is not valid. A live system should never ASSERT PME# line. As long as this functionality is enable on the chip the PME will be asserted.
To avoid this unwanted condition the driver should disable PME on the chip on a live system. And enable it back when it is going to any of the PWR STATE that require a wake up by the LAN.

"Brandeburg, Jesse" wrote:

> >+static void e100_shutdown(struct device *dev)
> >+{
> >+ struct pci_dev *pdev = container_of(dev, struct pci_dev, dev);
> >+ struct net_device *netdev = pci_get_drvdata(pdev);
> >+ struct nic *nic = netdev_priv(netdev);
> >+
> >+ pci_enable_wake(pdev, PCI_D0, nic->flags & (wol_magic |
> >e100_asf(nic)));
> >+}
> >+
>
> Separately, does anyone think that the OS should be handling the PME event on the bus (as it comes from the PIC as an interrupt, and can be masked at the PIC) with a default handler? The machines having the problem seem to be killed by an interrupt storm generated by the PME interrupt, just a guess.
>
> Jesse
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2005-01-31 21:00:26

by Bukie Mabayoje

[permalink] [raw]
Subject: Re: 2.4.29, e100 and a WOL packet causes keventd going mad



Marcelo Tosatti wrote:

> On Sun, Jan 30, 2005 at 08:23:47PM -0800, Bukie Mabayoje wrote:
> >
> > Scott Feldman wrote:
> >
> > > On Sun, 2005-01-30 at 09:18, David Härdeman wrote:
> > > > I experience the same problems as reported by Michael Gernoth when
> > > > sending a WOL-packet to computer with a e100 NIC which is already
> > > > powered on.
> > >
> > > I didn't look at the 2.4 case, but for 2.6, it seems e100 was enabling
> > > PME wakeup during probe. PME shouldn't be enabled while the system is
> > > up. I suspect the assertion of PME while the system is up is what's
> > > causing problems. This patch moves PME wakeup enabling to either
> > > suspend or shutdown.
> > >
> > > David, would you give this patch a try? Make sure the system still
> > > wakes from a magic packet if suspended or shut down, and doesn't cause
> > > kacpid to go crazy if system is running. If it helps for 2.6, perhaps
> > > someone can look into 2.4 to see if there is something similar going on
> >
> > This issue was reported on 2.4.
>
> Can any of you guys test v2.6, please?

I will be glad to test it now but I can't, I am currently doing some work on 2.4. If no one has tested it in the next few days I will validate it then.

By the way, do anyone have an idea how to get this functionality into 2.4 eepro100. The problem is that eepro100 code works on a non WOL cards.

>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/