2006-10-20 18:33:12

by Daniel Walker

[permalink] [raw]
Subject: [PATCH] e100_shutdown: netif_poll_disable hang

My machine annoyingly hangs while rebooting. I tracked it down
to e100-fix-reboot-f-with-netconsole-enabled.patch in 2.6.18-rc2-mm2

I review the changes and it seemed to be calling netif_poll_disable
one too many time. Once in e100_down(), and again in e100_shutdown().

The second one in e100_shutdown() caused the hang. So this patch
removes it.

Signed-Off-By: Daniel Walker <[email protected]>

---
drivers/net/e100.c | 1 -
1 files changed, 1 deletion(-)

Index: linux-2.6.18/drivers/net/e100.c
===================================================================
--- linux-2.6.18.orig/drivers/net/e100.c
+++ linux-2.6.18/drivers/net/e100.c
@@ -2709,7 +2709,6 @@ static void e100_shutdown(struct pci_dev
struct net_device *netdev = pci_get_drvdata(pdev);
struct nic *nic = netdev_priv(netdev);

- netif_poll_disable(nic->netdev);
del_timer_sync(&nic->watchdog);
netif_carrier_off(nic->netdev);

--


2006-10-20 20:53:38

by Kok, Auke

[permalink] [raw]
Subject: Re: [PATCH] e100_shutdown: netif_poll_disable hang

Daniel Walker wrote:
> My machine annoyingly hangs while rebooting. I tracked it down
> to e100-fix-reboot-f-with-netconsole-enabled.patch in 2.6.18-rc2-mm2
>
> I review the changes and it seemed to be calling netif_poll_disable
> one too many time. Once in e100_down(), and again in e100_shutdown().
>
> The second one in e100_shutdown() caused the hang. So this patch
> removes it.
>
> Signed-Off-By: Daniel Walker <[email protected]>
>
> ---
> drivers/net/e100.c | 1 -
> 1 files changed, 1 deletion(-)
>
> Index: linux-2.6.18/drivers/net/e100.c
> ===================================================================
> --- linux-2.6.18.orig/drivers/net/e100.c
> +++ linux-2.6.18/drivers/net/e100.c
> @@ -2709,7 +2709,6 @@ static void e100_shutdown(struct pci_dev
> struct net_device *netdev = pci_get_drvdata(pdev);
> struct nic *nic = netdev_priv(netdev);
>
> - netif_poll_disable(nic->netdev);
> del_timer_sync(&nic->watchdog);
> netif_carrier_off(nic->netdev);
>
> --

this won't help netconsole shutdown/reboot -f, probably locking it up again!

NAK

Also missing is the NAPI conditional, which I left out. Also missing is the same code
for suspend.

Here's a better approach, allowing both normal shutdown code path and reboot -f to
disable polling.

Cheers,

Auke
---

e100: disable polling only when up during suspend and shutdown

Signed-off-by: Auke Kok <auke-jan.h.kok>
Cc: Daniel Walker <[email protected]>
Cc: Andrew Morton <[email protected]>

---
drivers/net/e100.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

---
diff --git a/drivers/net/e100.c b/drivers/net/e100.c
index d4a2572..815eb29 100644
--- a/drivers/net/e100.c
+++ b/drivers/net/e100.c
@@ -2719,7 +2719,10 @@ static int e100_suspend(struct pci_dev *
struct net_device *netdev = pci_get_drvdata(pdev);
struct nic *nic = netdev_priv(netdev);

- netif_poll_disable(nic->netdev);
+#ifdef CONFIG_E100_NAPI
+ if (netif_running(netdev))
+ netif_poll_disable(nic->netdev);
+#endif
del_timer_sync(&nic->watchdog);
netif_carrier_off(nic->netdev);

@@ -2763,7 +2766,10 @@ static void e100_shutdown(struct pci_dev
struct net_device *netdev = pci_get_drvdata(pdev);
struct nic *nic = netdev_priv(netdev);

- netif_poll_disable(nic->netdev);
+#ifdef CONFIG_E100_NAPI
+ if (netif_running(netdev))
+ netif_poll_disable(nic->netdev);
+#endif
del_timer_sync(&nic->watchdog);
netif_carrier_off(nic->netdev);

2006-10-20 21:11:26

by Kok, Auke

[permalink] [raw]
Subject: Re: [PATCH] e100_shutdown: netif_poll_disable hang

Auke Kok wrote:
> Daniel Walker wrote:
>> My machine annoyingly hangs while rebooting. I tracked it down
>> to e100-fix-reboot-f-with-netconsole-enabled.patch in 2.6.18-rc2-mm2
>>
>> I review the changes and it seemed to be calling netif_poll_disable
>> one too many time. Once in e100_down(), and again in e100_shutdown().
>>
>> The second one in e100_shutdown() caused the hang. So this patch
>> removes it.

it doesn't even do harm to netif_poll_disable() twice as far as I can see, as it merely
calls test_and_set_bit(), which will instantly succeed on the first attempt if the bit
was already set.

did this change actually fix it for you? I'm wondering if the netif_carrier_off might
not be the culprit here...

Auke

2006-10-21 17:41:18

by Damien Wyart

[permalink] [raw]
Subject: Re: [PATCH] e100_shutdown: netif_poll_disable hang

> > > My machine annoyingly hangs while rebooting. I tracked it down to
> > > e100-fix-reboot-f-with-netconsole-enabled.patch in 2.6.18-rc2-mm2
> > > I review the changes and it seemed to be calling
> > > netif_poll_disable one too many time. Once in e100_down(), and
> > > again in e100_shutdown().
> > > The second one in e100_shutdown() caused the hang. So this patch
> > > removes it.

* Auke Kok <[email protected]> [061020 23:09]:
> it doesn't even do harm to netif_poll_disable() twice as far as I can
> see, as it merely calls test_and_set_bit(), which will instantly
> succeed on the first attempt if the bit was already set.

> did this change actually fix it for you? I'm wondering if the
> netif_carrier_off might not be the culprit here...

I can confirm the proposed original change of D. Walker fixed the
problem for me. I did not test the change you proposed as a followup.

--
Damien Wyart

2006-10-21 17:55:20

by Kok, Auke

[permalink] [raw]
Subject: Re: [PATCH] e100_shutdown: netif_poll_disable hang

Damien Wyart wrote:
>>>> My machine annoyingly hangs while rebooting. I tracked it down to
>>>> e100-fix-reboot-f-with-netconsole-enabled.patch in 2.6.18-rc2-mm2
>>>> I review the changes and it seemed to be calling
>>>> netif_poll_disable one too many time. Once in e100_down(), and
>>>> again in e100_shutdown().
>>>> The second one in e100_shutdown() caused the hang. So this patch
>>>> removes it.
>
> * Auke Kok <[email protected]> [061020 23:09]:
>> it doesn't even do harm to netif_poll_disable() twice as far as I can
>> see, as it merely calls test_and_set_bit(), which will instantly
>> succeed on the first attempt if the bit was already set.
>
>> did this change actually fix it for you? I'm wondering if the
>> netif_carrier_off might not be the culprit here...
>
> I can confirm the proposed original change of D. Walker fixed the
> problem for me. I did not test the change you proposed as a followup.

his change breaks something else (a reboot with netconsole, possibly suspend). Please
give the latest version I sent a try. Daniel confirmed me that it works, but it's always
nice to hear it from more people.

Thanks

Auke

2006-10-21 20:56:03

by Damien Wyart

[permalink] [raw]
Subject: Re: [PATCH] e100_shutdown: netif_poll_disable hang

* Auke Kok <[email protected]> [2006-10-21 10:54]: his change
> breaks something else (a reboot with netconsole, possibly suspend).
> Please give the latest version I sent a try. Daniel confirmed me that
> it works, but it's always nice to hear it from more people.

Yes, your version works here too.

--
Damien Wyart