2006-11-01 05:53:29

by Chris Wright

[permalink] [raw]
Subject: [PATCH 25/61] bcm43xx: fix watchdog timeouts.

-stable review patch. If anyone has any objections, please let us know.
------------------

From: Michael Buesch <[email protected]>

This fixes a netdev watchdog timeout problem.
The problem is caused by a needed netif_tx_disable
in the hardware calibration code and can be shown by the
following timegraph.

|---5secs - ~10 jiffies time---|---|OOPS
^ ^
last real TX periodic work stops netif

At OOPS, the following happens:
The watchdog timer triggers, because the timeout of 5secs
is over. The watchdog first checks for stopped TX.
_Usually_ TX is only stopped from the TX handler to indicate
a full TX queue. But this is different. We need to stop TX here,
regardless of the TX queue state. So the watchdog recognizes
the stopped device and assumes it is stopped due to full
TX queues (Which is a _wrong_ assumption in this case). It then
tests how far the last TX has been in the past. If it's more than
5secs (which is the case for low or no traffic), it will fire
a TX timeout.

Signed-off-by: Michael Buesch <[email protected]>
Signed-off-by: Larry Finger <[email protected]>
Signed-off-by: Greg Kroah-Hartman <[email protected]>
Signed-off-by: Chris Wright <[email protected]>

---
drivers/net/wireless/bcm43xx/bcm43xx_main.c | 8 ++++++++
1 file changed, 8 insertions(+)

--- linux-2.6.18.1.orig/drivers/net/wireless/bcm43xx/bcm43xx_main.c
+++ linux-2.6.18.1/drivers/net/wireless/bcm43xx/bcm43xx_main.c
@@ -3165,7 +3165,15 @@ static void bcm43xx_periodic_work_handle

badness = estimate_periodic_work_badness(bcm->periodic_state);
mutex_lock(&bcm->mutex);
+
+ /* We must fake a started transmission here, as we are going to
+ * disable TX. If we wouldn't fake a TX, it would be possible to
+ * trigger the netdev watchdog, if the last real TX is already
+ * some time on the past (slightly less than 5secs)
+ */
+ bcm->net_dev->trans_start = jiffies;
netif_tx_disable(bcm->net_dev);
+
spin_lock_irqsave(&bcm->irq_lock, flags);
if (badness > BADNESS_LIMIT) {
/* Periodic work will take a long time, so we want it to

--


2006-11-01 15:17:41

by John W. Linville

[permalink] [raw]
Subject: Re: [PATCH 25/61] bcm43xx: fix watchdog timeouts.

On Tue, Oct 31, 2006 at 09:34:05PM -0800, Chris Wright wrote:
> -stable review patch. If anyone has any objections, please let us know.
> ------------------
>
> From: Michael Buesch <[email protected]>
>
> This fixes a netdev watchdog timeout problem.
> The problem is caused by a needed netif_tx_disable
> in the hardware calibration code and can be shown by the
> following timegraph.

ACK
--
John W. Linville
[email protected]