2020-01-22 22:26:11

by Finn Thain

[permalink] [raw]
Subject: [PATCH net v3 12/12] net/sonic: Prevent tx watchdog timeout

Section 5.5.3.2 of the datasheet says,

If FIFO Underrun, Byte Count Mismatch, Excessive Collision, or
Excessive Deferral (if enabled) errors occur, transmission ceases.

In this situation, the chip asserts a TXER interrupt rather than TXDN.
But the handler for the TXDN is the only way that the transmit queue
gets restarted. Hence, an aborted transmission can result in a watchdog
timeout.

This problem can be reproduced on congested link, as that can result in
excessive transmitter collisions. Another way to reproduce this is with
a FIFO Underrun, which may be caused by DMA latency.

In event of a TXER interrupt, prevent a watchdog timeout by restarting
transmission.

Fixes: 1da177e4c3f4 ("Linux-2.6.12-rc2")
Tested-by: Stan Johnson <[email protected]>
Signed-off-by: Finn Thain <[email protected]>
---
drivers/net/ethernet/natsemi/sonic.c | 17 +++++++++++++----
1 file changed, 13 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/natsemi/sonic.c b/drivers/net/ethernet/natsemi/sonic.c
index 27b6f6585527..05e760444a92 100644
--- a/drivers/net/ethernet/natsemi/sonic.c
+++ b/drivers/net/ethernet/natsemi/sonic.c
@@ -415,10 +415,19 @@ static irqreturn_t sonic_interrupt(int irq, void *dev_id)
lp->stats.rx_missed_errors += 65536;

/* transmit error */
- if (status & SONIC_INT_TXER)
- if (SONIC_READ(SONIC_TCR) & SONIC_TCR_FU)
- netif_dbg(lp, tx_err, dev, "%s: tx fifo underrun\n",
- __func__);
+ if (status & SONIC_INT_TXER) {
+ u16 tcr = SONIC_READ(SONIC_TCR);
+
+ netif_dbg(lp, tx_err, dev, "%s: TXER intr, TCR %04x\n",
+ __func__, tcr);
+
+ if (tcr & (SONIC_TCR_EXD | SONIC_TCR_EXC |
+ SONIC_TCR_FU | SONIC_TCR_BCM)) {
+ /* Aborted transmission. Try again. */
+ netif_stop_queue(dev);
+ SONIC_WRITE(SONIC_CMD, SONIC_CR_TXP);
+ }
+ }

/* bus retry */
if (status & SONIC_INT_BR) {
--
2.24.1