2009-12-03 21:20:13

by Ha, Tristram

[permalink] [raw]
Subject: [PATCH 2.6.32 2/3] net: Fix ks8851 snl transmit problem

From: Tristram Ha <[email protected]>

This fixes a transmit problem of the ks8851 snl network driver. Under heavy TCP traffic the device will stop transmitting. Turning off the transmit interrupt avoids this issue. A new workqueue was implemented to replace the functionality of the transmit interrupt processing.

Signed-off-by: Tristram Ha <[email protected]>
---
diff -urpN linux-2.6.32.old/drivers/net/ks8851.c linux-2.6.32.new/drivers/net/ks8851.c
--- linux-2.6.32.old/drivers/net/ks8851.c 2009-11-03 11:37:49.000000000 -0800
+++ linux-2.6.32.new/drivers/net/ks8851.c 2009-12-02 15:31:39.000000000 -0800
@@ -111,11 +111,13 @@ struct ks8851_net {
struct mii_if_info mii;
struct ks8851_rxctrl rxctrl;

+ struct work_struct tx_check;
struct work_struct tx_work;
struct work_struct irq_work;
struct work_struct rxctrl_work;

struct sk_buff_head txq;
+ int tx_len;

struct spi_message spi_msg1;
struct spi_message spi_msg2;
@@ -573,19 +575,6 @@ static void ks8851_irq_work(struct work_
if (status & IRQ_RXPSI)
handled |= IRQ_RXPSI;

- if (status & IRQ_TXI) {
- handled |= IRQ_TXI;
-
- /* no lock here, tx queue should have been stopped */
-
- /* update our idea of how much tx space is available to the
- * system */
- ks->tx_space = ks8851_rdreg16(ks, KS_TXMIR);
-
- if (netif_msg_intr(ks))
- ks_dbg(ks, "%s: txspace %d\n", __func__, ks->tx_space);
- }
-
if (status & IRQ_RXI)
handled |= IRQ_RXI;

@@ -623,9 +612,6 @@ static void ks8851_irq_work(struct work_

mutex_unlock(&ks->lock);

- if (status & IRQ_TXI)
- netif_wake_queue(ks->netdev);
-
enable_irq(ks->netdev->irq);
}

@@ -703,6 +689,17 @@ static void ks8851_done_tx(struct ks8851
dev_kfree_skb(txb);
}

+static void ks8851_tx_check(struct work_struct *work)
+{
+ struct ks8851_net *ks = container_of(work, struct ks8851_net, tx_check);
+
+ ks->tx_space = ks8851_rdreg16(ks, KS_TXMIR);
+ if (ks->tx_space > ks->tx_len)
+ netif_wake_queue(ks->netdev);
+ else
+ schedule_work(&ks->tx_check);
+}
+
/**
* ks8851_tx_work - process tx packet(s)
* @work: The work strucutre what was scheduled.
@@ -814,7 +811,6 @@ static int ks8851_net_open(struct net_de
/* clear then enable interrupts */

#define STD_IRQ (IRQ_LCI | /* Link Change */ \
- IRQ_TXI | /* TX done */ \
IRQ_RXI | /* RX done */ \
IRQ_SPIBEI | /* SPI bus error */ \
IRQ_TXPSI | /* TX process stop */ \
@@ -830,6 +826,7 @@ static int ks8851_net_open(struct net_de
ks_dbg(ks, "network device %s up\n", dev->name);

mutex_unlock(&ks->lock);
+ ks8851_write_mac_addr(dev);
return 0;
}

@@ -854,6 +851,7 @@ static int ks8851_net_stop(struct net_de

/* stop any outstanding work */
flush_work(&ks->irq_work);
+ flush_work(&ks->tx_check);
flush_work(&ks->tx_work);
flush_work(&ks->rxctrl_work);

@@ -912,14 +910,16 @@ static netdev_tx_t ks8851_start_xmit(str

if (needed > ks->tx_space) {
netif_stop_queue(dev);
+ ks->tx_len = needed;
+ schedule_work(&ks->tx_check);
ret = NETDEV_TX_BUSY;
} else {
ks->tx_space -= needed;
skb_queue_tail(&ks->txq, skb);
+ schedule_work(&ks->tx_work);
}

spin_unlock(&ks->statelock);
- schedule_work(&ks->tx_work);

return ret;
}
@@ -1229,6 +1229,7 @@ static int __devinit ks8851_probe(struct
mutex_init(&ks->lock);
spin_lock_init(&ks->statelock);

+ INIT_WORK(&ks->tx_check, ks8851_tx_check);
INIT_WORK(&ks->tx_work, ks8851_tx_work);
INIT_WORK(&ks->irq_work, ks8851_irq_work);
INIT_WORK(&ks->rxctrl_work, ks8851_rxctrl_work);
@@ -1281,6 +1282,7 @@ static int __devinit ks8851_probe(struct

ks8851_read_selftest(ks);
ks8851_init_mac(ks);
+ ks->tx_space = ks8851_rdreg16(ks, KS_TXMIR);

ret = request_irq(spi->irq, ks8851_irq, IRQF_TRIGGER_LOW,
ndev->name, ks);


2009-12-04 00:03:13

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 2.6.32 2/3] net: Fix ks8851 snl transmit problem

From: "Ha, Tristram" <[email protected]>
Date: Thu, 3 Dec 2009 13:17:48 -0800

> This fixes a transmit problem of the ks8851 snl network driver. Under heavy TCP traffic the device will stop transmitting. Turning off the transmit interrupt avoids this issue. A new workqueue was implemented to replace the functionality of the transmit interrupt processing.
>
> Signed-off-by: Tristram Ha <[email protected]>

Please describe why the TX interrupt facility cannot be used.
You probably have a race condition or similar, have you tried
to see if there is some issue like that?

This workqueue is going to make performance terrible, at best.

2009-12-04 01:01:16

by Ha, Tristram

[permalink] [raw]
Subject: RE: [PATCH 2.6.32 2/3] net: Fix ks8851 snl transmit problem

David Miller wrote:
> From: "Ha, Tristram" <[email protected]>
> Date: Thu, 3 Dec 2009 13:17:48 -0800
>
>> This fixes a transmit problem of the ks8851 snl network driver.
Under heavy TCP traffic the
>> device will stop transmitting. Turning off the transmit interrupt
avoids this issue. A new
>> workqueue was implemented to replace the functionality of the
transmit interrupt processing.
>>
>> Signed-off-by: Tristram Ha <[email protected]>
>
> Please describe why the TX interrupt facility cannot be used.
> You probably have a race condition or similar, have you tried to see
if there is some issue like
> that?
>
> This workqueue is going to make performance terrible, at best.

I am a software engineer at Micrel, Inc., which developed the KSZ8851
SNL chips. This transmit problem is a hardware issue as the same
problem is encountered in different drivers like WinCE. We have not
pinpointed the hardware bug. I do not think there is something wrong in
the driver. Turning off the transmit interrupt is just a workaround.

I know the new workqueue implementation is polling. But in my
experience, interrupting the system for each packet sent likely degrades
the system performance more. The workqueue is only called when the
transmit buffer is empty, which happens after 3 or 4 packets are sent
quickly in succession. The workqueue schedules itself again if the
transmit buffer is still not available, but in practice this case never
happens.

The KSZ8851 chips do not need special transmit interrupt handling
routine to cleanup the registers or socket buffers. It is better to
turn it off for better performance. Because the KSZ8851 SNL uses SPI,
its driver has restrictions of when to call SPI host to access
registers. That is why the original driver implementation uses a
workqueue to check the transmit buffer during transmit interrupt
handling.

2009-12-21 21:51:56

by Ha, Tristram

[permalink] [raw]
Subject: RE: [PATCH 2.6.32 2/3] net: Fix ks8851 snl transmit problem

Ha, Tristram wrote:
> David Miller wrote:
>>
>> Please describe why the TX interrupt facility cannot be used.
>> You probably have a race condition or similar, have you tried to see
>> if there is some issue like that?
>>
>> This workqueue is going to make performance terrible, at best.
>
> I am a software engineer at Micrel, Inc., which developed the KSZ8851
SNL chips. This transmit
> problem is a hardware issue as the same problem is encountered in
different drivers like WinCE.
> We have not pinpointed the hardware bug. I do not think there is
something wrong in the driver.
> Turning off the transmit interrupt is just a workaround.
>
> I know the new workqueue implementation is polling. But in my
experience, interrupting the
> system for each packet sent likely degrades the system performance
more. The workqueue is only
> called when the transmit buffer is empty, which happens after 3 or 4
packets are sent quickly in
> succession. The workqueue schedules itself again if the transmit
buffer is still not available,
> but in practice this case never happens.
>
> The KSZ8851 chips do not need special transmit interrupt handling
routine to cleanup the
> registers or socket buffers. It is better to turn it off for better
performance. Because the
> KSZ8851 SNL uses SPI, its driver has restrictions of when to call SPI
host to access registers.
> That is why the original driver implementation uses a workqueue to
check the transmit buffer
> during transmit interrupt handling.

Upon investigation, I found the ks8851 snl TCP transmit problem is not
related to the transmit side, which is fine, but to the receive side.
The "[PATCH 2.6.33] net: Fix ks8851 snl receive problem" patch I just
submitted should avoid the problem, so this patch is not required to
workaround the problem.

However, I still recommend this patch for performance consideration. I
did a nuttcp benchmark using the Beagle Zippy combo board.

The original code using transmit done interrupt:

TCP RX 9.6 Mbps TCP TX 10.9 Mbps

The new code with transmit interrupt enabled:

TCP RX 9.7 Mbps TCP TX 11.5 Mbps

The new code with transmit interrupt disabled:

TCP RX 9.8 Mbps TCP TX 11.8 Mbps

The improvement is not much, but as the ks8851 snl device is targeted on
embedded systems that may not have powerful CPUs, the performance gain
is important to those ks8851 snl users.