2002-11-20 18:46:23

by Robert Olsson

[permalink] [raw]
Subject: e1000 fixes (NAPI)



Two fixes the NAPI branch of e1000.


1) e1000_irq_disable was used to disable irqs which called synchronize_irq
which in turn caused a solid hang on SMP systems.

2) Interrupt acking patch previously sent by DaveM.

Cheers.
--ro



--- linux/drivers/net/e1000.vanilla/e1000.h 2002-11-13 12:54:49.000000000 +0100
+++ linux/drivers/net/e1000/e1000.h 2002-11-20 15:00:14.000000000 +0100
@@ -181,6 +181,10 @@
uint32_t tx_abs_int_delay;
int max_data_per_txd;

+#ifdef CONFIG_E1000_NAPI
+ uint32_t icr_pending;
+#endif
+
/* RX */
struct e1000_desc_ring rx_ring;
uint64_t hw_csum_err;
--- linux/drivers/net/e1000.vanilla/e1000_main.c 2002-11-13 12:54:49.000000000 +0100
+++ linux/drivers/net/e1000/e1000_main.c 2002-11-20 17:37:55.000000000 +0100
@@ -1896,15 +1896,29 @@
{
struct net_device *netdev = data;
struct e1000_adapter *adapter = netdev->priv;
+ uint32_t icr;
+ int i = E1000_MAX_INTR;
+

#ifdef CONFIG_E1000_NAPI
- if (netif_rx_schedule_prep(netdev)) {
- e1000_irq_disable(adapter);
+ icr = E1000_READ_REG(&adapter->hw, ICR);
+ if (icr && netif_rx_schedule_prep(netdev)) {
+
+ /* Disable interrupts */
+ atomic_inc(&adapter->irq_sem);
+ E1000_WRITE_REG(&adapter->hw, IMC, ~0);
+
+ /* E1000_WRITE_FLUSH(&adapter->hw);
+ * E1000_READ below syncs it --ro
+ */
+
__netif_rx_schedule(netdev);
+
+ adapter->icr_pending = icr;
+ while (i-- && (icr = E1000_READ_REG(&adapter->hw, ICR)) != 0)
+ adapter->icr_pending |= icr;
}
#else
- uint32_t icr;
- int i = E1000_MAX_INTR;

while(i && (icr = E1000_READ_REG(&adapter->hw, ICR))) {

@@ -1930,7 +1944,15 @@
int i = E1000_MAX_INTR;
int hasReceived = 0;

- while(i && (icr = E1000_READ_REG(&adapter->hw, ICR))) {
+ while(i) {
+ if (adapter->icr_pending) {
+ icr = adapter->icr_pending;
+ adapter->icr_pending = 0;
+ } else
+ icr = E1000_READ_REG(&adapter->hw, ICR);
+ if (!icr)
+ break;
+
if (icr & E1000_ICR_RXT0)
hasReceived = 1;



2002-11-20 22:34:40

by Jeff V. Merkey

[permalink] [raw]
Subject: Re: e1000 fixes (NAPI)



Need another fix. You need to reinstrument the tasklet schedule in the
fill_rx_ring instread of doing the whole thing from interrupt. When the
system is loaded at 100% saturation on gigbit with 300 byte packets or
smaller, the driver does not allow any processes to run, and you cannot
log in via ssh or any user space apps. This is severely busted.

The later versions of the driver > 4.3.15 all exhibit this behavior and
are extremely broken.

Jeff



On Wed, Nov 20, 2002 at 08:01:16PM +0100, Robert Olsson wrote:
>
>
> Two fixes the NAPI branch of e1000.
>
>
> 1) e1000_irq_disable was used to disable irqs which called synchronize_irq
> which in turn caused a solid hang on SMP systems.
>
> 2) Interrupt acking patch previously sent by DaveM.
>
> Cheers.
> --ro
>
>
>
> --- linux/drivers/net/e1000.vanilla/e1000.h 2002-11-13 12:54:49.000000000 +0100
> +++ linux/drivers/net/e1000/e1000.h 2002-11-20 15:00:14.000000000 +0100
> @@ -181,6 +181,10 @@
> uint32_t tx_abs_int_delay;
> int max_data_per_txd;
>
> +#ifdef CONFIG_E1000_NAPI
> + uint32_t icr_pending;
> +#endif
> +
> /* RX */
> struct e1000_desc_ring rx_ring;
> uint64_t hw_csum_err;
> --- linux/drivers/net/e1000.vanilla/e1000_main.c 2002-11-13 12:54:49.000000000 +0100
> +++ linux/drivers/net/e1000/e1000_main.c 2002-11-20 17:37:55.000000000 +0100
> @@ -1896,15 +1896,29 @@
> {
> struct net_device *netdev = data;
> struct e1000_adapter *adapter = netdev->priv;
> + uint32_t icr;
> + int i = E1000_MAX_INTR;
> +
>
> #ifdef CONFIG_E1000_NAPI
> - if (netif_rx_schedule_prep(netdev)) {
> - e1000_irq_disable(adapter);
> + icr = E1000_READ_REG(&adapter->hw, ICR);
> + if (icr && netif_rx_schedule_prep(netdev)) {
> +
> + /* Disable interrupts */
> + atomic_inc(&adapter->irq_sem);
> + E1000_WRITE_REG(&adapter->hw, IMC, ~0);
> +
> + /* E1000_WRITE_FLUSH(&adapter->hw);
> + * E1000_READ below syncs it --ro
> + */
> +
> __netif_rx_schedule(netdev);
> +
> + adapter->icr_pending = icr;
> + while (i-- && (icr = E1000_READ_REG(&adapter->hw, ICR)) != 0)
> + adapter->icr_pending |= icr;
> }
> #else
> - uint32_t icr;
> - int i = E1000_MAX_INTR;
>
> while(i && (icr = E1000_READ_REG(&adapter->hw, ICR))) {
>
> @@ -1930,7 +1944,15 @@
> int i = E1000_MAX_INTR;
> int hasReceived = 0;
>
> - while(i && (icr = E1000_READ_REG(&adapter->hw, ICR))) {
> + while(i) {
> + if (adapter->icr_pending) {
> + icr = adapter->icr_pending;
> + adapter->icr_pending = 0;
> + } else
> + icr = E1000_READ_REG(&adapter->hw, ICR);
> + if (!icr)
> + break;
> +
> if (icr & E1000_ICR_RXT0)
> hasReceived = 1;
>
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2002-11-21 01:11:54

by Feldman, Scott

[permalink] [raw]
Subject: RE: e1000 fixes (NAPI)

> Need another fix. You need to reinstrument the tasklet
> schedule in the fill_rx_ring instread of doing the whole thing from
> interrupt. When the system is loaded at 100% saturation on gigbit
> with 300 byte packets or smaller, the driver does not allow any
> processes to run, and you cannot log in via ssh or any user space
> apps. This is severely busted.

That's one of the points of NAPI - to process high traffic Rx rates outside
of h/w interrupt context.

-scott

2002-11-21 10:28:26

by Robert Olsson

[permalink] [raw]
Subject: Re: e1000 fixes (NAPI)


Jeff V. Merkey writes:
>
> Need another fix. You need to reinstrument the tasklet schedule in the
> fill_rx_ring instread of doing the whole thing from interrupt. When the
> system is loaded at 100% saturation on gigbit with 300 byte packets or
> smaller, the driver does not allow any processes to run, and you cannot
> log in via ssh or any user space apps. This is severely busted.
>
> The later versions of the driver > 4.3.15 all exhibit this behavior and
> are extremely broken.

Where have you been? :-)

NAPI does RX processing in softirq. RX interrupts are just used to indicate
work. At high loads the consecutive RX polls gets run via ksoftirqd which
is under scheduler control also the RX softirq breakes for other work. This
makes the NAPI network stuff as very well behaved kernel citizen and also
gives network performance at any load.

More details is in the usenix paper;
http://www.cyberus.ca/~hadi/usenix-paper.tgz

Cheers.
--ro

2002-11-21 16:56:16

by Jeff V. Merkey

[permalink] [raw]
Subject: Re: e1000 fixes (NAPI)



I've been using the driver from Intel's website. :-) Where is this new extension so I can test it.

:-)

Jeff

P.S. I've been working heads down on new projects with Linux is where
I've been. Sounds like other folks have seen this and addressed it.



On Thu, Nov 21, 2002 at 11:43:11AM +0100, Robert Olsson wrote:
>
> Jeff V. Merkey writes:
> >
> > Need another fix. You need to reinstrument the tasklet schedule in the
> > fill_rx_ring instread of doing the whole thing from interrupt. When the
> > system is loaded at 100% saturation on gigbit with 300 byte packets or
> > smaller, the driver does not allow any processes to run, and you cannot
> > log in via ssh or any user space apps. This is severely busted.
> >
> > The later versions of the driver > 4.3.15 all exhibit this behavior and
> > are extremely broken.
>
> Where have you been? :-)
>
> NAPI does RX processing in softirq. RX interrupts are just used to indicate
> work. At high loads the consecutive RX polls gets run via ksoftirqd which
> is under scheduler control also the RX softirq breakes for other work. This
> makes the NAPI network stuff as very well behaved kernel citizen and also
> gives network performance at any load.
>
> More details is in the usenix paper;
> http://www.cyberus.ca/~hadi/usenix-paper.tgz
>
> Cheers.
> --ro
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2002-11-21 17:01:14

by Jeff V. Merkey

[permalink] [raw]
Subject: Re: e1000 fixes (NAPI)



One other comment. Does NAPI handle the issue of refilling the RX ring
from interrupt? This is the source of the problem, not packet delvery, which
is handled from a softirq handler.

Jeff

On Thu, Nov 21, 2002 at 11:43:11AM +0100, Robert Olsson wrote:
>
> Jeff V. Merkey writes:
> >
> > Need another fix. You need to reinstrument the tasklet schedule in the
> > fill_rx_ring instread of doing the whole thing from interrupt. When the
> > system is loaded at 100% saturation on gigbit with 300 byte packets or
> > smaller, the driver does not allow any processes to run, and you cannot
> > log in via ssh or any user space apps. This is severely busted.
> >
> > The later versions of the driver > 4.3.15 all exhibit this behavior and
> > are extremely broken.
>
> Where have you been? :-)
>
> NAPI does RX processing in softirq. RX interrupts are just used to indicate
> work. At high loads the consecutive RX polls gets run via ksoftirqd which
> is under scheduler control also the RX softirq breakes for other work. This
> makes the NAPI network stuff as very well behaved kernel citizen and also
> gives network performance at any load.
>
> More details is in the usenix paper;
> http://www.cyberus.ca/~hadi/usenix-paper.tgz
>
> Cheers.
> --ro
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2002-11-21 17:06:23

by Jeff Garzik

[permalink] [raw]
Subject: Re: e1000 fixes (NAPI)

Jeff V. Merkey wrote:

>
> One other comment. Does NAPI handle the issue of refilling the RX ring
> from interrupt? This is the source of the problem, not packet
> delvery, which
> is handled from a softirq handler.



NAPI poll does not happen in an interrupt. Doing things in interrupts
is the source of problems that NAPI is trying to solve.

Other than that, please read the code and NAPI paper... :)

2002-11-21 17:24:06

by Jeff Garzik

[permalink] [raw]
Subject: Re: e1000 fixes (NAPI)

Jeff V. Merkey wrote:

> >NAPI poll does not happen in an interrupt. Doing things in interrupts
> >is the source of problems that NAPI is trying to solve.
> >
> >Other than that, please read the code and NAPI paper... :)
>
>
>
>
> Where can I find it?



In the link Robert Ollson gave to you (paper), and the Linux kernel (code).

Jeff



2002-11-21 17:21:41

by Jeff V. Merkey

[permalink] [raw]
Subject: Re: e1000 fixes (NAPI)

On Thu, Nov 21, 2002 at 12:13:03PM -0500, Jeff Garzik wrote:
> Jeff V. Merkey wrote:
>
> >
> > One other comment. Does NAPI handle the issue of refilling the RX ring
> > from interrupt? This is the source of the problem, not packet
> > delvery, which
> > is handled from a softirq handler.
>
>
>
> NAPI poll does not happen in an interrupt. Doing things in interrupts
> is the source of problems that NAPI is trying to solve.
>
> Other than that, please read the code and NAPI paper... :)



Where can I find it?

Jeff

2002-11-21 20:54:45

by Jeff V. Merkey

[permalink] [raw]
Subject: Re: e1000 fixes (NAPI)


Thanks

jeff

On Thu, Nov 21, 2002 at 12:31:08PM -0500, Jeff Garzik wrote:
> Jeff V. Merkey wrote:
>
> > >NAPI poll does not happen in an interrupt. Doing things in interrupts
> > >is the source of problems that NAPI is trying to solve.
> > >
> > >Other than that, please read the code and NAPI paper... :)
> >
> >
> >
> >
> > Where can I find it?
>
>
>
> In the link Robert Ollson gave to you (paper), and the Linux kernel (code).
>
> Jeff
>
>

2002-11-22 10:06:19

by Robert Olsson

[permalink] [raw]
Subject: Re: e1000 fixes (NAPI)




Seems like FreeBSD is now getting on this train too. Someone sent me this link.
http://info.iet.unipi.it/~luigi/polling/

Cheers.

--ro


Jeff V. Merkey writes:
>
> Thanks
>
> jeff
>
> On Thu, Nov 21, 2002 at 12:31:08PM -0500, Jeff Garzik wrote:
> > Jeff V. Merkey wrote:
> >
> > > >NAPI poll does not happen in an interrupt. Doing things in interrupts
> > > >is the source of problems that NAPI is trying to solve.
> > > >
> > > >Other than that, please read the code and NAPI paper... :)
> > >
> > >
> > >
> > >
> > > Where can I find it?
> >
> >
> >
> > In the link Robert Ollson gave to you (paper), and the Linux kernel (code).
> >
> > Jeff
> >
> >

2002-11-22 17:56:02

by Jeff V. Merkey

[permalink] [raw]
Subject: Re: e1000 fixes (NAPI)


I'll check it out.

:-)

Jeff

On Fri, Nov 22, 2002 at 11:21:14AM +0100, Robert Olsson wrote:
>
>
>
> Seems like FreeBSD is now getting on this train too. Someone sent me this link.
> http://info.iet.unipi.it/~luigi/polling/
>
> Cheers.
>
> --ro
>
>
> Jeff V. Merkey writes:
> >
> > Thanks
> >
> > jeff
> >
> > On Thu, Nov 21, 2002 at 12:31:08PM -0500, Jeff Garzik wrote:
> > > Jeff V. Merkey wrote:
> > >
> > > > >NAPI poll does not happen in an interrupt. Doing things in interrupts
> > > > >is the source of problems that NAPI is trying to solve.
> > > > >
> > > > >Other than that, please read the code and NAPI paper... :)
> > > >
> > > >
> > > >
> > > >
> > > > Where can I find it?
> > >
> > >
> > >
> > > In the link Robert Ollson gave to you (paper), and the Linux kernel (code).
> > >
> > > Jeff
> > >
> > >
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/