2004-04-08 10:39:30

by Stelian Pop

[permalink] [raw]
Subject: [PATCH 2.6] netpoll: prevent transmission stall

While working on kgdb over netpoll I found out another bug in
netpoll.

This one is in netpoll_send_skb(), and it is triggered by
sending repeatedly more packets than the transmit buffers the
ethernet card has, while being in the netpoll_trap. (this is
rapidly triggered by kgdb when using a card with limited
transmit buffers, like my pcmcia NE2000).

One should poll() the ethernet card interrupt function in order
to get the transmit acks from the card and free the transmit
buffers in order to be able to transmit more packets.

This is why in the original code, netpoll_send_skb() calls
netpoll_poll() if dev->hard_start_xmit() fails. However, this
does not work because netpoll_poll() is called only if
netif_queue_stopped(). But the net queue functions are overridden
if netpoll_trap() is on, causing netif_queue_stopped() to never
return true.

The attached patch 'fixes' this, by forcing a netpoll_poll()
every time dev->hard_start_xmit() fails. Maybe there is a cleaner
way to do this by making the queue functions work even under
netpoll_trap, but it shouldn't really be necessary, I cannot
see how forcing the netpoll_poll() call could break anything.

Stelian.

--- net/core/netpoll.c.ORIG 2004-04-06 10:50:05.000000000 +0200
+++ net/core/netpoll.c 2004-04-08 10:41:56.484026496 +0200
@@ -163,21 +163,15 @@
spin_lock(&np->dev->xmit_lock);
np->dev->xmit_lock_owner = smp_processor_id();

- if (netif_queue_stopped(np->dev)) {
- np->dev->xmit_lock_owner = -1;
- spin_unlock(&np->dev->xmit_lock);
-
- netpoll_poll(np);
- goto repeat;
- }
-
status = np->dev->hard_start_xmit(skb, np->dev);
np->dev->xmit_lock_owner = -1;
spin_unlock(&np->dev->xmit_lock);

/* transmit busy */
- if(status)
+ if(status) {
+ netpoll_poll(np);
goto repeat;
+ }
}

void netpoll_send_udp(struct netpoll *np, const char *msg, int len)


--
Stelian Pop <[email protected]>


2004-04-08 14:06:48

by Matt Mackall

[permalink] [raw]
Subject: Re: [PATCH 2.6] netpoll: prevent transmission stall

On Thu, Apr 08, 2004 at 12:39:25PM +0200, Stelian Pop wrote:
> While working on kgdb over netpoll I found out another bug in
> netpoll.
>
> This one is in netpoll_send_skb(), and it is triggered by
> sending repeatedly more packets than the transmit buffers the
> ethernet card has, while being in the netpoll_trap. (this is
> rapidly triggered by kgdb when using a card with limited
> transmit buffers, like my pcmcia NE2000).
>
> One should poll() the ethernet card interrupt function in order
> to get the transmit acks from the card and free the transmit
> buffers in order to be able to transmit more packets.
>
> This is why in the original code, netpoll_send_skb() calls
> netpoll_poll() if dev->hard_start_xmit() fails. However, this
> does not work because netpoll_poll() is called only if
> netif_queue_stopped(). But the net queue functions are overridden
> if netpoll_trap() is on, causing netif_queue_stopped() to never
> return true.
>
> The attached patch 'fixes' this, by forcing a netpoll_poll()
> every time dev->hard_start_xmit() fails. Maybe there is a cleaner
> way to do this by making the queue functions work even under
> netpoll_trap, but it shouldn't really be necessary, I cannot
> see how forcing the netpoll_poll() call could break anything.

Ok, makes sense. I'll queue this up.

--
Matt Mackall : http://www.selenic.com : Linux development and consulting