2007-12-12 04:03:15

by Joonwoo Park

[permalink] [raw]
Subject: [PATCH 1/7] [NETDEV]: e1000 Fix possible causing oops of net_rx_action

[NETDEV]: e1000 Fix possible causing oops of net_rx_action
returning work_done == weight as true after calling netif_rx_complete will cause oops in net_rx_action.

Thanks
Joonwoo

Signed-off-by: Joonwoo Park <[email protected]>
---
drivers/net/e1000/e1000_main.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 4f37506..4dd61e3 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -3949,6 +3949,8 @@ quit_polling:
e1000_set_itr(adapter);
netif_rx_complete(poll_dev, napi);
e1000_irq_enable(adapter);
+ if (unlikely(work_done == napi->weight))
+ return work_done - 1;
}

return work_done;
---


2007-12-13 10:19:24

by Joonwoo Park

[permalink] [raw]
Subject: RE: [PATCH 1/7] [NETDEV]: e1000 Fix possible causing oops of net_rx_action

2007/12/12, Joonwoo Park <[email protected]>:
> [NETDEV]: e1000 Fix possible causing oops of net_rx_action
> returning work_done == weight as true after calling netif_rx_complete will cause oops in net_rx_action.
>

I tried two types of patches for oops and ifconfig down hang for e1000 first.
Just blowing netif_running up is not best solution I think, it makes ifconfig down hang at least for e1000.
I would like to listen to the others suggestions courteously, please enlighten me :-)

The first:
- if !netif_running, stop receiving process, up to 64 (e1000) packets in the queue would be dropped.
---
drivers/net/e1000/e1000_main.c | 8 ++++----
1 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 4f37506..664312b 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -3938,12 +3938,12 @@ e1000_clean(struct napi_struct *napi, int budget)
spin_unlock(&adapter->tx_queue_lock);
}

- adapter->clean_rx(adapter, &adapter->rx_ring[0],
- &work_done, budget);
+ if (likely(netif_running(poll_dev)))
+ adapter->clean_rx(adapter, &adapter->rx_ring[0],
+ &work_done, budget);

/* If no Tx and not enough Rx work done, exit the polling mode */
- if ((!tx_cleaned && (work_done == 0)) ||
- !netif_running(poll_dev)) {
+ if ((!tx_cleaned && (work_done == 0))) {
quit_polling:
if (likely(adapter->itr_setting & 3))
e1000_set_itr(adapter);
---

The second:
- if !netif_running, receive up to weight - 1 packets, one packets in the queue can be dropped.
---
drivers/net/e1000/e1000_main.c | 9 +++++----
1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/net/e1000/e1000_main.c b/drivers/net/e1000/e1000_main.c
index 4f37506..8e53c5b 100644
--- a/drivers/net/e1000/e1000_main.c
+++ b/drivers/net/e1000/e1000_main.c
@@ -3919,7 +3919,7 @@ e1000_clean(struct napi_struct *napi, int budget)
{
struct e1000_adapter *adapter = container_of(napi, struct e1000_adapter, napi);
struct net_device *poll_dev = adapter->netdev;
- int tx_cleaned = 0, work_done = 0;
+ int tx_cleaned = 0, work_done = 0, running;

/* Must NOT use netdev_priv macro here. */
adapter = poll_dev->priv;
@@ -3938,12 +3938,13 @@ e1000_clean(struct napi_struct *napi, int budget)
spin_unlock(&adapter->tx_queue_lock);
}

+ running = netif_running(poll_dev);
+
adapter->clean_rx(adapter, &adapter->rx_ring[0],
- &work_done, budget);
+ &work_done, budget - !running);

/* If no Tx and not enough Rx work done, exit the polling mode */
- if ((!tx_cleaned && (work_done == 0)) ||
- !netif_running(poll_dev)) {
+ if ((!tx_cleaned && (work_done == 0)) || !running) {
quit_polling:
if (likely(adapter->itr_setting & 3))
e1000_set_itr(adapter);
---


Thanks,
Joonwoo

2007-12-13 13:34:26

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 1/7] [NETDEV]: e1000 Fix possible causing oops of net_rx_action

From: "Joonwoo Park" <[email protected]>
Date: Thu, 13 Dec 2007 19:18:56 +0900

> Just blowing netif_running up is not best solution I think, it makes
> ifconfig down hang at least for e1000.

It hangs because the packet receive rate is so high that NAPI
poll never exits.

I think we need a cheap solution to something so obscure and
almost not worth explicitly even coding for. Really, if you
setup silly situations like that, you get what you asked for.

2007-12-14 02:41:42

by Joonwoo Park

[permalink] [raw]
Subject: Re: [PATCH 1/7] [NETDEV]: e1000 Fix possible causing oops of net_rx_action

2007/12/13, David Miller <[email protected]>:
> From: "Joonwoo Park" <[email protected]>
> Date: Thu, 13 Dec 2007 19:18:56 +0900
>
> > Just blowing netif_running up is not best solution I think, it makes
> > ifconfig down hang at least for e1000.
>
> It hangs because the packet receive rate is so high that NAPI
> poll never exits.

Certainly I'm aware it

>
> I think we need a cheap solution to something so obscure and
> almost not worth explicitly even coding for. Really, if you
> setup silly situations like that, you get what you asked for.
>

I can agree that we need good solution for that.
BUT I don't think I didn't setup *silly* situation. my customers who
are reporting this problem, running firewall on linux which is
forwarding packets with high rate.
I don't want to say 'don't ifconfig down, don't reboot, don't
shutdown' it would introduce problem on your such *sily* sitution'.
In addition, my laptop is just connected to another *linux* machine
which is generating 300mbps 64byte udp packets infinitely.

Joonwoo