2010-04-03 21:09:20

by Christian Lamparter

[permalink] [raw]
Subject: [RFT/RFC] p54pci: prevent stuck rx-ring on slow system

From: Quintin Pitts <[email protected]>

This patch fixes an old problem, which - under certain
circumstances - could cause the device to become
unresponsive.

most of p54pci's rx-ring management is implemented in just
two distinct standalone functions. p54p_check_rx_ring takes
care of processing incoming data, while p54p_refill_rx_ring
tries to replenish all depleted communication buffers.

This has always worked fine on my fast machine, but
now I know there is a hidden race...

The most likely candidate here is ring_control->device_idx.
Quintin Pitts had already analyzed the culprit and posted
a patch back in Oct 2009. But sadly, no one's picked up on this.
( https://patchwork.kernel.org/patch/53079/ [2 & 3] ).
This patch goes in the same way, except that we now prioritize
rx data processing, simply because tx routine can wait.

Reported-by: Sean Young <[email protected]>
Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=11386
Reported-by: Quintin Pitts <[email protected]>
Signed-off-by: Quintin Pitts <[email protected]>
Signed-off-by: Christian Lamparter <[email protected]>
---
Quintin, I've added the "From: ...", because this is from your post.
So, you've the last word, if you don't agree with what just I wrote.

NB: The comment above is just the best explanation I could
come up with. I'm not entirely sure, if I really hit the nail
right on this one, though. But don't get me wrong: the patch does
definitively help.

Regards,
Christian
---
diff --git a/drivers/net/wireless/p54/p54pci.c b/drivers/net/wireless/p54/p54pci.c
index ed4bdff..aa29663 100644
--- a/drivers/net/wireless/p54/p54pci.c
+++ b/drivers/net/wireless/p54/p54pci.c
@@ -131,7 +131,7 @@ static int p54p_upload_firmware(struct ieee80211_hw *dev)

static void p54p_refill_rx_ring(struct ieee80211_hw *dev,
int ring_index, struct p54p_desc *ring, u32 ring_limit,
- struct sk_buff **rx_buf)
+ struct sk_buff **rx_buf, u32 index)
{
struct p54p_priv *priv = dev->priv;
struct p54p_ring_control *ring_control = priv->ring_control;
@@ -139,7 +139,7 @@ static void p54p_refill_rx_ring(struct ieee80211_hw *dev,

idx = le32_to_cpu(ring_control->host_idx[ring_index]);
limit = idx;
- limit -= le32_to_cpu(ring_control->device_idx[ring_index]);
+ limit -= le32_to_cpu(index);
limit = ring_limit - limit;

i = idx % ring_limit;
@@ -231,7 +231,7 @@ static void p54p_check_rx_ring(struct ieee80211_hw *dev, u32 *index,
i %= ring_limit;
}

- p54p_refill_rx_ring(dev, ring_index, ring, ring_limit, rx_buf);
+ p54p_refill_rx_ring(dev, ring_index, ring, ring_limit, rx_buf, *index);
}

static void p54p_check_tx_ring(struct ieee80211_hw *dev, u32 *index,
@@ -276,14 +276,6 @@ static void p54p_tasklet(unsigned long dev_id)
struct p54p_priv *priv = dev->priv;
struct p54p_ring_control *ring_control = priv->ring_control;

- p54p_check_tx_ring(dev, &priv->tx_idx_mgmt, 3, ring_control->tx_mgmt,
- ARRAY_SIZE(ring_control->tx_mgmt),
- priv->tx_buf_mgmt);
-
- p54p_check_tx_ring(dev, &priv->tx_idx_data, 1, ring_control->tx_data,
- ARRAY_SIZE(ring_control->tx_data),
- priv->tx_buf_data);
-
p54p_check_rx_ring(dev, &priv->rx_idx_mgmt, 2, ring_control->rx_mgmt,
ARRAY_SIZE(ring_control->rx_mgmt), priv->rx_buf_mgmt);

@@ -292,6 +284,14 @@ static void p54p_tasklet(unsigned long dev_id)

wmb();
P54P_WRITE(dev_int, cpu_to_le32(ISL38XX_DEV_INT_UPDATE));
+
+ p54p_check_tx_ring(dev, &priv->tx_idx_mgmt, 3, ring_control->tx_mgmt,
+ ARRAY_SIZE(ring_control->tx_mgmt),
+ priv->tx_buf_mgmt);
+
+ p54p_check_tx_ring(dev, &priv->tx_idx_data, 1, ring_control->tx_data,
+ ARRAY_SIZE(ring_control->tx_data),
+ priv->tx_buf_data);
}

static irqreturn_t p54p_interrupt(int irq, void *dev_id)
@@ -444,10 +444,10 @@ static int p54p_open(struct ieee80211_hw *dev)
priv->rx_idx_mgmt = priv->tx_idx_mgmt = 0;

p54p_refill_rx_ring(dev, 0, priv->ring_control->rx_data,
- ARRAY_SIZE(priv->ring_control->rx_data), priv->rx_buf_data);
+ ARRAY_SIZE(priv->ring_control->rx_data), priv->rx_buf_data, 0);

p54p_refill_rx_ring(dev, 2, priv->ring_control->rx_mgmt,
- ARRAY_SIZE(priv->ring_control->rx_mgmt), priv->rx_buf_mgmt);
+ ARRAY_SIZE(priv->ring_control->rx_mgmt), priv->rx_buf_mgmt, 0);

P54P_WRITE(ring_control_base, cpu_to_le32(priv->ring_control_dma));
P54P_READ(ring_control_base);


2010-04-04 10:20:48

by Quintin Pitts

[permalink] [raw]
Subject: Re: [RFT/RFC] p54pci: prevent stuck rx-ring on slow system

On Sat Apr 03 2010 16:09:12 GMT-0500 (CDT), Christian Lamparter wrote:
> From: Quintin Pitts <[email protected]>
>
> This patch fixes an old problem, which - under certain
> circumstances - could cause the device to become
> unresponsive.
>
> most of p54pci's rx-ring management is implemented in just
> two distinct standalone functions. p54p_check_rx_ring takes
> care of processing incoming data, while p54p_refill_rx_ring
> tries to replenish all depleted communication buffers.
>
> This has always worked fine on my fast machine, but
> now I know there is a hidden race...
>
> The most likely candidate here is ring_control->device_idx.
> Quintin Pitts had already analyzed the culprit and posted
> a patch back in Oct 2009. But sadly, no one's picked up on this.
> ( https://patchwork.kernel.org/patch/53079/ [2 & 3] ).
> This patch goes in the same way, except that we now prioritize
> rx data processing, simply because tx routine can wait.
>
> Reported-by: Sean Young <[email protected]>
> Bugzilla: https://bugzilla.kernel.org/show_bug.cgi?id=11386
> Reported-by: Quintin Pitts <[email protected]>
> Signed-off-by: Quintin Pitts <[email protected]>
> Signed-off-by: Christian Lamparter <[email protected]>
> ---
> Quintin, I've added the "From: ...", because this is from your post.
> So, you've the last word, if you don't agree with what just I wrote.
>

I agree with what you wrote.

Thanks.

Quintin.