I found something that I believe might be an issue, and I have an
idea on how to correct this, but unfortunately, this doesn't solve
the issues I occasionally see with this driver. I'd still like to
share it, because I might be totally mistaken in my understanding.
With no firmware code or documentation at hand, it's not exactly clear
which assumption the firmware makes, but obviously, the driver and the
firmware share memory to exchange descriptors that either contain
control information or payload. The driver puts control descriptors
and payload descriptors in a ring buffer in an interleaved fashion.
When the driver wants to send an skb, it looks for a currently unused
control descriptor, and then fills it, together with its directly
chained payload descriptor. The descriptors are both marked valid and
then the firmware is instructed to start processing the ringbuffer.
In case the firmware is idle when wcn36xx_dxe_tx_frame() is entered,
this is all fine. However, when sending many packets in a short time
frame, there are cases when the firmware is still processing the ring
buffer (iow, ch->reg_ctrl & WCN36xx_DXE_CH_CTRL_EN != 0), and in this
case, writes to the shared memory area depict a data race. The local
spinlock of course doesn't help to prevent that. OTOH, it should be
completely fine to modify the descriptors while firmware is still
reading them, as the firmware should only pay attention to such that
are marked valid.
There's a problem with the latter presumption however which looks like
this in the driver code:
desc->fr_len = ctl->skb->len;
/* set dxe descriptor to VALID */
desc->ctrl = ch->ctrl_skb;
The CPU may very well reorder the two writes, even though the memory is
allocated as coherent DMA. When that happens, the firmware may see a
wrong length for a valid descriptor. A simple memory write barrier would
suffice to solve this, but then again, there are two descriptors
involved.
My attempt to fix that restructures the code a bit and makes the
payload descriptor valid first and then the control descriptor, both
strongly ordered through memory fences. This however assumes that the
firmware will ignore valid payload descriptors unless they have a
valid control descriptor preceding them, but that's really just
guessing.
Does that make sense? As I said, I can't really say this improves
anything, sadly, so I might be mistaken entirely. But I'll leave this
here for further discussion. Ideally, somebody with access to the
firmware sources could give an assessment whether this is an issue at
all or not.
Thanks,
Daniel
Daniel Mack (1):
wcn36xx: fix buffer commit logic on TX path
drivers/net/wireless/ath/wcn36xx/dxe.c | 75 +++++++++++++++++-----------------
1 file changed, 38 insertions(+), 37 deletions(-)
--
2.14.3
On 10 April 2018 at 20:42, Daniel Mack <[email protected]> wrote:
> I found something that I believe might be an issue, and I have an
> idea on how to correct this, but unfortunately, this doesn't solve
> the issues I occasionally see with this driver. I'd still like to
> share it, because I might be totally mistaken in my understanding.
>
Thanks for sharing you idea. Are you aware of the recent patch I sent
that Loic Poulain
wrote that fixes a race issue in access to wcn36xx_dxe_tx_frame() ?
Kalle just recently applied it to ath-next, I don't think it's
available yet upstream.
The patch was fixing something similar, perhaps it will solve the
issue you're experiencing.
> With no firmware code or documentation at hand, it's not exactly clear
> which assumption the firmware makes, but obviously, the driver and the
> firmware share memory to exchange descriptors that either contain
> control information or payload. The driver puts control descriptors
> and payload descriptors in a ring buffer in an interleaved fashion.
>
> When the driver wants to send an skb, it looks for a currently unused
> control descriptor, and then fills it, together with its directly
> chained payload descriptor. The descriptors are both marked valid and
> then the firmware is instructed to start processing the ringbuffer.
> In case the firmware is idle when wcn36xx_dxe_tx_frame() is entered,
> this is all fine. However, when sending many packets in a short time
> frame, there are cases when the firmware is still processing the ring
> buffer (iow, ch->reg_ctrl & WCN36xx_DXE_CH_CTRL_EN != 0), and in this
> case, writes to the shared memory area depict a data race. The local
> spinlock of course doesn't help to prevent that. OTOH, it should be
> completely fine to modify the descriptors while firmware is still
> reading them, as the firmware should only pay attention to such that
> are marked valid.
>
> There's a problem with the latter presumption however which looks like
> this in the driver code:
>
> desc->fr_len = ctl->skb->len;
> /* set dxe descriptor to VALID */
> desc->ctrl = ch->ctrl_skb;
>
The firmware won't start reading the descriptors until it receives an
interrupt from driver.
which in turn happen later in the function using: wcn36xx_dxe_write_register()
so I don't think reordering in this case make any problem.
> The CPU may very well reorder the two writes, even though the memory is
> allocated as coherent DMA. When that happens, the firmware may see a
> wrong length for a valid descriptor. A simple memory write barrier would
> suffice to solve this, but then again, there are two descriptors
> involved.
>
> My attempt to fix that restructures the code a bit and makes the
> payload descriptor valid first and then the control descriptor, both
> strongly ordered through memory fences. This however assumes that the
> firmware will ignore valid payload descriptors unless they have a
> valid control descriptor preceding them, but that's really just
> guessing.
>
> Does that make sense? As I said, I can't really say this improves
> anything, sadly, so I might be mistaken entirely. But I'll leave this
> here for further discussion. Ideally, somebody with access to the
> firmware sources could give an assessment whether this is an issue at
> all or not.
When I'm not sure regarding some implementation I consult with the
downstream pronto driver.
https://github.com/sultanqasim/qcom_wlan_prima
Did you look there if they actually placed wmb() ?
>
>
> Thanks,
> Daniel
>
>
> Daniel Mack (1):
> wcn36xx: fix buffer commit logic on TX path
>
> drivers/net/wireless/ath/wcn36xx/dxe.c | 75 +++++++++++++++++-----------------
> 1 file changed, 38 insertions(+), 37 deletions(-)
>
> --
> 2.14.3
>
Hi Daniel,
> /* Move the head of the ring to the next empty descriptor */
> - ch->head_blk_ctl = ctl->next;
> + ch->head_blk_ctl = ctl_skb->next;
> +
> + /* Commit all previous writes and set descriptors to VALID */
> + wmb();
Is this first memory barrier really needed? from what I understand, we
only need to ensure that the control descriptor is marked valid at the
end of the procedure and we don't really care about the paylod one.
> + desc_skb->ctrl = ch->ctrl_skb;
> + wmb();
> + desc_bd->ctrl = ch->ctrl_bd;
>
> /*
> * When connected and trying to send data frame chip can be in sleep
Otherwise, patch makes sense.
Regards,
Loic
When wcn36xx_dxe_tx_frame() is entered while the device is still processing
the queue asyncronously, we are racing against the firmware code with
updates to the buffer descriptors. Presumably, the firmware scans the ring
buffer that holds the descriptors and scans for a valid control descriptor,
and then assumes that the next descriptor contains the payload. If, however,
the control descriptor is marked valid, but the payload descriptor isn't,
the packet is not sent out.
Another issue with the current code is that is lacks memory barriers before
descriptors are marked valid. This is important because the CPU may reorder
writes to memory, even if it is allocated as coherent DMA area, and hence
the device may see incompletely written data.
To fix this, the code in wcn36xx_dxe_tx_frame() was restructured a bit so
that the payload descriptor is made valid before the control descriptor.
Memory barriers are added to ensure coherency of shared memory areas.
Signed-off-by: Daniel Mack <[email protected]>
---
drivers/net/wireless/ath/wcn36xx/dxe.c | 75 +++++++++++++++++-----------------
1 file changed, 38 insertions(+), 37 deletions(-)
diff --git a/drivers/net/wireless/ath/wcn36xx/dxe.c b/drivers/net/wireless/ath/wcn36xx/dxe.c
index 26e6c42f886a..cb93545a42ce 100644
--- a/drivers/net/wireless/ath/wcn36xx/dxe.c
+++ b/drivers/net/wireless/ath/wcn36xx/dxe.c
@@ -638,8 +638,8 @@ int wcn36xx_dxe_tx_frame(struct wcn36xx *wcn,
struct sk_buff *skb,
bool is_low)
{
- struct wcn36xx_dxe_ctl *ctl = NULL;
- struct wcn36xx_dxe_desc *desc = NULL;
+ struct wcn36xx_dxe_desc *desc_bd, *desc_skb;
+ struct wcn36xx_dxe_ctl *ctl_bd, *ctl_skb;
struct wcn36xx_dxe_ch *ch = NULL;
unsigned long flags;
int ret;
@@ -647,74 +647,75 @@ int wcn36xx_dxe_tx_frame(struct wcn36xx *wcn,
ch = is_low ? &wcn->dxe_tx_l_ch : &wcn->dxe_tx_h_ch;
spin_lock_irqsave(&ch->lock, flags);
- ctl = ch->head_blk_ctl;
+ ctl_bd = ch->head_blk_ctl;
+ ctl_skb = ctl_bd->next;
/*
* If skb is not null that means that we reached the tail of the ring
* hence ring is full. Stop queues to let mac80211 back off until ring
* has an empty slot again.
*/
- if (NULL != ctl->next->skb) {
+ if (NULL != ctl_skb->skb) {
ieee80211_stop_queues(wcn->hw);
wcn->queues_stopped = true;
spin_unlock_irqrestore(&ch->lock, flags);
return -EBUSY;
}
- ctl->skb = NULL;
- desc = ctl->desc;
+ if (unlikely(ctl_skb->bd_cpu_addr)) {
+ wcn36xx_err("bd_cpu_addr cannot be NULL for skb DXE\n");
+ ret = -EINVAL;
+ goto unlock;
+ }
+
+ desc_bd = ctl_bd->desc;
+ desc_skb = ctl_skb->desc;
+
+ ctl_bd->skb = NULL;
/* write buffer descriptor */
- memcpy(ctl->bd_cpu_addr, bd, sizeof(*bd));
+ memcpy(ctl_bd->bd_cpu_addr, bd, sizeof(*bd));
/* Set source address of the BD we send */
- desc->src_addr_l = ctl->bd_phy_addr;
-
- desc->dst_addr_l = ch->dxe_wq;
- desc->fr_len = sizeof(struct wcn36xx_tx_bd);
- desc->ctrl = ch->ctrl_bd;
+ desc_bd->src_addr_l = ctl_bd->bd_phy_addr;
+ desc_bd->dst_addr_l = ch->dxe_wq;
+ desc_bd->fr_len = sizeof(struct wcn36xx_tx_bd);
wcn36xx_dbg(WCN36XX_DBG_DXE, "DXE TX\n");
wcn36xx_dbg_dump(WCN36XX_DBG_DXE_DUMP, "DESC1 >>> ",
- (char *)desc, sizeof(*desc));
+ (char *)desc_bd, sizeof(*desc_bd));
wcn36xx_dbg_dump(WCN36XX_DBG_DXE_DUMP,
- "BD >>> ", (char *)ctl->bd_cpu_addr,
+ "BD >>> ", (char *)ctl_bd->bd_cpu_addr,
sizeof(struct wcn36xx_tx_bd));
- /* Set source address of the SKB we send */
- ctl = ctl->next;
- desc = ctl->desc;
- if (ctl->bd_cpu_addr) {
- wcn36xx_err("bd_cpu_addr cannot be NULL for skb DXE\n");
- ret = -EINVAL;
- goto unlock;
- }
-
- desc->src_addr_l = dma_map_single(wcn->dev,
- skb->data,
- skb->len,
- DMA_TO_DEVICE);
- if (dma_mapping_error(wcn->dev, desc->src_addr_l)) {
+ desc_skb->src_addr_l = dma_map_single(wcn->dev,
+ skb->data,
+ skb->len,
+ DMA_TO_DEVICE);
+ if (dma_mapping_error(wcn->dev, desc_skb->src_addr_l)) {
dev_err(wcn->dev, "unable to DMA map src_addr_l\n");
ret = -ENOMEM;
goto unlock;
}
- ctl->skb = skb;
- desc->dst_addr_l = ch->dxe_wq;
- desc->fr_len = ctl->skb->len;
-
- /* set dxe descriptor to VALID */
- desc->ctrl = ch->ctrl_skb;
+ ctl_skb->skb = skb;
+ desc_skb->dst_addr_l = ch->dxe_wq;
+ desc_skb->fr_len = ctl_skb->skb->len;
wcn36xx_dbg_dump(WCN36XX_DBG_DXE_DUMP, "DESC2 >>> ",
- (char *)desc, sizeof(*desc));
+ (char *)desc_skb, sizeof(*desc_skb));
wcn36xx_dbg_dump(WCN36XX_DBG_DXE_DUMP, "SKB >>> ",
- (char *)ctl->skb->data, ctl->skb->len);
+ (char *)ctl_skb->skb->data, ctl_skb->skb->len);
/* Move the head of the ring to the next empty descriptor */
- ch->head_blk_ctl = ctl->next;
+ ch->head_blk_ctl = ctl_skb->next;
+
+ /* Commit all previous writes and set descriptors to VALID */
+ wmb();
+ desc_skb->ctrl = ch->ctrl_skb;
+ wmb();
+ desc_bd->ctrl = ch->ctrl_bd;
/*
* When connected and trying to send data frame chip can be in sleep
--
2.14.3
Daniel Mack <[email protected]> writes:
> Hi Loic,
>
> On Wednesday, April 11, 2018 03:30 PM, Loic Poulain wrote:
>>> /* Move the head of the ring to the next empty descriptor */
>>> - ch->head_blk_ctl = ctl->next;
>>> + ch->head_blk_ctl = ctl_skb->next;
>>> +
>>> + /* Commit all previous writes and set descriptors to VALID */
>>> + wmb();
>>
>> Is this first memory barrier really needed? from what I understand, we
>> only need to ensure that the control descriptor is marked valid at the
>> end of the procedure and we don't really care about the paylod one.
>>
>>> + desc_skb->ctrl = ch->ctrl_skb;
>>> + wmb();
>>> + desc_bd->ctrl = ch->ctrl_bd;
>>>
>>> /*
>>> * When connected and trying to send data frame chip can be in sleep
>>
>> Otherwise, patch makes sense.
>
> This patch has somehow vanished from patchwork.
It's still in patchwork, just in state RFC:
https://patchwork.kernel.org/patch/10333553/
--
Kalle Valo
Hi Loic,
On Wednesday, April 11, 2018 03:30 PM, Loic Poulain wrote:
>> /* Move the head of the ring to the next empty descriptor */
>> - ch->head_blk_ctl = ctl->next;
>> + ch->head_blk_ctl = ctl_skb->next;
>> +
>> + /* Commit all previous writes and set descriptors to VALID */
>> + wmb();
>
> Is this first memory barrier really needed? from what I understand, we
> only need to ensure that the control descriptor is marked valid at the
> end of the procedure and we don't really care about the paylod one.
Well, without documentation or the firmware sources, that's just
guesswork at this point. My assumption is only based on the weird
comments and workarounds in the downstream driver.
I added the second barrier to ensure that no descriptor is ever marked
valid unless all other bits are definitely in sync.
Thanks,
Daniel
Hi Loic,
On Wednesday, April 11, 2018 03:30 PM, Loic Poulain wrote:
>> /* Move the head of the ring to the next empty descriptor */
>> - ch->head_blk_ctl = ctl->next;
>> + ch->head_blk_ctl = ctl_skb->next;
>> +
>> + /* Commit all previous writes and set descriptors to VALID */
>> + wmb();
>
> Is this first memory barrier really needed? from what I understand, we
> only need to ensure that the control descriptor is marked valid at the
> end of the procedure and we don't really care about the paylod one.
>
>> + desc_skb->ctrl = ch->ctrl_skb;
>> + wmb();
>> + desc_bd->ctrl = ch->ctrl_bd;
>>
>> /*
>> * When connected and trying to send data frame chip can be in sleep
>
> Otherwise, patch makes sense.
This patch has somehow vanished from patchwork. I am running it since I
posted it and I can't see it causing any regressions, so I'd like to
repost. Given that you seem to be in favor of it, can I have your
Reviewed-by?
Thanks,
Daniel
On 11 April 2018 at 15:37, Daniel Mack <[email protected]> wrote:
> Hi Loic,
>
> On Wednesday, April 11, 2018 03:30 PM, Loic Poulain wrote:
>>> /* Move the head of the ring to the next empty descriptor */
>>> - ch->head_blk_ctl = ctl->next;
>>> + ch->head_blk_ctl = ctl_skb->next;
>>> +
>>> + /* Commit all previous writes and set descriptors to VALID */
>>> + wmb();
>>
>> Is this first memory barrier really needed? from what I understand, we
>> only need to ensure that the control descriptor is marked valid at the
>> end of the procedure and we don't really care about the paylod one.
>
> Well, without documentation or the firmware sources, that's just
> guesswork at this point. My assumption is only based on the weird
> comments and workarounds in the downstream driver.
>
> I added the second barrier to ensure that no descriptor is ever marked
> valid unless all other bits are definitely in sync.
Fair enough!
Hi Ramon,
On Tuesday, April 10, 2018 08:11 PM, Ramon Fried wrote:
> On 10 April 2018 at 20:42, Daniel Mack <[email protected]> wrote:
>> I found something that I believe might be an issue, and I have an
>> idea on how to correct this, but unfortunately, this doesn't solve
>> the issues I occasionally see with this driver. I'd still like to
>> share it, because I might be totally mistaken in my understanding.
>
> Thanks for sharing you idea. Are you aware of the recent patch I sent
> that Loic Poulain
> wrote that fixes a race issue in access to wcn36xx_dxe_tx_frame() ?
> Kalle just recently applied it to ath-next, I don't think it's
> available yet upstream.
Yes, my patch builds upon yours.
> The patch was fixing something similar, perhaps it will solve the
> issue you're experiencing.
I'm not even sure what kind of effect I'm hunting, so it's hard to tell.
Your patch definitely addresses a data race too though.
>> There's a problem with the latter presumption however which looks like
>> this in the driver code:
>>
>> desc->fr_len = ctl->skb->len;
>> /* set dxe descriptor to VALID */
>> desc->ctrl = ch->ctrl_skb;
>
> The firmware won't start reading the descriptors until it receives an
> interrupt from driver.
> which in turn happen later in the function using: wcn36xx_dxe_write_register()
> so I don't think reordering in this case make any problem.
I understand, but as I said, there's definitely the problem that the
channel is already running when wcn36xx_dxe_tx_frame() is entered. Try
adding this at the beginning of the the function and then run iperf:
int reg;
wcn36xx_dxe_read_register(wcn, ch->reg_ctrl, ®);
WARN_ON(reg & WCN36xx_DXE_CH_CTRL_EN);
I fail to see how the firmware would determine which descriptors to look
at without any type of synchronization mechanism.
>> Does that make sense? As I said, I can't really say this improves
>> anything, sadly, so I might be mistaken entirely. But I'll leave this
>> here for further discussion. Ideally, somebody with access to the
>> firmware sources could give an assessment whether this is an issue at
>> all or not.
>
> When I'm not sure regarding some implementation I consult with the
> downstream pronto driver.
> https://github.com/sultanqasim/qcom_wlan_prima
>
> Did you look there if they actually placed wmb() ?
Yes, I've read some of that as well. They use wmb() before writing to
and rmb() after reading firmware registers. The equivalent upstream is
wcn36xx_dxe_[read,write}_register(), and they use writel() and readl()
which have the same barriers on arm64. So that's fine.
What's interesting though is that the downstream drivers sets the VLD
bit of the first descriptor of the chain *after* they set all the
others. There are also some confusing comments about that (/* First
frame not set VAL bit, why ??? */). Can you make sense of that?
Thanks,
Daniel