2022-11-15 15:29:39

by Zhang Changzhong

[permalink] [raw]
Subject: [PATCH net v2 0/3] three fixes for nixge driver

This patchset try to fix some bugs in nixge driver find by code review.
Compile tested only as I don't have the hardware to test them.

v1->v2:
- Split tx queue handling and error handling into two patches
- Add new patch to fix buffer descriptor overwriting issue

v1:
- Link: https://lore.kernel.org/lkml/[email protected]/

Zhang Changzhong (3):
net: nixge: fix potential memory leak in nixge_start_xmit()
net: nixge: avoid overwriting buffer descriptor
net: nixge: fix tx queue handling

drivers/net/ethernet/ni/nixge.c | 58 +++++++++++++++++++++++++++++------------
1 file changed, 41 insertions(+), 17 deletions(-)

--
2.9.5



2022-11-15 15:31:34

by Zhang Changzhong

[permalink] [raw]
Subject: [PATCH net v2 3/3] net: nixge: fix tx queue handling

Currently the driver check for available space at the beginning of
nixge_start_xmit(), and when there is not enough space for this packet,
it returns NETDEV_TX_OK, which casues packet loss and memory leak.

Instead the queue should be stopped after the packet is added to the BD
when there may not be enough space for next packet. In addition, the
queue should be wakeup only if there is enough space for a packet with
max frags.

Fixes: 492caffa8a1a ("net: ethernet: nixge: Add support for National Instruments XGE netdev")
Signed-off-by: Zhang Changzhong <[email protected]>
---
drivers/net/ethernet/ni/nixge.c | 54 +++++++++++++++++++++++++++++------------
1 file changed, 38 insertions(+), 16 deletions(-)

diff --git a/drivers/net/ethernet/ni/nixge.c b/drivers/net/ethernet/ni/nixge.c
index 91b7ebc..3776a03 100644
--- a/drivers/net/ethernet/ni/nixge.c
+++ b/drivers/net/ethernet/ni/nixge.c
@@ -457,6 +457,17 @@ static void nixge_tx_skb_unmap(struct nixge_priv *priv,
}
}

+static int nixge_check_tx_bd_space(struct nixge_priv *priv,
+ int num_frag)
+{
+ struct nixge_hw_dma_bd *cur_p;
+
+ cur_p = &priv->tx_bd_v[(priv->tx_bd_tail + num_frag) % TX_BD_NUM];
+ if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK)
+ return NETDEV_TX_BUSY;
+ return 0;
+}
+
static void nixge_start_xmit_done(struct net_device *ndev)
{
struct nixge_priv *priv = netdev_priv(ndev);
@@ -488,19 +499,13 @@ static void nixge_start_xmit_done(struct net_device *ndev)
ndev->stats.tx_packets += packets;
ndev->stats.tx_bytes += size;

- if (packets)
- netif_wake_queue(ndev);
-}
-
-static int nixge_check_tx_bd_space(struct nixge_priv *priv,
- int num_frag)
-{
- struct nixge_hw_dma_bd *cur_p;
+ if (packets) {
+ /* Matches barrier in nixge_start_xmit */
+ smp_mb();

- cur_p = &priv->tx_bd_v[(priv->tx_bd_tail + num_frag) % TX_BD_NUM];
- if (cur_p->status & XAXIDMA_BD_STS_ALL_MASK)
- return NETDEV_TX_BUSY;
- return 0;
+ if (!nixge_check_tx_bd_space(priv, MAX_SKB_FRAGS + 1))
+ netif_wake_queue(ndev);
+ }
}

static netdev_tx_t nixge_start_xmit(struct sk_buff *skb,
@@ -518,10 +523,15 @@ static netdev_tx_t nixge_start_xmit(struct sk_buff *skb,
cur_p = &priv->tx_bd_v[priv->tx_bd_tail];
tx_skb = &priv->tx_skb[priv->tx_bd_tail];

- if (nixge_check_tx_bd_space(priv, num_frag + 1)) {
- if (!netif_queue_stopped(ndev))
- netif_stop_queue(ndev);
- return NETDEV_TX_OK;
+ if (unlikely(nixge_check_tx_bd_space(priv, num_frag + 1))) {
+ /* Should not happen as last start_xmit call should have
+ * checked for sufficient space and queue should only be
+ * woken when sufficient space is available.
+ */
+ netif_stop_queue(ndev);
+ if (net_ratelimit())
+ netdev_err(ndev, "BUG! TX Ring full when queue awake!\n");
+ return NETDEV_TX_BUSY;
}

cur_phys = dma_map_single(ndev->dev.parent, skb->data,
@@ -572,6 +582,18 @@ static netdev_tx_t nixge_start_xmit(struct sk_buff *skb,
++priv->tx_bd_tail;
priv->tx_bd_tail %= TX_BD_NUM;

+ /* Stop queue if next transmit may not have space */
+ if (nixge_check_tx_bd_space(priv, MAX_SKB_FRAGS + 1)) {
+ netif_stop_queue(ndev);
+
+ /* Matches barrier in nixge_start_xmit_done */
+ smp_mb();
+
+ /* Space might have just been freed - check again */
+ if (!nixge_check_tx_bd_space(priv, MAX_SKB_FRAGS + 1))
+ netif_wake_queue(ndev);
+ }
+
return NETDEV_TX_OK;
frag_err:
for (; ii > 0; ii--) {
--
2.9.5


2022-11-15 23:15:05

by Francois Romieu

[permalink] [raw]
Subject: Re: [PATCH net v2 3/3] net: nixge: fix tx queue handling

Zhang Changzhong <[email protected]> :
> Currently the driver check for available space at the beginning of
> nixge_start_xmit(), and when there is not enough space for this packet,
> it returns NETDEV_TX_OK, which casues packet loss and memory leak.
>
> Instead the queue should be stopped after the packet is added to the BD
> when there may not be enough space for next packet. In addition, the
> queue should be wakeup only if there is enough space for a packet with
> max frags.
>
> Fixes: 492caffa8a1a ("net: ethernet: nixge: Add support for National Instruments XGE netdev")
> Signed-off-by: Zhang Changzhong <[email protected]>
> ---
> drivers/net/ethernet/ni/nixge.c | 54 +++++++++++++++++++++++++++++------------
> 1 file changed, 38 insertions(+), 16 deletions(-)
>
> diff --git a/drivers/net/ethernet/ni/nixge.c b/drivers/net/ethernet/ni/nixge.c
> index 91b7ebc..3776a03 100644
> --- a/drivers/net/ethernet/ni/nixge.c
> +++ b/drivers/net/ethernet/ni/nixge.c
[...]
> static netdev_tx_t nixge_start_xmit(struct sk_buff *skb,
> @@ -518,10 +523,15 @@ static netdev_tx_t nixge_start_xmit(struct sk_buff *skb,
> cur_p = &priv->tx_bd_v[priv->tx_bd_tail];
> tx_skb = &priv->tx_skb[priv->tx_bd_tail];
>
> - if (nixge_check_tx_bd_space(priv, num_frag + 1)) {
> - if (!netif_queue_stopped(ndev))
> - netif_stop_queue(ndev);
> - return NETDEV_TX_OK;
> + if (unlikely(nixge_check_tx_bd_space(priv, num_frag + 1))) {
> + /* Should not happen as last start_xmit call should have
> + * checked for sufficient space and queue should only be
> + * woken when sufficient space is available.
> + */

Almost. IRQ triggering after nixge_start_xmit::netif_stop_queue and
before nixge_start_xmit::smp_mb may wrongly wake queue.

Call me timorous but I would feel more confortable if this code could
be tested on real hardware before being fed into -net.

--
Ueimor

2022-11-16 10:26:44

by Zhang Changzhong

[permalink] [raw]
Subject: Re: [PATCH net v2 3/3] net: nixge: fix tx queue handling

On 2022/11/16 7:04, Francois Romieu wrote:
> Zhang Changzhong <[email protected]> :
>> Currently the driver check for available space at the beginning of
>> nixge_start_xmit(), and when there is not enough space for this packet,
>> it returns NETDEV_TX_OK, which casues packet loss and memory leak.
>>
>> Instead the queue should be stopped after the packet is added to the BD
>> when there may not be enough space for next packet. In addition, the
>> queue should be wakeup only if there is enough space for a packet with
>> max frags.
>>
>> Fixes: 492caffa8a1a ("net: ethernet: nixge: Add support for National Instruments XGE netdev")
>> Signed-off-by: Zhang Changzhong <[email protected]>
>> ---
>> drivers/net/ethernet/ni/nixge.c | 54 +++++++++++++++++++++++++++++------------
>> 1 file changed, 38 insertions(+), 16 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/ni/nixge.c b/drivers/net/ethernet/ni/nixge.c
>> index 91b7ebc..3776a03 100644
>> --- a/drivers/net/ethernet/ni/nixge.c
>> +++ b/drivers/net/ethernet/ni/nixge.c
> [...]
>> static netdev_tx_t nixge_start_xmit(struct sk_buff *skb,
>> @@ -518,10 +523,15 @@ static netdev_tx_t nixge_start_xmit(struct sk_buff *skb,
>> cur_p = &priv->tx_bd_v[priv->tx_bd_tail];
>> tx_skb = &priv->tx_skb[priv->tx_bd_tail];
>>
>> - if (nixge_check_tx_bd_space(priv, num_frag + 1)) {
>> - if (!netif_queue_stopped(ndev))
>> - netif_stop_queue(ndev);
>> - return NETDEV_TX_OK;
>> + if (unlikely(nixge_check_tx_bd_space(priv, num_frag + 1))) {
>> + /* Should not happen as last start_xmit call should have
>> + * checked for sufficient space and queue should only be
>> + * woken when sufficient space is available.
>> + */
>
> Almost. IRQ triggering after nixge_start_xmit::netif_stop_queue and
> before nixge_start_xmit::smp_mb may wrongly wake queue.
>

I don't know what you mean by "wronly wake queue". The queue is woken
only when there is sufficient for next packet.

> Call me timorous but I would feel more confortable if this code could
> be tested on real hardware before being fed into -net.
>

I agree with you, hope someone can test and correct it.

Thanks,
Changzhong

2022-11-16 11:27:10

by Francois Romieu

[permalink] [raw]
Subject: Re: [PATCH net v2 3/3] net: nixge: fix tx queue handling

Zhang Changzhong <[email protected]> :
> On 2022/11/16 7:04, Francois Romieu wrote:
> > Zhang Changzhong <[email protected]> :
[...]
> >> diff --git a/drivers/net/ethernet/ni/nixge.c b/drivers/net/ethernet/ni/nixge.c
> >> index 91b7ebc..3776a03 100644
> >> --- a/drivers/net/ethernet/ni/nixge.c
> >> +++ b/drivers/net/ethernet/ni/nixge.c
> > [...]
> >> static netdev_tx_t nixge_start_xmit(struct sk_buff *skb,
> >> @@ -518,10 +523,15 @@ static netdev_tx_t nixge_start_xmit(struct sk_buff *skb,
> >> cur_p = &priv->tx_bd_v[priv->tx_bd_tail];
> >> tx_skb = &priv->tx_skb[priv->tx_bd_tail];
> >>
> >> - if (nixge_check_tx_bd_space(priv, num_frag + 1)) {
> >> - if (!netif_queue_stopped(ndev))
> >> - netif_stop_queue(ndev);
> >> - return NETDEV_TX_OK;
> >> + if (unlikely(nixge_check_tx_bd_space(priv, num_frag + 1))) {
> >> + /* Should not happen as last start_xmit call should have
> >> + * checked for sufficient space and queue should only be
> >> + * woken when sufficient space is available.
> >> + */
> >
> > Almost. IRQ triggering after nixge_start_xmit::netif_stop_queue and
> > before nixge_start_xmit::smp_mb may wrongly wake queue.
> >
>
> I don't know what you mean by "wronly wake queue". The queue is woken
> only when there is sufficient for next packet.

Between nixge_start_xmit::netif_stop_queue and nixge_start_xmit::smp_mb,
"next" packet is current packet in hard_start_xmit. However said current
packet may not be accounted for in the IRQ context transmit completion
handler.

[nixge_start_xmit]

++priv->tx_bd_tail;
priv->tx_bd_tail %= TX_BD_NUM;

/* Stop queue if next transmit may not have space */
if (nixge_check_tx_bd_space(priv, MAX_SKB_FRAGS + 1)) {
netif_stop_queue(ndev);

Which value does [nixge_start_xmit_done] read as priv->tx_bd_tail at
this point ? The value set a few lines above or some older value ?

/* Matches barrier in nixge_start_xmit_done */
smp_mb();

/* Space might have just been freed - check again */
if (!nixge_check_tx_bd_space(priv, MAX_SKB_FRAGS + 1))
netif_wake_queue(ndev);
}

--
Ueimor

2022-11-16 12:53:13

by Zhang Changzhong

[permalink] [raw]
Subject: Re: [PATCH net v2 3/3] net: nixge: fix tx queue handling

On 2022/11/16 18:27, Francois Romieu wrote:
> Zhang Changzhong <[email protected]> :
>> On 2022/11/16 7:04, Francois Romieu wrote:
>>> Zhang Changzhong <[email protected]> :
> [...]
>>>> diff --git a/drivers/net/ethernet/ni/nixge.c b/drivers/net/ethernet/ni/nixge.c
>>>> index 91b7ebc..3776a03 100644
>>>> --- a/drivers/net/ethernet/ni/nixge.c
>>>> +++ b/drivers/net/ethernet/ni/nixge.c
>>> [...]
>>>> static netdev_tx_t nixge_start_xmit(struct sk_buff *skb,
>>>> @@ -518,10 +523,15 @@ static netdev_tx_t nixge_start_xmit(struct sk_buff *skb,
>>>> cur_p = &priv->tx_bd_v[priv->tx_bd_tail];
>>>> tx_skb = &priv->tx_skb[priv->tx_bd_tail];
>>>>
>>>> - if (nixge_check_tx_bd_space(priv, num_frag + 1)) {
>>>> - if (!netif_queue_stopped(ndev))
>>>> - netif_stop_queue(ndev);
>>>> - return NETDEV_TX_OK;
>>>> + if (unlikely(nixge_check_tx_bd_space(priv, num_frag + 1))) {
>>>> + /* Should not happen as last start_xmit call should have
>>>> + * checked for sufficient space and queue should only be
>>>> + * woken when sufficient space is available.
>>>> + */
>>>
>>> Almost. IRQ triggering after nixge_start_xmit::netif_stop_queue and
>>> before nixge_start_xmit::smp_mb may wrongly wake queue.
>>>
>>
>> I don't know what you mean by "wronly wake queue". The queue is woken
>> only when there is sufficient for next packet.
>
> Between nixge_start_xmit::netif_stop_queue and nixge_start_xmit::smp_mb,
> "next" packet is current packet in hard_start_xmit. However said current
> packet may not be accounted for in the IRQ context transmit completion
> handler.
>
> [nixge_start_xmit]
>
> ++priv->tx_bd_tail;
> priv->tx_bd_tail %= TX_BD_NUM;
>
> /* Stop queue if next transmit may not have space */
> if (nixge_check_tx_bd_space(priv, MAX_SKB_FRAGS + 1)) {
> netif_stop_queue(ndev);
>
> Which value does [nixge_start_xmit_done] read as priv->tx_bd_tail at
> this point ? The value set a few lines above or some older value ?
>
> /* Matches barrier in nixge_start_xmit_done */
> smp_mb();
>
> /* Space might have just been freed - check again */
> if (!nixge_check_tx_bd_space(priv, MAX_SKB_FRAGS + 1))
> netif_wake_queue(ndev);
> }
>

Got it! Thanks a lot for your detailed explanation.

Best Regards,
Changzhong