2022-10-13 05:11:14

by Akihiko Odaki

[permalink] [raw]
Subject: [PATCH] e1000e: Fix TX dispatch condition

e1000_xmit_frame is expected to stop the queue and dispatch frames to
hardware if there is not sufficient space for the next frame in the
buffer, but sometimes it failed to do so because the estimated maxmium
size of frame was wrong. As the consequence, the later invocation of
e1000_xmit_frame failed with NETDEV_TX_BUSY, and the frame in the buffer
remained forever, resulting in a watchdog failure.

This change fixes the estimated size by making it match with the
condition for NETDEV_TX_BUSY. Apparently, the old estimation failed to
account for the following lines which determines the space requirement
for not causing NETDEV_TX_BUSY:
> /* reserve a descriptor for the offload context */
> if ((mss) || (skb->ip_summed == CHECKSUM_PARTIAL))
> count++;
> count++;
>
> count += DIV_ROUND_UP(len, adapter->tx_fifo_limit);

This issue was found with http-stress02 test included in Linux Test
Project 20220930.

Signed-off-by: Akihiko Odaki <[email protected]>
---
drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
index 321f2a95ae3a..da113f5011e9 100644
--- a/drivers/net/ethernet/intel/e1000e/netdev.c
+++ b/drivers/net/ethernet/intel/e1000e/netdev.c
@@ -5936,9 +5936,9 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb,
e1000_tx_queue(tx_ring, tx_flags, count);
/* Make sure there is space in the ring for the next send. */
e1000_maybe_stop_tx(tx_ring,
- (MAX_SKB_FRAGS *
+ ((MAX_SKB_FRAGS + 1) *
DIV_ROUND_UP(PAGE_SIZE,
- adapter->tx_fifo_limit) + 2));
+ adapter->tx_fifo_limit) + 4));

if (!netdev_xmit_more() ||
netif_xmit_stopped(netdev_get_tx_queue(netdev, 0))) {
--
2.37.3


2022-10-27 15:05:43

by G, GurucharanX

[permalink] [raw]
Subject: RE: [Intel-wired-lan] [PATCH] e1000e: Fix TX dispatch condition



> -----Original Message-----
> From: Intel-wired-lan <[email protected]> On Behalf Of
> Akihiko Odaki
> Sent: Thursday, October 13, 2022 10:31 AM
> Cc: [email protected]; [email protected]; Yuri Benditovich
> <[email protected]>; Eric Dumazet <[email protected]>;
> Jakub Kicinski <[email protected]>; Yan Vugenfirer <[email protected]>; intel-
> [email protected]; Paolo Abeni <[email protected]>; David S.
> Miller <[email protected]>
> Subject: [Intel-wired-lan] [PATCH] e1000e: Fix TX dispatch condition
>
> e1000_xmit_frame is expected to stop the queue and dispatch frames to
> hardware if there is not sufficient space for the next frame in the buffer, but
> sometimes it failed to do so because the estimated maxmium size of frame
> was wrong. As the consequence, the later invocation of e1000_xmit_frame
> failed with NETDEV_TX_BUSY, and the frame in the buffer remained forever,
> resulting in a watchdog failure.
>
> This change fixes the estimated size by making it match with the condition for
> NETDEV_TX_BUSY. Apparently, the old estimation failed to account for the
> following lines which determines the space requirement for not causing
> NETDEV_TX_BUSY:
> > /* reserve a descriptor for the offload context */
> > if ((mss) || (skb->ip_summed == CHECKSUM_PARTIAL))
> > count++;
> > count++;
> >
> > count += DIV_ROUND_UP(len, adapter->tx_fifo_limit);
>
> This issue was found with http-stress02 test included in Linux Test Project
> 20220930.
>
> Signed-off-by: Akihiko Odaki <[email protected]>
> ---
> drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>

Tested-by: Gurucharan <[email protected]> (A Contingent worker at Intel)

2022-10-28 11:41:35

by Paul Menzel

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH] e1000e: Fix TX dispatch condition

Dear Akihiko,


Thank you very much for the patch.

Am 13.10.22 um 07:00 schrieb Akihiko Odaki:
> e1000_xmit_frame is expected to stop the queue and dispatch frames to
> hardware if there is not sufficient space for the next frame in the
> buffer, but sometimes it failed to do so because the estimated maxmium
> size of frame was wrong. As the consequence, the later invocation of
> e1000_xmit_frame failed with NETDEV_TX_BUSY, and the frame in the buffer
> remained forever, resulting in a watchdog failure.
>
> This change fixes the estimated size by making it match with the
> condition for NETDEV_TX_BUSY. Apparently, the old estimation failed to
> account for the following lines which determines the space requirement
> for not causing NETDEV_TX_BUSY:
>> /* reserve a descriptor for the offload context */
>> if ((mss) || (skb->ip_summed == CHECKSUM_PARTIAL))
>> count++;
>> count++;
>>
>> count += DIV_ROUND_UP(len, adapter->tx_fifo_limit);

I’d just use Markdown syntax, and indent by four spaces without > for
citation.

> This issue was found with http-stress02 test included in Linux Test
> Project 20220930.

So it was reproduced in QEMU? For convenience, it’d be great if you
added the QEMU command.

Also, do you know if this is a regression? If so, it’d be great if you
added the Fixes: tag.


Kind regards,

Paul


> Signed-off-by: Akihiko Odaki <[email protected]>
> ---
> drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/e1000e/netdev.c b/drivers/net/ethernet/intel/e1000e/netdev.c
> index 321f2a95ae3a..da113f5011e9 100644
> --- a/drivers/net/ethernet/intel/e1000e/netdev.c
> +++ b/drivers/net/ethernet/intel/e1000e/netdev.c
> @@ -5936,9 +5936,9 @@ static netdev_tx_t e1000_xmit_frame(struct sk_buff *skb,
> e1000_tx_queue(tx_ring, tx_flags, count);
> /* Make sure there is space in the ring for the next send. */
> e1000_maybe_stop_tx(tx_ring,
> - (MAX_SKB_FRAGS *
> + ((MAX_SKB_FRAGS + 1) *
> DIV_ROUND_UP(PAGE_SIZE,
> - adapter->tx_fifo_limit) + 2));
> + adapter->tx_fifo_limit) + 4));
>
> if (!netdev_xmit_more() ||
> netif_xmit_stopped(netdev_get_tx_queue(netdev, 0))) {

2022-10-30 11:44:03

by naamax.meir

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH] e1000e: Fix TX dispatch condition

On 10/13/2022 08:00, Akihiko Odaki wrote:
> e1000_xmit_frame is expected to stop the queue and dispatch frames to
> hardware if there is not sufficient space for the next frame in the
> buffer, but sometimes it failed to do so because the estimated maxmium
> size of frame was wrong. As the consequence, the later invocation of
> e1000_xmit_frame failed with NETDEV_TX_BUSY, and the frame in the buffer
> remained forever, resulting in a watchdog failure.
>
> This change fixes the estimated size by making it match with the
> condition for NETDEV_TX_BUSY. Apparently, the old estimation failed to
> account for the following lines which determines the space requirement
> for not causing NETDEV_TX_BUSY:
>> /* reserve a descriptor for the offload context */
>> if ((mss) || (skb->ip_summed == CHECKSUM_PARTIAL))
>> count++;
>> count++;
>>
>> count += DIV_ROUND_UP(len, adapter->tx_fifo_limit);
>
> This issue was found with http-stress02 test included in Linux Test
> Project 20220930.
>
> Signed-off-by: Akihiko Odaki <[email protected]>
> ---
> drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Tested-by: Naama Meir <[email protected]>

2022-11-09 11:31:52

by naamax.meir

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH] e1000e: Fix TX dispatch condition

On 10/13/2022 08:00, Akihiko Odaki wrote:
> e1000_xmit_frame is expected to stop the queue and dispatch frames to
> hardware if there is not sufficient space for the next frame in the
> buffer, but sometimes it failed to do so because the estimated maxmium
> size of frame was wrong. As the consequence, the later invocation of
> e1000_xmit_frame failed with NETDEV_TX_BUSY, and the frame in the buffer
> remained forever, resulting in a watchdog failure.
>
> This change fixes the estimated size by making it match with the
> condition for NETDEV_TX_BUSY. Apparently, the old estimation failed to
> account for the following lines which determines the space requirement
> for not causing NETDEV_TX_BUSY:
>> /* reserve a descriptor for the offload context */
>> if ((mss) || (skb->ip_summed == CHECKSUM_PARTIAL))
>> count++;
>> count++;
>>
>> count += DIV_ROUND_UP(len, adapter->tx_fifo_limit);
>
> This issue was found with http-stress02 test included in Linux Test
> Project 20220930.
>
> Signed-off-by: Akihiko Odaki <[email protected]>
> ---
> drivers/net/ethernet/intel/e1000e/netdev.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
Tested-by: Naama Meir <[email protected]>