2017-06-06 07:25:22

by Niklas Cassel

[permalink] [raw]
Subject: [PATCH net] net: stmmac: fix completely hung TX when using TSO

stmmac_tso_allocator can fail to set the Last Descriptor bit
on a descriptor that actually was the last descriptor.

This happens when the buffer of the last descriptor ends
up having a size of exactly TSO_MAX_BUFF_SIZE.

When the IP eventually reaches the next last descriptor,
which actually has the bit set, the DMA will hang.

When the DMA hangs, we get a tx timeout, however,
since stmmac does not do a complete reset of the IP
in stmmac_tx_timeout, we end up in a state with
completely hung TX.

Signed-off-by: Niklas Cassel <[email protected]>
---
drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
index 68a188e74c54..440bea049a7f 100644
--- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
+++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
@@ -2723,7 +2723,7 @@ static void stmmac_tso_allocator(struct stmmac_priv *priv, unsigned int des,

priv->hw->desc->prepare_tso_tx_desc(desc, 0, buff_size,
0, 1,
- (last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
+ (last_segment) && (tmp_len <= TSO_MAX_BUFF_SIZE),
0, 0);

tmp_len -= TSO_MAX_BUFF_SIZE;
--
2.11.0


2017-06-06 08:00:46

by Peppe CAVALLARO

[permalink] [raw]
Subject: Re: [PATCH net] net: stmmac: fix completely hung TX when using TSO

Hi Niklas

I get the point and I acked the patch but Alex, please, can you confirm
that this issue has never seen on your boxes where the TSO has been
fully tested? The initial development (commit f748be531) introduces
the following:
(last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
...

On 6/6/2017 9:25 AM, Niklas Cassel wrote:
> stmmac_tso_allocator can fail to set the Last Descriptor bit
> on a descriptor that actually was the last descriptor.
>
> This happens when the buffer of the last descriptor ends
> up having a size of exactly TSO_MAX_BUFF_SIZE.
>
> When the IP eventually reaches the next last descriptor,
> which actually has the bit set, the DMA will hang.
>
> When the DMA hangs, we get a tx timeout, however,
> since stmmac does not do a complete reset of the IP
> in stmmac_tx_timeout, we end up in a state with
> completely hung TX.
>
> Signed-off-by: Niklas Cassel <[email protected]>

Acked-by: Giuseppe Cavallaro <[email protected]>

> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 68a188e74c54..440bea049a7f 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -2723,7 +2723,7 @@ static void stmmac_tso_allocator(struct stmmac_priv *priv, unsigned int des,
>
> priv->hw->desc->prepare_tso_tx_desc(desc, 0, buff_size,
> 0, 1,
> - (last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
> + (last_segment) && (tmp_len <= TSO_MAX_BUFF_SIZE),
> 0, 0);
>
> tmp_len -= TSO_MAX_BUFF_SIZE;

Regards
Peppe


2017-06-06 09:10:29

by Alexandre Torgue

[permalink] [raw]
Subject: Re: [PATCH net] net: stmmac: fix completely hung TX when using TSO

Hi Guys,

On 06/06/2017 10:00 AM, Giuseppe CAVALLARO wrote:
> Hi Niklas
>
> I get the point and I acked the patch but Alex, please, can you confirm
> that this issue has never seen on your boxes where the TSO has been
> fully tested? The initial development (commit f748be531) introduces
> the following:
> (last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),

I don't remember to have seen this kind of issue in the past but for
sure I agree with this patch.

Acked-by: Alexandre TORGUE <[email protected]>

> ...
>
> On 6/6/2017 9:25 AM, Niklas Cassel wrote:
>> stmmac_tso_allocator can fail to set the Last Descriptor bit
>> on a descriptor that actually was the last descriptor.
>>
>> This happens when the buffer of the last descriptor ends
>> up having a size of exactly TSO_MAX_BUFF_SIZE.
>>
>> When the IP eventually reaches the next last descriptor,
>> which actually has the bit set, the DMA will hang.
>>
>> When the DMA hangs, we get a tx timeout, however,
>> since stmmac does not do a complete reset of the IP
>> in stmmac_tx_timeout, we end up in a state with
>> completely hung TX.
>>
>> Signed-off-by: Niklas Cassel <[email protected]>
>
> Acked-by: Giuseppe Cavallaro <[email protected]>
>
>> ---
>> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> index 68a188e74c54..440bea049a7f 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> @@ -2723,7 +2723,7 @@ static void stmmac_tso_allocator(struct
>> stmmac_priv *priv, unsigned int des,
>> priv->hw->desc->prepare_tso_tx_desc(desc, 0, buff_size,
>> 0, 1,
>> - (last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
>> + (last_segment) && (tmp_len <= TSO_MAX_BUFF_SIZE),
>> 0, 0);
>> tmp_len -= TSO_MAX_BUFF_SIZE;
>
> Regards
> Peppe
>
>

2017-06-06 10:01:22

by Niklas Cassel

[permalink] [raw]
Subject: Re: [PATCH net] net: stmmac: fix completely hung TX when using TSO

On 06/06/2017 10:00 AM, Giuseppe CAVALLARO wrote:
> Hi Niklas

Hello Peppe, Alex

>
> I get the point and I acked the patch but Alex, please, can you confirm
> that this issue has never seen on your boxes where the TSO has been
> fully tested? The initial development (commit f748be531) introduces
> the following:
> (last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
> ...

Since you need a packet where the txbuff size of last descriptor
has the exact size of 16K-1, depending on your workload,
it is possible to run your system for a very long time without
triggering the bug.

We noticed the problem since a test in our test suite fetches
the syslog. The syslog usually has a size that is not constant,
and the size usually increases with time.
During some of the test runs, far from all of them, we would
eventually end up with a packet of the exact size that triggers
the bug.

It should be possible to trigger the bug more easily if you
know what buffer size to write to your TCP socket, together
with some manual TCP corking.


Regards,
Niklas

>
> On 6/6/2017 9:25 AM, Niklas Cassel wrote:
>> stmmac_tso_allocator can fail to set the Last Descriptor bit
>> on a descriptor that actually was the last descriptor.
>>
>> This happens when the buffer of the last descriptor ends
>> up having a size of exactly TSO_MAX_BUFF_SIZE.
>>
>> When the IP eventually reaches the next last descriptor,
>> which actually has the bit set, the DMA will hang.
>>
>> When the DMA hangs, we get a tx timeout, however,
>> since stmmac does not do a complete reset of the IP
>> in stmmac_tx_timeout, we end up in a state with
>> completely hung TX.
>>
>> Signed-off-by: Niklas Cassel <[email protected]>
>
> Acked-by: Giuseppe Cavallaro <[email protected]>
>
>> ---
>> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> index 68a188e74c54..440bea049a7f 100644
>> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
>> @@ -2723,7 +2723,7 @@ static void stmmac_tso_allocator(struct stmmac_priv *priv, unsigned int des,
>> priv->hw->desc->prepare_tso_tx_desc(desc, 0, buff_size,
>> 0, 1,
>> - (last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
>> + (last_segment) && (tmp_len <= TSO_MAX_BUFF_SIZE),
>> 0, 0);
>> tmp_len -= TSO_MAX_BUFF_SIZE;
>
> Regards
> Peppe
>
>

2017-06-06 14:31:28

by Florian Fainelli

[permalink] [raw]
Subject: Re: [PATCH net] net: stmmac: fix completely hung TX when using TSO



On 06/06/2017 12:25 AM, Niklas Cassel wrote:
> stmmac_tso_allocator can fail to set the Last Descriptor bit
> on a descriptor that actually was the last descriptor.
>
> This happens when the buffer of the last descriptor ends
> up having a size of exactly TSO_MAX_BUFF_SIZE.
>
> When the IP eventually reaches the next last descriptor,
> which actually has the bit set, the DMA will hang.
>
> When the DMA hangs, we get a tx timeout, however,
> since stmmac does not do a complete reset of the IP
> in stmmac_tx_timeout, we end up in a state with
> completely hung TX.
>
> Signed-off-by: Niklas Cassel <[email protected]>

This should have:

Fixes: f748be531d70 ("stmmac: support new GMAC4")

right?

> ---
> drivers/net/ethernet/stmicro/stmmac/stmmac_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> index 68a188e74c54..440bea049a7f 100644
> --- a/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> +++ b/drivers/net/ethernet/stmicro/stmmac/stmmac_main.c
> @@ -2723,7 +2723,7 @@ static void stmmac_tso_allocator(struct stmmac_priv *priv, unsigned int des,
>
> priv->hw->desc->prepare_tso_tx_desc(desc, 0, buff_size,
> 0, 1,
> - (last_segment) && (buff_size < TSO_MAX_BUFF_SIZE),
> + (last_segment) && (tmp_len <= TSO_MAX_BUFF_SIZE),
> 0, 0);
>
> tmp_len -= TSO_MAX_BUFF_SIZE;
>

--
Florian

2017-06-06 20:25:10

by David Miller

[permalink] [raw]
Subject: Re: [PATCH net] net: stmmac: fix completely hung TX when using TSO

From: Niklas Cassel <[email protected]>
Date: Tue, 6 Jun 2017 09:25:00 +0200

> stmmac_tso_allocator can fail to set the Last Descriptor bit
> on a descriptor that actually was the last descriptor.
>
> This happens when the buffer of the last descriptor ends
> up having a size of exactly TSO_MAX_BUFF_SIZE.
>
> When the IP eventually reaches the next last descriptor,
> which actually has the bit set, the DMA will hang.
>
> When the DMA hangs, we get a tx timeout, however,
> since stmmac does not do a complete reset of the IP
> in stmmac_tx_timeout, we end up in a state with
> completely hung TX.
>
> Signed-off-by: Niklas Cassel <[email protected]>

Applied and queued up for -stable, thank you.