2023-11-29 15:54:38

by ChunHao Lin

[permalink] [raw]
Subject: [PATCH net v2] r8169: fix rtl8125b PAUSE frames blasting when suspended

When FIFO reaches near full state, device will issue pause frame.
If pause slot is enabled(set to 1), in this time, device will issue
pause frame only once. But if pause slot is disabled(set to 0), device
will keep sending pause frames until FIFO reaches near empty state.

When pause slot is disabled, if there is no one to handle receive
packets, device FIFO will reach near full state and keep sending
pause frames. That will impact entire local area network.

This issue can be reproduced in Chromebox (not Chromebook) in
developer mode running a test image (and v5.10 kernel):
1) ping -f $CHROMEBOX (from workstation on same local network)
2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX
3) ping $ROUTER (wait until ping fails from workstation)

Takes about ~20-30 seconds after step 2 for the local network to
stop working.

Fix this issue by enabling pause slot to only send pause frame once
when FIFO reaches near full state.

Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125")
Reported-by: Grant Grundler <[email protected]>
Tested-by: Grant Grundler <[email protected]>
Cc: [email protected]
Signed-off-by: ChunHao Lin <[email protected]>
---
v2:
- update comment and title.
---
drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
index 62cabeeb842a..bb787a52bc75 100644
--- a/drivers/net/ethernet/realtek/r8169_main.c
+++ b/drivers/net/ethernet/realtek/r8169_main.c
@@ -196,6 +196,7 @@ enum rtl_registers {
/* No threshold before first PCI xfer */
#define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT)
#define RX_EARLY_OFF (1 << 11)
+#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */
#define RXCFG_DMA_SHIFT 8
/* Unlimited maximum PCI burst. */
#define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT)
@@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp)
case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53:
RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF);
break;
- case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63:
+ case RTL_GIGA_MAC_VER_61:
RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST);
break;
+ case RTL_GIGA_MAC_VER_63:
+ RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST |
+ RX_PAUSE_SLOT_ON);
+ break;
default:
RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST);
break;
--
2.39.2


2023-11-29 23:07:02

by Jacob Keller

[permalink] [raw]
Subject: Re: [PATCH net v2] r8169: fix rtl8125b PAUSE frames blasting when suspended



On 11/29/2023 7:53 AM, ChunHao Lin wrote:
> When FIFO reaches near full state, device will issue pause frame.
> If pause slot is enabled(set to 1), in this time, device will issue
> pause frame only once. But if pause slot is disabled(set to 0), device
> will keep sending pause frames until FIFO reaches near empty state.
>
> When pause slot is disabled, if there is no one to handle receive
> packets, device FIFO will reach near full state and keep sending
> pause frames. That will impact entire local area network.
>
> This issue can be reproduced in Chromebox (not Chromebook) in
> developer mode running a test image (and v5.10 kernel):
> 1) ping -f $CHROMEBOX (from workstation on same local network)
> 2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX
> 3) ping $ROUTER (wait until ping fails from workstation)
>
> Takes about ~20-30 seconds after step 2 for the local network to
> stop working.
>
> Fix this issue by enabling pause slot to only send pause frame once
> when FIFO reaches near full state.
>

Makes sense. Avoiding the spam is good. The naming is a bit confusing
but I guess that comes from realtek datasheet?

> Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125")
> Reported-by: Grant Grundler <[email protected]>
> Tested-by: Grant Grundler <[email protected]>
> Cc: [email protected]
> Signed-off-by: ChunHao Lin <[email protected]>
> ---
> v2:
> - update comment and title.
> ---
> drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> index 62cabeeb842a..bb787a52bc75 100644
> --- a/drivers/net/ethernet/realtek/r8169_main.c
> +++ b/drivers/net/ethernet/realtek/r8169_main.c
> @@ -196,6 +196,7 @@ enum rtl_registers {
> /* No threshold before first PCI xfer */
> #define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT)
> #define RX_EARLY_OFF (1 << 11)
> +#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */

This confuses me though: RX_EARLY_OFF is (1 << 11) as well.. Is that
from a different set of devices? We're writing to the same register
RxConfig here I think in both cases?

Can you clarify if these are supposed to be the same bit?

> #define RXCFG_DMA_SHIFT 8
> /* Unlimited maximum PCI burst. */
> #define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT)
> @@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp)
> case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53:
> RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF);
> break;
> - case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63:
> + case RTL_GIGA_MAC_VER_61:
> RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST);
> break;

I assume there isn't a VER_62 between these?

> + case RTL_GIGA_MAC_VER_63:
> + RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST |
> + RX_PAUSE_SLOT_ON);

We add RX_PAUSE_SLOT_ON now for RTL_GIGA_MAC_VER_63 in addition. Makes
sense.
> + break;
> default:
> RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST);
> break;

2023-11-29 23:42:03

by Grant Grundler

[permalink] [raw]
Subject: Re: [PATCH net v2] r8169: fix rtl8125b PAUSE frames blasting when suspended

On Wed, Nov 29, 2023 at 3:05 PM Jacob Keller <[email protected]> wrote:
> On 11/29/2023 7:53 AM, ChunHao Lin wrote:
> > When FIFO reaches near full state, device will issue pause frame.
> > If pause slot is enabled(set to 1), in this time, device will issue
> > pause frame only once. But if pause slot is disabled(set to 0), device
> > will keep sending pause frames until FIFO reaches near empty state.
> >
> > When pause slot is disabled, if there is no one to handle receive
> > packets, device FIFO will reach near full state and keep sending
> > pause frames. That will impact entire local area network.
> >
> > This issue can be reproduced in Chromebox (not Chromebook) in
> > developer mode running a test image (and v5.10 kernel):
> > 1) ping -f $CHROMEBOX (from workstation on same local network)
> > 2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX
> > 3) ping $ROUTER (wait until ping fails from workstation)
> >
> > Takes about ~20-30 seconds after step 2 for the local network to
> > stop working.
> >
> > Fix this issue by enabling pause slot to only send pause frame once
> > when FIFO reaches near full state.
> >
>
> Makes sense. Avoiding the spam is good. The naming is a bit confusing
> but I guess that comes from realtek datasheet?

I don't know. It doesn't matter to me what it's called since I don't
have access to the data sheet anyway. :/

> > Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125")
> > Reported-by: Grant Grundler <[email protected]>
> > Tested-by: Grant Grundler <[email protected]>
> > Cc: [email protected]
> > Signed-off-by: ChunHao Lin <[email protected]>
> > ---
> > v2:
> > - update comment and title.
> > ---
> > drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++-
> > 1 file changed, 6 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
> > index 62cabeeb842a..bb787a52bc75 100644
> > --- a/drivers/net/ethernet/realtek/r8169_main.c
> > +++ b/drivers/net/ethernet/realtek/r8169_main.c
> > @@ -196,6 +196,7 @@ enum rtl_registers {
> > /* No threshold before first PCI xfer */
> > #define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT)
> > #define RX_EARLY_OFF (1 << 11)
> > +#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */
>
> This confuses me though: RX_EARLY_OFF is (1 << 11) as well.. Is that
> from a different set of devices?

Yes, for a different HW version of the device.

> We're writing to the same register
> RxConfig here I think in both cases?

Yes. But to different versions of the HW which use this bit
differently. Ergo the comment about "8125b and later".

> Can you clarify if these are supposed to be the same bit?

Yes, they are the same bit - but different versions of HW use BIT(11)
differently.

>
> > #define RXCFG_DMA_SHIFT 8
> > /* Unlimited maximum PCI burst. */
> > #define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT)
> > @@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp)
> > case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53:
> > RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF);
> > break;
> > - case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63:
> > + case RTL_GIGA_MAC_VER_61:
> > RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST);
> > break;
>
> I assume there isn't a VER_62 between these?

Correct. My clue is this code near the top of this file:

149 [RTL_GIGA_MAC_VER_61] = {"RTL8125A", FIRMWARE_8125A_3},
150 /* reserve 62 for CFG_METHOD_4 in the vendor driver */
151 [RTL_GIGA_MAC_VER_63] = {"RTL8125B", FIRMWARE_8125B_2},

>
> > + case RTL_GIGA_MAC_VER_63:
> > + RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST |
> > + RX_PAUSE_SLOT_ON);
>
> We add RX_PAUSE_SLOT_ON now for RTL_GIGA_MAC_VER_63 in addition. Makes
> sense.

Exactly.

thanks for reviewing!

cheers,
grant

> > + break;
> > default:
> > RTL_W32(tp, RxConfig, RX128_INT_EN | RX_DMA_BURST);
> > break;

2023-11-30 22:25:22

by Jacob Keller

[permalink] [raw]
Subject: Re: [PATCH net v2] r8169: fix rtl8125b PAUSE frames blasting when suspended



On 11/29/2023 3:40 PM, Grant Grundler wrote:
> On Wed, Nov 29, 2023 at 3:05 PM Jacob Keller <[email protected]> wrote:
>> On 11/29/2023 7:53 AM, ChunHao Lin wrote:
>>> When FIFO reaches near full state, device will issue pause frame.
>>> If pause slot is enabled(set to 1), in this time, device will issue
>>> pause frame only once. But if pause slot is disabled(set to 0), device
>>> will keep sending pause frames until FIFO reaches near empty state.
>>>
>>> When pause slot is disabled, if there is no one to handle receive
>>> packets, device FIFO will reach near full state and keep sending
>>> pause frames. That will impact entire local area network.
>>>
>>> This issue can be reproduced in Chromebox (not Chromebook) in
>>> developer mode running a test image (and v5.10 kernel):
>>> 1) ping -f $CHROMEBOX (from workstation on same local network)
>>> 2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX
>>> 3) ping $ROUTER (wait until ping fails from workstation)
>>>
>>> Takes about ~20-30 seconds after step 2 for the local network to
>>> stop working.
>>>
>>> Fix this issue by enabling pause slot to only send pause frame once
>>> when FIFO reaches near full state.
>>>
>>
>> Makes sense. Avoiding the spam is good. The naming is a bit confusing
>> but I guess that comes from realtek datasheet?
>
> I don't know. It doesn't matter to me what it's called since I don't
> have access to the data sheet anyway. :/
>

The name is fine, i just found it a bit hard to parse since its
effectively "PAUSE_SLOT_ON" makes us *not* send pause frames forever.

I think its fine as-is, since this is referring to the use of the pause
slot in hardware.

>>> Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125")
>>> Reported-by: Grant Grundler <[email protected]>
>>> Tested-by: Grant Grundler <[email protected]>
>>> Cc: [email protected]
>>> Signed-off-by: ChunHao Lin <[email protected]>
>>> ---
>>> v2:
>>> - update comment and title.
>>> ---
>>> drivers/net/ethernet/realtek/r8169_main.c | 7 ++++++-
>>> 1 file changed, 6 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/net/ethernet/realtek/r8169_main.c b/drivers/net/ethernet/realtek/r8169_main.c
>>> index 62cabeeb842a..bb787a52bc75 100644
>>> --- a/drivers/net/ethernet/realtek/r8169_main.c
>>> +++ b/drivers/net/ethernet/realtek/r8169_main.c
>>> @@ -196,6 +196,7 @@ enum rtl_registers {
>>> /* No threshold before first PCI xfer */
>>> #define RX_FIFO_THRESH (7 << RXCFG_FIFO_SHIFT)
>>> #define RX_EARLY_OFF (1 << 11)
>>> +#define RX_PAUSE_SLOT_ON (1 << 11) /* 8125b and later */
>>
>> This confuses me though: RX_EARLY_OFF is (1 << 11) as well.. Is that
>> from a different set of devices?
>
> Yes, for a different HW version of the device.
>

Great.

>> We're writing to the same register
>> RxConfig here I think in both cases?
>
> Yes. But to different versions of the HW which use this bit
> differently. Ergo the comment about "8125b and later".
>
>> Can you clarify if these are supposed to be the same bit?
>
> Yes, they are the same bit - but different versions of HW use BIT(11)
> differently.

Thanks for the clarification!

>
>>
>>> #define RXCFG_DMA_SHIFT 8
>>> /* Unlimited maximum PCI burst. */
>>> #define RX_DMA_BURST (7 << RXCFG_DMA_SHIFT)
>>> @@ -2306,9 +2307,13 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp)
>>> case RTL_GIGA_MAC_VER_40 ... RTL_GIGA_MAC_VER_53:
>>> RTL_W32(tp, RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST | RX_EARLY_OFF);
>>> break;
>>> - case RTL_GIGA_MAC_VER_61 ... RTL_GIGA_MAC_VER_63:
>>> + case RTL_GIGA_MAC_VER_61:
>>> RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST);
>>> break;
>>
>> I assume there isn't a VER_62 between these?
>
> Correct. My clue is this code near the top of this file:
>
> 149 [RTL_GIGA_MAC_VER_61] = {"RTL8125A", FIRMWARE_8125A_3},
> 150 /* reserve 62 for CFG_METHOD_4 in the vendor driver */
> 151 [RTL_GIGA_MAC_VER_63] = {"RTL8125B", FIRMWARE_8125B_2},
>
>>
>>> + case RTL_GIGA_MAC_VER_63:
>>> + RTL_W32(tp, RxConfig, RX_FETCH_DFLT_8125 | RX_DMA_BURST |
>>> + RX_PAUSE_SLOT_ON);
>>
>> We add RX_PAUSE_SLOT_ON now for RTL_GIGA_MAC_VER_63 in addition. Makes
>> sense.
>
> Exactly.
>
> thanks for reviewing!
>

Great.

For the record:

Reviewed-by: Jacob Keller <[email protected]>

2023-12-02 04:14:57

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH net v2] r8169: fix rtl8125b PAUSE frames blasting when suspended

On Wed, 29 Nov 2023 23:53:50 +0800 ChunHao Lin wrote:
> When FIFO reaches near full state, device will issue pause frame.
> If pause slot is enabled(set to 1), in this time, device will issue
> pause frame only once. But if pause slot is disabled(set to 0), device
> will keep sending pause frames until FIFO reaches near empty state.

Heiner, looks good?
--
pw-bot: needs-ack

2023-12-02 08:31:27

by Heiner Kallweit

[permalink] [raw]
Subject: Re: [PATCH net v2] r8169: fix rtl8125b PAUSE frames blasting when suspended

On 29.11.2023 16:53, ChunHao Lin wrote:
> When FIFO reaches near full state, device will issue pause frame.
> If pause slot is enabled(set to 1), in this time, device will issue
> pause frame only once. But if pause slot is disabled(set to 0), device
> will keep sending pause frames until FIFO reaches near empty state.
>
> When pause slot is disabled, if there is no one to handle receive
> packets, device FIFO will reach near full state and keep sending
> pause frames. That will impact entire local area network.
>
> This issue can be reproduced in Chromebox (not Chromebook) in
> developer mode running a test image (and v5.10 kernel):
> 1) ping -f $CHROMEBOX (from workstation on same local network)
> 2) run "powerd_dbus_suspend" from command line on the $CHROMEBOX
> 3) ping $ROUTER (wait until ping fails from workstation)
>
> Takes about ~20-30 seconds after step 2 for the local network to
> stop working.
>
> Fix this issue by enabling pause slot to only send pause frame once
> when FIFO reaches near full state.
>
> Fixes: f1bce4ad2f1c ("r8169: add support for RTL8125")
> Reported-by: Grant Grundler <[email protected]>
> Tested-by: Grant Grundler <[email protected]>
> Cc: [email protected]
> Signed-off-by: ChunHao Lin <[email protected]>
> ---
> v2:
> - update comment and title.
> ---

Reviewed-by: Heiner Kallweit <[email protected]>


2023-12-02 19:31:10

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH net v2] r8169: fix rtl8125b PAUSE frames blasting when suspended

Hello:

This patch was applied to netdev/net.git (main)
by Jakub Kicinski <[email protected]>:

On Wed, 29 Nov 2023 23:53:50 +0800 you wrote:
> When FIFO reaches near full state, device will issue pause frame.
> If pause slot is enabled(set to 1), in this time, device will issue
> pause frame only once. But if pause slot is disabled(set to 0), device
> will keep sending pause frames until FIFO reaches near empty state.
>
> When pause slot is disabled, if there is no one to handle receive
> packets, device FIFO will reach near full state and keep sending
> pause frames. That will impact entire local area network.
>
> [...]

Here is the summary with links:
- [net,v2] r8169: fix rtl8125b PAUSE frames blasting when suspended
https://git.kernel.org/netdev/net/c/4b0768b6556a

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html