2017-08-30 19:54:00

by Tim Harvey

[permalink] [raw]
Subject: DSA mv88e6xxx RX frame errors and TCP/IP RX failure

Greetings,

I'm seeing RX frame errors when using the mv88e6xxx DSA driver on
4.13-rc7. The board I'm using is a GW5904 [1] which has an IMX6 FEC
MAC (eth0) connected via RGMII to a MV88E6176 with its downstream
P0/P1/P2/P3 to front panel RJ45's (lan1-lan4).

What I see is the following:
- bring up eth0/lan1
- DHCP ipv4 on lan1
- iperf client to server on network connected to lan1 shows ~150mbps
TX without any errors/overruns/frame but 10 or so dropped
- iperf server with a 100mbps TCP client test shows
- iperf server will hang when connected to from iperf client on lan1
network and I see frame errors from ifconfig:

root@xenial:/# ifconfig lan1
lan1 Link encap:Ethernet HWaddr 00:D0:12:41:F3:E7
inet addr:172.24.22.125 Bcast:172.24.255.255 Mask:255.240.0.0
inet6 addr: fe80::2d0:12ff:fe41:f3e7/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:148 errors:0 dropped:30 overruns:0 frame:0
TX packets:15 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:8780 (8.5 KiB) TX bytes:1762 (1.7 KiB)

root@xenial:/# ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:D0:12:41:F3:E7
inet6 addr: fe80::2d0:12ff:fe41:f3e7/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:386 errors:19 dropped:39 overruns:0 frame:57
TX packets:24 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:39484 (38.5 KiB) TX bytes:2880 (2.8 KiB)

It doesn't appear that this is a new issue as it exists on also on
older kernels.

Note that the IMX6 has an errata (ERR004512) [2] that limits the
theoretical max performance of the FEC to 470mbps (total TX+RX) and if
the TX and RK peak datarate is higher than ~400mps there is a risk of
ENET RX FIFO overrun but I don't think this is the issue here. It
would be the cause of the relatively low throughput of ~150 TX though
I would assume.

Best Regards,

Tim

[1] - https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/arm/boot/dts/imx6qdl-gw5904.dtsi
[2] - http://cache.nxp.com/docs/en/errata/IMX6DQCE.pdf - ERR004512


2017-08-30 22:06:34

by Andrew Lunn

[permalink] [raw]
Subject: Re: DSA mv88e6xxx RX frame errors and TCP/IP RX failure

On Wed, Aug 30, 2017 at 12:53:56PM -0700, Tim Harvey wrote:
> Greetings,
>
> I'm seeing RX frame errors when using the mv88e6xxx DSA driver on
> 4.13-rc7. The board I'm using is a GW5904 [1] which has an IMX6 FEC
> MAC (eth0) connected via RGMII to a MV88E6176 with its downstream
> P0/P1/P2/P3 to front panel RJ45's (lan1-lan4).

Hi Tim

Can you confirm the counter is this one:

/* Report late collisions as a frame error. */
if (status & (BD_ENET_RX_NO | BD_ENET_RX_CL))
ndev->stats.rx_frame_errors++;

I don't see anywhere else frame errors are counted, but it would be
good to prove we are looking in the right place.

Andrew

2017-08-31 00:22:46

by Tim Harvey

[permalink] [raw]
Subject: Re: DSA mv88e6xxx RX frame errors and TCP/IP RX failure

On Wed, Aug 30, 2017 at 3:06 PM, Andrew Lunn <[email protected]> wrote:
> On Wed, Aug 30, 2017 at 12:53:56PM -0700, Tim Harvey wrote:
>> Greetings,
>>
>> I'm seeing RX frame errors when using the mv88e6xxx DSA driver on
>> 4.13-rc7. The board I'm using is a GW5904 [1] which has an IMX6 FEC
>> MAC (eth0) connected via RGMII to a MV88E6176 with its downstream
>> P0/P1/P2/P3 to front panel RJ45's (lan1-lan4).
>
> Hi Tim
>
> Can you confirm the counter is this one:
>
> /* Report late collisions as a frame error. */
> if (status & (BD_ENET_RX_NO | BD_ENET_RX_CL))
> ndev->stats.rx_frame_errors++;
>
> I don't see anywhere else frame errors are counted, but it would be
> good to prove we are looking in the right place.
>

Andrew,

(adding IMX FEC driver maintainer to CC)

Yes, that's one of them being hit. It looks like ifconfig reports
'frame' as the accumulation of a few stats so here are some more
specifics from /sys/class/net/eth0/statistics:

root@xenial:/sys/devices/soc0/soc/2100000.aips-bus/2188000.ethernet/net/eth0/statistics#
for i in `ls rx_*`; do echo $i:$(cat $i); done
rx_bytes:103229
rx_compressed:0
rx_crc_errors:22
rx_dropped:0
rx_errors:22
rx_fifo_errors:0
rx_frame_errors:22
rx_length_errors:22
rx_missed_errors:0
rx_nohandler:0
rx_over_errors:0
rx_packets:1174
root@xenial:/sys/devices/soc0/soc/2100000.aips-bus/2188000.ethernet/net/eth0/statistics#
ifconfig eth0
eth0 Link encap:Ethernet HWaddr 00:D0:12:41:F3:E7
inet6 addr: fe80::2d0:12ff:fe41:f3e7/64 Scope:Link
UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
RX packets:1207 errors:22 dropped:0 overruns:0 frame:66
TX packets:42 errors:0 dropped:0 overruns:0 carrier:0
collisions:0 txqueuelen:1000
RX bytes:106009 (103.5 KiB) TX bytes:4604 (4.4 KiB)

Instrumenting fec driver I see the following getting hit:

status & BD_ENET_RX_LG /* rx_length_errors: Frame too long */
status & BD_ENET_RX_CR /* rx_crc_errors: CRC Error */
status & BD_ENET_RX_CL /* rx_frame_errors: Collision? */

Is this a frame size issue where the MV88E6176 is sending frames down
that exceed the MTU because of headers added?

Tim

2017-08-31 00:38:42

by Ilia Mirkin

[permalink] [raw]
Subject: Re: DSA mv88e6xxx RX frame errors and TCP/IP RX failure

On Wed, Aug 30, 2017 at 8:22 PM, Tim Harvey <[email protected]> wrote:
> On Wed, Aug 30, 2017 at 3:06 PM, Andrew Lunn <[email protected]> wrote:
>> On Wed, Aug 30, 2017 at 12:53:56PM -0700, Tim Harvey wrote:
>>> Greetings,
>>>
>>> I'm seeing RX frame errors when using the mv88e6xxx DSA driver on
>>> 4.13-rc7. The board I'm using is a GW5904 [1] which has an IMX6 FEC
>>> MAC (eth0) connected via RGMII to a MV88E6176 with its downstream
>>> P0/P1/P2/P3 to front panel RJ45's (lan1-lan4).
>>
>> Hi Tim
>>
>> Can you confirm the counter is this one:
>>
>> /* Report late collisions as a frame error. */
>> if (status & (BD_ENET_RX_NO | BD_ENET_RX_CL))
>> ndev->stats.rx_frame_errors++;
>>
>> I don't see anywhere else frame errors are counted, but it would be
>> good to prove we are looking in the right place.
>>
>
> Andrew,
>
> (adding IMX FEC driver maintainer to CC)
>
> Yes, that's one of them being hit. It looks like ifconfig reports
> 'frame' as the accumulation of a few stats so here are some more
> specifics from /sys/class/net/eth0/statistics:
>
> root@xenial:/sys/devices/soc0/soc/2100000.aips-bus/2188000.ethernet/net/eth0/statistics#
> for i in `ls rx_*`; do echo $i:$(cat $i); done
> rx_bytes:103229
> rx_compressed:0
> rx_crc_errors:22
> rx_dropped:0
> rx_errors:22
> rx_fifo_errors:0
> rx_frame_errors:22
> rx_length_errors:22
> rx_missed_errors:0
> rx_nohandler:0
> rx_over_errors:0
> rx_packets:1174
> root@xenial:/sys/devices/soc0/soc/2100000.aips-bus/2188000.ethernet/net/eth0/statistics#
> ifconfig eth0
> eth0 Link encap:Ethernet HWaddr 00:D0:12:41:F3:E7
> inet6 addr: fe80::2d0:12ff:fe41:f3e7/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:1207 errors:22 dropped:0 overruns:0 frame:66
> TX packets:42 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:106009 (103.5 KiB) TX bytes:4604 (4.4 KiB)
>
> Instrumenting fec driver I see the following getting hit:
>
> status & BD_ENET_RX_LG /* rx_length_errors: Frame too long */
> status & BD_ENET_RX_CR /* rx_crc_errors: CRC Error */
> status & BD_ENET_RX_CL /* rx_frame_errors: Collision? */
>
> Is this a frame size issue where the MV88E6176 is sending frames down
> that exceed the MTU because of headers added?

Not sure if this is relevant to you, but
https://github.com/laanwj/linux-freedreno-a2xx/commit/076b6542fa27499072ec6c3a7941c8b3c79ba1fd
was necessary to fix some MTU issues on a i.MX51. Not sure if it's
upstream yet or not.

Cheers,

-ilia

2017-08-31 01:46:28

by Andrew Lunn

[permalink] [raw]
Subject: Re: DSA mv88e6xxx RX frame errors and TCP/IP RX failure

> > /* Report late collisions as a frame error. */
> > if (status & (BD_ENET_RX_NO | BD_ENET_RX_CL))
> > ndev->stats.rx_frame_errors++;
> >
> > I don't see anywhere else frame errors are counted, but it would be
> > good to prove we are looking in the right place.
> >
>
> Andrew,
>
> (adding IMX FEC driver maintainer to CC)
>
> Yes, that's one of them being hit. It looks like ifconfig reports
> 'frame' as the accumulation of a few stats so here are some more
> specifics from /sys/class/net/eth0/statistics:
>
> root@xenial:/sys/devices/soc0/soc/2100000.aips-bus/2188000.ethernet/net/eth0/statistics#
> for i in `ls rx_*`; do echo $i:$(cat $i); done
> rx_bytes:103229
> rx_compressed:0
> rx_crc_errors:22
> rx_dropped:0
> rx_errors:22
> rx_fifo_errors:0
> rx_frame_errors:22
> rx_length_errors:22
> rx_missed_errors:0
> rx_nohandler:0
> rx_over_errors:0
> rx_packets:1174
> root@xenial:/sys/devices/soc0/soc/2100000.aips-bus/2188000.ethernet/net/eth0/statistics#
> ifconfig eth0
> eth0 Link encap:Ethernet HWaddr 00:D0:12:41:F3:E7
> inet6 addr: fe80::2d0:12ff:fe41:f3e7/64 Scope:Link
> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
> RX packets:1207 errors:22 dropped:0 overruns:0 frame:66
> TX packets:42 errors:0 dropped:0 overruns:0 carrier:0
> collisions:0 txqueuelen:1000
> RX bytes:106009 (103.5 KiB) TX bytes:4604 (4.4 KiB)
>
> Instrumenting fec driver I see the following getting hit:
>
> status & BD_ENET_RX_LG /* rx_length_errors: Frame too long */
> status & BD_ENET_RX_CR /* rx_crc_errors: CRC Error */
> status & BD_ENET_RX_CL /* rx_frame_errors: Collision? */
>
> Is this a frame size issue where the MV88E6176 is sending frames down
> that exceed the MTU because of headers added?

I did fix an issue recently with that. See

commit fbbeefdd21049fcf9437c809da3828b210577f36
Author: Andrew Lunn <[email protected]>
Date: Sun Jul 30 19:36:05 2017 +0200

net: fec: Allow reception of frames bigger than 1522 bytes

The FEC Receive Control Register has a 14 bit field indicating the
longest frame that may be received. It is being set to 1522. Frames
longer than this are discarded, but counted as being in error.

When using DSA, frames from the switch has an additional header,
either 4 or 8 bytes if a Marvell switch is used. Thus a full MTU frame
of 1522 bytes received by the switch on a port becomes 1530 bytes when
passed to the host via the FEC interface.

Change the maximum receive size to 2048 - 64, where 64 is the maximum
rx_alignment applied on the receive buffer for AVB capable FEC
cores. Use this value also for the maximum receive buffer size. The
driver is already allocating a receive SKB of 2048 bytes, so this
change should not have any significant effects.

Tested on imx51, imx6, vf610.

Signed-off-by: Andrew Lunn <[email protected]>
Signed-off-by: David S. Miller <[email protected]>


However, this is was of an all/nothing problem. All frames with the
full MTU were getting dropped, where as i think you are only seeing a
few dropped?

Anyway, try cherry picking that patch and see if it helps.

Andrew

2017-08-31 09:21:00

by Maxim Uvarov

[permalink] [raw]
Subject: Re: DSA mv88e6xxx RX frame errors and TCP/IP RX failure

check with ping -s 1500 that packets passed to cpu. mv88e6xxx add
additional dsa tag before the frame so small packets can pass and big
rejected.

also ethtool -S dsaethdevX shows more details stats for Marvell chips.

or opposite, lower mtu on cpu to 1400 and check that ping works. So
from description of patch above it has to work.

Maxim.

2017-08-31 4:46 GMT+03:00 Andrew Lunn <[email protected]>:
>> > /* Report late collisions as a frame error. */
>> > if (status & (BD_ENET_RX_NO | BD_ENET_RX_CL))
>> > ndev->stats.rx_frame_errors++;
>> >
>> > I don't see anywhere else frame errors are counted, but it would be
>> > good to prove we are looking in the right place.
>> >
>>
>> Andrew,
>>
>> (adding IMX FEC driver maintainer to CC)
>>
>> Yes, that's one of them being hit. It looks like ifconfig reports
>> 'frame' as the accumulation of a few stats so here are some more
>> specifics from /sys/class/net/eth0/statistics:
>>
>> root@xenial:/sys/devices/soc0/soc/2100000.aips-bus/2188000.ethernet/net/eth0/statistics#
>> for i in `ls rx_*`; do echo $i:$(cat $i); done
>> rx_bytes:103229
>> rx_compressed:0
>> rx_crc_errors:22
>> rx_dropped:0
>> rx_errors:22
>> rx_fifo_errors:0
>> rx_frame_errors:22
>> rx_length_errors:22
>> rx_missed_errors:0
>> rx_nohandler:0
>> rx_over_errors:0
>> rx_packets:1174
>> root@xenial:/sys/devices/soc0/soc/2100000.aips-bus/2188000.ethernet/net/eth0/statistics#
>> ifconfig eth0
>> eth0 Link encap:Ethernet HWaddr 00:D0:12:41:F3:E7
>> inet6 addr: fe80::2d0:12ff:fe41:f3e7/64 Scope:Link
>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>> RX packets:1207 errors:22 dropped:0 overruns:0 frame:66
>> TX packets:42 errors:0 dropped:0 overruns:0 carrier:0
>> collisions:0 txqueuelen:1000
>> RX bytes:106009 (103.5 KiB) TX bytes:4604 (4.4 KiB)
>>
>> Instrumenting fec driver I see the following getting hit:
>>
>> status & BD_ENET_RX_LG /* rx_length_errors: Frame too long */
>> status & BD_ENET_RX_CR /* rx_crc_errors: CRC Error */
>> status & BD_ENET_RX_CL /* rx_frame_errors: Collision? */
>>
>> Is this a frame size issue where the MV88E6176 is sending frames down
>> that exceed the MTU because of headers added?
>
> I did fix an issue recently with that. See
>
> commit fbbeefdd21049fcf9437c809da3828b210577f36
> Author: Andrew Lunn <[email protected]>
> Date: Sun Jul 30 19:36:05 2017 +0200
>
> net: fec: Allow reception of frames bigger than 1522 bytes
>
> The FEC Receive Control Register has a 14 bit field indicating the
> longest frame that may be received. It is being set to 1522. Frames
> longer than this are discarded, but counted as being in error.
>
> When using DSA, frames from the switch has an additional header,
> either 4 or 8 bytes if a Marvell switch is used. Thus a full MTU frame
> of 1522 bytes received by the switch on a port becomes 1530 bytes when
> passed to the host via the FEC interface.
>
> Change the maximum receive size to 2048 - 64, where 64 is the maximum
> rx_alignment applied on the receive buffer for AVB capable FEC
> cores. Use this value also for the maximum receive buffer size. The
> driver is already allocating a receive SKB of 2048 bytes, so this
> change should not have any significant effects.
>
> Tested on imx51, imx6, vf610.
>
> Signed-off-by: Andrew Lunn <[email protected]>
> Signed-off-by: David S. Miller <[email protected]>
>
>
> However, this is was of an all/nothing problem. All frames with the
> full MTU were getting dropped, where as i think you are only seeing a
> few dropped?
>
> Anyway, try cherry picking that patch and see if it helps.
>
> Andrew



--
Best regards,
Maxim Uvarov

2017-08-31 15:37:58

by Tim Harvey

[permalink] [raw]
Subject: Re: DSA mv88e6xxx RX frame errors and TCP/IP RX failure

On Wed, Aug 30, 2017 at 6:46 PM, Andrew Lunn <[email protected]> wrote:
>> > /* Report late collisions as a frame error. */
>> > if (status & (BD_ENET_RX_NO | BD_ENET_RX_CL))
>> > ndev->stats.rx_frame_errors++;
>> >
>> > I don't see anywhere else frame errors are counted, but it would be
>> > good to prove we are looking in the right place.
>> >
>>
>> Andrew,
>>
>> (adding IMX FEC driver maintainer to CC)
>>
>> Yes, that's one of them being hit. It looks like ifconfig reports
>> 'frame' as the accumulation of a few stats so here are some more
>> specifics from /sys/class/net/eth0/statistics:
>>
>> root@xenial:/sys/devices/soc0/soc/2100000.aips-bus/2188000.ethernet/net/eth0/statistics#
>> for i in `ls rx_*`; do echo $i:$(cat $i); done
>> rx_bytes:103229
>> rx_compressed:0
>> rx_crc_errors:22
>> rx_dropped:0
>> rx_errors:22
>> rx_fifo_errors:0
>> rx_frame_errors:22
>> rx_length_errors:22
>> rx_missed_errors:0
>> rx_nohandler:0
>> rx_over_errors:0
>> rx_packets:1174
>> root@xenial:/sys/devices/soc0/soc/2100000.aips-bus/2188000.ethernet/net/eth0/statistics#
>> ifconfig eth0
>> eth0 Link encap:Ethernet HWaddr 00:D0:12:41:F3:E7
>> inet6 addr: fe80::2d0:12ff:fe41:f3e7/64 Scope:Link
>> UP BROADCAST RUNNING MULTICAST MTU:1500 Metric:1
>> RX packets:1207 errors:22 dropped:0 overruns:0 frame:66
>> TX packets:42 errors:0 dropped:0 overruns:0 carrier:0
>> collisions:0 txqueuelen:1000
>> RX bytes:106009 (103.5 KiB) TX bytes:4604 (4.4 KiB)
>>
>> Instrumenting fec driver I see the following getting hit:
>>
>> status & BD_ENET_RX_LG /* rx_length_errors: Frame too long */
>> status & BD_ENET_RX_CR /* rx_crc_errors: CRC Error */
>> status & BD_ENET_RX_CL /* rx_frame_errors: Collision? */
>>
>> Is this a frame size issue where the MV88E6176 is sending frames down
>> that exceed the MTU because of headers added?
>
> I did fix an issue recently with that. See
>
> commit fbbeefdd21049fcf9437c809da3828b210577f36
> Author: Andrew Lunn <[email protected]>
> Date: Sun Jul 30 19:36:05 2017 +0200
>
> net: fec: Allow reception of frames bigger than 1522 bytes
>
> The FEC Receive Control Register has a 14 bit field indicating the
> longest frame that may be received. It is being set to 1522. Frames
> longer than this are discarded, but counted as being in error.
>
> When using DSA, frames from the switch has an additional header,
> either 4 or 8 bytes if a Marvell switch is used. Thus a full MTU frame
> of 1522 bytes received by the switch on a port becomes 1530 bytes when
> passed to the host via the FEC interface.
>
> Change the maximum receive size to 2048 - 64, where 64 is the maximum
> rx_alignment applied on the receive buffer for AVB capable FEC
> cores. Use this value also for the maximum receive buffer size. The
> driver is already allocating a receive SKB of 2048 bytes, so this
> change should not have any significant effects.
>
> Tested on imx51, imx6, vf610.
>
> Signed-off-by: Andrew Lunn <[email protected]>
> Signed-off-by: David S. Miller <[email protected]>
>
>
> However, this is was of an all/nothing problem. All frames with the
> full MTU were getting dropped, where as i think you are only seeing a
> few dropped?
>

Andrew,

Indeed this does resolve the issue. I see a burst of FIFO overruns
initially when receiving an iperf bandwidth test but that would be
caused by the IMX6 errata and should be mitigated via pause frames.
After that short burst I see no other errors and iperf works fine.

Should we get this patch in the linux-stable tree for the 4.9 kernel?

Thanks!

Tim

2017-08-31 17:41:50

by Andrew Lunn

[permalink] [raw]
Subject: Re: DSA mv88e6xxx RX frame errors and TCP/IP RX failure

> > I did fix an issue recently with that. See
> >
> > commit fbbeefdd21049fcf9437c809da3828b210577f36
> > Author: Andrew Lunn <[email protected]>
> > Date: Sun Jul 30 19:36:05 2017 +0200
> >
> > net: fec: Allow reception of frames bigger than 1522 bytes
> >
> > The FEC Receive Control Register has a 14 bit field indicating the
> > longest frame that may be received. It is being set to 1522. Frames
> > longer than this are discarded, but counted as being in error.
> >
> > When using DSA, frames from the switch has an additional header,
> > either 4 or 8 bytes if a Marvell switch is used. Thus a full MTU frame
> > of 1522 bytes received by the switch on a port becomes 1530 bytes when
> > passed to the host via the FEC interface.
> >
> > Change the maximum receive size to 2048 - 64, where 64 is the maximum
> > rx_alignment applied on the receive buffer for AVB capable FEC
> > cores. Use this value also for the maximum receive buffer size. The
> > driver is already allocating a receive SKB of 2048 bytes, so this
> > change should not have any significant effects.
> >
> > Tested on imx51, imx6, vf610.
> >
> > Signed-off-by: Andrew Lunn <[email protected]>
> > Signed-off-by: David S. Miller <[email protected]>
> >
> >
> > However, this is was of an all/nothing problem. All frames with the
> > full MTU were getting dropped, where as i think you are only seeing a
> > few dropped?
> >
>
> Andrew,
>
> Indeed this does resolve the issue. I see a burst of FIFO overruns
> initially when receiving an iperf bandwidth test but that would be
> caused by the IMX6 errata and should be mitigated via pause frames.
> After that short burst I see no other errors and iperf works fine.
>
> Should we get this patch in the linux-stable tree for the 4.9 kernel?

Hi David

Please could you add

fbbeefdd2104 ("net: fec: Allow reception of frames bigger than 1522 bytes")

to stable.

Thanks
Andrew

2017-08-31 17:44:38

by David Miller

[permalink] [raw]
Subject: Re: DSA mv88e6xxx RX frame errors and TCP/IP RX failure

From: Andrew Lunn <[email protected]>
Date: Thu, 31 Aug 2017 19:41:41 +0200

> Please could you add
>
> fbbeefdd2104 ("net: fec: Allow reception of frames bigger than 1522 bytes")
>
> to stable.

Queued up, thanks.