Hi everyone,
for a vanilla kernel version 6.5.2 I observed the following behaviour on an
i.MX8MP-EVK:
root@<redacted>:~# ethtool -s eth1 autoneg on speed 100 duplex full
root@<redacted>:~# ethtool --show-eee eth1
EEE settings for eth1:
EEE status: enabled - inactive
Tx LPI: disabled
Supported EEE link modes: 100baseT/Full
1000baseT/Full
Advertised EEE link modes: 100baseT/Full
Link partner advertised EEE link modes: Not reported
root@<redacted>:~# ip link add link eth1 name eqos.5 type vlan id 5
RTNETLINK answers: Device or resource busy
root@<redacted>:~# dmesg | tail -n 1
[ 819.085069] imx-dwmac 30bf0000.ethernet eth1: Timeout accessing
MAC_VLAN_Tag_Filter
root@<redacted>:~# ip link show dev eqos.5@eth1
Device "eqos.5@eth1" does not exist.
root@<redacted>:~# ethtool --set-eee eth1 eee off
root@<redacted>:~# ethtool --show-eee eth1
EEE settings for eth1:
EEE status: disabled
Tx LPI: disabled
Supported EEE link modes: 100baseT/Full
1000baseT/Full
Advertised EEE link modes: Not reported
Link partner advertised EEE link modes: Not reported
root@<redacted>:~# ip link add link eth1 name eqos.5 type vlan id 5
root@<redacted>:~# ip link show dev eqos.5
5: eqos.5@eth1: <BROADCAST,MULTICAST> mtu 1500 qdisc noop state DOWN mode
DEFAULT group default qlen 1000
link/ether 00:04:9f:07:9c:42 brd ff:ff:ff:ff:ff:ff
The same holds for removing VLANs when EEE is enabled:
(after reboot)
root@<redacted>:~# ethtool --set-eee eth1 eee off
root@<redacted>:~# ip link add link eth1 name eqos.5 type vlan id 5
root@<redacted>:~# ethtool --set-eee eth1 eee on
root@<redacted>:~# ip link del link eth1 name eqos.5 type vlan id 5
root@beluga-1311a8001168e9dc:~# dmesg | tail -n2
[ 240.918085] imx-dwmac 30bf0000.ethernet eth1: Timeout accessing
MAC_VLAN_Tag_Filter
[ 240.925827] imx-dwmac 30bf0000.ethernet eth1: failed to kill vid 0081/5
Which is even a bit more concerning, because there is no error reported to
userspace, only a netdev_err print to the kernel log
In my debugging session I found that this behaviour is only linked to EEE being
enabled or disabled.
On 1Gbps links, the eee-broken-1000t property is set for the ethphy node, which
is why the behaviour usually does not occur for 1GBps (which is probably the
most common usecase).
Maybe someone on this list has more insight in the inner workings of the
dwmac/stmmac/eqos and could point out how to fix this issue, I'd be happy to
send patches and fix it. Also, maybe someone has other implementations at hand
and can check if this can be reproduced
Do you deem disabling EEE while setting the VLAN up a valid workaround or
should we rather add a warning when the timeout occurs and EEE is still enabled?
Best regards
Johannes
--
Pengutronix e.K. | Johannes Zink |
Steuerwalder Str. 21 | https://www.pengutronix.de/ |
31137 Hildesheim, Germany | Phone: +49-5121-206917-0 |
Amtsgericht Hildesheim, HRA 2686| Fax: +49-5121-206917-5555 |