Frank reports that in a mt7530 setup where some ports are standalone and
some are in a VLAN-aware bridge, 8021q uppers of the standalone ports
lose their VLAN tag on xmit, as seen by the link partner.
This seems to occur because once the other ports join the VLAN-aware
bridge, mt7530_port_vlan_filtering() also calls
mt7530_port_set_vlan_aware(ds, cpu_dp->index), and this affects the way
that the switch processes the traffic of the standalone port.
Relevant is the PVC_EG_TAG bit. The MT7530 documentation says about it:
EG_TAG: Incoming Port Egress Tag VLAN Attribution
0: disabled (system default)
1: consistent (keep the original ingress tag attribute)
My interpretation is that this setting applies on the ingress port, and
"disabled" is basically the normal behavior, where the egress tag format
of the packet (tagged or untagged) is decided by the VLAN table
(MT7530_VLAN_EGRESS_UNTAG or MT7530_VLAN_EGRESS_TAG).
But there is also an option of overriding the system default behavior,
and for the egress tagging format of packets to be decided not by the
VLAN table, but simply by copying the ingress tag format (if ingress was
tagged, egress is tagged; if ingress was untagged, egress is untagged;
aka "consistent). This is useful in 2 scenarios:
- VLAN-unaware bridge ports will always encounter a miss in the VLAN
table. They should forward a packet as-is, though. So we use
"consistent" there. See commit e045124e9399 ("net: dsa: mt7530: fix
tagged frames pass-through in VLAN-unaware mode").
- Traffic injected from the CPU port. The operating system is in god
mode; if it wants a packet to exit as VLAN-tagged, it sends it as
VLAN-tagged. Otherwise it sends it as VLAN-untagged*.
*This is true only if we don't consider the bridge TX forwarding offload
feature, which mt7530 doesn't support.
So for now, make the CPU port always stay in "consistent" mode to allow
software VLANs to be forwarded to their egress ports with the VLAN tag
intact, and not stripped.
Link: https://lore.kernel.org/netdev/trinity-e6294d28-636c-4c40-bb8b-b523521b00be-1674233135062@3c-app-gmx-bs36/
Fixes: e045124e9399 ("net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode")
Reported-by: Frank Wunderlich <[email protected]>
Tested-by: Frank Wunderlich <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/dsa/mt7530.c | 26 +++++++++++++++++++-------
1 file changed, 19 insertions(+), 7 deletions(-)
diff --git a/drivers/net/dsa/mt7530.c b/drivers/net/dsa/mt7530.c
index 616b21c90d05..3a15015bc409 100644
--- a/drivers/net/dsa/mt7530.c
+++ b/drivers/net/dsa/mt7530.c
@@ -1302,14 +1302,26 @@ mt7530_port_set_vlan_aware(struct dsa_switch *ds, int port)
if (!priv->ports[port].pvid)
mt7530_rmw(priv, MT7530_PVC_P(port), ACC_FRM_MASK,
MT7530_VLAN_ACC_TAGGED);
- }
- /* Set the port as a user port which is to be able to recognize VID
- * from incoming packets before fetching entry within the VLAN table.
- */
- mt7530_rmw(priv, MT7530_PVC_P(port), VLAN_ATTR_MASK | PVC_EG_TAG_MASK,
- VLAN_ATTR(MT7530_VLAN_USER) |
- PVC_EG_TAG(MT7530_VLAN_EG_DISABLED));
+ /* Set the port as a user port which is to be able to recognize
+ * VID from incoming packets before fetching entry within the
+ * VLAN table.
+ */
+ mt7530_rmw(priv, MT7530_PVC_P(port),
+ VLAN_ATTR_MASK | PVC_EG_TAG_MASK,
+ VLAN_ATTR(MT7530_VLAN_USER) |
+ PVC_EG_TAG(MT7530_VLAN_EG_DISABLED));
+ } else {
+ /* Also set CPU ports to the "user" VLAN port attribute, to
+ * allow VLAN classification, but keep the EG_TAG attribute as
+ * "consistent" (i.o.w. don't change its value) for packets
+ * received by the switch from the CPU, so that tagged packets
+ * are forwarded to user ports as tagged, and untagged as
+ * untagged.
+ */
+ mt7530_rmw(priv, MT7530_PVC_P(port), VLAN_ATTR_MASK,
+ VLAN_ATTR(MT7530_VLAN_USER));
+ }
}
static void
--
2.34.1
On 5.02.2023 17:07, Vladimir Oltean wrote:
> Frank reports that in a mt7530 setup where some ports are standalone and
> some are in a VLAN-aware bridge, 8021q uppers of the standalone ports
> lose their VLAN tag on xmit, as seen by the link partner.
>
> This seems to occur because once the other ports join the VLAN-aware
> bridge, mt7530_port_vlan_filtering() also calls
> mt7530_port_set_vlan_aware(ds, cpu_dp->index), and this affects the way
> that the switch processes the traffic of the standalone port.
>
> Relevant is the PVC_EG_TAG bit. The MT7530 documentation says about it:
>
> EG_TAG: Incoming Port Egress Tag VLAN Attribution
> 0: disabled (system default)
> 1: consistent (keep the original ingress tag attribute)
>
> My interpretation is that this setting applies on the ingress port, and
> "disabled" is basically the normal behavior, where the egress tag format
> of the packet (tagged or untagged) is decided by the VLAN table
> (MT7530_VLAN_EGRESS_UNTAG or MT7530_VLAN_EGRESS_TAG).
>
> But there is also an option of overriding the system default behavior,
> and for the egress tagging format of packets to be decided not by the
> VLAN table, but simply by copying the ingress tag format (if ingress was
> tagged, egress is tagged; if ingress was untagged, egress is untagged;
> aka "consistent). This is useful in 2 scenarios:
>
> - VLAN-unaware bridge ports will always encounter a miss in the VLAN
> table. They should forward a packet as-is, though. So we use
> "consistent" there. See commit e045124e9399 ("net: dsa: mt7530: fix
> tagged frames pass-through in VLAN-unaware mode").
>
> - Traffic injected from the CPU port. The operating system is in god
> mode; if it wants a packet to exit as VLAN-tagged, it sends it as
> VLAN-tagged. Otherwise it sends it as VLAN-untagged*.
>
> *This is true only if we don't consider the bridge TX forwarding offload
> feature, which mt7530 doesn't support.
>
> So for now, make the CPU port always stay in "consistent" mode to allow
> software VLANs to be forwarded to their egress ports with the VLAN tag
> intact, and not stripped.
>
> Link: https://lore.kernel.org/netdev/trinity-e6294d28-636c-4c40-bb8b-b523521b00be-1674233135062@3c-app-gmx-bs36/
> Fixes: e045124e9399 ("net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode")
> Reported-by: Frank Wunderlich <[email protected]>
> Tested-by: Frank Wunderlich <[email protected]>
> Signed-off-by: Vladimir Oltean <[email protected]>
Tested on MT7621AT and MT7623NI boards with MT7530 switch. Both had this
issue and this patch fixes it.
Tested-by: Arınç ÜNAL <[email protected]>
Unrelated to this, as in it existed before this patch, port@0 hasn't
been working at all on my MT7621AT Unielec U7621-06 board and MT7623NI
Bananapi BPI-R2.
Packets are sent out from master eth1 fine, the computer receives them.
Frames are received on eth1 but nothing shows on the DSA slave interface
of port@0. Sounds like malformed frames are received on eth1.
Cheers.
Arınç
Hi Arınç,
On Sun, Feb 05, 2023 at 10:25:27PM +0300, Arınç ÜNAL wrote:
> Unrelated to this, as in it existed before this patch, port@0 hasn't been
> working at all on my MT7621AT Unielec U7621-06 board and MT7623NI Bananapi
> BPI-R2.
>
> Packets are sent out from master eth1 fine, the computer receives them.
> Frames are received on eth1 but nothing shows on the DSA slave interface of
> port@0. Sounds like malformed frames are received on eth1.
I need to ask, how do the packets look like on the RX path of the DSA
master, as seen by tcpdump -i eth1 -e -n -Q in -XX? If they aren't
received, can you post consecutive outputs from ethtool -S eth1 | grep -v ': 0',
to see what (error) counter increments?
Hey Vladimir,
On 5.02.2023 23:39, Vladimir Oltean wrote:
> Hi Arınç,
>
> On Sun, Feb 05, 2023 at 10:25:27PM +0300, Arınç ÜNAL wrote:
>> Unrelated to this, as in it existed before this patch, port@0 hasn't been
>> working at all on my MT7621AT Unielec U7621-06 board and MT7623NI Bananapi
>> BPI-R2.
>>
>> Packets are sent out from master eth1 fine, the computer receives them.
>> Frames are received on eth1 but nothing shows on the DSA slave interface of
>> port@0. Sounds like malformed frames are received on eth1.
>
> I need to ask, how do the packets look like on the RX path of the DSA
> master, as seen by tcpdump -i eth1 -e -n -Q in -XX? If they aren't
> received, can you post consecutive outputs from ethtool -S eth1 | grep -v ': 0',
> to see what (error) counter increments?
I appreciate your effort on this. I've put it in the attachments to
avoid column limit on the Thunderbird mail client. Ping runs on the
device. Packet capture on the other side is attached.
Arınç
On Mon, Feb 06, 2023 at 02:02:48AM +0300, Arınç ÜNAL wrote:
> # ethtool -S eth1 | grep -v ': 0'
> NIC statistics:
> tx_bytes: 6272
> tx_packets: 81
> rx_bytes: 9089
> rx_packets: 136
> p05_TxUnicast: 52
> p05_TxMulticast: 3
> p05_TxBroadcast: 81
> p05_TxPktSz65To127: 136
> p05_TxBytes: 9633
> p05_RxFiltering: 11
> p05_RxUnicast: 11
> p05_RxMulticast: 26
> p05_RxBroadcast: 44
> p05_RxPktSz64: 47
> p05_RxPktSz65To127: 34
> p05_RxBytes: 6272
> # ethtool -S eth1 | grep -v ': 0'
> NIC statistics:
> tx_bytes: 6784
> tx_packets: 89
> rx_bytes: 9601
> rx_packets: 144
> p05_TxUnicast: 60
> p05_TxMulticast: 3
> p05_TxBroadcast: 81
> p05_TxPktSz65To127: 144
> p05_TxBytes: 10177
> p05_RxFiltering: 11
> p05_RxUnicast: 11
> p05_RxMulticast: 26
> p05_RxBroadcast: 52
> p05_RxPktSz64: 55
> p05_RxPktSz65To127: 34
> p05_RxBytes: 6784
> # ethtool -S eth1 | grep -v ': 0'
> NIC statistics:
> tx_bytes: 7424
> tx_packets: 99
> rx_bytes: 10241
> rx_packets: 154
> p05_TxUnicast: 70
> p05_TxMulticast: 3
> p05_TxBroadcast: 81
> p05_TxPktSz65To127: 154
> p05_TxBytes: 10857
> p05_RxFiltering: 11
> p05_RxUnicast: 11
> p05_RxMulticast: 26
> p05_RxBroadcast: 62
> p05_RxPktSz64: 65
> p05_RxPktSz65To127: 34
> p05_RxBytes: 7424
I see no signs of packet loss on the DSA master or the CPU port.
However my analysis of the packets shows:
> # tcpdump -i eth1 -e -n -Q in -XX
> tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
> listening on eth1, link-type NULL (BSD loopback), snapshot length 262144 bytes
> 03:50:38.645568 AF Unknown (2459068999), length 60:
> 0x0000: 9292 6a47 1ac0 e0d5 5ea4 edcc 0806 0001 ..jG....^.......
^ ^ ^
| | |
| | ETH_P_ARP
| MAC SA:
| e0:d5:5e:a4:ed:cc
MAC DA:
92:92:6a:47:1a:c0
> 0x0010: 0800 0604 0002 e0d5 5ea4 edcc c0a8 0202 ........^.......
> 0x0020: 9292 6a47 1ac0 c0a8 0201 0000 0000 0000 ..jG............
> 0x0030: 0000 0000 0000 0000 0000 0000 ............
So you have no tag_mtk header in the EtherType position where it's
supposed to be. This means you must be making use of the hardware DSA
untagging feature that Felix Fietkau added.
Let's do some debugging. I'd like to know 2 things, in this order.
First, whether DSA sees the accelerated header (stripped by hardware, as
opposed to being present in the packet):
diff --git a/net/dsa/tag.c b/net/dsa/tag.c
index b2fba1a003ce..e64628cf7fc1 100644
--- a/net/dsa/tag.c
+++ b/net/dsa/tag.c
@@ -75,12 +75,17 @@ static int dsa_switch_rcv(struct sk_buff *skb, struct net_device *dev,
if (!skb_has_extensions(skb))
skb->slow_gro = 0;
+ netdev_err(dev, "%s: skb %px metadata dst contains port id %d attached\n",
+ __func__, skb, port);
+
skb->dev = dsa_master_find_slave(dev, 0, port);
if (likely(skb->dev)) {
dsa_default_offload_fwd_mark(skb);
nskb = skb;
}
} else {
+ netdev_err(dev, "%s: there is no metadata dst attached to skb 0x%px\n",
+ __func__, skb);
nskb = cpu_dp->rcv(skb, dev);
}
And second, which is what does the DSA master actually see, and put in
the skb metadata dst field:
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index f1cb1efc94cf..e7ff569959b4 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -2077,11 +2077,23 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
if (skb_vlan_tag_present(skb) && netdev_uses_dsa(netdev)) {
unsigned int port = ntohs(skb->vlan_proto) & GENMASK(2, 0);
+ netdev_err(netdev, "%s: skb->vlan_proto 0x%x port %d\n", __func__,
+ ntohs(skb->vlan_proto), port);
+
if (port < ARRAY_SIZE(eth->dsa_meta) &&
- eth->dsa_meta[port])
+ eth->dsa_meta[port]) {
+ netdev_err(netdev, "%s: attaching metadata dst with port %d to skb 0x%px\n",
+ __func__, port, skb);
skb_dst_set_noref(skb, ð->dsa_meta[port]->dst);
+ } else {
+ netdev_err(netdev, "%s: not attaching any metadata dst to skb 0x%px\n",
+ __func__, skb);
+ }
__vlan_hwaccel_clear_tag(skb);
+ } else if (netdev_uses_dsa(netdev)) {
+ netdev_err(netdev, "%s: received skb 0x%px without VLAN/DSA tag present\n",
+ __func__, skb);
}
skb_record_rx_queue(skb, 0);
Be warned that there may be a considerable amount of output to the console,
so it would be best if you used a single switch port with small amounts
of traffic.
Am 6. Februar 2023 00:50:53 MEZ schrieb Vladimir Oltean <[email protected]>:
>On Mon, Feb 06, 2023 at 02:02:48AM +0300, Arınç ÜNAL wrote:
>> # ethtool -S eth1 | grep -v ': 0'
>> NIC statistics:
>> tx_bytes: 6272
>> tx_packets: 81
>> rx_bytes: 9089
>> rx_packets: 136
>> p05_TxUnicast: 52
>> p05_TxMulticast: 3
>> p05_TxBroadcast: 81
>> p05_TxPktSz65To127: 136
>> p05_TxBytes: 9633
>> p05_RxFiltering: 11
>> p05_RxUnicast: 11
>> p05_RxMulticast: 26
>> p05_RxBroadcast: 44
>> p05_RxPktSz64: 47
>> p05_RxPktSz65To127: 34
>> p05_RxBytes: 6272
>> # ethtool -S eth1 | grep -v ': 0'
>> NIC statistics:
>> tx_bytes: 6784
>> tx_packets: 89
>> rx_bytes: 9601
>> rx_packets: 144
>> p05_TxUnicast: 60
>> p05_TxMulticast: 3
>> p05_TxBroadcast: 81
>> p05_TxPktSz65To127: 144
>> p05_TxBytes: 10177
>> p05_RxFiltering: 11
>> p05_RxUnicast: 11
>> p05_RxMulticast: 26
>> p05_RxBroadcast: 52
>> p05_RxPktSz64: 55
>> p05_RxPktSz65To127: 34
>> p05_RxBytes: 6784
>> # ethtool -S eth1 | grep -v ': 0'
>> NIC statistics:
>> tx_bytes: 7424
>> tx_packets: 99
>> rx_bytes: 10241
>> rx_packets: 154
>> p05_TxUnicast: 70
>> p05_TxMulticast: 3
>> p05_TxBroadcast: 81
>> p05_TxPktSz65To127: 154
>> p05_TxBytes: 10857
>> p05_RxFiltering: 11
>> p05_RxUnicast: 11
>> p05_RxMulticast: 26
>> p05_RxBroadcast: 62
>> p05_RxPktSz64: 65
>> p05_RxPktSz65To127: 34
>> p05_RxBytes: 7424
>
>I see no signs of packet loss on the DSA master or the CPU port.
>However my analysis of the packets shows:
>
>> # tcpdump -i eth1 -e -n -Q in -XX
>> tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
>> listening on eth1, link-type NULL (BSD loopback), snapshot length 262144 bytes
>> 03:50:38.645568 AF Unknown (2459068999), length 60:
>> 0x0000: 9292 6a47 1ac0 e0d5 5ea4 edcc 0806 0001 ..jG....^.......
> ^ ^ ^
> | | |
> | | ETH_P_ARP
> | MAC SA:
> | e0:d5:5e:a4:ed:cc
> MAC DA:
> 92:92:6a:47:1a:c0
>
>> 0x0010: 0800 0604 0002 e0d5 5ea4 edcc c0a8 0202 ........^.......
>> 0x0020: 9292 6a47 1ac0 c0a8 0201 0000 0000 0000 ..jG............
>> 0x0030: 0000 0000 0000 0000 0000 0000 ............
>
>So you have no tag_mtk header in the EtherType position where it's
>supposed to be. This means you must be making use of the hardware DSA
>untagging feature that Felix Fietkau added.
>
>Let's do some debugging. I'd like to know 2 things, in this order.
>First, whether DSA sees the accelerated header (stripped by hardware, as
>opposed to being present in the packet):
>
>diff --git a/net/dsa/tag.c b/net/dsa/tag.c
>index b2fba1a003ce..e64628cf7fc1 100644
>--- a/net/dsa/tag.c
>+++ b/net/dsa/tag.c
>@@ -75,12 +75,17 @@ static int dsa_switch_rcv(struct sk_buff *skb, struct net_device *dev,
> if (!skb_has_extensions(skb))
> skb->slow_gro = 0;
>
>+ netdev_err(dev, "%s: skb %px metadata dst contains port id %d attached\n",
>+ __func__, skb, port);
>+
> skb->dev = dsa_master_find_slave(dev, 0, port);
> if (likely(skb->dev)) {
> dsa_default_offload_fwd_mark(skb);
> nskb = skb;
> }
> } else {
>+ netdev_err(dev, "%s: there is no metadata dst attached to skb 0x%px\n",
>+ __func__, skb);
> nskb = cpu_dp->rcv(skb, dev);
> }
>
>
>And second, which is what does the DSA master actually see, and put in
>the skb metadata dst field:
>
>diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
>index f1cb1efc94cf..e7ff569959b4 100644
>--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
>+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
>@@ -2077,11 +2077,23 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
> if (skb_vlan_tag_present(skb) && netdev_uses_dsa(netdev)) {
> unsigned int port = ntohs(skb->vlan_proto) & GENMASK(2, 0);
>
>+ netdev_err(netdev, "%s: skb->vlan_proto 0x%x port %d\n", __func__,
>+ ntohs(skb->vlan_proto), port);
>+
> if (port < ARRAY_SIZE(eth->dsa_meta) &&
>- eth->dsa_meta[port])
>+ eth->dsa_meta[port]) {
>+ netdev_err(netdev, "%s: attaching metadata dst with port %d to skb 0x%px\n",
>+ __func__, port, skb);
> skb_dst_set_noref(skb, ð->dsa_meta[port]->dst);
>+ } else {
>+ netdev_err(netdev, "%s: not attaching any metadata dst to skb 0x%px\n",
>+ __func__, skb);
>+ }
>
> __vlan_hwaccel_clear_tag(skb);
>+ } else if (netdev_uses_dsa(netdev)) {
>+ netdev_err(netdev, "%s: received skb 0x%px without VLAN/DSA tag present\n",
>+ __func__, skb);
> }
>
> skb_record_rx_queue(skb, 0);
>
>Be warned that there may be a considerable amount of output to the console,
>so it would be best if you used a single switch port with small amounts
>of traffic.
Arınç have you tested with or without this series?
https://patchwork.kernel.org/project/linux-mediatek/list/?series=707666
Maybe try the opposite.
Have not see it in net-next yet.
regards Frank
On 6.02.2023 10:35, Frank Wunderlich wrote:
> Am 6. Februar 2023 00:50:53 MEZ schrieb Vladimir Oltean <[email protected]>:
>> On Mon, Feb 06, 2023 at 02:02:48AM +0300, Arınç ÜNAL wrote:
>>> # ethtool -S eth1 | grep -v ': 0'
>>> NIC statistics:
>>> tx_bytes: 6272
>>> tx_packets: 81
>>> rx_bytes: 9089
>>> rx_packets: 136
>>> p05_TxUnicast: 52
>>> p05_TxMulticast: 3
>>> p05_TxBroadcast: 81
>>> p05_TxPktSz65To127: 136
>>> p05_TxBytes: 9633
>>> p05_RxFiltering: 11
>>> p05_RxUnicast: 11
>>> p05_RxMulticast: 26
>>> p05_RxBroadcast: 44
>>> p05_RxPktSz64: 47
>>> p05_RxPktSz65To127: 34
>>> p05_RxBytes: 6272
>>> # ethtool -S eth1 | grep -v ': 0'
>>> NIC statistics:
>>> tx_bytes: 6784
>>> tx_packets: 89
>>> rx_bytes: 9601
>>> rx_packets: 144
>>> p05_TxUnicast: 60
>>> p05_TxMulticast: 3
>>> p05_TxBroadcast: 81
>>> p05_TxPktSz65To127: 144
>>> p05_TxBytes: 10177
>>> p05_RxFiltering: 11
>>> p05_RxUnicast: 11
>>> p05_RxMulticast: 26
>>> p05_RxBroadcast: 52
>>> p05_RxPktSz64: 55
>>> p05_RxPktSz65To127: 34
>>> p05_RxBytes: 6784
>>> # ethtool -S eth1 | grep -v ': 0'
>>> NIC statistics:
>>> tx_bytes: 7424
>>> tx_packets: 99
>>> rx_bytes: 10241
>>> rx_packets: 154
>>> p05_TxUnicast: 70
>>> p05_TxMulticast: 3
>>> p05_TxBroadcast: 81
>>> p05_TxPktSz65To127: 154
>>> p05_TxBytes: 10857
>>> p05_RxFiltering: 11
>>> p05_RxUnicast: 11
>>> p05_RxMulticast: 26
>>> p05_RxBroadcast: 62
>>> p05_RxPktSz64: 65
>>> p05_RxPktSz65To127: 34
>>> p05_RxBytes: 7424
>>
>> I see no signs of packet loss on the DSA master or the CPU port.
>> However my analysis of the packets shows:
>>
>>> # tcpdump -i eth1 -e -n -Q in -XX
>>> tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
>>> listening on eth1, link-type NULL (BSD loopback), snapshot length 262144 bytes
>>> 03:50:38.645568 AF Unknown (2459068999), length 60:
>>> 0x0000: 9292 6a47 1ac0 e0d5 5ea4 edcc 0806 0001 ..jG....^.......
>> ^ ^ ^
>> | | |
>> | | ETH_P_ARP
>> | MAC SA:
>> | e0:d5:5e:a4:ed:cc
>> MAC DA:
>> 92:92:6a:47:1a:c0
>>
>>> 0x0010: 0800 0604 0002 e0d5 5ea4 edcc c0a8 0202 ........^.......
>>> 0x0020: 9292 6a47 1ac0 c0a8 0201 0000 0000 0000 ..jG............
>>> 0x0030: 0000 0000 0000 0000 0000 0000 ............
>>
>> So you have no tag_mtk header in the EtherType position where it's
>> supposed to be. This means you must be making use of the hardware DSA
>> untagging feature that Felix Fietkau added.
>>
>> Let's do some debugging. I'd like to know 2 things, in this order.
>> First, whether DSA sees the accelerated header (stripped by hardware, as
>> opposed to being present in the packet):
>>
>> diff --git a/net/dsa/tag.c b/net/dsa/tag.c
>> index b2fba1a003ce..e64628cf7fc1 100644
>> --- a/net/dsa/tag.c
>> +++ b/net/dsa/tag.c
>> @@ -75,12 +75,17 @@ static int dsa_switch_rcv(struct sk_buff *skb, struct net_device *dev,
>> if (!skb_has_extensions(skb))
>> skb->slow_gro = 0;
>>
>> + netdev_err(dev, "%s: skb %px metadata dst contains port id %d attached\n",
>> + __func__, skb, port);
>> +
>> skb->dev = dsa_master_find_slave(dev, 0, port);
>> if (likely(skb->dev)) {
>> dsa_default_offload_fwd_mark(skb);
>> nskb = skb;
>> }
>> } else {
>> + netdev_err(dev, "%s: there is no metadata dst attached to skb 0x%px\n",
>> + __func__, skb);
>> nskb = cpu_dp->rcv(skb, dev);
>> }
>>
>>
>> And second, which is what does the DSA master actually see, and put in
>> the skb metadata dst field:
>>
>> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
>> index f1cb1efc94cf..e7ff569959b4 100644
>> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
>> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
>> @@ -2077,11 +2077,23 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
>> if (skb_vlan_tag_present(skb) && netdev_uses_dsa(netdev)) {
>> unsigned int port = ntohs(skb->vlan_proto) & GENMASK(2, 0);
>>
>> + netdev_err(netdev, "%s: skb->vlan_proto 0x%x port %d\n", __func__,
>> + ntohs(skb->vlan_proto), port);
>> +
>> if (port < ARRAY_SIZE(eth->dsa_meta) &&
>> - eth->dsa_meta[port])
>> + eth->dsa_meta[port]) {
>> + netdev_err(netdev, "%s: attaching metadata dst with port %d to skb 0x%px\n",
>> + __func__, port, skb);
>> skb_dst_set_noref(skb, ð->dsa_meta[port]->dst);
>> + } else {
>> + netdev_err(netdev, "%s: not attaching any metadata dst to skb 0x%px\n",
>> + __func__, skb);
>> + }
>>
>> __vlan_hwaccel_clear_tag(skb);
>> + } else if (netdev_uses_dsa(netdev)) {
>> + netdev_err(netdev, "%s: received skb 0x%px without VLAN/DSA tag present\n",
>> + __func__, skb);
>> }
>>
>> skb_record_rx_queue(skb, 0);
>>
>> Be warned that there may be a considerable amount of output to the console,
>> so it would be best if you used a single switch port with small amounts
>> of traffic.
>
> Arınç have you tested with or without this series?
>
> https://patchwork.kernel.org/project/linux-mediatek/list/?series=707666
I'm trying this without it, this is the tree I'm testing.
https://github.com/arinc9/linux/commits/test-for-richard
Arınç
Finally I got time. It's been a seismically active day where I'm from.
On 6.02.2023 02:50, Vladimir Oltean wrote:
> On Mon, Feb 06, 2023 at 02:02:48AM +0300, Arınç ÜNAL wrote:
>> # ethtool -S eth1 | grep -v ': 0'
>> NIC statistics:
>> tx_bytes: 6272
>> tx_packets: 81
>> rx_bytes: 9089
>> rx_packets: 136
>> p05_TxUnicast: 52
>> p05_TxMulticast: 3
>> p05_TxBroadcast: 81
>> p05_TxPktSz65To127: 136
>> p05_TxBytes: 9633
>> p05_RxFiltering: 11
>> p05_RxUnicast: 11
>> p05_RxMulticast: 26
>> p05_RxBroadcast: 44
>> p05_RxPktSz64: 47
>> p05_RxPktSz65To127: 34
>> p05_RxBytes: 6272
>> # ethtool -S eth1 | grep -v ': 0'
>> NIC statistics:
>> tx_bytes: 6784
>> tx_packets: 89
>> rx_bytes: 9601
>> rx_packets: 144
>> p05_TxUnicast: 60
>> p05_TxMulticast: 3
>> p05_TxBroadcast: 81
>> p05_TxPktSz65To127: 144
>> p05_TxBytes: 10177
>> p05_RxFiltering: 11
>> p05_RxUnicast: 11
>> p05_RxMulticast: 26
>> p05_RxBroadcast: 52
>> p05_RxPktSz64: 55
>> p05_RxPktSz65To127: 34
>> p05_RxBytes: 6784
>> # ethtool -S eth1 | grep -v ': 0'
>> NIC statistics:
>> tx_bytes: 7424
>> tx_packets: 99
>> rx_bytes: 10241
>> rx_packets: 154
>> p05_TxUnicast: 70
>> p05_TxMulticast: 3
>> p05_TxBroadcast: 81
>> p05_TxPktSz65To127: 154
>> p05_TxBytes: 10857
>> p05_RxFiltering: 11
>> p05_RxUnicast: 11
>> p05_RxMulticast: 26
>> p05_RxBroadcast: 62
>> p05_RxPktSz64: 65
>> p05_RxPktSz65To127: 34
>> p05_RxBytes: 7424
>
> I see no signs of packet loss on the DSA master or the CPU port.
> However my analysis of the packets shows:
>
>> # tcpdump -i eth1 -e -n -Q in -XX
>> tcpdump: verbose output suppressed, use -v[v]... for full protocol decode
>> listening on eth1, link-type NULL (BSD loopback), snapshot length 262144 bytes
>> 03:50:38.645568 AF Unknown (2459068999), length 60:
>> 0x0000: 9292 6a47 1ac0 e0d5 5ea4 edcc 0806 0001 ..jG....^.......
> ^ ^ ^
> | | |
> | | ETH_P_ARP
> | MAC SA:
> | e0:d5:5e:a4:ed:cc
> MAC DA:
> 92:92:6a:47:1a:c0
>
>> 0x0010: 0800 0604 0002 e0d5 5ea4 edcc c0a8 0202 ........^.......
>> 0x0020: 9292 6a47 1ac0 c0a8 0201 0000 0000 0000 ..jG............
>> 0x0030: 0000 0000 0000 0000 0000 0000 ............
>
> So you have no tag_mtk header in the EtherType position where it's
> supposed to be. This means you must be making use of the hardware DSA
> untagging feature that Felix Fietkau added.
>
> Let's do some debugging. I'd like to know 2 things, in this order.
> First, whether DSA sees the accelerated header (stripped by hardware, as
> opposed to being present in the packet):
>
> diff --git a/net/dsa/tag.c b/net/dsa/tag.c
> index b2fba1a003ce..e64628cf7fc1 100644
> --- a/net/dsa/tag.c
> +++ b/net/dsa/tag.c
> @@ -75,12 +75,17 @@ static int dsa_switch_rcv(struct sk_buff *skb, struct net_device *dev,
> if (!skb_has_extensions(skb))
> skb->slow_gro = 0;
>
> + netdev_err(dev, "%s: skb %px metadata dst contains port id %d attached\n",
> + __func__, skb, port);
> +
> skb->dev = dsa_master_find_slave(dev, 0, port);
> if (likely(skb->dev)) {
> dsa_default_offload_fwd_mark(skb);
> nskb = skb;
> }
> } else {
> + netdev_err(dev, "%s: there is no metadata dst attached to skb 0x%px\n",
> + __func__, skb);
> nskb = cpu_dp->rcv(skb, dev);
> }
>
# ping 192.168.2.2
PING 192.168.2.2[ 39.508013] mtk_soc_eth 1b100000.ethernet eth1:
dsa_switch_rcv: there is no metadata dst attached to skb 0xc2dfecc0
(192.168.2.2): 56 data bytes
[ 40.558253] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: there
is no metadata dst attached to skb 0xc2dfed80
^C
--- 192.168.2.2 ping statistics ---
2 packets transmitted, 0 packets received, 100% packet loss
# [ 41.598312] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv:
there is no metadata dst attached to skb 0xc2dfee40
[ 55.432363] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: there
is no metadata dst attached to skb 0xc2dfef00
[ 56.442233] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: there
is no metadata dst attached to skb 0xc2dfef00
[ 57.466253] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: there
is no metadata dst attached to skb 0xc2dfef00
[ 60.538211] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: there
is no metadata dst attached to skb 0xc2dfef00
[ 61.562191] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: there
is no metadata dst attached to skb 0xc2dfec00
[ 62.586190] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: there
is no metadata dst attached to skb 0xc2dfeb40
On a working port:
[ 113.278462] mt7530 mdio-bus:1f wan: Link is Down
[ 113.283214] br0: port 1(wan) entered disabled state
[ 115.438955] mt7530 mdio-bus:1f lan0: Link is Up - 1Gbps/Full - flow
control off
[ 115.446332] br0: port 2(lan0) entered blocking state
[ 115.451346] br0: port 2(lan0) entered forwarding state
[ 118.007199] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: skb
c2dfeb40 metadata dst contains port id 1 attached
[ 118.018209] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: skb
c2dfeb40 metadata dst contains port id 1 attached
[ 119.009252] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: skb
c2dfed80 metadata dst contains port id 1 attached
[ 120.010470] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: skb
c2dfed80 metadata dst contains port id 1 attached
[ 123.038246] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: skb
c2dfe900 metadata dst contains port id 1 attached
>
> And second, which is what does the DSA master actually see, and put in
> the skb metadata dst field:
>
> diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> index f1cb1efc94cf..e7ff569959b4 100644
> --- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> +++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
> @@ -2077,11 +2077,23 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
> if (skb_vlan_tag_present(skb) && netdev_uses_dsa(netdev)) {
> unsigned int port = ntohs(skb->vlan_proto) & GENMASK(2, 0);
>
> + netdev_err(netdev, "%s: skb->vlan_proto 0x%x port %d\n", __func__,
> + ntohs(skb->vlan_proto), port);
> +
> if (port < ARRAY_SIZE(eth->dsa_meta) &&
> - eth->dsa_meta[port])
> + eth->dsa_meta[port]) {
> + netdev_err(netdev, "%s: attaching metadata dst with port %d to skb 0x%px\n",
> + __func__, port, skb);
> skb_dst_set_noref(skb, ð->dsa_meta[port]->dst);
> + } else {
> + netdev_err(netdev, "%s: not attaching any metadata dst to skb 0x%px\n",
> + __func__, skb);
> + }
>
> __vlan_hwaccel_clear_tag(skb);
> + } else if (netdev_uses_dsa(netdev)) {
> + netdev_err(netdev, "%s: received skb 0x%px without VLAN/DSA tag present\n",
> + __func__, skb);
> }
>
> skb_record_rx_queue(skb, 0);
>
> Be warned that there may be a considerable amount of output to the console,
> so it would be best if you used a single switch port with small amounts
> of traffic.
# ping 192.168.2.2
PING 192.168.2.2[ 22.674182] mtk_soc_eth 1b100000.ethernet eth1:
mtk_poll_rx: received skb 0xc2d67840 without VLAN/DSA tag present
(192.168.2.2): 56 data bytes
[ 23.678336] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx: received
skb 0xc2d67840 without VLAN/DSA tag present
[ 24.718355] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx: received
skb 0xc2d67840 without VLAN/DSA tag present
^C
--- 192.168.2.2 ping statistics ---
4 packets transmitted, 0 packets received, 100% packet loss
# [ 28.757693] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx:
received skb 0xc2d67840 without VLAN/DSA tag present
[ 29.758347] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx: received
skb 0xc2d67840 without VLAN/DSA tag present
[ 30.782404] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx: received
skb 0xc2d67840 without VLAN/DSA tag present
[ 33.854281] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx: received
skb 0xc2d67840 without VLAN/DSA tag present
On a working port:
[ 48.798419] mt7530 mdio-bus:1f wan: Link is Down
[ 48.803166] br0: port 1(wan) entered disabled state
[ 50.958903] mt7530 mdio-bus:1f lan0: Link is Up - 1Gbps/Full - flow
control off
[ 50.966282] br0: port 2(lan0) entered blocking state
[ 50.971300] br0: port 2(lan0) entered forwarding state
[ 54.261846] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx:
skb->vlan_proto 0x1 port 1
[ 54.269905] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx:
attaching metadata dst with port 1 to skb 0xc2d67840
[ 54.280412] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx:
skb->vlan_proto 0x1 port 1
[ 54.288460] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx:
attaching metadata dst with port 1 to skb 0xc2d67840
[ 55.263241] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx:
skb->vlan_proto 0x1 port 1
[ 55.271292] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx:
attaching metadata dst with port 1 to skb 0xc2d67840
[ 59.358317] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx:
skb->vlan_proto 0x1 port 1
[ 59.366361] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx:
attaching metadata dst with port 1 to skb 0xc2d67a80
Arınç
Hi Arınç,
On Mon, Feb 06, 2023 at 07:41:06PM +0300, Arınç ÜNAL wrote:
> Finally I got time. It's been a seismically active day where I'm from.
My deepest condolences to those who experienced tragedies after today's
earthquakes. A lot of people in neighboring countries are horrified
thinking when this will happen to them. Hopefully you aren't living in
Gaziantep or nearby cities.
> # ping 192.168.2.2
> PING 192.168.2.2
> [ 39.508013] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: there is no metadata dst attached to skb 0xc2dfecc0
>
> # ping 192.168.2.2
> PING 192.168.2.2
> [ 22.674182] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx: received skb 0xc2d67840 without VLAN/DSA tag present
Thank you so much for testing. Would you mind cleaning everything up and
testing with this patch instead (formatted on top of net-next)?
Even if you need to adapt to your tree, hopefully you get the idea from
the commit message.
From 218025fd0c33a06865e4202c5170bfc17e26cc75 Mon Sep 17 00:00:00 2001
From: Vladimir Oltean <[email protected]>
Date: Mon, 6 Feb 2023 19:03:53 +0200
Subject: [PATCH] net: ethernet: mtk_eth_soc: fix DSA TX tag hwaccel for switch
port 0
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit
Arınç reports that on his MT7621AT Unielec U7621-06 board and MT7623NI
Bananapi BPI-R2, packets received by the CPU over mt7530 switch port 0
(of which this driver acts as the DSA master) are not processed
correctly by software. More precisely, they arrive without a DSA tag
(in packet or in the hwaccel area - skb_metadata_dst()), so DSA cannot
demux them towards the switch's interface for port 0. Traffic from other
ports receives a skb_metadata_dst() with the correct port and is demuxed
properly.
Looking at mtk_poll_rx(), it becomes apparent that this driver uses the
skb vlan hwaccel area:
union {
u32 vlan_all;
struct {
__be16 vlan_proto;
__u16 vlan_tci;
};
};
as a temporary storage for the VLAN hwaccel tag, or the DSA hwaccel tag.
If this is a DSA master it's a DSA hwaccel tag, and finally clears up
the skb VLAN hwaccel header.
I'm guessing that the problem is the (mis)use of API.
skb_vlan_tag_present() looks like this:
#define skb_vlan_tag_present(__skb) (!!(__skb)->vlan_all)
So if both vlan_proto and vlan_tci are zeroes, skb_vlan_tag_present()
returns precisely false. I don't know for sure what is the format of the
DSA hwaccel tag, but I surely know that lowermost 3 bits of vlan_proto
are 0 when receiving from port 0:
unsigned int port = vlan_proto & GENMASK(2, 0);
If the RX descriptor has no other bits set to non-zero values in
RX_DMA_VTAG, then the call to __vlan_hwaccel_put_tag() will not, in
fact, make the subsequent skb_vlan_tag_present() return true, because
it's implemented like this:
static inline void __vlan_hwaccel_put_tag(struct sk_buff *skb,
__be16 vlan_proto, u16 vlan_tci)
{
skb->vlan_proto = vlan_proto;
skb->vlan_tci = vlan_tci;
}
What we need to do to fix this problem (assuming this is the problem) is
to stop using skb->vlan_all as temporary storage for driver affairs, and
just create some local variables that serve the same purpose, but
hopefully better. Instead of calling skb_vlan_tag_present(), let's look
at a boolean has_hwaccel_tag which we set to true when the RX DMA
descriptors have something. Disambiguate based on netdev_uses_dsa()
whether this is a VLAN or DSA hwaccel tag, and only call
__vlan_hwaccel_put_tag() if we're certain it's a VLAN tag.
Link: https://lore.kernel.org/netdev/[email protected]/
Fixes: 2d7605a72906 ("net: ethernet: mtk_eth_soc: enable hardware DSA untagging")
Reported-by: Arınç ÜNAL <[email protected]>
Signed-off-by: Vladimir Oltean <[email protected]>
---
drivers/net/ethernet/mediatek/mtk_eth_soc.c | 24 ++++++++++++---------
1 file changed, 14 insertions(+), 10 deletions(-)
diff --git a/drivers/net/ethernet/mediatek/mtk_eth_soc.c b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
index f1cb1efc94cf..64b575fbe317 100644
--- a/drivers/net/ethernet/mediatek/mtk_eth_soc.c
+++ b/drivers/net/ethernet/mediatek/mtk_eth_soc.c
@@ -1921,7 +1921,9 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
while (done < budget) {
unsigned int pktlen, *rxdcsum;
+ bool has_hwaccel_tag = false;
struct net_device *netdev;
+ u16 vlan_proto, vlan_tci;
dma_addr_t dma_addr;
u32 hash, reason;
int mac = 0;
@@ -2061,27 +2063,29 @@ static int mtk_poll_rx(struct napi_struct *napi, int budget,
if (netdev->features & NETIF_F_HW_VLAN_CTAG_RX) {
if (MTK_HAS_CAPS(eth->soc->caps, MTK_NETSYS_V2)) {
- if (trxd.rxd3 & RX_DMA_VTAG_V2)
- __vlan_hwaccel_put_tag(skb,
- htons(RX_DMA_VPID(trxd.rxd4)),
- RX_DMA_VID(trxd.rxd4));
+ if (trxd.rxd3 & RX_DMA_VTAG_V2) {
+ vlan_proto = RX_DMA_VPID(trxd.rxd4);
+ vlan_tci = RX_DMA_VID(trxd.rxd4);
+ has_hwaccel_tag = true;
+ }
} else if (trxd.rxd2 & RX_DMA_VTAG) {
- __vlan_hwaccel_put_tag(skb, htons(RX_DMA_VPID(trxd.rxd3)),
- RX_DMA_VID(trxd.rxd3));
+ vlan_proto = RX_DMA_VPID(trxd.rxd3);
+ vlan_tci = RX_DMA_VID(trxd.rxd3);
+ has_hwaccel_tag = true;
}
}
/* When using VLAN untagging in combination with DSA, the
* hardware treats the MTK special tag as a VLAN and untags it.
*/
- if (skb_vlan_tag_present(skb) && netdev_uses_dsa(netdev)) {
- unsigned int port = ntohs(skb->vlan_proto) & GENMASK(2, 0);
+ if (has_hwaccel_tag && netdev_uses_dsa(netdev)) {
+ unsigned int port = vlan_proto & GENMASK(2, 0);
if (port < ARRAY_SIZE(eth->dsa_meta) &&
eth->dsa_meta[port])
skb_dst_set_noref(skb, ð->dsa_meta[port]->dst);
-
- __vlan_hwaccel_clear_tag(skb);
+ } else if (has_hwaccel_tag) {
+ __vlan_hwaccel_put_tag(skb, htons(vlan_proto), vlan_tci);
}
skb_record_rx_queue(skb, 0);
--
2.34.1
On 2/5/23 06:07, Vladimir Oltean wrote:
> Frank reports that in a mt7530 setup where some ports are standalone and
> some are in a VLAN-aware bridge, 8021q uppers of the standalone ports
> lose their VLAN tag on xmit, as seen by the link partner.
>
> This seems to occur because once the other ports join the VLAN-aware
> bridge, mt7530_port_vlan_filtering() also calls
> mt7530_port_set_vlan_aware(ds, cpu_dp->index), and this affects the way
> that the switch processes the traffic of the standalone port.
>
> Relevant is the PVC_EG_TAG bit. The MT7530 documentation says about it:
>
> EG_TAG: Incoming Port Egress Tag VLAN Attribution
> 0: disabled (system default)
> 1: consistent (keep the original ingress tag attribute)
>
> My interpretation is that this setting applies on the ingress port, and
> "disabled" is basically the normal behavior, where the egress tag format
> of the packet (tagged or untagged) is decided by the VLAN table
> (MT7530_VLAN_EGRESS_UNTAG or MT7530_VLAN_EGRESS_TAG).
>
> But there is also an option of overriding the system default behavior,
> and for the egress tagging format of packets to be decided not by the
> VLAN table, but simply by copying the ingress tag format (if ingress was
> tagged, egress is tagged; if ingress was untagged, egress is untagged;
> aka "consistent). This is useful in 2 scenarios:
>
> - VLAN-unaware bridge ports will always encounter a miss in the VLAN
> table. They should forward a packet as-is, though. So we use
> "consistent" there. See commit e045124e9399 ("net: dsa: mt7530: fix
> tagged frames pass-through in VLAN-unaware mode").
>
> - Traffic injected from the CPU port. The operating system is in god
> mode; if it wants a packet to exit as VLAN-tagged, it sends it as
> VLAN-tagged. Otherwise it sends it as VLAN-untagged*.
>
> *This is true only if we don't consider the bridge TX forwarding offload
> feature, which mt7530 doesn't support.
>
> So for now, make the CPU port always stay in "consistent" mode to allow
> software VLANs to be forwarded to their egress ports with the VLAN tag
> intact, and not stripped.
>
> Link: https://lore.kernel.org/netdev/trinity-e6294d28-636c-4c40-bb8b-b523521b00be-1674233135062@3c-app-gmx-bs36/
> Fixes: e045124e9399 ("net: dsa: mt7530: fix tagged frames pass-through in VLAN-unaware mode")
> Reported-by: Frank Wunderlich <[email protected]>
> Tested-by: Frank Wunderlich <[email protected]>
> Signed-off-by: Vladimir Oltean <[email protected]>
Reviewed-by: Florian Fainelli <[email protected]>
--
Florian
On 6.02.2023 20:46, Vladimir Oltean wrote:
> Hi Arınç,
>
> On Mon, Feb 06, 2023 at 07:41:06PM +0300, Arınç ÜNAL wrote:
>> Finally I got time. It's been a seismically active day where I'm from.
>
> My deepest condolences to those who experienced tragedies after today's
> earthquakes. A lot of people in neighboring countries are horrified
> thinking when this will happen to them. Hopefully you aren't living in
> Gaziantep or nearby cities.
Thank you for asking, we're all fine as a family in İzmir. This region
is on a fault line as well so the same could happen here too like it did
in 2020. Thankfully our apartment is strong.
>
>> # ping 192.168.2.2
>> PING 192.168.2.2
>> [ 39.508013] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: there is no metadata dst attached to skb 0xc2dfecc0
>>
>> # ping 192.168.2.2
>> PING 192.168.2.2
>> [ 22.674182] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx: received skb 0xc2d67840 without VLAN/DSA tag present
>
> Thank you so much for testing. Would you mind cleaning everything up and
> testing with this patch instead (formatted on top of net-next)?
> Even if you need to adapt to your tree, hopefully you get the idea from
> the commit message.
Applies cleanly and fixes the issue! You can add my tested-by tag.
Thanks a lot for the ridiculously fast fix Vladimir!
Arınç
One last thing. Specific to MT7530 switch in MT7621 SoC, if port@5 is
the only CPU port defined on the devicetree, frames sent from DSA master
appears malformed on the user port. Packet capture on computer connected
to the user port is attached.
The ARP frames on the pcapng file are received on the DSA master, I
captured them with tcpdump, and put it in the attachments. Then I start
pinging from the DSA master and the malformed frames appear on the
pcapng file.
It'd be great if you could take a look this final issue.
Arınç
On Mon, Feb 06, 2023 at 10:41:47PM +0300, Arınç ÜNAL wrote:
> One last thing. Specific to MT7530 switch in MT7621 SoC, if port@5 is the
> only CPU port defined on the devicetree, frames sent from DSA master appears
> malformed on the user port. Packet capture on computer connected to the user
> port is attached.
>
> The ARP frames on the pcapng file are received on the DSA master, I captured
> them with tcpdump, and put it in the attachments. Then I start pinging from
> the DSA master and the malformed frames appear on the pcapng file.
>
> It'd be great if you could take a look this final issue.
What phy-mode does port@5 use when it doesn't work? What about the DSA master?
On 06/02/2023 23:33, Vladimir Oltean wrote:
> On Mon, Feb 06, 2023 at 10:41:47PM +0300, Arınç ÜNAL wrote:
>> One last thing. Specific to MT7530 switch in MT7621 SoC, if port@5 is the
>> only CPU port defined on the devicetree, frames sent from DSA master appears
>> malformed on the user port. Packet capture on computer connected to the user
>> port is attached.
>>
>> The ARP frames on the pcapng file are received on the DSA master, I captured
>> them with tcpdump, and put it in the attachments. Then I start pinging from
>> the DSA master and the malformed frames appear on the pcapng file.
>>
>> It'd be great if you could take a look this final issue.
>
> What phy-mode does port@5 use when it doesn't work? What about the DSA master?
It's rgmii on port@5 and gmac1.
Arınç
On Mon, Feb 06, 2023 at 11:35:50PM +0300, Arınç ÜNAL wrote:
> On 06/02/2023 23:33, Vladimir Oltean wrote:
> > On Mon, Feb 06, 2023 at 10:41:47PM +0300, Arınç ÜNAL wrote:
> > > One last thing. Specific to MT7530 switch in MT7621 SoC, if port@5 is the
> > > only CPU port defined on the devicetree, frames sent from DSA master appears
> > > malformed on the user port. Packet capture on computer connected to the user
> > > port is attached.
> > >
> > > The ARP frames on the pcapng file are received on the DSA master, I captured
> > > them with tcpdump, and put it in the attachments. Then I start pinging from
> > > the DSA master and the malformed frames appear on the pcapng file.
> > >
> > > It'd be great if you could take a look this final issue.
> >
> > What phy-mode does port@5 use when it doesn't work? What about the DSA master?
>
> It's rgmii on port@5 and gmac1.
What kind of RGMII? Plain "rgmii" on both ends, with no internal delays?
With RGMII, somebody must add a skew to the clock signal relative to the
data signals, so that setup/hold times at the other end are not violated.
Either the transmitter or the receiver can add RGMII delays in each
direction of communication, but someone must do it, and no more than one
entity should do it.
So my question would be: could you retry after replacing phy-mode = "rgmii"
with phy-mode = "rgmii-id" on port@5? And if that doesn't change anything
(unlikely but possible), also try "rgmii-txid" and "rgmii-rxid" in port@5?
Don't change the phy-mode on gmac1.
On 6.02.2023 23:42, Vladimir Oltean wrote:
> On Mon, Feb 06, 2023 at 11:35:50PM +0300, Arınç ÜNAL wrote:
>> On 06/02/2023 23:33, Vladimir Oltean wrote:
>>> On Mon, Feb 06, 2023 at 10:41:47PM +0300, Arınç ÜNAL wrote:
>>>> One last thing. Specific to MT7530 switch in MT7621 SoC, if port@5 is the
>>>> only CPU port defined on the devicetree, frames sent from DSA master appears
>>>> malformed on the user port. Packet capture on computer connected to the user
>>>> port is attached.
>>>>
>>>> The ARP frames on the pcapng file are received on the DSA master, I captured
>>>> them with tcpdump, and put it in the attachments. Then I start pinging from
>>>> the DSA master and the malformed frames appear on the pcapng file.
>>>>
>>>> It'd be great if you could take a look this final issue.
>>>
>>> What phy-mode does port@5 use when it doesn't work? What about the DSA master?
>>
>> It's rgmii on port@5 and gmac1.
>
> What kind of RGMII? Plain "rgmii" on both ends, with no internal delays?
> With RGMII, somebody must add a skew to the clock signal relative to the
> data signals, so that setup/hold times at the other end are not violated.
> Either the transmitter or the receiver can add RGMII delays in each
> direction of communication, but someone must do it, and no more than one
> entity should do it.
>
> So my question would be: could you retry after replacing phy-mode = "rgmii"
> with phy-mode = "rgmii-id" on port@5? And if that doesn't change anything
> (unlikely but possible), also try "rgmii-txid" and "rgmii-rxid" in port@5?
> Don't change the phy-mode on gmac1.
Still getting malformed frames. Packet capture for each phy-mode is
attached. Made sure the phy-mode with:
# cat /proc/device-tree/ethernet@1e100000/mac@1/phy-mode
rgmii
# cat
/proc/device-tree/ethernet@1e100000/mdio-bus/switch@1f/ports/port@5/phy-mode
rgmii-id
# cat
/proc/device-tree/ethernet@1e100000/mdio-bus/switch@1f/ports/port@5/phy-mode
rgmii-txid
# cat
/proc/device-tree/ethernet@1e100000/mdio-bus/switch@1f/ports/port@5/phy-mode
rgmii-rxid
Arınç
Hi,
On Mon, 2023-02-06 at 19:46 +0200, Vladimir Oltean wrote:
> On Mon, Feb 06, 2023 at 07:41:06PM +0300, Arınç ÜNAL wrote:
> > Finally I got time. It's been a seismically active day where I'm from.
>
> My deepest condolences to those who experienced tragedies after today's
> earthquakes. A lot of people in neighboring countries are horrified
> thinking when this will happen to them. Hopefully you aren't living in
> Gaziantep or nearby cities.
>
> > # ping 192.168.2.2
> > PING 192.168.2.2
> > [ 39.508013] mtk_soc_eth 1b100000.ethernet eth1: dsa_switch_rcv: there is no metadata dst attached to skb 0xc2dfecc0
> >
> > # ping 192.168.2.2
> > PING 192.168.2.2
> > [ 22.674182] mtk_soc_eth 1b100000.ethernet eth1: mtk_poll_rx: received skb 0xc2d67840 without VLAN/DSA tag present
>
> Thank you so much for testing. Would you mind cleaning everything up and
> testing with this patch instead (formatted on top of net-next)?
> Even if you need to adapt to your tree, hopefully you get the idea from
> the commit message.
>
> From 218025fd0c33a06865e4202c5170bfc17e26cc75 Mon Sep 17 00:00:00 2001
> From: Vladimir Oltean <[email protected]>
> Date: Mon, 6 Feb 2023 19:03:53 +0200
> Subject: [PATCH] net: ethernet: mtk_eth_soc: fix DSA TX tag hwaccel for switch
> port 0
> MIME-Version: 1.0
> Content-Type: text/plain; charset=UTF-8
> Content-Transfer-Encoding: 8bit
>
> Arınç reports that on his MT7621AT Unielec U7621-06 board and MT7623NI
> Bananapi BPI-R2, packets received by the CPU over mt7530 switch port 0
> (of which this driver acts as the DSA master) are not processed
> correctly by software. More precisely, they arrive without a DSA tag
> (in packet or in the hwaccel area - skb_metadata_dst()), so DSA cannot
> demux them towards the switch's interface for port 0. Traffic from other
> ports receives a skb_metadata_dst() with the correct port and is demuxed
> properly.
>
> Looking at mtk_poll_rx(), it becomes apparent that this driver uses the
> skb vlan hwaccel area:
>
> union {
> u32 vlan_all;
> struct {
> __be16 vlan_proto;
> __u16 vlan_tci;
> };
> };
>
> as a temporary storage for the VLAN hwaccel tag, or the DSA hwaccel tag.
> If this is a DSA master it's a DSA hwaccel tag, and finally clears up
> the skb VLAN hwaccel header.
>
> I'm guessing that the problem is the (mis)use of API.
> skb_vlan_tag_present() looks like this:
>
> #define skb_vlan_tag_present(__skb) (!!(__skb)->vlan_all)
>
> So if both vlan_proto and vlan_tci are zeroes, skb_vlan_tag_present()
> returns precisely false. I don't know for sure what is the format of the
> DSA hwaccel tag, but I surely know that lowermost 3 bits of vlan_proto
> are 0 when receiving from port 0:
>
> unsigned int port = vlan_proto & GENMASK(2, 0);
>
> If the RX descriptor has no other bits set to non-zero values in
> RX_DMA_VTAG, then the call to __vlan_hwaccel_put_tag() will not, in
> fact, make the subsequent skb_vlan_tag_present() return true, because
> it's implemented like this:
>
> static inline void __vlan_hwaccel_put_tag(struct sk_buff *skb,
> __be16 vlan_proto, u16 vlan_tci)
> {
> skb->vlan_proto = vlan_proto;
> skb->vlan_tci = vlan_tci;
> }
>
> What we need to do to fix this problem (assuming this is the problem) is
> to stop using skb->vlan_all as temporary storage for driver affairs, and
> just create some local variables that serve the same purpose, but
> hopefully better. Instead of calling skb_vlan_tag_present(), let's look
> at a boolean has_hwaccel_tag which we set to true when the RX DMA
> descriptors have something. Disambiguate based on netdev_uses_dsa()
> whether this is a VLAN or DSA hwaccel tag, and only call
> __vlan_hwaccel_put_tag() if we're certain it's a VLAN tag.
>
> Link: https://lore.kernel.org/netdev/[email protected]/
> Fixes: 2d7605a72906 ("net: ethernet: mtk_eth_soc: enable hardware DSA untagging")
> Reported-by: Arınç ÜNAL <[email protected]>
> Signed-off-by: Vladimir Oltean <[email protected]>
Thank you Vladimir for the quick turn-around!
For future case, please avoid replying with new patches - tag area
included - to existing patch/thread, as it confuses tag propagation,
thanks!
Paolo
Hello:
This patch was applied to netdev/net.git (master)
by Paolo Abeni <[email protected]>:
On Sun, 5 Feb 2023 16:07:13 +0200 you wrote:
> Frank reports that in a mt7530 setup where some ports are standalone and
> some are in a VLAN-aware bridge, 8021q uppers of the standalone ports
> lose their VLAN tag on xmit, as seen by the link partner.
>
> This seems to occur because once the other ports join the VLAN-aware
> bridge, mt7530_port_vlan_filtering() also calls
> mt7530_port_set_vlan_aware(ds, cpu_dp->index), and this affects the way
> that the switch processes the traffic of the standalone port.
>
> [...]
Here is the summary with links:
- [net] net: dsa: mt7530: don't change PVC_EG_TAG when CPU port becomes VLAN-aware
https://git.kernel.org/netdev/net/c/0b6d6425103a
You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html
Hi Paolo,
On Tue, Feb 07, 2023 at 11:56:13AM +0100, Paolo Abeni wrote:
> Thank you Vladimir for the quick turn-around!
>
> For future case, please avoid replying with new patches - tag area
> included - to existing patch/thread, as it confuses tag propagation,
> thanks!
Ah, yes, I see (and thanks for fixing it up).
Although I need to ask, since I think I made legitimate use of the tools
given to me. What should I have done instead? Post an RFC patch (even
though I didn't know whether it worked or not) in a thread separate to
the debugging session? I didn't want to diverge from the thread reporting
the issue. Maybe we should have started a new thread, decoupled from the
patch?
On Tue, 2023-02-07 at 14:39 +0200, Vladimir Oltean wrote:
> On Tue, Feb 07, 2023 at 11:56:13AM +0100, Paolo Abeni wrote:
> > Thank you Vladimir for the quick turn-around!
> >
> > For future case, please avoid replying with new patches - tag area
> > included - to existing patch/thread, as it confuses tag propagation,
> > thanks!
>
> Ah, yes, I see (and thanks for fixing it up).
>
> Although I need to ask, since I think I made legitimate use of the tools
> given to me. What should I have done instead? Post an RFC patch (even
> though I didn't know whether it worked or not) in a thread separate to
> the debugging session? I didn't want to diverge from the thread reporting
> the issue. Maybe we should have started a new thread, decoupled from the
> patch?
Here what specifically confused the bot were the additional tags
present in the debug patch. One possible alternative would have been
posting - in the same thread - the code of the tentative patch without
the formal commit message/tag area.
That option is quite convenient toome, as writing the changelog takes
me a measurable amount of time and I could spend that effort only when
the patch is finalize/tested.
Please let me know if the above makes sense to you.
Cheers,
Paolo
On Tue, Feb 07, 2023 at 07:07:14PM +0100, Paolo Abeni wrote:
> On Tue, 2023-02-07 at 14:39 +0200, Vladimir Oltean wrote:
> > On Tue, Feb 07, 2023 at 11:56:13AM +0100, Paolo Abeni wrote:
> > > Thank you Vladimir for the quick turn-around!
> > >
> > > For future case, please avoid replying with new patches - tag area
> > > included - to existing patch/thread, as it confuses tag propagation,
> > > thanks!
> >
> > Ah, yes, I see (and thanks for fixing it up).
> >
> > Although I need to ask, since I think I made legitimate use of the tools
> > given to me. What should I have done instead? Post an RFC patch (even
> > though I didn't know whether it worked or not) in a thread separate to
> > the debugging session? I didn't want to diverge from the thread reporting
> > the issue. Maybe we should have started a new thread, decoupled from the
> > patch?
>
> Here what specifically confused the bot were the additional tags
> present in the debug patch. One possible alternative would have been
> posting - in the same thread - the code of the tentative patch without
> the formal commit message/tag area.
>
> That option is quite convenient toome, as writing the changelog takes
> me a measurable amount of time and I could spend that effort only when
> the patch is finalize/tested.
>
> Please let me know if the above makes sense to you.
I think even the Signed-off-by would confuse the patchwork bot, right?
I would have to send just the diff portion, and send the full patch as
an email attachment.
In any case, I'll pay attention to this next time.
Hi,
Can this be applied to next too?
It looks like discussion about different issues in mt7530 driver prevents it picking it up.
regards Frank
On Sat, Feb 11, 2023 at 07:04:14PM +0100, Frank Wunderlich wrote:
> Hi,
>
> Can this be applied to next too?
>
> It looks like discussion about different issues in mt7530 driver prevents it picking it up.
> regards Frank
https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=0b6d6425103a676e2b6a81f3fd35d7ea4f9b90ec
> On 11 Feb 2023, at 19:31, Vladimir Oltean <[email protected]> wrote:
>
> On Sat, Feb 11, 2023 at 07:04:14PM +0100, Frank Wunderlich wrote:
>> Hi,
>>
>> Can this be applied to next too?
>>
>> It looks like discussion about different issues in mt7530 driver prevents it picking it up.
>> regards Frank
>
> https://git.kernel.org/pub/scm/linux/kernel/git/netdev/net-next.git/commit/?id=0b6d6425103a676e2b6a81f3fd35d7ea4f9b90ec
Sorry, that patch is wrong. In the shared tree with Arinc I have a different version. One without any need to set / change anything on the CPU port.
Let me check with Arinc before I send that one.
Richard