2022-08-09 06:36:32

by Sun Shouxin

[permalink] [raw]
Subject: [PATCH v2] net:bonding:support balance-alb interface with vlan to bridge

In my test, balance-alb bonding with two slaves eth0 and eth1,
and then Bond0.150 is created with vlan id attached bond0.
After adding bond0.150 into one linux bridge, I noted that Bond0,
bond0.150 and bridge were assigned to the same MAC as eth0.
Once bond0.150 receives a packet whose dest IP is bridge's
and dest MAC is eth1's, the linux bridge will not match
eth1's MAC entry in FDB, and not handle it as expected.
The patch fix the issue, and diagram as below:

eth1(mac:eth1_mac)--bond0(balance-alb,mac:eth0_mac)--eth0(mac:eth0_mac)
|
bond0.150(mac:eth0_mac)
|
bridge(ip:br_ip, mac:eth0_mac)--other port

Suggested-by: Hu Yadi <[email protected]>
Signed-off-by: Sun Shouxin <[email protected]>
---

changelog:
v1->v2:
-declare variabls in reverse xmas tree order
-delete {}
-add explanation in commit message
---
drivers/net/bonding/bond_alb.c | 7 +++++++
1 file changed, 7 insertions(+)

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 007d43e46dcb..60cb9a0225aa 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -653,6 +653,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb,
static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
{
struct slave *tx_slave = NULL;
+ struct net_device *dev;
struct arp_pkt *arp;

if (!pskb_network_may_pull(skb, sizeof(*arp)))
@@ -665,6 +666,12 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
if (!bond_slave_has_mac_rx(bond, arp->mac_src))
return NULL;

+ dev = ip_dev_find(dev_net(bond->dev), arp->ip_src);
+ if (dev) {
+ if (netif_is_bridge_master(dev))
+ return NULL;
+ }
+
if (arp->op_code == htons(ARPOP_REPLY)) {
/* the arp must be sent on the selected rx channel */
tx_slave = rlb_choose_channel(skb, bond, arp);
--
2.27.0


2022-08-10 12:54:26

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH v2] net:bonding:support balance-alb interface with vlan to bridge

Hello:

This patch was applied to netdev/net.git (master)
by David S. Miller <[email protected]>:

On Mon, 8 Aug 2022 23:21:03 -0700 you wrote:
> In my test, balance-alb bonding with two slaves eth0 and eth1,
> and then Bond0.150 is created with vlan id attached bond0.
> After adding bond0.150 into one linux bridge, I noted that Bond0,
> bond0.150 and bridge were assigned to the same MAC as eth0.
> Once bond0.150 receives a packet whose dest IP is bridge's
> and dest MAC is eth1's, the linux bridge will not match
> eth1's MAC entry in FDB, and not handle it as expected.
> The patch fix the issue, and diagram as below:
>
> [...]

Here is the summary with links:
- [v2] net:bonding:support balance-alb interface with vlan to bridge
https://git.kernel.org/netdev/net/c/d5410ac7b0ba

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html


2022-08-10 15:02:56

by Jay Vosburgh

[permalink] [raw]
Subject: Re: [PATCH v2] net:bonding:support balance-alb interface with vlan to bridge

Sun Shouxin <[email protected]> wrote:

>In my test, balance-alb bonding with two slaves eth0 and eth1,
>and then Bond0.150 is created with vlan id attached bond0.
>After adding bond0.150 into one linux bridge, I noted that Bond0,
>bond0.150 and bridge were assigned to the same MAC as eth0.
>Once bond0.150 receives a packet whose dest IP is bridge's
>and dest MAC is eth1's, the linux bridge will not match
>eth1's MAC entry in FDB, and not handle it as expected.
>The patch fix the issue, and diagram as below:
>
>eth1(mac:eth1_mac)--bond0(balance-alb,mac:eth0_mac)--eth0(mac:eth0_mac)
> |
> bond0.150(mac:eth0_mac)
> |
> bridge(ip:br_ip, mac:eth0_mac)--other port
>
>Suggested-by: Hu Yadi <[email protected]>
>Signed-off-by: Sun Shouxin <[email protected]>

As Nik suggested, please add some additional explanation here.
You can cut and paste my description from the original discussion if
you'd like.

>---
>
>changelog:
>v1->v2:
> -declare variabls in reverse xmas tree order
> -delete {}
> -add explanation in commit message
>---
> drivers/net/bonding/bond_alb.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
>diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
>index 007d43e46dcb..60cb9a0225aa 100644
>--- a/drivers/net/bonding/bond_alb.c
>+++ b/drivers/net/bonding/bond_alb.c
>@@ -653,6 +653,7 @@ static struct slave *rlb_choose_channel(struct sk_buff *skb,
> static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
> {
> struct slave *tx_slave = NULL;
>+ struct net_device *dev;
> struct arp_pkt *arp;
>
> if (!pskb_network_may_pull(skb, sizeof(*arp)))
>@@ -665,6 +666,12 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)
> if (!bond_slave_has_mac_rx(bond, arp->mac_src))
> return NULL;
>
>+ dev = ip_dev_find(dev_net(bond->dev), arp->ip_src);
>+ if (dev) {
>+ if (netif_is_bridge_master(dev))
>+ return NULL;

Stylistically, the "if dev" and "if netif_is_bridge_master"
could be one line, e.g., "if dev && netif_is_bridge_master".

Functionally, ip_dev_find acquires a reference to dev, and this
code will need to release (dev_put) that reference.

I'm also wondering if testing bond->dev for netif_if_bridge_port
before ip_dev_find would help here (as an optimization); I think so, for
the case where the bond is directly in the bridge without a VLAN in the
middle.

-J


>+ }
>+
> if (arp->op_code == htons(ARPOP_REPLY)) {
> /* the arp must be sent on the selected rx channel */
> tx_slave = rlb_choose_channel(skb, bond, arp);
>--
>2.27.0
>

---
-Jay Vosburgh, [email protected]

2022-08-10 21:18:11

by Jay Vosburgh

[permalink] [raw]
Subject: Re: [PATCH v2] net:bonding:support balance-alb interface with vlan to bridge

[email protected] wrote:

>Hello:
>
>This patch was applied to netdev/net.git (master)
>by David S. Miller <[email protected]>:
>
>On Mon, 8 Aug 2022 23:21:03 -0700 you wrote:
>> In my test, balance-alb bonding with two slaves eth0 and eth1,
>> and then Bond0.150 is created with vlan id attached bond0.
>> After adding bond0.150 into one linux bridge, I noted that Bond0,
>> bond0.150 and bridge were assigned to the same MAC as eth0.
>> Once bond0.150 receives a packet whose dest IP is bridge's
>> and dest MAC is eth1's, the linux bridge will not match
>> eth1's MAC entry in FDB, and not handle it as expected.
>> The patch fix the issue, and diagram as below:
>>
>> [...]
>
>Here is the summary with links:
> - [v2] net:bonding:support balance-alb interface with vlan to bridge
> https://git.kernel.org/netdev/net/c/d5410ac7b0ba
>
>You are awesome, thank you!

There looks to be a reference count leak in the existing patch
(ip_dev_find acquires a reference that is not released). I.e., it needs
something like:

diff --git a/drivers/net/bonding/bond_alb.c b/drivers/net/bonding/bond_alb.c
index 60cb9a0225aa..b9dbad3a8af8 100644
--- a/drivers/net/bonding/bond_alb.c
+++ b/drivers/net/bonding/bond_alb.c
@@ -668,8 +668,11 @@ static struct slave *rlb_arp_xmit(struct sk_buff *skb, struct bonding *bond)

dev = ip_dev_find(dev_net(bond->dev), arp->ip_src);
if (dev) {
- if (netif_is_bridge_master(dev))
+ if (netif_is_bridge_master(dev)) {
+ dev_put(dev);
return NULL;
+ }
+ dev_put(dev);
}

if (arp->op_code == htons(ARPOP_REPLY)) {

I haven't tested this, but it seems correct. Comments?

I'll create a formal submission here in a bit.

-J

---
-Jay Vosburgh, [email protected]