Received: by 2002:a05:6a10:af89:0:0:0:0 with SMTP id iu9csp3602763pxb; Mon, 24 Jan 2022 13:16:41 -0800 (PST) X-Google-Smtp-Source: ABdhPJwPiiSe+zn35t24/TwgU33wmxfJTZmqd4ZrjY9iRN7St1v099OU5yWueXB8KfMJL39Huhr/ X-Received: by 2002:a17:902:7c05:b0:149:a3b4:934c with SMTP id x5-20020a1709027c0500b00149a3b4934cmr16222938pll.42.1643059001455; Mon, 24 Jan 2022 13:16:41 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643059001; cv=none; d=google.com; s=arc-20160816; b=eMrhc0LwQS7wkIKMyit2aaA1WpPusou/dzMICHUElWyuMRjoFxH59W9E+t+UJmLMXx 8xWV3/ItYnUQJqixJRaVSDiaUSnJlduDZyiOuaCjCpCdTtjuzILbpUCtXThK/KuJviJ/ khtZS1zhzuum6Av/Oe8GG2+WQdf3WDj2ttEPBXpz9LAuk8FXOMP52HUyuBKkP/YetUPX Mr3PNRvKNsUczqtYxqM5C5Hstx2t8ppgCcysLtof1c9wZ5o1qaKDeOCfNPE16m/bnO56 kPpJT+6OzWugMLwnevihO1sNX12F9u30v8FRrs5TdC8sgEKuB11+7J/HcPdwTv75TOcB kujA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :user-agent:references:in-reply-to:message-id:date:subject:cc:to :from:dkim-signature; bh=IkIyk4EcLMAtBXR5crjZkRnxbgtWpVbP+huiHJ/BGn8=; b=DdFREk08DNZwIBbxZxLMPh+6jxGmAAKRPdTSJhk9jNA8n/hbxcolCk6q46wh6zOfVb HG+C91dwCjJjZdKp/8QozIBpKVBvFtcaLP9Aju97RM0ETUF770q/rAlVS9LjB26Z38Tf Mh8DdtygkA1TNFV2uNy1IrKvaNSHTD9coLPO+JtpFdxenAzYZm4b9JuSBgHPjO3G1hoK uSvWJRAwSqTDSd/JotW1gHHqt+amyK3YPrma9ImnEbOLZINWn5yHrDpaWQFufzhEjIfA H6GU83Jf2Vqg5Bmi50IaIB+xVr23N2nteTPJN0x1bKiHpp4ww5usYuguRV4mj6TymVAa 5Cyg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=JlTIPED9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id rj13si409620pjb.33.2022.01.24.13.16.29; Mon, 24 Jan 2022 13:16:41 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@linuxfoundation.org header.s=korg header.b=JlTIPED9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linuxfoundation.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1392345AbiAXUvE (ORCPT + 99 others); Mon, 24 Jan 2022 15:51:04 -0500 Received: from ams.source.kernel.org ([145.40.68.75]:34642 "EHLO ams.source.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1383740AbiAXU1n (ORCPT ); Mon, 24 Jan 2022 15:27:43 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 59DEEB810BD; Mon, 24 Jan 2022 20:27:42 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8B7B3C340E5; Mon, 24 Jan 2022 20:27:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=linuxfoundation.org; s=korg; t=1643056061; bh=nlw3u2S952hqTb1O3ZCG6sm5tj7vtwUyB4BDlS7DPnA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JlTIPED9NdYfIEEzpcicLkZCaoz/AVUDGALec3htwoBPZypyBo6zPSKhNSZZoY6Gh BjzQpdG0EcgKTGEnBI2/wWGi5wqZKg9kjBwGpcPdqyOb5wHcGOu3TZmxpxNWjumVWF pYbuBGCbkl24f6xbFGr8R6e3kNlCPYpcLPJPM7aY= From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Vladimir Oltean , Jakub Kicinski , Sasha Levin Subject: [PATCH 5.15 356/846] net: mscc: ocelot: fix incorrect balancing with down LAG ports Date: Mon, 24 Jan 2022 19:37:53 +0100 Message-Id: <20220124184113.221978902@linuxfoundation.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20220124184100.867127425@linuxfoundation.org> References: <20220124184100.867127425@linuxfoundation.org> User-Agent: quilt/0.66 MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Vladimir Oltean [ Upstream commit a14e6b69f393d651913edcbe4ec0dec27b8b4b40 ] Assuming the test setup described here: https://patchwork.kernel.org/project/netdevbpf/cover/20210205130240.4072854-1-vladimir.oltean@nxp.com/ (swp1 and swp2 are in bond0, and bond0 is in a bridge with swp0) it can be seen that when swp1 goes down (on either board A or B), then traffic that should go through that port isn't forwarded anywhere. A dump of the PGID table shows the following: PGID_DST[0] = ports 0 PGID_DST[1] = ports 1 PGID_DST[2] = ports 2 PGID_DST[3] = ports 3 PGID_DST[4] = ports 4 PGID_DST[5] = ports 5 PGID_DST[6] = no ports PGID_AGGR[0] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[1] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[2] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[3] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[4] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[5] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[6] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[7] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[8] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[9] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[10] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[11] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[12] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[13] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[14] = ports 0, 1, 2, 3, 4, 5 PGID_AGGR[15] = ports 0, 1, 2, 3, 4, 5 PGID_SRC[0] = ports 1, 2 PGID_SRC[1] = ports 0 PGID_SRC[2] = ports 0 PGID_SRC[3] = no ports PGID_SRC[4] = no ports PGID_SRC[5] = no ports PGID_SRC[6] = ports 0, 1, 2, 3, 4, 5 Whereas a "good" PGID configuration for that setup should have looked like this: PGID_DST[0] = ports 0 PGID_DST[1] = ports 1, 2 PGID_DST[2] = ports 1, 2 PGID_DST[3] = ports 3 PGID_DST[4] = ports 4 PGID_DST[5] = ports 5 PGID_DST[6] = no ports PGID_AGGR[0] = ports 0, 2, 3, 4, 5 PGID_AGGR[1] = ports 0, 2, 3, 4, 5 PGID_AGGR[2] = ports 0, 2, 3, 4, 5 PGID_AGGR[3] = ports 0, 2, 3, 4, 5 PGID_AGGR[4] = ports 0, 2, 3, 4, 5 PGID_AGGR[5] = ports 0, 2, 3, 4, 5 PGID_AGGR[6] = ports 0, 2, 3, 4, 5 PGID_AGGR[7] = ports 0, 2, 3, 4, 5 PGID_AGGR[8] = ports 0, 2, 3, 4, 5 PGID_AGGR[9] = ports 0, 2, 3, 4, 5 PGID_AGGR[10] = ports 0, 2, 3, 4, 5 PGID_AGGR[11] = ports 0, 2, 3, 4, 5 PGID_AGGR[12] = ports 0, 2, 3, 4, 5 PGID_AGGR[13] = ports 0, 2, 3, 4, 5 PGID_AGGR[14] = ports 0, 2, 3, 4, 5 PGID_AGGR[15] = ports 0, 2, 3, 4, 5 PGID_SRC[0] = ports 1, 2 PGID_SRC[1] = ports 0 PGID_SRC[2] = ports 0 PGID_SRC[3] = no ports PGID_SRC[4] = no ports PGID_SRC[5] = no ports PGID_SRC[6] = ports 0, 1, 2, 3, 4, 5 In other words, in the "bad" configuration, the attempt is to remove the inactive swp1 from the destination ports via PGID_DST. But when a MAC table entry is learned, it is learned towards PGID_DST 1, because that is the logical port id of the LAG itself (it is equal to the lowest numbered member port). So when swp1 becomes inactive, if we set PGID_DST[1] to contain just swp1 and not swp2, the packet will not have any chance to reach the destination via swp2. The "correct" way to remove swp1 as a destination is via PGID_AGGR (remove swp1 from the aggregation port groups for all aggregation codes). This means that PGID_DST[1] and PGID_DST[2] must still contain both swp1 and swp2. This makes the MAC table still treat packets destined towards the single-port LAG as "multicast", and the inactive ports are removed via the aggregation code tables. The change presented here is a design one: the ocelot_get_bond_mask() function used to take an "only_active_ports" argument. We don't need that. The only call site that specifies only_active_ports=true, ocelot_set_aggr_pgids(), must retrieve the entire bonding mask, because it must program that into PGID_DST. Additionally, it must also clear the inactive ports from the bond mask here, which it can't do if bond_mask just contains the active ports: ac = ocelot_read_rix(ocelot, ANA_PGID_PGID, i); ac &= ~bond_mask; <---- here /* Don't do division by zero if there was no active * port. Just make all aggregation codes zero. */ if (num_active_ports) ac |= BIT(aggr_idx[i % num_active_ports]); ocelot_write_rix(ocelot, ac, ANA_PGID_PGID, i); So it becomes the responsibility of ocelot_set_aggr_pgids() to take ocelot_port->lag_tx_active into consideration when populating the aggr_idx array. Fixes: 23ca3b727ee6 ("net: mscc: ocelot: rebalance LAGs on link up/down events") Signed-off-by: Vladimir Oltean Link: https://lore.kernel.org/r/20220107164332.402133-1-vladimir.oltean@nxp.com Signed-off-by: Jakub Kicinski Signed-off-by: Sasha Levin --- drivers/net/ethernet/mscc/ocelot.c | 26 +++++++++++--------------- 1 file changed, 11 insertions(+), 15 deletions(-) diff --git a/drivers/net/ethernet/mscc/ocelot.c b/drivers/net/ethernet/mscc/ocelot.c index 00b5e6860bf69..0ee34a0c7683e 100644 --- a/drivers/net/ethernet/mscc/ocelot.c +++ b/drivers/net/ethernet/mscc/ocelot.c @@ -1302,8 +1302,7 @@ int ocelot_get_ts_info(struct ocelot *ocelot, int port, } EXPORT_SYMBOL(ocelot_get_ts_info); -static u32 ocelot_get_bond_mask(struct ocelot *ocelot, struct net_device *bond, - bool only_active_ports) +static u32 ocelot_get_bond_mask(struct ocelot *ocelot, struct net_device *bond) { u32 mask = 0; int port; @@ -1314,12 +1313,8 @@ static u32 ocelot_get_bond_mask(struct ocelot *ocelot, struct net_device *bond, if (!ocelot_port) continue; - if (ocelot_port->bond == bond) { - if (only_active_ports && !ocelot_port->lag_tx_active) - continue; - + if (ocelot_port->bond == bond) mask |= BIT(port); - } } return mask; @@ -1406,10 +1401,8 @@ void ocelot_apply_bridge_fwd_mask(struct ocelot *ocelot) mask = ocelot_get_bridge_fwd_mask(ocelot, port, bridge); mask |= cpu_fwd_mask; mask &= ~BIT(port); - if (bond) { - mask &= ~ocelot_get_bond_mask(ocelot, bond, - false); - } + if (bond) + mask &= ~ocelot_get_bond_mask(ocelot, bond); } else { /* Standalone ports forward only to DSA tag_8021q CPU * ports (if those exist), or to the hardware CPU port @@ -1727,13 +1720,17 @@ static void ocelot_set_aggr_pgids(struct ocelot *ocelot) if (!bond || (visited & BIT(lag))) continue; - bond_mask = ocelot_get_bond_mask(ocelot, bond, true); + bond_mask = ocelot_get_bond_mask(ocelot, bond); for_each_set_bit(port, &bond_mask, ocelot->num_phys_ports) { + struct ocelot_port *ocelot_port = ocelot->ports[port]; + // Destination mask ocelot_write_rix(ocelot, bond_mask, ANA_PGID_PGID, port); - aggr_idx[num_active_ports++] = port; + + if (ocelot_port->lag_tx_active) + aggr_idx[num_active_ports++] = port; } for_each_aggr_pgid(ocelot, i) { @@ -1782,8 +1779,7 @@ static void ocelot_setup_logical_port_ids(struct ocelot *ocelot) bond = ocelot_port->bond; if (bond) { - int lag = __ffs(ocelot_get_bond_mask(ocelot, bond, - false)); + int lag = __ffs(ocelot_get_bond_mask(ocelot, bond)); ocelot_rmw_gix(ocelot, ANA_PORT_PORT_CFG_PORTID_VAL(lag), -- 2.34.1