Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp1115696pxp; Thu, 17 Mar 2022 03:14:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwLUoe7OWaAMJW0awhUG69ZdbbpNpzQbM+nZEAv8Ulzw0QhDXZPKhaKVS1Jjt0XkHPlhQfV X-Received: by 2002:a50:9fa5:0:b0:418:e7c4:cf96 with SMTP id c34-20020a509fa5000000b00418e7c4cf96mr3605112edf.30.1647512045112; Thu, 17 Mar 2022 03:14:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647512045; cv=none; d=google.com; s=arc-20160816; b=O1KXfXAqqJgWIisnagatDybEUHy2uhwFkLvbQ2c66dP7nSYeVugMOlrwoRGCZC9h2p idLLDVft1Uh8kWgc74mx8Ifuywt4Wagddyd9rYDTrm+xLBn4KY/eYP/gMl+1Y5VVL1UY iN5573lpUA0hofFWwz37QTAu0EgRhUCkgJtTcgTeZhScVsm9t1GYN3sR/v77pMXj2r2/ FenJeDmD4Od2Y4+DnPTXw5Y+Ct+m/KmTKnHiHu9WujFLYUI+CCl9PB5+mZntsfQ/YFq7 JcL3fguEZeDak8NZS+7GtgkjHVXOdKxnn5I12AjsiPja5noF5kgL7iMQm+HjSLug9ovQ 5mJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=V0LbW5LueiWt1tPi33D71OMFfIFyzm670dcDqQa5c5Q=; b=LlHiInmsfYbo6WJ6TnEfkHIcY5TiZrFHR9fzQ63uh8esBmd3yPcws2QbZcD3uNZ+iY 4I3ZsAiuxHSadAR0gTkupOytFVcwj1pvSzpowKoH7E8tCUjCXR6y8qbWS3vRFxn+R4Xx mUNmKRXnFEvi7UMDOfcEIlV7okRxgKfVrmRoqfcfVPHOnAtiTli3nqJf4RoR/q49gNWM cDw2N8xEg1LLG099qS4SIZ7y77g1GYFvlgzFCWT+4fZ1xroAwxqY9N2AiCeIrmfe/Jef EkT4YxwV648nLla2xg9RB72VGaYPQyXW9VvE8ja5frvlrOh5+my0ar6y40Itr0WNyAsd eTFg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@waldekranz-com.20210112.gappssmtp.com header.s=20210112 header.b=CxfWZXX0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id p2-20020a50cd82000000b00418edac76afsi1206498edi.280.2022.03.17.03.13.40; Thu, 17 Mar 2022 03:14:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@waldekranz-com.20210112.gappssmtp.com header.s=20210112 header.b=CxfWZXX0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232257AbiCQJvn (ORCPT + 99 others); Thu, 17 Mar 2022 05:51:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230308AbiCQJvl (ORCPT ); Thu, 17 Mar 2022 05:51:41 -0400 Received: from mail-lj1-x244.google.com (mail-lj1-x244.google.com [IPv6:2a00:1450:4864:20::244]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3F99E1BB798 for ; Thu, 17 Mar 2022 02:50:24 -0700 (PDT) Received: by mail-lj1-x244.google.com with SMTP id bn33so6484823ljb.6 for ; Thu, 17 Mar 2022 02:50:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=waldekranz-com.20210112.gappssmtp.com; s=20210112; h=from:to:cc:subject:in-reply-to:references:date:message-id :mime-version; bh=V0LbW5LueiWt1tPi33D71OMFfIFyzm670dcDqQa5c5Q=; b=CxfWZXX0kv32BD1WQ/OJcVIGTDr4mpYvX56gBYB4lFMI4e8M2MOFdiwYSYgpRCe+TV 4SPxRzXQEvwIgxda5Ymbh2hfRKRMFdzl/A5DCtNwFYo937O58rin+LBpyg/WiuGQrJoC tk0QW3gTx3wonWCM8CQqjk6TN1YATpkxxJmVlCzut4UvLY0leGV53BFpmuqk3ScMWnjg VlbSkLyQj5YP1VxKhdl7jkFWwd5lGbfeFGCsp2gFFzE835ikvILERttvSuw22ecUqLNJ Slx4HOJgfqyCSbPmg3ychRF8bNbMCYqitgUCV107TpctqoH4jV/QfncMRd3OgzX25RgA OrAg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=V0LbW5LueiWt1tPi33D71OMFfIFyzm670dcDqQa5c5Q=; b=BJcwmaiCHiFSZogJeXbpffQhrzgq+Ge1szIWsCfiS+dDWH8jYsgiWJseO2PtBbraDk eg7yfaTYwoi0lfeAmEzRGePiGWW6ijPBn41EZOQCxCT/uR2I2S/IpJwqcpy4RyhVhkcr BmDWDKzbMCN6taWrezeBkA0Ixxp+RU6Fpexg11QCtlOkvYCYMxrpatneu3fcsm/a3NEm u+V8pNPzp8X0FKuxaIwNU9LwmY87CRyUcpp48ybhk2tNch1YEQ/Fd8PZh82j3u+b9tpM 4v5+3Qskf3mVl4Am7Q1POf2JesJW4PGC4OkHU4kPdgQHxxUj7D9FRUbCKf7U9RNJUXEC thOw== X-Gm-Message-State: AOAM531eQ2i/m8JAJcqVv/UOpMsjfYi3auqyZ4EffEEynuRbFzRnAe8S XzrGXsZWrqYyLd4AdUHocroq7w== X-Received: by 2002:a05:651c:1a1f:b0:247:ff8b:e691 with SMTP id by31-20020a05651c1a1f00b00247ff8be691mr2367542ljb.298.1647510623134; Thu, 17 Mar 2022 02:50:23 -0700 (PDT) Received: from wkz-x280 (a124.broadband3.quicknet.se. [46.17.184.124]) by smtp.gmail.com with ESMTPSA id f11-20020a056512228b00b004487997379esm402462lfu.158.2022.03.17.02.50.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Mar 2022 02:50:22 -0700 (PDT) From: Tobias Waldekranz To: Nikolay Aleksandrov , davem@davemloft.net, kuba@kernel.org Cc: Andrew Lunn , Vivien Didelot , Florian Fainelli , Vladimir Oltean , Jiri Pirko , Ivan Vecera , Roopa Prabhu , Russell King , Petr Machata , Ido Schimmel , Matt Johnston , Cooper Lees , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bridge@lists.linux-foundation.org Subject: Re: [PATCH v5 net-next 00/15] net: bridge: Multiple Spanning Trees In-Reply-To: <610eb6cc-4df4-f0fc-462a-b33145334a12@blackwall.org> References: <20220316150857.2442916-1-tobias@waldekranz.com> <610eb6cc-4df4-f0fc-462a-b33145334a12@blackwall.org> Date: Thu, 17 Mar 2022 10:50:21 +0100 Message-ID: <87tubwkiw2.fsf@waldekranz.com> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 17, 2022 at 11:00, Nikolay Aleksandrov wrote: > On 16/03/2022 17:08, Tobias Waldekranz wrote: >> The bridge has had per-VLAN STP support for a while now, since: >> >> https://lore.kernel.org/netdev/20200124114022.10883-1-nikolay@cumulusnetworks.com/ >> >> The current implementation has some problems: >> >> - The mapping from VLAN to STP state is fixed as 1:1, i.e. each VLAN >> is managed independently. This is awkward from an MSTP (802.1Q-2018, >> Clause 13.5) point of view, where the model is that multiple VLANs >> are grouped into MST instances. >> >> Because of the way that the standard is written, presumably, this is >> also reflected in hardware implementations. It is not uncommon for a >> switch to support the full 4k range of VIDs, but that the pool of >> MST instances is much smaller. Some examples: >> >> Marvell LinkStreet (mv88e6xxx): 4k VLANs, but only 64 MSTIs >> Marvell Prestera: 4k VLANs, but only 128 MSTIs >> Microchip SparX-5i: 4k VLANs, but only 128 MSTIs >> >> - By default, the feature is enabled, and there is no way to disable >> it. This makes it hard to add offloading in a backwards compatible >> way, since any underlying switchdevs have no way to refuse the >> function if the hardware does not support it >> >> - The port-global STP state has precedence over per-VLAN states. In >> MSTP, as far as I understand it, all VLANs will use the common >> spanning tree (CST) by default - through traffic engineering you can >> then optimize your network to group subsets of VLANs to use >> different trees (MSTI). To my understanding, the way this is >> typically managed in silicon is roughly: >> >> Incoming packet: >> .----.----.--------------.----.------------- >> | DA | SA | 802.1Q VID=X | ET | Payload ... >> '----'----'--------------'----'------------- >> | >> '->|\ .----------------------------. >> | +--> | VID | Members | ... | MSTI | >> PVID -->|/ |-----|---------|-----|------| >> | 1 | 0001001 | ... | 0 | >> | 2 | 0001010 | ... | 10 | >> | 3 | 0001100 | ... | 10 | >> '----------------------------' >> | >> .-----------------------------' >> | .------------------------. >> '->| MSTI | Fwding | Lrning | >> |------|--------|--------| >> | 0 | 111110 | 111110 | >> | 10 | 110111 | 110111 | >> '------------------------' >> >> What this is trying to show is that the STP state (whether MSTP is >> used, or ye olde STP) is always accessed via the VLAN table. If STP >> is running, all MSTI pointers in that table will reference the same >> index in the STP stable - if MSTP is running, some VLANs may point >> to other trees (like in this example). >> >> The fact that in the Linux bridge, the global state (think: index 0 >> in most hardware implementations) is supposed to override the >> per-VLAN state, is very awkward to offload. In effect, this means >> that when the global state changes to blocking, drivers will have to >> iterate over all MSTIs in use, and alter them all to match. This >> also means that you have to cache whether the hardware state is >> currently tracking the global state or the per-VLAN state. In the >> first case, you also have to cache the per-VLAN state so that you >> can restore it if the global state transitions back to forwarding. >> >> This series adds a new mst_enable bridge setting (as suggested by Nik) >> that can only be changed when no VLANs are configured on the >> bridge. Enabling this mode has the following effect: >> >> - The port-global STP state is used to represent the CST (Common >> Spanning Tree) (1/15) >> >> - Ingress STP filtering is deferred until the frame's VLAN has been >> resolved (1/15) >> >> - The preexisting per-VLAN states can no longer be controlled directly >> (1/15). They are instead placed under the MST module's control, >> which is managed using a new netlink interface (described in 3/15) >> >> - VLANs can br mapped to MSTIs in an arbitrary M:N fashion, using a >> new global VLAN option (2/15) >> >> Switchdev notifications are added so that a driver can track: >> - MST enabled state >> - VID to MSTI mappings >> - MST port states >> >> An offloading implementation is this provided for mv88e6xxx. >> >> A proposal for the corresponding iproute2 interface is available here: >> >> https://github.com/wkz/iproute2/tree/mst >> > > Hi Tobias, > One major missing thing is the selftests for this new feature. Do you > have a plan to upstream them? 100% agree. I have an internal test that I plan to adapt to run as a kselftest. There's a bootstrapping problem here though. I can't send the iproute2 series until the kernel support is merged - and until I know how the iproute2 support ends up looking I can't add a kselftest. Ideally, tools/iproute2 would be a thing in the kernel. Then you could send the entire implementation as one series. I'm sure that's probably been discussed many times already, but my Google-fu fails me.