Received: by 2002:a05:6a10:413:0:0:0:0 with SMTP id 19csp1231742pxp; Thu, 17 Mar 2022 05:47:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzavGIap4srns4kXrgxvBeILsLFdjyNR+w/nz0nw/nwYB3eykgcxNgsJ7GEkmEAImur04ba X-Received: by 2002:a17:906:d54d:b0:6db:ab37:60d0 with SMTP id cr13-20020a170906d54d00b006dbab3760d0mr4225737ejc.234.1647521273875; Thu, 17 Mar 2022 05:47:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1647521273; cv=none; d=google.com; s=arc-20160816; b=xqFaS/EGS7okOWWFwMRxaHT7ZnYlSRqPfDS2+yHFeEqdHsqrCP949oWoWuhuLIpoKi nC8/bvNE3AApq67wyfPujU/O5B2sjMuW25QyuqZ2BUZKDkr/W+ViJSHdROJJvbEXcHKL kx6CgOoCO1761Kl6G/Gv6fdkjDfew4e2N0tOYFE1HuoD0uVh8ZpYLRRRcdhMUJj26+9H IP4GH2YKmJzyuQJOVg5fzpBSyBmaA+Htgpnhbb7mDXV/WaNQw871Vu3PyL/GRfkfJabf uW9hVUtOVlK0C69t+yFqbLjq1b/2+Jkfz7KJ0f4B+c9v1mFwNJKZ4uiyoWJk+W220LW9 jgwA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=LKI5osEQPlJQ9O41BUKV52XR7DzjpxsT1kue581cMEE=; b=Qv7tP5q7kwMbh1ij0rbtvrM+RspJW9Ll9hlvCvuPcnLBMHwrc8Rb97Z/Z/jowa0AeR txnCheG5dRfEjLI/TreVXnldyQno9C1Jy28IxL2GiCG3XUlaLEeBkVER6La8s7WCJgME OM+xxMjLlDmiPCwlsl4J54lQZFVU/fw1wkVQAHgX8f6SDf3Z1tQN8QEMvN95uWGLgcAN A1CqznTOQtn9z5Nu2XAkDUgaOJAIXx+y3GGCxV6Ld8GrwdzyosbccOgPVjKtDZFzDAPr BIIVfOvrtRtgpEcRIYnaeHb5BvPWNrtOe7HM2RODbf/Ro8HrkhhCKxzCajhbgd44ifXd Ujog== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@blackwall-org.20210112.gappssmtp.com header.s=20210112 header.b=wtF2cm1c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id q2-20020a50cc82000000b00418d6d7b866si1430223edi.161.2022.03.17.05.47.28; Thu, 17 Mar 2022 05:47:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@blackwall-org.20210112.gappssmtp.com header.s=20210112 header.b=wtF2cm1c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231946AbiCQJ5z (ORCPT + 99 others); Thu, 17 Mar 2022 05:57:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58960 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231959AbiCQJ5r (ORCPT ); Thu, 17 Mar 2022 05:57:47 -0400 Received: from mail-ej1-x62a.google.com (mail-ej1-x62a.google.com [IPv6:2a00:1450:4864:20::62a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A759D1DB8B3 for ; Thu, 17 Mar 2022 02:56:31 -0700 (PDT) Received: by mail-ej1-x62a.google.com with SMTP id gb39so9532512ejc.1 for ; Thu, 17 Mar 2022 02:56:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=blackwall-org.20210112.gappssmtp.com; s=20210112; h=message-id:date:mime-version:user-agent:subject:content-language:to :cc:references:from:in-reply-to:content-transfer-encoding; bh=LKI5osEQPlJQ9O41BUKV52XR7DzjpxsT1kue581cMEE=; b=wtF2cm1chUMImmLChI8MkXLIoXzRJv/4c2ZfBPQXNLIW3gztpjO9Vus4qB80F9/9Ac UfctFxEEUiedV5c74UYBUcZbTl7TAnqIDK/81wFUViIAyzwtnGBDk/nEfWIu4dOZq3UY dU772LEIPywn+CzvVwK16LoMS8QaDRouGG8Ra7DddwEVLt3oRdY1hvMUB8H9AIXtgvwj v+laEHqfmmio6d79AWCPC9XC9JU2ad1QjY1he8lLN/UplsOA70S56ox/4B+pz62Dgp3L 3UKH4xjoixBzd2DeF+Z3lSqd6kkiE+ouS+Y1reWWY2fTs+L5B4x9E6PUniJymHyX+99H 2pIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:message-id:date:mime-version:user-agent:subject :content-language:to:cc:references:from:in-reply-to :content-transfer-encoding; bh=LKI5osEQPlJQ9O41BUKV52XR7DzjpxsT1kue581cMEE=; b=RGMko1LEGahYCu+LtHsIP2LsJRHwchs4r2PVM0Q4c/2LNFFkRuaJySZ8TrJKpc3c+9 0htvdKM+M1+nQjiqudmpnGY30obtbIwj8NQzr3epw/YVKQKRp9rwe21R6GLQVUF+tyf/ gtRzst/1G4RO7Fa0odHrpbrpEgm3N90SKSY4IgDj23RsHkXgyOQbw8MXNe3057tDgIB3 1PpxMEIeQDt15i5AA7efCb2PObdqNKgBJ7lCOF8+HPpsUICMUEknl6nF418PH0R4MsGX MZCXxiidslfewjIjfJicBZmIkya+R1c3nMumNyvA8uv1EEYQktWnXuXVns174BbpP5lh bzrQ== X-Gm-Message-State: AOAM532cIE7GfdKfP3eqZnuQK9G6fr8iyNK09hzIHiTsJADuQTXlrjg8 GmwJfnsSvmqoarKdXA/bkk7/9g== X-Received: by 2002:a17:907:9910:b0:6d5:acd6:8d02 with SMTP id ka16-20020a170907991000b006d5acd68d02mr3560951ejc.173.1647510989856; Thu, 17 Mar 2022 02:56:29 -0700 (PDT) Received: from [192.168.0.111] (87-243-81-1.ip.btc-net.bg. [87.243.81.1]) by smtp.gmail.com with ESMTPSA id n6-20020aa7c786000000b00410d2403ccfsm2382081eds.21.2022.03.17.02.56.28 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Thu, 17 Mar 2022 02:56:29 -0700 (PDT) Message-ID: <65f72950-8cfa-132d-f455-06213dae4327@blackwall.org> Date: Thu, 17 Mar 2022 11:56:27 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.0 Subject: Re: [PATCH v5 net-next 00/15] net: bridge: Multiple Spanning Trees Content-Language: en-US To: Tobias Waldekranz , davem@davemloft.net, kuba@kernel.org Cc: Andrew Lunn , Vivien Didelot , Florian Fainelli , Vladimir Oltean , Jiri Pirko , Ivan Vecera , Roopa Prabhu , Russell King , Petr Machata , Ido Schimmel , Matt Johnston , Cooper Lees , linux-kernel@vger.kernel.org, netdev@vger.kernel.org, bridge@lists.linux-foundation.org References: <20220316150857.2442916-1-tobias@waldekranz.com> <610eb6cc-4df4-f0fc-462a-b33145334a12@blackwall.org> <87tubwkiw2.fsf@waldekranz.com> From: Nikolay Aleksandrov In-Reply-To: <87tubwkiw2.fsf@waldekranz.com> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,NICE_REPLY_A,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_NONE, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 17/03/2022 11:50, Tobias Waldekranz wrote: > On Thu, Mar 17, 2022 at 11:00, Nikolay Aleksandrov wrote: >> On 16/03/2022 17:08, Tobias Waldekranz wrote: >>> The bridge has had per-VLAN STP support for a while now, since: >>> >>> https://lore.kernel.org/netdev/20200124114022.10883-1-nikolay@cumulusnetworks.com/ >>> >>> The current implementation has some problems: >>> >>> - The mapping from VLAN to STP state is fixed as 1:1, i.e. each VLAN >>> is managed independently. This is awkward from an MSTP (802.1Q-2018, >>> Clause 13.5) point of view, where the model is that multiple VLANs >>> are grouped into MST instances. >>> >>> Because of the way that the standard is written, presumably, this is >>> also reflected in hardware implementations. It is not uncommon for a >>> switch to support the full 4k range of VIDs, but that the pool of >>> MST instances is much smaller. Some examples: >>> >>> Marvell LinkStreet (mv88e6xxx): 4k VLANs, but only 64 MSTIs >>> Marvell Prestera: 4k VLANs, but only 128 MSTIs >>> Microchip SparX-5i: 4k VLANs, but only 128 MSTIs >>> >>> - By default, the feature is enabled, and there is no way to disable >>> it. This makes it hard to add offloading in a backwards compatible >>> way, since any underlying switchdevs have no way to refuse the >>> function if the hardware does not support it >>> >>> - The port-global STP state has precedence over per-VLAN states. In >>> MSTP, as far as I understand it, all VLANs will use the common >>> spanning tree (CST) by default - through traffic engineering you can >>> then optimize your network to group subsets of VLANs to use >>> different trees (MSTI). To my understanding, the way this is >>> typically managed in silicon is roughly: >>> >>> Incoming packet: >>> .----.----.--------------.----.------------- >>> | DA | SA | 802.1Q VID=X | ET | Payload ... >>> '----'----'--------------'----'------------- >>> | >>> '->|\ .----------------------------. >>> | +--> | VID | Members | ... | MSTI | >>> PVID -->|/ |-----|---------|-----|------| >>> | 1 | 0001001 | ... | 0 | >>> | 2 | 0001010 | ... | 10 | >>> | 3 | 0001100 | ... | 10 | >>> '----------------------------' >>> | >>> .-----------------------------' >>> | .------------------------. >>> '->| MSTI | Fwding | Lrning | >>> |------|--------|--------| >>> | 0 | 111110 | 111110 | >>> | 10 | 110111 | 110111 | >>> '------------------------' >>> >>> What this is trying to show is that the STP state (whether MSTP is >>> used, or ye olde STP) is always accessed via the VLAN table. If STP >>> is running, all MSTI pointers in that table will reference the same >>> index in the STP stable - if MSTP is running, some VLANs may point >>> to other trees (like in this example). >>> >>> The fact that in the Linux bridge, the global state (think: index 0 >>> in most hardware implementations) is supposed to override the >>> per-VLAN state, is very awkward to offload. In effect, this means >>> that when the global state changes to blocking, drivers will have to >>> iterate over all MSTIs in use, and alter them all to match. This >>> also means that you have to cache whether the hardware state is >>> currently tracking the global state or the per-VLAN state. In the >>> first case, you also have to cache the per-VLAN state so that you >>> can restore it if the global state transitions back to forwarding. >>> >>> This series adds a new mst_enable bridge setting (as suggested by Nik) >>> that can only be changed when no VLANs are configured on the >>> bridge. Enabling this mode has the following effect: >>> >>> - The port-global STP state is used to represent the CST (Common >>> Spanning Tree) (1/15) >>> >>> - Ingress STP filtering is deferred until the frame's VLAN has been >>> resolved (1/15) >>> >>> - The preexisting per-VLAN states can no longer be controlled directly >>> (1/15). They are instead placed under the MST module's control, >>> which is managed using a new netlink interface (described in 3/15) >>> >>> - VLANs can br mapped to MSTIs in an arbitrary M:N fashion, using a >>> new global VLAN option (2/15) >>> >>> Switchdev notifications are added so that a driver can track: >>> - MST enabled state >>> - VID to MSTI mappings >>> - MST port states >>> >>> An offloading implementation is this provided for mv88e6xxx. >>> >>> A proposal for the corresponding iproute2 interface is available here: >>> >>> https://github.com/wkz/iproute2/tree/mst >>> >> >> Hi Tobias, >> One major missing thing is the selftests for this new feature. Do you >> have a plan to upstream them? > > 100% agree. I have an internal test that I plan to adapt to run as a > kselftest. There's a bootstrapping problem here though. I can't send the > iproute2 series until the kernel support is merged - and until I know > how the iproute2 support ends up looking I can't add a kselftest. > That's ok, some people choose to send the iproute2 with the set, others send the iproute2 patches separately and add selftests after those are accepted (that's my personal preference for the same reasons above). Personally I don't mind either way as long as the tests end up materializing. :) Just in case you've missed it - most of the bridge tests reside in tools/testing/selftests/net/forwarding. > Ideally, tools/iproute2 would be a thing in the kernel. Then you could > send the entire implementation as one series. I'm sure that's probably > been discussed many times already, but my Google-fu fails me. Cheers, Nik