Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp1740622pxb; Mon, 12 Apr 2021 05:47:37 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxQbVhPAvkpxtlQsNmr7+hjJz2hewVA0TPXGl5dSXRUO0/XSkiZDL5u2MKt6q+hiqlJPIfe X-Received: by 2002:a17:907:9709:: with SMTP id jg9mr26320674ejc.276.1618231656866; Mon, 12 Apr 2021 05:47:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618231656; cv=none; d=google.com; s=arc-20160816; b=qXYoC0npyGFLPpuxRcYA0HVWWBm7vSuGTNnKbG26CJEs2Va/LpiqVRtVXEytcLyEDN ErZZi6cOxcX7cH04CN/NzZa6HQ7YvH2Em20LFBnvvpRW3jqrhWR9yPenlu7XtbhOL9Rr H7Mg3Jus8OW0vRwZ7Fo3ZxRq9xNsFoZs64KnTMxDjEtLt3xgk6ZlJWdlImumiD7//+6d +HHeKvwbQ5hMGwCcCANv0eYIKGX7GNY22YaSNGA0Nq6wELWaWRkbJOKLWjDPgZsRhPAg ApcO36QGi5IxVEkv7DzFogxLuJ8n2CWdF5MV+BfZVqUE6ci6sJVnURJyWAAJyAQOysI/ xIXQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:from:dkim-signature; bh=K9zFwgQThnD5eX67NpvDmwlYzeG84kCePup7Nam43iY=; b=KUcOb4DsBe3d2dIFes6dsKMdlFJZYAC1jQTHAhSIiWsrR5ZFImaA6vz77ScjJcJ3X0 bkbaIb3yznD5qv5CZEA6+vlUBuy6agNdwxM1H0uL9w+fE5q3t32O/6JPO8s1X5dIPvgL nv8T5qB9j44EOo0GbM9tR/TVah8tyuwdeEE0aXubN8+P9H3K5nQcRV32gkVF23Qi7bKc HDbjSlY82DoLtlQ3JTvgWrJtqZ4LYY1NEc0UwADET7Fy7Es0Arkc1ljoy1CJz1zPGNmR CVCb70/7I/E/QDpbxFdiLcxCTpxVKlDVSsVHpcgVaeWSfIburW/15edSX8YSWITBJGV6 +GQQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@waldekranz-com.20150623.gappssmtp.com header.s=20150623 header.b=G7nI0KdF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id q8si7796410ejy.320.2021.04.12.05.47.13; Mon, 12 Apr 2021 05:47:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@waldekranz-com.20150623.gappssmtp.com header.s=20150623 header.b=G7nI0KdF; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241291AbhDLMqd (ORCPT + 99 others); Mon, 12 Apr 2021 08:46:33 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:60362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241268AbhDLMqd (ORCPT ); Mon, 12 Apr 2021 08:46:33 -0400 Received: from mail-lf1-x12b.google.com (mail-lf1-x12b.google.com [IPv6:2a00:1450:4864:20::12b]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 321C4C06174A for ; Mon, 12 Apr 2021 05:46:15 -0700 (PDT) Received: by mail-lf1-x12b.google.com with SMTP id j18so21199631lfg.5 for ; Mon, 12 Apr 2021 05:46:15 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=waldekranz-com.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:in-reply-to:references:date:message-id :mime-version; bh=K9zFwgQThnD5eX67NpvDmwlYzeG84kCePup7Nam43iY=; b=G7nI0KdFgTu9x2Gm0LgVO8BLeCpd3E53MYTduPLDoh/nAzdbUShjlY2Bq7s6wcLx6c v/HzrGr9dInpgl6OIX3SjSYsR7ltPeMw+L/of9q37M+MS4OACfK9mHanwyqWo2WkPH5P QfbhVfvciGXGjAprM9g4i106zLzp/2PigWVCTukRN9yhdxJYqTSamyQa/gBQiZxkMvB+ gOQQbnSof6Hxu5vyO/1Lvc4XRsUroCeRc+f3hmHs8j+N0ss/Ja3F3RbGMp6teocVVdSP ieH/gmv1lZMO2zIy5+iIf54d+hy/5PKAYuvitDxdT27UxXJ6X9sxT3qcJCapKcicWcTC IC7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:in-reply-to:references:date :message-id:mime-version; bh=K9zFwgQThnD5eX67NpvDmwlYzeG84kCePup7Nam43iY=; b=MokklcFeqj9vpgCtmK0CJ7D2dyJ6jApGbgnabtyJrnYj3QBiZIcroMG+MisUZwg56c fAaPmPFBzDe2/pnfpxCkPOjBINhemylQ97jOqz6wueMcRnhEwihmv4xzaltuOyVH8b9e 3kOZVX/kkAtqDBLwNjm6xQ9Il6dsQqtIpdR9ZQwHN5K3uER6mHx4Su3ftTtM28tp06m8 sWsY9NkU5tvtnlRbVwYAnUQGNKcs3otUmFyncJPrNKK9BbdQYS1OjslbfqtQoqZN43uJ XRVXEJaLQ5vrnEL85rD0sQyMCaCw7UcBtI76XKxCbIOyGcQJ5z9KqL2EXSXi2c3T2qnN MXdQ== X-Gm-Message-State: AOAM530z1z3Zd20XsdSd3EDAguQ1DdzfMEAetGvmN5h+F+e+QU2u5pN6 6P/F5/wlnn9pvUFIKJnnCz2Digv5Gg+ZTQ== X-Received: by 2002:a19:2387:: with SMTP id j129mr15163663lfj.478.1618231573027; Mon, 12 Apr 2021 05:46:13 -0700 (PDT) Received: from wkz-x280 (h-90-88.A259.priv.bahnhof.se. [212.85.90.88]) by smtp.gmail.com with ESMTPSA id w19sm2413556lfl.199.2021.04.12.05.46.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Apr 2021 05:46:12 -0700 (PDT) From: Tobias Waldekranz To: Vladimir Oltean , Marek Behun Cc: Ansuel Smith , netdev@vger.kernel.org, "David S. Miller" , Jakub Kicinski , Andrew Lunn , Vivien Didelot , Florian Fainelli , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Eric Dumazet , Wei Wang , Cong Wang , Taehee Yoo , =?utf-8?B?QmrDtnJuIFTDtnBlbA==?= , zhang kai , Weilong Chen , Roopa Prabhu , Di Zhu , Francis Laniel , linux-kernel@vger.kernel.org Subject: Re: [PATCH RFC net-next 0/3] Multi-CPU DSA support In-Reply-To: <20210411185017.3xf7kxzzq2vefpwu@skbuf> References: <20210410133454.4768-1-ansuelsmth@gmail.com> <20210411200135.35fb5985@thinkpad> <20210411185017.3xf7kxzzq2vefpwu@skbuf> Date: Mon, 12 Apr 2021 14:46:11 +0200 Message-ID: <878s5nllgs.fsf@waldekranz.com> MIME-Version: 1.0 Content-Type: text/plain Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sun, Apr 11, 2021 at 21:50, Vladimir Oltean wrote: > On Sun, Apr 11, 2021 at 08:01:35PM +0200, Marek Behun wrote: >> On Sat, 10 Apr 2021 15:34:46 +0200 >> Ansuel Smith wrote: >> >> > Hi, >> > this is a respin of the Marek series in hope that this time we can >> > finally make some progress with dsa supporting multi-cpu port. >> > >> > This implementation is similar to the Marek series but with some tweaks. >> > This adds support for multiple-cpu port but leave the driver the >> > decision of the type of logic to use about assigning a CPU port to the >> > various port. The driver can also provide no preference and the CPU port >> > is decided using a round-robin way. >> >> In the last couple of months I have been giving some thought to this >> problem, and came up with one important thing: if there are multiple >> upstream ports, it would make a lot of sense to dynamically reallocate >> them to each user port, based on which user port is actually used, and >> at what speed. >> >> For example on Turris Omnia we have 2 CPU ports and 5 user ports. All >> ports support at most 1 Gbps. Round-robin would assign: >> CPU port 0 - Port 0 >> CPU port 1 - Port 1 >> CPU port 0 - Port 2 >> CPU port 1 - Port 3 >> CPU port 0 - Port 4 >> >> Now suppose that the user plugs ethernet cables only into ports 0 and 2, >> with 1, 3 and 4 free: >> CPU port 0 - Port 0 (plugged) >> CPU port 1 - Port 1 (free) >> CPU port 0 - Port 2 (plugged) >> CPU port 1 - Port 3 (free) >> CPU port 0 - Port 4 (free) >> >> We end up in a situation where ports 0 and 2 share 1 Gbps bandwidth to >> CPU, and the second CPU port is not used at all. >> >> A mechanism for automatic reassignment of CPU ports would be ideal here. >> >> What do you guys think? > > The reason why I don't think this is such a great idea is because the > CPU port assignment is a major reconfiguration step which should at the > very least be done while the network is down, to avoid races with the > data path (something which this series does not appear to handle). > And if you allow the static user-port-to-CPU-port assignment to change > every time a link goes up/down, I don't think you really want to force > the network down through the entire switch basically. > > So I'd be tempted to say 'tough luck' if all your ports are not up, and > the ones that are are assigned statically to the same CPU port. It's a > compromise between flexibility and simplicity, and I would go for > simplicity here. That's the most you can achieve with static assignment, > just put the CPU ports in a LAG if you want better dynamic load balancing > (for details read on below). I agree. Unless you only have a few really wideband flows, a LAG will typically do a great job with balancing. This will happen without the user having to do any configuration at all. It would also perform well in "router-on-a-stick"-setups where the incoming and outgoing port is the same. ... > But there is something which is even more interesting about Felix with > the ocelot-8021q tagger. Since Marek posted his RFC and until Ansuel > posted the follow-up, things have happened, and now both Felix and the > Marvell driver support LAG offload via the bonding and/or team drivers. > At least for Felix, when using the ocelot-8021q tagged, it should be > possible to put the two CPU ports in a hardware LAG, and the two DSA > masters in a software LAG, and let the bond/team upper of the DSA > masters be the CPU port. > > I would like us to keep the door open for both alternatives, and to have > a way to switch between static user-to-CPU port assignment, and LAG. > I think that if there are multiple 'ethernet = ' phandles present in the > device tree, DSA should populate a list of valid DSA masters, and then > call into the driver to allow it to select which master it prefers for > each user port. This is similar to what Ansuel added with 'port_get_preferred_cpu', > except that I chose "DSA master" and not "CPU port" for a specific reason. > For LAG, the DSA master would be bond0. I do not see why we would go through the trouble of creating a user-visible bond/team for this. As you detail below, it would mean jumping through a lot of hoops. I am not sure there is that much we can use from those drivers. - We know that the CPU ports are statically up, so there is no "active transmit set" to manage, it always consists of all ports. - The LAG members are statically known at boot time via the DT, so we do not need (or want, in fact) any management of that from userspace. We could just let the drivers setup the LAG internally, and then do the load-balancing in dsa_slave_xmit or provide a generic helper that the taggers could use.