Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp2392912pxb; Tue, 13 Apr 2021 00:19:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy1Un+8gx4b2oVrl5EIBRb6j4O2AziV9MXE7GInZ/0DttqX0uBStWuH3v29ovYnnHM+xJ7k X-Received: by 2002:a17:90a:c3:: with SMTP id v3mr3517495pjd.55.1618298345746; Tue, 13 Apr 2021 00:19:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618298345; cv=none; d=google.com; s=arc-20160816; b=cX1ZjNCoA+T2/z2EWB0Nd1dBAuKFM/Z6bDBEylBz3oolSpJfzSe/18Ond5HhB5CYgL HiMEio+BaGLPRBiGa7eiPuricKxYyUXYLJhUTQa1Ok9oWZuDwb7GvfnDIS0sGKQaWm04 pCpzjqVANnGHcFhEYQLa5ZnZts1W9XrqVdEE1s5kmr1AeF/0sK2MwPDG4shqcZdCg3RW 3m7y6jSvCxErBWUw3FGP7Ek7Mkcpb1iAM7EFhb0fpwgHeAPr5qhx2YvneQjBFJWbAEM5 JAbaw8wFCe2BrvM9EwWdCMe7uYE4XUA7YW/GsXKBlXrO9BafioCZ2XnSbL0gHOlJ45WN JiEg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=278UkD/EEDFcvFgvIn0SxAH6gzyvmuo42b1whtPcp44=; b=cOM/0YUYz30YbYlBqi92NDC8hsy0ofhvTPe894y6ubOVQrPNmkXU1eo2qDrrSAr5o6 OC+yMpTtBvND5DA98kpoSwTH79qJ5adAoUiWgRvzWir7pZT21O0y0b+x5CSNsHguj/ns IJEjt/6iueJWrTPW5qQZBD+NCQU02boWEStKOYInJdTzvR/80Kf9xbxQqSwvs92PyAOf FYbATDiqx10ATtO4WOhiQZT+wSPJdpdspXBW12I6sBa7TgUPPaHL8iIB34uDASJ887IM 0VcFQVMu+nHvfCV3qqr99AHfe0wiXroKMnTYcFiSsJXSK2k+tQJw5kjHKcsqgPKvnEL1 exHg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=nic.cz Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b12si15142639pfl.235.2021.04.13.00.18.53; Tue, 13 Apr 2021 00:19:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=nic.cz Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1343530AbhDLVvQ (ORCPT + 99 others); Mon, 12 Apr 2021 17:51:16 -0400 Received: from lists.nic.cz ([217.31.204.67]:55936 "EHLO mail.nic.cz" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238235AbhDLVvP (ORCPT ); Mon, 12 Apr 2021 17:51:15 -0400 Received: from thinkpad (unknown [IPv6:2a0e:b107:ae1:0:3e97:eff:fe61:c680]) by mail.nic.cz (Postfix) with ESMTPSA id 0E5E013FF05; Mon, 12 Apr 2021 23:50:55 +0200 (CEST) Date: Mon, 12 Apr 2021 23:50:54 +0200 From: Marek Behun To: Tobias Waldekranz Cc: Vladimir Oltean , Ansuel Smith , netdev@vger.kernel.org, "David S. Miller" , Jakub Kicinski , Andrew Lunn , Vivien Didelot , Florian Fainelli , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Eric Dumazet , Wei Wang , Cong Wang , Taehee Yoo , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , zhang kai , Weilong Chen , Roopa Prabhu , Di Zhu , Francis Laniel , linux-kernel@vger.kernel.org Subject: Re: [PATCH RFC net-next 0/3] Multi-CPU DSA support Message-ID: <20210412235054.73754df9@thinkpad> In-Reply-To: <8735vvkxju.fsf@waldekranz.com> References: <20210410133454.4768-1-ansuelsmth@gmail.com> <20210411200135.35fb5985@thinkpad> <20210411185017.3xf7kxzzq2vefpwu@skbuf> <878s5nllgs.fsf@waldekranz.com> <20210412213045.4277a598@thinkpad> <8735vvkxju.fsf@waldekranz.com> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-100.0 required=5.9 tests=SHORTCIRCUIT,URIBL_BLOCKED, USER_IN_WELCOMELIST,USER_IN_WHITELIST shortcircuit=ham autolearn=disabled version=3.4.2 X-Spam-Checker-Version: SpamAssassin 3.4.2 (2018-09-13) on mail.nic.cz X-Virus-Scanned: clamav-milter 0.102.2 at mail X-Virus-Status: Clean Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 12 Apr 2021 23:22:45 +0200 Tobias Waldekranz wrote: > On Mon, Apr 12, 2021 at 21:30, Marek Behun wrote: > > On Mon, 12 Apr 2021 14:46:11 +0200 > > Tobias Waldekranz wrote: > > > >> I agree. Unless you only have a few really wideband flows, a LAG will > >> typically do a great job with balancing. This will happen without the > >> user having to do any configuration at all. It would also perform well > >> in "router-on-a-stick"-setups where the incoming and outgoing port is > >> the same. > > > > TLDR: The problem with LAGs how they are currently implemented is that > > for Turris Omnia, basically in 1/16 of configurations the traffic would > > go via one CPU port anyway. > > > > > > > > One potencial problem that I see with using LAGs for aggregating CPU > > ports on mv88e6xxx is how these switches determine the port for a > > packet: only the src and dst MAC address is used for the hash that > > chooses the port. > > > > The most common scenario for Turris Omnia, for example, where we have 2 > > CPU ports and 5 user ports, is that into these 5 user ports the user > > plugs 5 simple devices (no switches, so only one peer MAC address for > > port). So we have only 5 pairs of src + dst MAC addresses. If we simply > > fill the LAG table as it is done now, then there is 2 * 0.5^5 = 1/16 > > chance that all packets would go through one CPU port. > > > > In order to have real load balancing in this scenario, we would either > > have to recompute the LAG mask table depending on the MAC addresses, or > > rewrite the LAG mask table somewhat randomly periodically. (This could > > be in theory offloaded onto the Z80 internal CPU for some of the > > switches of the mv88e6xxx family, but not for Omnia.) > > I thought that the option to associate each port netdev with a DSA > master would only be used on transmit. Are you saying that there is a > way to configure an mv88e6xxx chip to steer packets to different CPU > ports depending on the incoming port? > > The reason that the traffic is directed towards the CPU is that some > kind of entry in the ATU says so, and the destination of that entry will > either be a port vector or a LAG. Of those two, only the LAG will offer > any kind of balancing. What am I missing? Via port vectors you can "load balance" by ports only, i.e. input port X -> trasmit via CPU port Y. When using LAGs, you are load balancing via hash(src MAC | dst mac) only. This is better in some ways. But what I am saying is that if the LAG mask table is static, as it is now implemented in mv88e6xxx code, then for many scenarios there is a big probability of no load balancing at all. For Turris Omnia for example there is 6.25% probability that the switch chip will send all traffic to the CPU via one CPU port. This is because the switch chooses the LAG port only from the hash of dst+src MAC address. (By the 1/16 = 6.25% probability I mean that for cca 1 in 16 customers, the switch would only use one port when sending data to the CPU). The round robin solution here is therefore better in this case.