Received: by 2002:ab2:6203:0:b0:1f5:f2ab:c469 with SMTP id o3csp2871527lqt; Tue, 23 Apr 2024 04:27:02 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUK/IBxg05E6eq1rOspUg1JcWB2h9fqfg271po+HZXH6dU9f6lYzhgiywrh6K2Urqk9sSe3FDnmjDE26SdGIDgz3Wot34Frfv9EXyG9yQ== X-Google-Smtp-Source: AGHT+IF8KJ4EHC/aIv/YMX/WSZht4e7Ij0h9gePKDVUPiuqNfa9d/HmuVtmLMy+x8luTYLC7GgQ+ X-Received: by 2002:a17:906:bcd1:b0:a56:62ed:c33d with SMTP id lw17-20020a170906bcd100b00a5662edc33dmr4074942ejb.62.1713871622073; Tue, 23 Apr 2024 04:27:02 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1713871622; cv=pass; d=google.com; s=arc-20160816; b=SqUZzzi1bGRPGHInNGOrtCFC5yUmadK/UY9JmZWYDZqN5O1iJ9tq8DClfZSzSkWY8v nkZQTAdM1EHEM4htllTulRIywFlwgJH6/L4AWtR+bkp6iM3Km3/yX76pBX2G8o2FZOgi FCY9/m0NafNcvL9UTAyGFUzw2YtOzvb3QgZS6vM36RxncpzH9uynqQQRmgqbew3t9x2J BlUBdz+BJChg1/R13iAThzqdO8S7rTz3qDt0ikYZ6w2q90Hh502uNSV0Rp058wpnHa+O vKiZUwvSw9ZszGZ/iTbl+Fb76rUxHf9eBe561OoJy45IjqNnoZV5UG/AKd2GLQFg0OSg gGoQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=yFS7RXq9qBhiwLT5zm2qX4wTBz1ICyONtrvH1XkqVug=; fh=UOjq4ipPIWX/+Yyw2CPdE1RSx8XtVMkNTz1AzaRZWQs=; b=mJHWxc+0eAoTtMUgQRg3xeq7URmobQmG0f+JsSU9LCTMBpePA3lCjXxbLB4egu/F4A gK/gn6xcqF5EiTYf2bRvG0rYILQrpJo5Y1aX+atr1F6ZpHsAqmHYJKCPhXJHj1cUZ7Z+ E3y9iu5xN0lkT76mJLFk5Zw7tr2gojKwFFsjL1Z3HXbLSZ/G7QTMb6zzyxll1cTK/3Gq gXAD2jYqhM/byGk7z4vch1qIRCeb0nCQ3/IJOqhHjCoeLa5HsLBp6I0Y+f3BZPAs5lU6 ePDf1miJIPS12CpaGajwVWwZSHulupOLPlYQsKI+Fk3D4PNav47uom7k97t6Avgx7RTH 3SRA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=AYdUuJSH; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-154964-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-154964-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [2604:1380:4601:e00::3]) by mx.google.com with ESMTPS id hq39-20020a1709073f2700b00a58752b86a8si1288925ejc.278.2024.04.23.04.27.01 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 23 Apr 2024 04:27:02 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-154964-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) client-ip=2604:1380:4601:e00::3; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=AYdUuJSH; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-154964-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:4601:e00::3 as permitted sender) smtp.mailfrom="linux-kernel+bounces-154964-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id B9D871F24B31 for ; Tue, 23 Apr 2024 11:18:06 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B7F837E761; Tue, 23 Apr 2024 11:17:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="AYdUuJSH" Received: from mail-ed1-f50.google.com (mail-ed1-f50.google.com [209.85.208.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3DE4D6E617 for ; Tue, 23 Apr 2024 11:17:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.50 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713871077; cv=none; b=EZxmhTpGeKbXqj8yi4K3DyQTEFMLKLhqg9ZAajP/9stARzDspO5iNUt4dIpwMTAwCPGOqcjhsfKe7D/DGPKhgADKAGjjFopPMbfgpekWgDZvF8KEAyq+r9b6HMvJdjkQWYzD874P06fCx/D8tN3flVmBboctiQFUzG2OidQbzwU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1713871077; c=relaxed/simple; bh=WBlYJbDM420376pxHCSgY3J8I6hmPvxhxfia3JcpKqQ=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=BaPyjIAO4W/cDiEWGcX6Yjm86FBJz1+d6iSTtN0AJ2gcdfN37+4AuuoFMVG2dqpBC2UX1ic3zdMhSMoAHBkfXZZDoOkIcRwTSos79bEadAbswbdkYQNbAKQQ5S5cgz5890iXkLwfsg5oXTq5q/J4tu7+f6ouk7mn3ZmlB1mK5Eo= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=AYdUuJSH; arc=none smtp.client-ip=209.85.208.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-ed1-f50.google.com with SMTP id 4fb4d7f45d1cf-571e13cd856so16571a12.0 for ; Tue, 23 Apr 2024 04:17:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1713871074; x=1714475874; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=yFS7RXq9qBhiwLT5zm2qX4wTBz1ICyONtrvH1XkqVug=; b=AYdUuJSHTNIL9PeNTroij4pmIoLgbEgRuIvGJr3s8qEjgTA7AYNbl66Q14cnVnnruN GSRz3o+bEB1rtFjJwSMmu9kUybfhn1QWdk0TBkMCuvwUAlp1zAWyzUkDNfry/lMHWrYi nrpy5rcBF0U48dg8Ugao871kSwgGJxHdhFa92rqVN/uHswpqXZn894BYVdO9yzhGRDb3 w9vnN543QQlyQpxL7cFCO2LIvrYtbc0AGjTXNyz2LBlvLY4UfPJFiU9FLTI6g3zbC3SZ MxmGe0ibll69EM/KJF9c6fE0k9iEWb2i75ZUd6DmrF0vTYuso5N3oxzIEncQr/RkSKGr rPxQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1713871074; x=1714475874; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=yFS7RXq9qBhiwLT5zm2qX4wTBz1ICyONtrvH1XkqVug=; b=Gs0RTBkvBA5KW48R+jSbwCkmd4nkvyAnPztMySRu9galR7jmSf0rdDJC10B8zLbDH0 uhH6CjpcKfiuEq6HpafMaMtOmDpg2Y8wbJqCiWKfy9Hrz+1mw3Pey1R8KMlUtkEiePUX b2KPxF4gDvwuJdrLMP/Ow8I9i2ESI+6plHMDZ+lORCf7C6jU4zTaoMj5VBhQoJ27YPek o9b6cUeciAOmZzbTXPSCAta8Jao7qj/yFPySqYMf2JPqgYrrTQbi8lK3D/GTCvQIUNhc LtHnRcMCoxE5wlDCH78kyIfnyadaodWWWonKsb6dRr90Wse6G6UCqh81SAiL+XTc1jMY ShwA== X-Forwarded-Encrypted: i=1; AJvYcCXqkjhoIrkN+t0P0Kg6njdJFbdgwS9wmRz5Fxn+5Uz7KtbrFwBZWcIiPyOHcJMvAsXnvUc8VzEwQggtwUzg35I15oFqbhyIvEvCi3/t X-Gm-Message-State: AOJu0Yy6dFEeFmwbqWkMYckWdTOsq8gry/+sYrY0FSgw1cZKRVtoPNuz dUw325Y803rFcAYsKilLLK2I5wfW2xBPLpuld628A00X//gvtXAH8eiGKlE4SxypfIBpvq1rwGF lenVzW6mw474SJ+sY9DqGRIUNJwIyWMmnywlP X-Received: by 2002:aa7:d290:0:b0:571:fee3:594c with SMTP id w16-20020aa7d290000000b00571fee3594cmr149127edq.4.1713871074171; Tue, 23 Apr 2024 04:17:54 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240423094117.93206-1-nbd@nbd.name> <63abfa26-d990-46c3-8982-3eaf7b8f8ee5@nbd.name> In-Reply-To: <63abfa26-d990-46c3-8982-3eaf7b8f8ee5@nbd.name> From: Eric Dumazet Date: Tue, 23 Apr 2024 13:17:40 +0200 Message-ID: Subject: Re: [RFC] net: add TCP fraglist GRO support To: Felix Fietkau Cc: netdev@vger.kernel.org, "David S. Miller" , Jakub Kicinski , Paolo Abeni , David Ahern , linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Tue, Apr 23, 2024 at 12:25=E2=80=AFPM Felix Fietkau wrote= : > > On 23.04.24 12:15, Eric Dumazet wrote: > > On Tue, Apr 23, 2024 at 11:41=E2=80=AFAM Felix Fietkau w= rote: > >> > >> When forwarding TCP after GRO, software segmentation is very expensive= , > >> especially when the checksum needs to be recalculated. > >> One case where that's currently unavoidable is when routing packets ov= er > >> PPPoE. Performance improves significantly when using fraglist GRO > >> implemented in the same way as for UDP. > >> > >> Here's a measurement of running 2 TCP streams through a MediaTek MT762= 2 > >> device (2-core Cortex-A53), which runs NAT with flow offload enabled f= rom > >> one ethernet port to PPPoE on another ethernet port + cake qdisc set t= o > >> 1Gbps. > >> > >> rx-gro-list off: 630 Mbit/s, CPU 35% idle > >> rx-gro-list on: 770 Mbit/s, CPU 40% idle > > > > Hi Felix > > > > changelog is a bit terse, and patch complex. > > > > Could you elaborate why this issue > > seems to be related to a specific driver ? > > > > I think we should push hard to not use frag_list in drivers :/ > > > > And GRO itself could avoid building frag_list skbs > > in hosts where forwarding is enabled. > > > > (Note that we also can increase MAX_SKB_FRAGS to 45 these days) > > The issue is not related to a specific driver at all. Here's how traffic > flows: TCP packets are received on the SoC ethernet driver, the network > stack performs regular GRO. The packet gets forwarded by flow offloading > until it reaches the PPPoE device. PPPoE does not support GSO packets, > so the packets need to be segmented again. > This is *very* expensive, since data needs to be copied and checksummed. gso segmentation does not copy the payload, unless the device has no SG capability. I guess something should be done about that, regardless of your GRO work, since most ethernet devices support SG these days. Some drivers use header split for RX, so forwarding to PPPoE would require a linearization anyway, if SG is not properly handled. > > So in my patch, I changed the code to build fraglist GRO instead of > regular GRO packets, whenever there is no local socket to receive the > packets. This makes segmenting very cheap, since the original skbs are > preserved on the trip through the stack. The only cost is an extra > socket lookup whenever NETIF_F_FRAGLIST_GRO is enabled. A socket lookup in multi-net-namespace world is not going to work generical= ly, but I get the idea now.