Received: by 2002:a05:6359:6284:b0:131:369:b2a3 with SMTP id se4csp5292168rwb; Wed, 9 Aug 2023 01:32:49 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHqi86TY38D/AoDOCG9cuaOrwyl7mR9/OyvWvyEQvXxZf7fct1qEJhlFAQFfpzwlXTP2Fhn X-Received: by 2002:a05:6a00:891:b0:67e:e019:3a28 with SMTP id q17-20020a056a00089100b0067ee0193a28mr2230225pfj.16.1691569969310; Wed, 09 Aug 2023 01:32:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691569969; cv=none; d=google.com; s=arc-20160816; b=x7YV2RvJq7oTMo8TTBFR7R8HTdnLFaNJkcwSLFyPCjhNZkK7wA8J7Fc+/iiRu6ahSx WIc4VogeBqVZ+NYh91ngOQ+DdBITkCpxKNdLLmSlNTam4g334UAqgTOzo5rAdP8Pypa6 t3sx48SLOVigXt/PR/MgaO3y2bjaKKDbSjijZFszpxvATlMx0BlqRnvtVNEV6rtsQg62 mgPjXIWH71s/mit1Drg73TlFOnSjalsdYFC9fiLpYN2Uh9ZYNpNio3EFmQoRKgwkOGln il0UaZOngeurrUd4G60csleVDiy4bSrKq9Fn2pb38qqsu32FJHBnZ2vSB+HqblTQjD4C qObQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=XJc4D+pWFIMKl1qw8oGjynYOM6/qtbfYWHBOBDcE408=; fh=1br1d4co4uMhHMbWF+sEyD2Psewer8LPcKkkhq7vDpo=; b=y9ZNcVlz9GV4vu27nPlmFes0s5q7Gjp2VbL5xI/hTA8Kcfy9MamAnReEj35A2htjzg iX+1fyFmBscys8bBi0Z9DDYq1WThVPmwDugdMRdu3wOSqoQ+xLpv5xLFxIu8MGlvG8Ik XZs9WLH73+nIZ2YfDdos8AiSPxkUIm4OEyoJC/16j46jzWQpk3MD8adjqVb9MQ3XBsVm VgxTYoGi0P9b84OsxDJ2gLpTouorqAlWpEqtP0srpmZOdMnv8KJJQXO5emxlIS8X25Yc +AjrtwHKMLrwmkogF/bRo9+IJR7hAr8DusdyTPA+9UwH/H9CKM2/FqDO+eBw63PTbGLK N1xQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=F6l1lQMO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id y16-20020a056a00191000b006875bd6d8d9si8748091pfi.169.2023.08.09.01.32.38; Wed, 09 Aug 2023 01:32:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=F6l1lQMO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231591AbjHIHO0 (ORCPT + 99 others); Wed, 9 Aug 2023 03:14:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33034 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231580AbjHIHOZ (ORCPT ); Wed, 9 Aug 2023 03:14:25 -0400 Received: from mail-lj1-x22d.google.com (mail-lj1-x22d.google.com [IPv6:2a00:1450:4864:20::22d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A702D210E for ; Wed, 9 Aug 2023 00:14:00 -0700 (PDT) Received: by mail-lj1-x22d.google.com with SMTP id 38308e7fff4ca-2ba1e9b1fa9so77058581fa.3 for ; Wed, 09 Aug 2023 00:14:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1691565239; x=1692170039; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=XJc4D+pWFIMKl1qw8oGjynYOM6/qtbfYWHBOBDcE408=; b=F6l1lQMO2CuYyzSt7qfaHFoNCWlWGk+xh4H8Z0whKt8T+fBaENkv+d/rgaMfAWfDo0 U1YdBws1rKhASov+2t65UG0XCkCtMWkjlkmmzS67JEHFpS1g7JeczPKzd400jgnQN1sq otvVjxp/93WT5lV9GwBZYDjz2zFpVuDapEJRaUX7gHs8v93Mr2r9aiqJ+pgQyvp2fPEC RpjCQTU0rbQLZjE3lBAa5duiUFGlZXYysa3nrS9DTIl0q41gH09EdurJjlC50WnfnPYz Wv/yoMEI1lY4CDAG0UI/L5RXmzw87GCjgjmWPKPKZd2s1ch9vaVpYqSvP6/0YypQ8nfw DZqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1691565239; x=1692170039; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=XJc4D+pWFIMKl1qw8oGjynYOM6/qtbfYWHBOBDcE408=; b=Rt735bMZVUhwsJTc8nF4fwgvL0QvSHl+1QfCABZSuJhMxqXBCwhG0eSclc00phGxGk NQ9U7/abviU8l3JIw5WxITMo4gFMABjvLjxQUSXQHskaycY3KWx25JASOdt3l46RAuKA tihM726U/emdUv3CMscciTsbZqNPNhtJL5uo6DZ/+vSrvUW/zPaqD3oo+ZhNhID2n6gW rS2MGFyC3RX47YMcGPAkgMxlywfrg2OV8MDvnOGR7xDRLgqw0tI7HTerfQmJgU+CIndZ Y6JS+qX54KFP909VjssYBvWCyG8j0Gb4h34DqihGP9LHr4pot/1UYTZFNV3Wl2YbKvQw NiOQ== X-Gm-Message-State: AOJu0YzfWQ1+KJFCLt5uUEMUCL1YGoxDcJVJZYiUBQt2bglnhBSH4iJv /cX9o/MQPKL31X8qGYS09rtH2D4hqqAOZd2qX5x1Ww== X-Received: by 2002:a2e:83c8:0:b0:2b9:cdbf:5c15 with SMTP id s8-20020a2e83c8000000b002b9cdbf5c15mr1104297ljh.51.1691565238877; Wed, 09 Aug 2023 00:13:58 -0700 (PDT) MIME-Version: 1.0 References: <20230808031913.46965-1-huangjie.albert@bytedance.com> <87v8dpbv5r.fsf@toke.dk> In-Reply-To: <87v8dpbv5r.fsf@toke.dk> From: =?UTF-8?B?6buE5p2w?= Date: Wed, 9 Aug 2023 15:13:47 +0800 Message-ID: Subject: Re: Re: [RFC v3 Optimizing veth xsk performance 0/9] To: =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= Cc: "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Maciej Fijalkowski , Jonathan Lemon , Pavel Begunkov , Yunsheng Lin , Kees Cook , Richard Gobert , "open list:NETWORKING DRIVERS" , open list , "open list:XDP (eXpress Data Path)" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_BLOCKED, SPF_HELO_NONE,SPF_NONE,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Toke H=C3=B8iland-J=C3=B8rgensen =E4=BA=8E2023=E5=B9=B48= =E6=9C=888=E6=97=A5=E5=91=A8=E4=BA=8C 20:01=E5=86=99=E9=81=93=EF=BC=9A > > Albert Huang writes: > > > AF_XDP is a kernel bypass technology that can greatly improve performan= ce. > > However,for virtual devices like veth,even with the use of AF_XDP socke= ts, > > there are still many additional software paths that consume CPU resourc= es. > > This patch series focuses on optimizing the performance of AF_XDP socke= ts > > for veth virtual devices. Patches 1 to 4 mainly involve preparatory wor= k. > > Patch 5 introduces tx queue and tx napi for packet transmission, while > > patch 8 primarily implements batch sending for IPv4 UDP packets, and pa= tch 9 > > add support for AF_XDP tx need_wakup feature. These optimizations signi= ficantly > > reduce the software path and support checksum offload. > > > > I tested those feature with > > A typical topology is shown below: > > client(send): server:(recv) > > veth<-->veth-peer veth1-peer<--->vet= h1 > > 1 | | 7 > > |2 6| > > | | > > bridge<------->eth0(mlnx5)- switch -eth1(mlnx5)<--->bridge1 > > 3 4 5 > > (machine1) (machine2) > > I definitely applaud the effort to improve the performance of af_xdp > over veth, this is something we have flagged as in need of improvement > as well. > > However, looking through your patch series, I am less sure that the > approach you're taking here is the right one. > > AFAIU (speaking about the TX side here), the main difference between > AF_XDP ZC and the regular transmit mode is that in the regular TX mode > the stack will allocate an skb to hold the frame and push that down the > stack. Whereas in ZC mode, there's a driver NDO that gets called > directly, bypassing the skb allocation entirely. > > In this series, you're implementing the ZC mode for veth, but the driver > code ends up allocating an skb anyway. Which seems to be a bit of a > weird midpoint between the two modes, and adds a lot of complexity to > the driver that (at least conceptually) is mostly just a > reimplementation of what the stack does in non-ZC mode (allocate an skb > and push it through the stack). > > So my question is, why not optimise the non-zc path in the stack instead > of implementing the zc logic for veth? It seems to me that it would be > quite feasible to apply the same optimisations (bulking, and even GRO) > to that path and achieve the same benefits, without having to add all > this complexity to the veth driver? > > -Toke > thanks! This idea is really good indeed. You've reminded me, and that's something I overlooked. I will now consider implementing the solution you've proposed and test the performance enhancement. Albert.