Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp1723420pxb; Sun, 18 Apr 2021 05:06:53 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyH6L6chpyAXq9tYzgojtR//ZCeMmb9/VRN8nItgYnr8TDYksfo9bm2iyziYYB57reKj1fF X-Received: by 2002:a50:c004:: with SMTP id r4mr19491981edb.192.1618747613333; Sun, 18 Apr 2021 05:06:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618747613; cv=none; d=google.com; s=arc-20160816; b=RNM9npr0y1BozcYJBdEtymn/7qHgSQVdiEW8QXxEIlpHyrvJu9PIi+tuwWL4qbDc8B BYI1gCaIwU4ZchozGAaXKdbzoM8QVLJ1tkagQ6d34Gcg28N0YR7Fl2PyQhzHhknPMoUD dGVLLcgxlmHaX4mkngW3u9tAdoDILXYFokda+P/asLRaIxhPfwlF73gkYGPzNey5KIGs DbMeizpAbWdngZiUI3EueCmUqmiDubvxyy65LcwCSnr66yoZnjW6ZkG5bOzhJmoEDESd J5zgG0QN7TS5AcrD2ElLxk7nzX3SzcPSWtWDEVxpyrICiBTivF/T9OTcq9i6k86xdFKn inXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:reply-to:cc:from:to :dkim-signature:date; bh=S6+7IL9psNB52Dh+xGxHAd9iOltPMkvweODEzTOF5hQ=; b=bY59zhjLbi50dAb4lEo8C8vvkKsR5IjQfLjeWao19G5pLclW5Vy0ODzn7o7XnRhKX0 l8OppAUp/BoIA8Orltj98jzoHaX9gJqitxJ4r+fYnKdpA5M616NJvg137hTc6xrQNCOU 2eyRJI97s6LfLuxmflLp4M72X0s2uX31RSaPlqX3AE3QKlSY7l102PpZSpdLzt59nfDt 59C/BR5N3pkL0bredghHXBGOwGjVge2aHCYy762C1qjetvQsaSQGTPEdVdEWKwDLX7hJ wXlEQzcYii8LpZEEn3IZJVqGUxOchhN9Es0xfvQiNLMrV3pC8NAPw/Wxm3JhYbcl6iia 9Xsw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@pm.me header.s=protonmail header.b=KG3PBF05; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=pm.me Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y9si10560151edi.0.2021.04.18.05.06.16; Sun, 18 Apr 2021 05:06:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@pm.me header.s=protonmail header.b=KG3PBF05; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=pm.me Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231932AbhDRMF2 (ORCPT + 99 others); Sun, 18 Apr 2021 08:05:28 -0400 Received: from mail-40131.protonmail.ch ([185.70.40.131]:31033 "EHLO mail-40131.protonmail.ch" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230193AbhDRMF1 (ORCPT ); Sun, 18 Apr 2021 08:05:27 -0400 Date: Sun, 18 Apr 2021 12:04:51 +0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=pm.me; s=protonmail; t=1618747497; bh=S6+7IL9psNB52Dh+xGxHAd9iOltPMkvweODEzTOF5hQ=; h=Date:To:From:Cc:Reply-To:Subject:In-Reply-To:References:From; b=KG3PBF05MIU1gCXucaCnQQGVhDVLH3fhBBNTUS1oMKAn+2aDmw/ho2Qx5Dr7wbMWh N2NfBSQ9negZLZs2iVHq0T3h23lulHBS47TotCeQ3oEY3iQAF1SR4bd3fExWyb7lxk +2xNWOQiwpXHIvFkKtMo11ZMlAl+KHaGRhn0CSHXJ0zHe1nOmrDVNjpSThrcMDv/iZ EinJCF2a6+5qXHn8P8fhC12T3EXUtFuGGXeZB7dLzyVcB9zpN/cfYXJwsxJyat/yD0 Q1lSNI2YYrC5xNyRyAwtV3iR5RUDxSEn8eMSkCLudBTN8UmhAciW/EIp1HQRQ00DVG Mz6WnsyEKZPOg== To: Magnus Karlsson From: Alexander Lobakin Cc: Alexander Lobakin , Xuan Zhuo , Alexei Starovoitov , Daniel Borkmann , =?utf-8?Q?Bj=C3=B6rn_T=C3=B6pel?= , Magnus Karlsson , Jonathan Lemon , "David S. Miller" , Jakub Kicinski , Jesper Dangaard Brouer , John Fastabend , Andrii Nakryiko , Martin KaFai Lau , Song Liu , Yonghong Song , KP Singh , Network Development , bpf , open list Reply-To: Alexander Lobakin Subject: Re: [PATCH v2 bpf-next 0/2] xsk: introduce generic almost-zerocopy xmit Message-ID: <20210418120431.6945-1-alobakin@pm.me> In-Reply-To: References: <1618278328.0085247-1-xuanzhuo@linux.alibaba.com> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.2 required=10.0 tests=ALL_TRUSTED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF shortcircuit=no autolearn=disabled version=3.4.4 X-Spam-Checker-Version: SpamAssassin 3.4.4 (2020-01-24) on mailout.protonmail.ch Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Magnus Karlsson Date: Tue, 13 Apr 2021 09:14:02 +0200 Hi! I've finally done with a kinda comfy setup after moving to another country and can finally continue working on patches and stuff. > On Tue, Apr 13, 2021 at 3:49 AM Xuan Zhuo wr= ote: > > > > On Mon, 12 Apr 2021 16:13:12 +0200, Magnus Karlsson wrote: > > > On Wed, Mar 31, 2021 at 2:27 PM Alexander Lobakin wr= ote: > > > > > > > > This series is based on the exceptional generic zerocopy xmit logic= s > > > > initially introduced by Xuan Zhuo. It extends it the way that it > > > > could cover all the sane drivers, not only the ones that are capabl= e > > > > of xmitting skbs with no linear space. > > > > > > > > The first patch is a random while-we-are-here improvement over > > > > full-copy path, and the second is the main course. See the individu= al > > > > commit messages for the details. > > > > > > > > The original (full-zerocopy) path is still here and still generally > > > > faster, but for now it seems like virtio_net will remain the only > > > > user of it, at least for a considerable period of time. > > > > > > > > From v1 [0]: > > > > - don't add a whole SMP_CACHE_BYTES because of only two bytes > > > > (NET_IP_ALIGN); > > > > - switch to zerocopy if the frame is 129 bytes or longer, not 128. > > > > 128 still fit to kmalloc-512, while a zerocopy skb is always > > > > kmalloc-1024 -> can potentially be slower on this frame size. > > > > > > > > [0] https://lore.kernel.org/netdev/20210330231528.546284-1-alobakin= @pm.me > > > > > > > > Alexander Lobakin (2): > > > > xsk: speed-up generic full-copy xmit > > > > > > I took both your patches for a spin on my machine and for the first > > > one I do see a small but consistent drop in performance. I thought it > > > would go the other way, but it does not so let us put this one on the > > > shelf for now. This is kinda strange as the solution is pretty straightforward. But sure, if the performance dropped after this one, it should not be considered for taking. I might have a look at it later. > > > > xsk: introduce generic almost-zerocopy xmit > > > > > > This one wreaked havoc on my machine ;-). The performance dropped wit= h > > > 75% for packets larger than 128 bytes when the new scheme kicks in. > > > Checking with perf top, it seems that we spend much more time > > > executing the sendmsg syscall. Analyzing some more: > > > > > > $ sudo bpftrace -e 'kprobe:__sys_sendto { @calls =3D @calls + 1; } > > > interval:s:1 {printf("calls/sec: %d\n", @calls); @calls =3D 0;}' > > > Attaching 2 probes... > > > calls/sec: 1539509 with your patch compared to > > > > > > calls/sec: 105796 without your patch > > > > > > The application spends a lot of more time trying to get the kernel to > > > send new packets, but the kernel replies with "have not completed the > > > outstanding ones, so come back later" =3D EAGAIN. Seems like the > > > transmission takes longer when the skbs have fragments, but I have no= t > > > examined this any further. Did you get a speed-up? > > > > Regarding this solution, I actually tested it on my mlx5 network card, = but the > > performance was severely degraded, so I did not continue this solution = later. I > > guess it might have something to do with the physical network card. We = can try > > other network cards. > > I tried it on a third card and got a 40% degradation, so let us scrap > this idea. It should stay optional as it is today as the (software) > drivers that benefit from this can turn it on explicitly. Thank you guys a lot for the testing! I think the main reason is the DMA mapping of one additional frag (14 bytes of MAC header, which is excessive). It can take a lot of CPU cycles, especially when the device is behind an IOMMU, and seems like memcpying is faster here. Moreover, if Xuan tested it as one of the steps towards his full-zerocopy and found it to be a bad idea, this should not go further. So I'm burying this. > > links: https://www.spinics.net/lists/netdev/msg710918.html > > > > Thanks. > > > > > > > > > net/xdp/xsk.c | 32 ++++++++++++++++++++++---------- > > > > 1 file changed, 22 insertions(+), 10 deletions(-) > > > > > > > > -- > > > > Well, this is untested. I currently don't have an access to my setu= p > > > > and is bound by moving to another country, but as I don't know for > > > > sure at the moment when I'll get back to work on the kernel next ti= me, > > > > I found it worthy to publish this now -- if any further changes wil= l > > > > be required when I already will be out-of-sight, maybe someone coul= d > > > > carry on to make a another revision and so on (I'm still here for a= ny > > > > questions, comments, reviews and improvements till the end of this > > > > week). > > > > But this *should* work with all the sane drivers. If a particular > > > > one won't handle this, it's likely ill. Any tests are highly > > > > appreciated. Thanks! > > > > -- > > > > 2.31.1 Thanks, Al