Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1004498rwl; Wed, 12 Apr 2023 07:07:24 -0700 (PDT) X-Google-Smtp-Source: AKy350ZcM4aWqlpvzyUo6HuPof0LbGXguiszaTndHVV3txYW+dBQL3U5Kuq82dr5FCkjmUqsIhV0 X-Received: by 2002:a17:907:a43:b0:921:da99:f39c with SMTP id be3-20020a1709070a4300b00921da99f39cmr18349550ejc.12.1681308444120; Wed, 12 Apr 2023 07:07:24 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681308444; cv=none; d=google.com; s=arc-20160816; b=K+iwC/VBQvaZuC8yLUOVdWcoXWZvtKK6qIX8wZXu5Man9yOOIWc7NPsCwlSy5qPuel vVyT0OldUqOhO2osaR99dQ8VIoApGDp5Ff024M8bLk1+HGmfBLcIGk2+ClHp6A9c+cJs uadlnEOII5lmkRzQZT0h3rz+WxDwHm2+eXDSL7gggyYnnMMeAVzjF0BCq1GsQ5vcCWc4 ApaHdQhY5teK9CU8vwTzjTQYqh5OwJ75Ou5325HpIxMUlGRxtvpSgomntP/Lu8w2ryNn 0XVa2gxoXZguZbih6Cv8ckNF1an5CSVZFIJeiI28rgFTSnWhpdIp6mPf12ljq3cHwhRC on2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=933rqrJ5JW6xLLolSy8p/l987jO0ZQnx8xOK/Rz6sro=; b=c9/CvGArwdr1wvIKMLuMkzTTE/qkMBCH1GZuo5bQSZkdywnAGontAmUS8X+qbGj7Ok aIcPUqDX0yKWl8UbRLeLpu2OiYPN65tgpXEbQB+oK8o2in1+QCVt144C4Pkm2VHB0rJA VFvljeve/W26FXMsXFMDhqjWKD4yTOy/3KM6T2YIkFUVst2KRYIyKv3OxWLmVkcY4Gc3 RKY3DU4KaYmg2LL5sILQSxjofA380pypPRfq9M74SMkbRj5L8DRnC+51OusQnu/jCXXF KfJzGT4e0mrywe6DZ0o5a//PdUveypLtJGHcE9gd7EMjwaitgBXwnaI8fRykUwdAbXHg ohQA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=nx7U8oZh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id vq9-20020a170907a4c900b008deaca3a01csi633337ejc.221.2023.04.12.07.06.52; Wed, 12 Apr 2023 07:07:24 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20221208 header.b=nx7U8oZh; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230495AbjDLN4J (ORCPT + 99 others); Wed, 12 Apr 2023 09:56:09 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230285AbjDLN4I (ORCPT ); Wed, 12 Apr 2023 09:56:08 -0400 Received: from mail-yw1-x112f.google.com (mail-yw1-x112f.google.com [IPv6:2607:f8b0:4864:20::112f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B94197D9E; Wed, 12 Apr 2023 06:55:50 -0700 (PDT) Received: by mail-yw1-x112f.google.com with SMTP id 00721157ae682-54c17fa9ae8so319591937b3.5; Wed, 12 Apr 2023 06:55:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20221208; t=1681307750; x=1683899750; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=933rqrJ5JW6xLLolSy8p/l987jO0ZQnx8xOK/Rz6sro=; b=nx7U8oZhzlF0AOp6y1Eods+RekUaFqPLuIZmfj/VtfuMR1usD9ckyk8fsIKTQctlRR yizqSV0S6KBUSdxEIf6Km8flho3enwQTh83nUMtN2Co52+Rsw3kiBqZxabqKqJSXU6oz riLKjD89F4/n8x5wIdmJBGzYbkIUnNCOXRKVVebc4HaCW6wvc+U4v4T24AAIXaVZCFHG 9MYCIyPIgVYnzHEM4WuQ1kGgEjnZTQf3kyh9IQS25FfjhPyW2ThvSCmFpl/ROZMpUzb/ 5Vp+nfljNeilxTxvEY4hNfelFFpMmxFEKSDR0u7Jq/AOY0RGvDe3Gn9yVDdDWbwPyst8 0/Vg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; t=1681307750; x=1683899750; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=933rqrJ5JW6xLLolSy8p/l987jO0ZQnx8xOK/Rz6sro=; b=nThmfCEMVlppficouA5LGUmqa/O0PisdqfdaF8Au02yhDK+4DNOwNxc7XDLaDFi43I MdG4r/RUG8UVzJ6cto0CGKaLL61mBtEY+EN1c1CrjoineZQVUcbR8DQJlSmh/YIudE4t 26t8pVGXza6UD8DiL56w9ZShM0cY7xMrVu/KhhRhsm+iVVOU3lgNQlDJ5r2wgeu3+DW4 idBTJ1b1JphwNaMFXo223gsApFnKwhU6oDZnjJw/AM3CfmUwo+bYvy0VIU3dNQx/6jHK qhVu0A0r1rKoqmamahtDvoyCPCsb6+PvDtczcpL0dDeAdwZ/QDYwPB0RAUa7lK/KgaSb OvXg== X-Gm-Message-State: AAQBX9eka45ZVuGSIFFzMwLLylWrLjBbGoXjJyc0OO1Uv0o7nDFWr0+7 KnSfra2FnPMhCaeh3x2lS3DuQl+qNsauRvxb3r8g1Y4jUHJMeJLs X-Received: by 2002:a81:ac5c:0:b0:54f:b2a3:8441 with SMTP id z28-20020a81ac5c000000b0054fb2a38441mr518540ywj.10.1681307749851; Wed, 12 Apr 2023 06:55:49 -0700 (PDT) MIME-Version: 1.0 References: <20230406130205.49996-1-kal.conley@dectris.com> <20230406130205.49996-2-kal.conley@dectris.com> <87sfdckgaa.fsf@toke.dk> <875ya12phx.fsf@toke.dk> In-Reply-To: <875ya12phx.fsf@toke.dk> From: Magnus Karlsson Date: Wed, 12 Apr 2023 15:55:38 +0200 Message-ID: Subject: Re: [PATCH bpf-next v3 1/3] xsk: Support UMEM chunk_size > PAGE_SIZE To: =?UTF-8?B?VG9rZSBIw7hpbGFuZC1Kw7hyZ2Vuc2Vu?= Cc: Kal Cutter Conley , Maciej Fijalkowski , =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Magnus Karlsson , Jonathan Lemon , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jonathan Corbet , Alexei Starovoitov , Daniel Borkmann , Jesper Dangaard Brouer , John Fastabend , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,FREEMAIL_FROM, RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 12 Apr 2023 at 15:40, Toke H=C3=B8iland-J=C3=B8rgensen wrote: > > Kal Cutter Conley writes: > > >> > > Add core AF_XDP support for chunk sizes larger than PAGE_SIZE. Thi= s > >> > > enables sending/receiving jumbo ethernet frames up to the theoreti= cal > >> > > maxiumum of 64 KiB. For chunk sizes > PAGE_SIZE, the UMEM is requi= red > >> > > to consist of HugeTLB VMAs (and be hugepage aligned). Initially, o= nly > >> > > SKB mode is usable pending future driver work. > >> > > >> > Hmm, interesting. So how does this interact with XDP multibuf? > >> > >> To me it currently does not interact with mbuf in any way as it is ena= bled > >> only for skb mode which linearizes the skb from what i see. > >> > >> I'd like to hear more about Kal's use case - Kal do you use AF_XDP in = SKB > >> mode on your side? > > > > Our use-case is to receive jumbo Ethernet frames up to 9000 bytes with > > AF_XDP in zero-copy mode. This patchset is a step in this direction. > > At the very least, it lets you test out the feature in SKB mode > > pending future driver support. Currently, XDP multi-buffer does not > > support AF_XDP at all. It could support it in theory, but I think it > > would need some UAPI design work and a bit of implementation work. > > > > Also, I think that the approach taken in this patchset has some > > advantages over XDP multi-buffer: > > (1) It should be possible to achieve higher performance > > (a) because the packet data is kept together > > (b) because you need to acquire and validate less descriptors > > and touch the queue pointers less often. > > (2) It is a nicer user-space API. > > (a) Since the packet data is all available in one linear > > buffer. This may even be a requirement to avoid an extra copy if the > > data must be handed off contiguously to other code. > > > > The disadvantage of this patchset is requiring the user to allocate > > HugeTLB pages which is an extra complication. > > > > I am not sure if this patchset would need to interact with XDP > > multi-buffer at all directly. Does anyone have anything to add here? > > Well, I'm mostly concerned with having two different operation and > configuration modes for the same thing. We'll probably need to support > multibuf for AF_XDP anyway for the non-ZC path, which means we'll need > to create a UAPI for that in any case. And having two APIs is just going > to be more complexity to handle at both the documentation and > maintenance level. One does not replace the other. We need them both, unfortunately. Multi-buff is great for e.g., stitching together different headers with the same data. Point to different buffers for the header in each packet but the same piece of data in all of them. This will never be solved with Kal's approach. We just need multi-buffer support for this. BTW, we are close to posting multi-buff support for AF_XDP. Just hang in there a little while longer while the last glitches are fixed. We have to stage it in two patch sets as it will be too long otherwise. First one will only contain improvements to the xsk selftests framework so that multi-buffer tests can be supported. The second one will be the core code and the actual multi-buffer tests. As for what Kal's patches are good for, please see below. > It *might* be worth it to do this if the performance benefit is really > compelling, but, well, you'd need to implement both and compare directly > to know that for sure :) The performance benefit is compelling. As I wrote in a mail to a post by Kal, there are users out there that state that this feature (for zero-copy mode nota bene) is a must for them to be able to use AF_XDP instead of DPDK style user-mode drivers. They have really tough latency requirements. > -Toke >