Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp3701110rdb; Sun, 10 Dec 2023 17:41:21 -0800 (PST) X-Google-Smtp-Source: AGHT+IEY3sEbBuvveeROFjQ7vRJoLUkYaRFnQAU1fJ5UvYCmKO9UhZo2FIOOvcLoKn98UgBxgjfD X-Received: by 2002:a17:902:dacf:b0:1d0:3358:4e26 with SMTP id q15-20020a170902dacf00b001d033584e26mr4633983plx.1.1702258881202; Sun, 10 Dec 2023 17:41:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702258881; cv=none; d=google.com; s=arc-20160816; b=lBI8/FsF6I8uaYvFSnEotnpp54eTzH8X+aXtPjj2RiKLNTrxJld+UtpDmQ9u5zSUF7 OOIZdjxHz/Mwd6pY2qkZRMulGJ08NepFRLt+kjcKsoPAdCriNLSOlTL+mNZa3kJA4jqN jddWzXIyoe+exqWGM0TCI13JBCRdeQrvBYJQXX1OgzDtv7tjE9cFGK84ZUQAb76RCEtD SRSKV5L2reqX+IbSAiiAVrjOtK+OZTNcHJL6xgGisac4Ukw88pHTlxCQ0M9SWHR3fs4z R7Ik6wh+T9lxAVjQkT0So5yoCwXUgq6/SbjvydOWwpSSnqoIeYgND4ZIo6NLbQpECNej 2gyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=MBOze/c93zYR/s0Tv/psL8lZsa2zAA8epMJdS8MTA48=; fh=k14T+ltQmYxY47rXINDOvNpSCjIbeWJPI0i5dqbdmVc=; b=hda9yzMauSPFrUujfCaN8rRmdnXokoPR6JfzNK5bxNRS3SISrtGOyDYOxLHixrWmGm 8vszax2nJQnS1e19YfId5Be3ygVg51cw/er/0RRftL8WTPqfrBCQalQO6ZRKGOMPP/jj zpkYi+7zec3vRlLr6/+Ild/jFiwxM68F1RtMV71mOoSOojDsBqkhUApXbjdSPM3XOFSz ceJVdn1jwHCv438u8wTrPNyYrbYs4EGyKk0nmhxzlZgY0W5GhysByrOQH8/6ypSknS0l K6Ybxit53+zRYu+G6hY9vEUZyobuYk6Hh/wdONNAhk9xh8wUrSknALC66dqUJuUOBwqa z4Gw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=qR17lPs4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id l4-20020a170903244400b001d09278b856si5381977pls.347.2023.12.10.17.41.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 10 Dec 2023 17:41:21 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=qR17lPs4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 77807808724B; Sun, 10 Dec 2023 17:41:18 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229655AbjLKBlE (ORCPT + 99 others); Sun, 10 Dec 2023 20:41:04 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36936 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229483AbjLKBlD (ORCPT ); Sun, 10 Dec 2023 20:41:03 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A387BD7 for ; Sun, 10 Dec 2023 17:41:06 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F14A9C43395; Mon, 11 Dec 2023 01:41:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1702258866; bh=MBOze/c93zYR/s0Tv/psL8lZsa2zAA8epMJdS8MTA48=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=qR17lPs4JMPnH4HZ6ahTIg4TYdw9Z4XEBYFJPU0HqGvQbsEA0QxrteiBI65X0D8wM OnS+34BNKvsvb7nxZvu3bIFA/LezLHP88kiCw7rXdNOBkbjqjORpXBWYYfYSJCt0q9 2d3TkFNtGktLG+F+95I8YIUrrIFrV4hwTZwW72HEi/SKVOS1p3IekzaLsbSF2ikZnw q4UCfhQejMRxVHD94ZpJUBrJHFFEQ4e/M/LidZJOEs1GY6tmxMEXinp2TyJHkLSskv luB8LytAXsQ1RptemeRh0Va7+XGeR/CIiqGvs2FQEEeVPQAslDf3xAlhmqNbBDGsst uW9/xpQAU60RA== Received: by mail-lj1-f178.google.com with SMTP id 38308e7fff4ca-2c9efa1ab7fso47837391fa.0; Sun, 10 Dec 2023 17:41:05 -0800 (PST) X-Gm-Message-State: AOJu0Yzf2HbO/DTwEGwALmGOL/a6eS7TlzY/75tCyFtE4vopVXpWSqon 1GzTgWGdsTC+GxU8ePqYW/9pUC01i5WRXgv5Cas= X-Received: by 2002:a2e:7e05:0:b0:2ca:1bb4:4426 with SMTP id z5-20020a2e7e05000000b002ca1bb44426mr543816ljc.207.1702258864030; Sun, 10 Dec 2023 17:41:04 -0800 (PST) MIME-Version: 1.0 References: <20231015141644.260646-1-akihiko.odaki@daynix.com> <20231015141644.260646-2-akihiko.odaki@daynix.com> <2594bb24-74dc-4785-b46d-e1bffcc3e7ed@daynix.com> <9a4853ad-5ef4-4b15-a49e-9edb5ae4468e@daynix.com> <6253fb6b-9a53-484a-9be5-8facd46c051e@daynix.com> <664003d3-aadb-4938-80f6-67fab1c9dcdd@daynix.com> In-Reply-To: From: Song Liu Date: Sun, 10 Dec 2023 17:40:52 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH v2 1/7] bpf: Introduce BPF_PROG_TYPE_VNET_HASH To: Akihiko Odaki Cc: Alexei Starovoitov , Jason Wang , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Jonathan Corbet , Willem de Bruijn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Xuan Zhuo , Mykola Lysenko , Shuah Khan , bpf , "open list:DOCUMENTATION" , LKML , Network Development , kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, "open list:KERNEL SELFTEST FRAMEWORK" , Yuri Benditovich , Andrew Melnychenko Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Sun, 10 Dec 2023 17:41:18 -0800 (PST) On Sat, Dec 9, 2023 at 11:03=E2=80=AFPM Akihiko Odaki wrote: > > On 2023/11/22 14:36, Akihiko Odaki wrote: > > On 2023/11/22 14:25, Song Liu wrote: [...] > > Now the discussion is stale again so let me summarize the discussion: > > A tuntap device can have an eBPF steering program to let the userspace > decide which tuntap queue should be used for each packet. QEMU uses this > feature to implement the RSS algorithm for virtio-net emulation. Now, > the virtio specification has a new feature to report hash values > calculated with the RSS algorithm. The goal of this RFC is to report > such hash values from the eBPF steering program to the userspace. > > There are currently three ideas to implement the proposal: > > 1. Abandon eBPF steering program and implement RSS in the kernel. > > It is possible to implement the RSS algorithm in the kernel as it's > strictly defined in the specification. However, there are proposals for > relevant virtio specification changes, and abandoning eBPF steering > program will loose the ability to implement those changes in the > userspace. There are concerns that this lead to more UAPI changes in the > end. > > 2. Add BPF kfuncs. > > Adding BPF kfuncs is *the* standard way to add BPF interfaces. hid-bpf > is a good reference for this. > > The problem with BPF kfuncs is that kfuncs are not considered as stable > as UAPI. In my understanding, it is not problematic for things like > hid-bpf because programs using those kfuncs affect the entire system > state and expected to be centrally managed. Such BPF programs can be > updated along with the kernel in a manner similar to kernel modules. > > The use case of tuntap steering/hash reporting is somewhat different > though; the eBPF program is more like a part of application (QEMU or > potentially other VMM) and thus needs to be portable. For example, a > user may expect a Debian container with QEMU installed to work on Fedora. > > BPF kfuncs do still provide some level of stability, but there is no > documentation that tell how stable they are. The worst case scenario I > can imagine is that a future legitimate BPF change breaks QEMU, letting > the "no regressions" rule force the change to be reverted. Some > assurance that kind scenario will not happen is necessary in my opinion. I don't think we can provide stability guarantees before seeing something being used in the field. How do we know it will be useful forever? If a couple years later, there is only one person using it somewhere in the world, why should we keep supporting it? If there are millions of virtual machines using it, why would you worry about it being removed? > > 3. Add BPF program type derived from the conventional steering program ty= pe > > In principle, it's just to add a feature to report four more bytes to > the conventional steering program. However, BPF program types are frozen > for feature additions and the proposed change will break the feature free= ze. > > So what's next? I'm inclined to option 3 due to its minimal ABI/API > change, but I'm also fine with option 2 if it is possible to guarantee > the ABI/API stability necessary to run pre-built QEMUs on future kernel > versions by e.g., explicitly stating the stability of kfuncs. If no > objection arises, I'll resend this series with the RFC prefix dropped > for upstream inclusion. If it's decided to go for option 1 or 2, I'll > post a new version of the series implementing the idea. Probably a dumb question, but does this RFC fall into option 3? If that's the case, I seriously don't think it's gonna happen. I would recommend you give option 2 a try and share the code. This is probably the best way to move the discussion forward. Thanks, Song