Received: by 2002:a05:7412:b130:b0:e2:908c:2ebd with SMTP id az48csp1239304rdb; Sat, 18 Nov 2023 08:09:02 -0800 (PST) X-Google-Smtp-Source: AGHT+IET7cf/7cKu34vDomGe6Clly5WOQ2uIOeOp0vkMkNtJOWrsRjphxVYcQY6ax+9SftPuXu6f X-Received: by 2002:aa7:930f:0:b0:6b8:69fa:a11 with SMTP id cz15-20020aa7930f000000b006b869fa0a11mr3078287pfb.12.1700323742511; Sat, 18 Nov 2023 08:09:02 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700323742; cv=none; d=google.com; s=arc-20160816; b=oG/YxgfkznMD7Med/TrP4GbrT99vYw5f/YOXH6pN+KOvtGy1r3aQJ2Mcl7qP80XTKQ xHs/IRIYnzrSgEyoxiCok1mSyGl2rgEjNco8GBjYOygnyaP+oguJreIcip2fGAzrnnY9 jDYQoqooIJqpU/8h/KJ6XcD95ya7rY/DSa4Wt3pIi7TJkbHzUTsZur7Tap2Zqqb/chTX UtBV78QrLt5Y30EH+P4PUuU4pbL3SGAcPazoZJ/9vfUJoc3JK5aQYTMM5K2+D4S3wVek UtwYgfZy2ENgth+lK+vDnmTB6PXviR4Ta3J8NoaLG4rsC1MtSrPZgxNBgW1mm0VDBI3c r3hg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=Rsnqi1ib4EvnXINLKabP7yw4uoi1oK5ej3itZD7VvwE=; fh=k14T+ltQmYxY47rXINDOvNpSCjIbeWJPI0i5dqbdmVc=; b=ShqNFdUhA7JKoMgBEpo1elRS0UCxXcGL9iX0eTMOzQVnJYo9L0qn38n7kHGA1AF6Ft TZ/mNN+Yi9ZoTUDt8waUvj/bBdl3Hkpe7+iOwUyXlZtPwxjP8u0Qy0KPxCKTZV1GKA2+ YMkvegrkGb9ljip+laNXAWk90ZefWz11FGD7f+SbUQbNH2hxhSvQF265Hdtf3VzXS343 dr6+ZTm40G3FeG60BOxWhjAikArXf9x0YNyzZGksUtCCqgXXGsuL8JmE+Bc+KTPHfkpP ybh6IXrWCh+KYwjb27vTfTPLNr+qys99UPVPn5jSgXOJw0c0UGGS2gRow3lFd3b7HAh4 XUXA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="TZbW/M05"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id p3-20020a63f443000000b005bd18d53c2bsi4406143pgk.885.2023.11.18.08.09.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 18 Nov 2023 08:09:02 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="TZbW/M05"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 65852804C213; Sat, 18 Nov 2023 08:08:59 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229514AbjKRQIw (ORCPT + 99 others); Sat, 18 Nov 2023 11:08:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40548 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229478AbjKRQIv (ORCPT ); Sat, 18 Nov 2023 11:08:51 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 63160A0 for ; Sat, 18 Nov 2023 08:08:37 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CCBF9C433AB; Sat, 18 Nov 2023 16:08:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1700323716; bh=Rsnqi1ib4EvnXINLKabP7yw4uoi1oK5ej3itZD7VvwE=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=TZbW/M05LVwPMqWrhx9TKrKAmJhZWl5oGZokrp6Yk372oMK1SWeer8oqfOmkHhrWR Pb874z3lvuDQZyHxPY2FY1wz7XLr/nN7jQE5/GtXNhXPnEL13OHtq4LroScNw8G5vO bLDR+6HkW8reckqHCtvI3Pn91dQnI265wfRhp9e7lXQbPrAARLBJ4bfAqpckWpPwWO /0VuVGXKvO9UrKAj9FoErIVwTYmCUzjiygrMSQywP4P5omdSn+aGf5jmw+s9Ht631A OBnb7j4QKGmaaRPJyjREAq0DGcs9gkxdmbTyk+3BT0j3jVKMKWUNzJHErY2T7gQTeS fxws+iP0PAm4g== Received: by mail-lj1-f176.google.com with SMTP id 38308e7fff4ca-2c50fbc218bso38535361fa.3; Sat, 18 Nov 2023 08:08:36 -0800 (PST) X-Gm-Message-State: AOJu0YzwUWut7Z59kc5r581WyDwIwVAzMwFTclmRQtvfZGs97orBpdeX p022IRuHtrQfDQwOWPV8nEH5uMSrnSs4VkBzZkg= X-Received: by 2002:a2e:97c8:0:b0:2c8:6f66:27a7 with SMTP id m8-20020a2e97c8000000b002c86f6627a7mr2087546ljj.23.1700323714932; Sat, 18 Nov 2023 08:08:34 -0800 (PST) MIME-Version: 1.0 References: <20231015141644.260646-1-akihiko.odaki@daynix.com> <20231015141644.260646-2-akihiko.odaki@daynix.com> <2594bb24-74dc-4785-b46d-e1bffcc3e7ed@daynix.com> <9a4853ad-5ef4-4b15-a49e-9edb5ae4468e@daynix.com> <6253fb6b-9a53-484a-9be5-8facd46c051e@daynix.com> In-Reply-To: <6253fb6b-9a53-484a-9be5-8facd46c051e@daynix.com> From: Song Liu Date: Sat, 18 Nov 2023 08:08:22 -0800 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [RFC PATCH v2 1/7] bpf: Introduce BPF_PROG_TYPE_VNET_HASH To: Akihiko Odaki Cc: Alexei Starovoitov , Jason Wang , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Jonathan Corbet , Willem de Bruijn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Xuan Zhuo , Mykola Lysenko , Shuah Khan , bpf , "open list:DOCUMENTATION" , LKML , Network Development , kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, "open list:KERNEL SELFTEST FRAMEWORK" , Yuri Benditovich , Andrew Melnychenko Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-1.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Sat, 18 Nov 2023 08:08:59 -0800 (PST) Hi, A few rookie questions below. On Sat, Nov 18, 2023 at 2:39=E2=80=AFAM Akihiko Odaki wrote: > > On 2023/10/18 4:19, Akihiko Odaki wrote: > > On 2023/10/18 4:03, Alexei Starovoitov wrote: [...] > > > > I would also appreciate if you have some documentation or link to > > relevant discussions on the mailing list. That will avoid having same > > discussion you may already have done in the past. > > Hi, > > The discussion has been stuck for a month, but I'd still like to > continue figuring out the way best for the whole kernel to implement > this feature. I summarize the current situation and question that needs > to be answered before push this forward: > > The goal of this RFC is to allow to report hash values calculated with > eBPF steering program. It's essentially just to report 4 bytes from the > kernel to the userspace. AFAICT, the proposed design is to have BPF generate some data (namely hash, but could be anything afaict) and consume it from user space. Instead of updating __sk_buff, can we have the user space to fetch the data/hash from a bpf map? If this is an option, I guess we can implement the same feature with BPF tracing programs? > > Unfortunately, however, it is not acceptable for the BPF subsystem > because the "stable" BPF is completely fixed these days. The > "unstable/kfunc" BPF is an alternative, but the eBPF program will be > shipped with a portable userspace program (QEMU)[1] so the lack of > interface stability is not tolerable. bpf kfuncs are as stable as exported symbols. Is exported symbols like stability enough for the use case? (I would assume yes.) > > Another option is to hardcode the algorithm that was conventionally > implemented with eBPF steering program in the kernel[2]. It is possible > because the algorithm strictly follows the virtio-net specification[3]. > However, there are proposals to add different algorithms to the > specification[4], and hardcoding the algorithm to the kernel will > require to add more UAPIs and code each time such a specification change > happens, which is not good for tuntap. The requirement looks similar to hid-bpf. Could you explain why that model is not enough? HID also requires some stability AFAICT. Thanks, Song > > In short, the proposed feature requires to make either of three compromis= es: > > 1. Compromise on the BPF side: Relax the "stable" BPF feature freeze > once and allow eBPF steering program to report 4 more bytes to the kernel= . > > 2. Compromise on the tuntap side: Implement the algorithm to the kernel, > and abandon the capability to update the algorithm without changing the > kernel. > > IMHO, I think it's better to make a compromise on the BPF side (option > 1). We should minimize the total UAPI changes in the whole kernel, and > option 1 is much superior in that sense. > > Yet I have to note that such a compromise on the BPF side can risk the > "stable" BPF feature freeze fragile and let other people complain like > "you allowed to change stable BPF for this, why do you reject [some > other request to change stable BPF]?" It is bad for BPF maintainers. (I > can imagine that introducing and maintaining widely different BPF > interfaces is too much burden.) And, of course, this requires an > approval from BPF maintainers. > > So I'd like to ask you that which of these compromises you think worse. > Please also tell me if you have another idea. > > Regards, > Akihiko Odaki > > [1] https://qemu.readthedocs.io/en/v8.1.0/devel/ebpf_rss.html > [2] > https://lore.kernel.org/all/20231008052101.144422-1-akihiko.odaki@daynix.= com/ > [3] > https://docs.oasis-open.org/virtio/virtio/v1.2/csd01/virtio-v1.2-csd01.ht= ml#x1-2400003 > [4] > https://lore.kernel.org/all/CACGkMEuBbGKssxNv5AfpaPpWQfk2BHR83rM5AHXN-YVM= f2NvpQ@mail.gmail.com/