Received: by 2002:a05:7412:b130:b0:e2:908c:2ebd with SMTP id az48csp1553312rdb; Sun, 19 Nov 2023 00:16:48 -0800 (PST) X-Google-Smtp-Source: AGHT+IFXe7jil76IHSaAhZjbByU1iy78exeVr0l65+wsPafw2Kgo7oI7Se0YJr/E9pRTC3ED22sU X-Received: by 2002:a05:6830:2682:b0:6d6:4c25:5a56 with SMTP id l2-20020a056830268200b006d64c255a56mr5576440otu.12.1700381807935; Sun, 19 Nov 2023 00:16:47 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700381807; cv=none; d=google.com; s=arc-20160816; b=qY0kX9wJdjj3p+QMi3m3OiFCUvp6FVXOz61M/Devdb5iDEnJVZEZPLrOdI0TpIfLdA zzqUEPoWRK8VQWAK2bOLndxqNKn6Q9cDutxW3R5alIAwQ3iVe78HMIR3SaW9BVZMYHPt akPiK3RjBLRlnIYswi2Gi/cfwWUYvHzT+OzUU4jGCIIs2KQ1EECxStbOyVFm9JUp5j/G Qc1v7tdtRIE4Ou/sQ6zlssIyw3z3v3ZdGQY2ncwBEyd5qN/yQaqlc+yEhFQYfFlCiVIC rAjJ/qG71zoHWbhBA6CvduZcnb2iF/zpASM3VWELowDT00hhmXtIUhiaEuRdZxRmYIgk ePtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=zZG+je2UulHz8p53bRtgSoWGbv+SBJAMHNPOyzsP/38=; fh=6JYbqZNmWeK6hBmM1DtV2cJfkSydirELGIkVrao/NK0=; b=WlAQRH+OaAIknSjNypuQOkXp95mvyfUOtaIbl3tEs5wegzaachlCjpnPpBKcSPP/Uy VQfU6k6SOxde2pmF6AahdPDBHlZId61YEYoqCBaaNz7W2r6tt3gEMPqjirWJS9hZpEYI qD57mpeWic5EMUkvisXmnBaMRIHd4UdPWfRy7Ctg5hjQzzC1RlO+NtdvNRSaEJhue60p hGlZFrYdh/puXqAWeu+2UYhSkjmpc/6DWqi3i0+pJJNPs0LF2bnnnnzZSrBupYwQC4do bbVzbtzK9+DcNyOKLaTBwZ0NPyEUMvf9S30GhOCjC8/pC+2tl8DuFo0t3uJbOF3/13pK /iSA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@daynix-com.20230601.gappssmtp.com header.s=20230601 header.b=fiiuSA3J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id 199-20020a6300d0000000b0059f0cebd046si5702742pga.729.2023.11.19.00.16.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 19 Nov 2023 00:16:47 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@daynix-com.20230601.gappssmtp.com header.s=20230601 header.b=fiiuSA3J; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 81818805D577; Sun, 19 Nov 2023 00:16:45 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229668AbjKSIDm (ORCPT + 99 others); Sun, 19 Nov 2023 03:03:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41956 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229454AbjKSIDk (ORCPT ); Sun, 19 Nov 2023 03:03:40 -0500 Received: from mail-pf1-x42d.google.com (mail-pf1-x42d.google.com [IPv6:2607:f8b0:4864:20::42d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 070AD194 for ; Sun, 19 Nov 2023 00:03:33 -0800 (PST) Received: by mail-pf1-x42d.google.com with SMTP id d2e1a72fcca58-6bee11456baso2898634b3a.1 for ; Sun, 19 Nov 2023 00:03:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=daynix-com.20230601.gappssmtp.com; s=20230601; t=1700381013; x=1700985813; darn=vger.kernel.org; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :from:to:cc:subject:date:message-id:reply-to; bh=zZG+je2UulHz8p53bRtgSoWGbv+SBJAMHNPOyzsP/38=; b=fiiuSA3Jjnvd0j02epWpbnFRhWhFSSulk7lXf6aBpptqee9828ldaSZOc5Xm/w1GpK S30HGuP1V+825/EcFNrSwDLG/W2H3xf5JMi9iAnGqDXbSrWEWhDFdf/RMdcFh+dLSfXF oCdG5b1JLdPPI58w6jJiNXJAK18lOkrguplHh5P55UGpfdHklcEEjMv3pcv5SSvzkoB0 r97xDRhnXp24Jxgw4ffzTcRXK1IcuxccUoRy0qyUuGKdZ/Pxe7jtGGL0bA/B4v0tmfi+ Lu5ydW1svgxZq6esAu0a/aaCcuBj9FyXHgTr2zUMK/NKUuqALrRUI2bfh9QF85WqQnxp HeBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1700381013; x=1700985813; h=content-transfer-encoding:in-reply-to:from:content-language :references:cc:to:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=zZG+je2UulHz8p53bRtgSoWGbv+SBJAMHNPOyzsP/38=; b=e9+d1hDoMzuNTkXdUlPXduKOuI7atFfXObq9dS/XlYtIZsjBRAbD3UFTcGwvrtcnWE fLNQvGYvOam4PzZZIRM4SKBbsxI3KX2i6rbFk0bov1aWfpuJaBsFZA5XK1x2FlQCzP8V joptwqcRtZ6WlQ7Q8Nn+LBbmvNUGLGuuIvc1o4v5cN6z9e6WvqDDrh3scuTr1MFoQ8JW hm6K+cRxRpfGMmZ8V7mu1bRnSG8M5wTQy1MuEv0jNxWG6mtz1MvU2eVRVgoOK8O7OfUV jkKlWqNYgipDTSuD2i8tJ9B0xIQd28yytFaebUHx4wR454y7Bl0GHe7wu6C6KJeoMgoB Z3nQ== X-Gm-Message-State: AOJu0YykVXrBK2cCjQI6xDWG39sarQXfRcS7npvMKG7rcmCuW1Z6UbYd fudIhMt0yk9QwFdXo0bYUY0YQQ== X-Received: by 2002:a17:902:d503:b0:1cc:b09a:b811 with SMTP id b3-20020a170902d50300b001ccb09ab811mr3233581plg.14.1700381012995; Sun, 19 Nov 2023 00:03:32 -0800 (PST) Received: from [157.82.205.15] ([157.82.205.15]) by smtp.gmail.com with ESMTPSA id j4-20020a170902da8400b001cf5c99f031sm234091plx.283.2023.11.19.00.03.26 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Sun, 19 Nov 2023 00:03:32 -0800 (PST) Message-ID: Date: Sun, 19 Nov 2023 17:03:25 +0900 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [RFC PATCH v2 1/7] bpf: Introduce BPF_PROG_TYPE_VNET_HASH To: Song Liu Cc: Alexei Starovoitov , Jason Wang , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , Jonathan Corbet , Willem de Bruijn , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "Michael S. Tsirkin" , Xuan Zhuo , Mykola Lysenko , Shuah Khan , bpf , "open list:DOCUMENTATION" , LKML , Network Development , kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, "open list:KERNEL SELFTEST FRAMEWORK" , Yuri Benditovich , Andrew Melnychenko References: <20231015141644.260646-1-akihiko.odaki@daynix.com> <20231015141644.260646-2-akihiko.odaki@daynix.com> <2594bb24-74dc-4785-b46d-e1bffcc3e7ed@daynix.com> <9a4853ad-5ef4-4b15-a49e-9edb5ae4468e@daynix.com> <6253fb6b-9a53-484a-9be5-8facd46c051e@daynix.com> Content-Language: en-US From: Akihiko Odaki In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Sun, 19 Nov 2023 00:16:45 -0800 (PST) On 2023/11/19 1:08, Song Liu wrote: > Hi, > > A few rookie questions below. Thanks for questions. > > On Sat, Nov 18, 2023 at 2:39 AM Akihiko Odaki wrote: >> >> On 2023/10/18 4:19, Akihiko Odaki wrote: >>> On 2023/10/18 4:03, Alexei Starovoitov wrote: > [...] >>> >>> I would also appreciate if you have some documentation or link to >>> relevant discussions on the mailing list. That will avoid having same >>> discussion you may already have done in the past. >> >> Hi, >> >> The discussion has been stuck for a month, but I'd still like to >> continue figuring out the way best for the whole kernel to implement >> this feature. I summarize the current situation and question that needs >> to be answered before push this forward: >> >> The goal of this RFC is to allow to report hash values calculated with >> eBPF steering program. It's essentially just to report 4 bytes from the >> kernel to the userspace. > > AFAICT, the proposed design is to have BPF generate some data > (namely hash, but could be anything afaict) and consume it from > user space. Instead of updating __sk_buff, can we have the user > space to fetch the data/hash from a bpf map? If this is an option, > I guess we can implement the same feature with BPF tracing > programs? Unfortunately no. The communication with the userspace can be done with two different means: - usual socket read/write - vhost for direct interaction with a KVM guest The BPF map may be a valid option for socket read/write, but it is not for vhost. In-kernel vhost may fetch hash from the BPF map, but I guess it's not a standard way to have an interaction between the kernel code and a BPF program. > >> >> Unfortunately, however, it is not acceptable for the BPF subsystem >> because the "stable" BPF is completely fixed these days. The >> "unstable/kfunc" BPF is an alternative, but the eBPF program will be >> shipped with a portable userspace program (QEMU)[1] so the lack of >> interface stability is not tolerable. > > bpf kfuncs are as stable as exported symbols. Is exported symbols > like stability enough for the use case? (I would assume yes.) > >> >> Another option is to hardcode the algorithm that was conventionally >> implemented with eBPF steering program in the kernel[2]. It is possible >> because the algorithm strictly follows the virtio-net specification[3]. >> However, there are proposals to add different algorithms to the >> specification[4], and hardcoding the algorithm to the kernel will >> require to add more UAPIs and code each time such a specification change >> happens, which is not good for tuntap. > > The requirement looks similar to hid-bpf. Could you explain why that > model is not enough? HID also requires some stability AFAICT. I have little knowledge with hid-bpf, but I assume it is more like a "safe" kernel module; in my understanding, it affects the system state and is intended to be loaded with some kind of a system daemon. It is fine to have the same lifecycle with the kernel for such a BPF program; whenever the kernel is updated, the distributor can recompile the BPF program with the new kernel headers and ship it along with the kernel just as like a kernel module. In contrast, our intended use case is more like a normal application. So, for example, a user may download a container and run QEMU (including the BPF program) installed in the container. As such, it is nice if the ABI is stable across kernel releases, but it is not guaranteed for kfuncs. Such a use case is already covered with the eBPF steering program so I want to maintain it if possible. Regards, Akihiko Odaki