Received: by 2002:a05:7412:6592:b0:d7:7d3a:4fe2 with SMTP id m18csp1325290rdg; Fri, 11 Aug 2023 18:51:44 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFtTJUHFbtMqYnco801B6Cw0KOsTu4uXVSkIXx7SpAsP48N7Oex5ZsPl7MTYFYQRp0KHuMZ X-Received: by 2002:aa7:d98e:0:b0:523:df1:ba15 with SMTP id u14-20020aa7d98e000000b005230df1ba15mr2831200eds.21.1691805104301; Fri, 11 Aug 2023 18:51:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691805104; cv=none; d=google.com; s=arc-20160816; b=rf2MZ/caYRsYG56zNV/VcKleOEpkbeIYsuC1iY3Tc/jatQFXE9eL2G8UgHJcDoWDKc T3ia+cDFydy8KT0zjY2hD7The/PFgxwcHzJJpbJoJRGEDUfvJXcgjEV+FUgfH3WAJUMp FgyNHmrBpaUBuXZ2UCfm002WPi2QQXW48xxtM4IdRcO0pEfc9GtyRo+RGGUYyD0nenSs WCJ1upSR58PcO26oVaUKspuoW+pGibruyBtE4cLTqr9Xje6Xh/1cl2cNBOKcCMJ2POyp sIndVew+Zn3bTqEabrMD6O/XpYqnJtPO1jHsynhj+CIaZLKNFeHr66kGTzQX2Vnq+R9c OnUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:mime-version:date :dkim-signature:message-id; bh=XOAmUrlmN46mc8kLRjXg/L06bahnJsC47ASbtgox2n0=; fh=5NwuGznZaI05UTzzqDvtUHNEbNN9Gr26Q4NWepkgb2k=; b=jvXKB7dTlc73BjqDDf8Ocvg3oDUmz5Oehh8I5vSCuih3x40YSBJmARm3CO30f/jV4/ OcJ/KV83DpuAjozFf7zKiKcm24dSk9WFWNHTiQg3ZcnZb3acXJ55qZqrEw0hjxph6v8r Wv774hgIYhOaEU4GNRDPMeJse3x8DMJcvd4RScV/u0gUToC3d/8Hi6eaKtaK3RybjqUn vZ3JRqhdgklGAVoWrT1+eOcXD36F66pLvHN+TI5h/fq5DsLuHPQ7UhrnIwprvflHYz0r 2rnQ661zoDBDojp0NhOtmkAr9JrklWWN3ZwMjvpZgTg3C5d8NN6jF6LKoLhFwiAur4NQ AmbQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=SNHoCg5d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id x8-20020aa7d6c8000000b00522d912cd61si4139010edr.175.2023.08.11.18.51.20; Fri, 11 Aug 2023 18:51:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linux.dev header.s=key1 header.b=SNHoCg5d; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linux.dev Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236803AbjHKXeZ (ORCPT + 99 others); Fri, 11 Aug 2023 19:34:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231685AbjHKXeU (ORCPT ); Fri, 11 Aug 2023 19:34:20 -0400 Received: from out-79.mta1.migadu.com (out-79.mta1.migadu.com [95.215.58.79]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0A41710E6 for ; Fri, 11 Aug 2023 16:34:17 -0700 (PDT) Message-ID: DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.dev; s=key1; t=1691796855; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=XOAmUrlmN46mc8kLRjXg/L06bahnJsC47ASbtgox2n0=; b=SNHoCg5d13wmYFB3Vv9vgXrj8JvkuLaWsHbv6s1y5JjHbtBbtZKKPWBjqBxloswco9LOjr 7IndQQykzTcQQpn3uC1QjRHqlrTXkvucO5lPLo8m7b3mZ2jQMN1VzjNtU+WIrX7wbM+fQB 09xhUlEP0LxhAOA/EtAWr2ZHGMgogiM= Date: Fri, 11 Aug 2023 16:34:10 -0700 MIME-Version: 1.0 Subject: Re: [PATCH bpf-next] bpf: Support default .validate() and .update() behavior for struct_ops links Content-Language: en-US To: Kui-Feng Lee Cc: bpf@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, andrii@kernel.org, song@kernel.org, yhs@fb.com, john.fastabend@gmail.com, kpsingh@kernel.org, haoluo@google.com, jolsa@kernel.org, linux-kernel@vger.kernel.org, kernel-team@meta.com, tj@kernel.org, clm@meta.com, thinker.li@gmail.com, Stanislav Fomichev , David Vernet References: <20230810220456.521517-1-void@manifault.com> <20230810230141.GA529552@maniforge> <20230811201914.GD542801@maniforge> <03f9f9be-620d-a44d-d6a3-8b9084344db5@gmail.com> X-Report-Abuse: Please report any abuse attempt to abuse@migadu.com and include these headers. From: Martin KaFai Lau In-Reply-To: <03f9f9be-620d-a44d-d6a3-8b9084344db5@gmail.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-Migadu-Flow: FLOW_OUT X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,SPF_HELO_NONE,SPF_PASS, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 8/11/23 4:12 PM, Kui-Feng Lee wrote: > > > On 8/11/23 15:49, Martin KaFai Lau wrote: >> On 8/11/23 1:19 PM, David Vernet wrote: >>> On Fri, Aug 11, 2023 at 10:35:03AM -0700, Martin KaFai Lau wrote: >>>> On 8/10/23 4:15 PM, Stanislav Fomichev wrote: >>>>> On 08/10, David Vernet wrote: >>>>>> On Thu, Aug 10, 2023 at 03:46:18PM -0700, Stanislav Fomichev wrote: >>>>>>> On 08/10, David Vernet wrote: >>>>>>>> Currently, if a struct_ops map is loaded with BPF_F_LINK, it must also >>>>>>>> define the .validate() and .update() callbacks in its corresponding >>>>>>>> struct bpf_struct_ops in the kernel. Enabling struct_ops link is useful >>>>>>>> in its own right to ensure that the map is unloaded if an application >>>>>>>> crashes. For example, with sched_ext, we want to automatically unload >>>>>>>> the host-wide scheduler if the application crashes. We would likely >>>>>>>> never support updating elements of a sched_ext struct_ops map, so we'd >>>>>>>> have to implement these callbacks showing that they _can't_ support >>>>>>>> element updates just to benefit from the basic lifetime management of >>>>>>>> struct_ops links. >>>>>>>> >>>>>>>> Let's enable struct_ops maps to work with BPF_F_LINK even if they >>>>>>>> haven't defined these callbacks, by assuming that a struct_ops map >>>>>>>> element cannot be updated by default. >>>>>>> >>>>>>> Any reason this is not part of sched_ext series? As you mention, >>>>>>> we don't seem to have such users in the three? >>>>>> >>>>>> Hi Stanislav, >>>>>> >>>>>> The sched_ext series [0] implements these callbacks. See >>>>>> bpf_scx_update() and bpf_scx_validate(). >>>>>> >>>>>> [0]: https://lore.kernel.org/all/20230711011412.100319-13-tj@kernel.org/ >>>>>> >>>>>> We could add this into that series and remove those callbacks, but this >>>>>> patch is fixing a UX / API issue with struct_ops links that's not really >>>>>> relevant to sched_ext. I don't think there's any reason to couple >>>>>> updating struct_ops map elements with allowing the kernel to manage the >>>>>> lifetime of struct_ops maps -- just because we only have 1 (non-test) >>>> >>>> Agree the link-update does not necessarily couple with link-creation, so >>>> removing 'link' update function enforcement is ok. The intention was to >>>> avoid the struct_ops link inconsistent experience (one struct_ops link >>>> support update and another struct_ops link does not) because consistency was >>>> one of the reason for the true kernel backed link support that Kui-Feng did. >>>> tcp-cc is the only one for now in struct_ops and it can support update, so >>>> the enforcement is here. I can see Stan's point that removing it now looks >>>> immature before a struct_ops landed in the kernel showing it does not make >>>> sense or very hard to support 'link' update. However, the scx patch set has >>>> shown this point, so I think it is good enough. >>> >>> Sorry for sending v2 of the patch a bit prematurely. Should have let you >>> weigh in first. >>> >>>> For 'validate', it is not related a 'link' update. It is for the struct_ops >>>> 'map' update. If the loaded struct_ops map is invalid, it will end up having >>>> a useless struct_ops map and no link can be created from it. I can see some >>> >>> To be honest I'm actually not sure I understand why .validate() is only >>> called for when BPF_F_LINK is specified. Is it because it could break >> >> Regardless '.validate' must be enforced or not, the ->validate() should be >> called for the non BPF_F_LINK case also during map update. This should be fixed. > > For the case of the TCP congestion control, its validation function is > called by the implementations of ->validate() and ->reg(). I mean it > expects ->reg() to do validation as well. Right, for tcp-cc, the reg is doing the validation because it is how the kernel tcp-cc module is done. For newer subsystem supporting struct_ops, it should expect the validation is done in the .validate alone.