Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp802734imw; Fri, 15 Jul 2022 13:10:57 -0700 (PDT) X-Google-Smtp-Source: AGRyM1t7kmmRjUfp9uZCUXCEf8fzzzComtDDllsPy7E/hu7ctqpSgVteRt+5oEsN8TJoviOVyckz X-Received: by 2002:a17:902:d490:b0:16b:f101:b295 with SMTP id c16-20020a170902d49000b0016bf101b295mr14935662plg.52.1657915857221; Fri, 15 Jul 2022 13:10:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1657915857; cv=none; d=google.com; s=arc-20160816; b=RwlWS4WkHCuDY+Px1fSnVV8XgbkkjqZ3Ez3W8Q5PrOzLqJTVwT6QDLF1tN0iHWZpJH K54IWiRNcoqiCJ7sWwmfYRpbP5u7Qd/PdIF4/t5mXVV0ehJ90KnoTRgyxJPVNVJu6fdY 9iZaqfeJKfGW/ZDXSmjWGBovRuqsro21L/8ep+XOpMzRhJxKYBD/v7SrMCaV6OBI3O1j ROmUuxZBFEk+V2msUsw7eEk7pb8leHWC02RyjrfH37MXkw2MScw3iDUsJ/JFplVVn4u3 hDlK1Pie7N7MsaLFo9jkx7d4mnMvQ2VCofT5cx2Aiuu9H5kTdvNne0w6AMnC7ewyEAPB HHtw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=lIm4NJOVpApAR3MyKfpGA6dfhFpuyQ9mMU8LCObwNfc=; b=X5AAy3aC/HzjLpcWbT5mM3ovwh5C6wRUcs5CuRgbWpLFhn9xt+gPphotxNW6/r5Rbq QkXHxgMutsVpRp0NZyl31uDb8BmNgHUXN+BATss2OIKLLqJXNv5YEuBhnx8df7W7yQ40 dNZGJJNIiT3f6kb+rq7S5YmNqb0nNuLPkuKYhfV6PhPSEVUaXnRdVQb60IoXzTMXHz4v m+fi0LGqiiv3JlnCAHY6QUCgxkXCfipv8U5oyCo2Mqyydemj374mEoor0nBtWca1IrQI EtmvQwhRkEKnvTTJ2o+6XgDZ7vcl5IxCgzXxvlhf9Kk9U9iIxgMi3ZdSm/7qkbNdz3UG rkGQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id ip2-20020a17090b314200b001e887404411si12018859pjb.8.2022.07.15.13.10.41; Fri, 15 Jul 2022 13:10:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231400AbiGOUAE (ORCPT + 99 others); Fri, 15 Jul 2022 16:00:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58966 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231397AbiGOT77 (ORCPT ); Fri, 15 Jul 2022 15:59:59 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C7AB477496; Fri, 15 Jul 2022 12:59:57 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 025676122A; Fri, 15 Jul 2022 19:59:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 12A87C341C0; Fri, 15 Jul 2022 19:59:54 +0000 (UTC) Date: Fri, 15 Jul 2022 15:59:53 -0400 From: Steven Rostedt To: Song Liu Cc: Song Liu , Networking , bpf , lkml , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Kernel Team , "jolsa@kernel.org" , "mhiramat@kernel.org" Subject: Re: [PATCH v2 bpf-next 3/5] ftrace: introduce FTRACE_OPS_FL_SHARE_IPMODIFY Message-ID: <20220715155953.4fb692e2@gandalf.local.home> In-Reply-To: <0EB34157-8BCA-47FC-B78F-AA8FE45A1707@fb.com> References: <20220602193706.2607681-1-song@kernel.org> <20220602193706.2607681-4-song@kernel.org> <20220713203343.4997eb71@rorschach.local.home> <20220714204817.2889e280@rorschach.local.home> <6A7EF1C7-471B-4652-99C1-87C72C223C59@fb.com> <20220714224646.62d49e36@rorschach.local.home> <170BE89A-101C-4B25-A664-5E47A902DB83@fb.com> <0CE9BF90-B8CE-40F6-A431-459936157B78@fb.com> <20220715151217.141dc98f@gandalf.local.home> <0EB34157-8BCA-47FC-B78F-AA8FE45A1707@fb.com> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-6.7 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,RCVD_IN_DNSWL_HI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, 15 Jul 2022 19:49:00 +0000 Song Liu wrote: > > > > What about if we release the lock when doing the callback? > > We can probably unlock ftrace_lock here. But we may break locking order > with direct mutex (see below). You're talking about the multi registering case, right? > > > > > Then we just need to make sure things are the same after reacquiring the > > lock, and if they are different, we release the lock again and do the > > callback with the new update. Wash, rinse, repeat, until the state is the > > same before and after the callback with locks acquired? > > Personally, I would like to avoid wash-rinse-repeat here. But it's common to do. Keeps your hair cleaner that way ;-) > > > > > This is a common way to handle callbacks that need to do something that > > takes the lock held before doing a callback. > > > > The reason I say this, is because the more we can keep the accounting > > inside of ftrace the better. > > > > Wouldn't this need to be done anyway if BPF was first and live kernel > > patching needed the update? An -EAGAIN would not suffice. > > prepare_direct_functions_for_ipmodify handles BPF-first-livepatch-later > case. The benefit of prepare_direct_functions_for_ipmodify() is that it > holds direct_mutex before ftrace_lock, and keeps holding it if necessary. > This is enough to make sure we don't need the wash-rinse-repeat. > > OTOH, if we wait until __ftrace_hash_update_ipmodify(), we already hold > ftrace_lock, but not direct_mutex. To make changes to bpf trampoline, we > have to unlock ftrace_lock and lock direct_mutex to avoid deadlock. > However, this means we will need the wash-rinse-repeat. > > > For livepatch-first-BPF-later case, we can probably handle this in > __ftrace_hash_update_ipmodify(), since we hold both direct_mutex and > ftrace_lock. We can unlock ftrace_lock and update the BPF trampoline. > It is safe against changes to direct ops, because we are still holding > direct_mutex. But, is this safe against another IPMODIFY ops? I am not > sure yet... Also, this is pretty weird because, we are updating a > direct trampoline before we finish registering it for the first time. > IOW, we are calling modify_ftrace_direct_multi_nolock for the same > trampoline before register_ftrace_direct_multi() returns. > > The approach in v2 propagates the -EAGAIN to BPF side, so these are two > independent calls of register_ftrace_direct_multi(). This does require > some protocol between ftrace core and its user, but I still think this > is a cleaner approach. The issue I have with this approach is it couples BPF and ftrace a bit too much. But there is a way with my approach you can still do your approach. That is, have ops_func() return zero if everything is fine, and otherwise returns a negative value. Then have the register function fail and return whatever value that gets returned by the ops_func() Then have the bpf ops_func() check (does this direct caller handle IPMODIFY? if yes, return 0, else return -EAGAIN). Then the registering of ftrace fails with your -EAGAIN, and then you can change the direct trampoline to handle IPMODIFY and try again. This time when ops_func() is called, it sees that the direct trampoline can handle the IPMODIFY and returns 0. Basically, it's a way to still implement my suggestion, but let BPF decide to use -EAGAIN to try again. And then BPF and ftrace don't need to have these special flags to change the behavior of each other. -- Steve