Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp903176pxb; Fri, 22 Apr 2022 13:54:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzR2yVCfd/SeBNm/fn+O8E63ElpN9/AF4z/JZlILmxtTGcU1qMvfOHVSLnviR9XIQq/KaDB X-Received: by 2002:a62:d152:0:b0:50c:e62e:55a4 with SMTP id t18-20020a62d152000000b0050ce62e55a4mr6661716pfl.72.1650660887893; Fri, 22 Apr 2022 13:54:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650660887; cv=none; d=google.com; s=arc-20160816; b=ID7rD51bZ4r9dWSeo+LteTY2B2bkp4opVjpRk6oRdnCujt1oZSzC8RljUgwqbRyyy8 Zfp9g7gASh1iDqi26TDy9y3UWFARyOkiNcpP0snydCG9vaG4GGq2IIm/mSkzcOabMS4B r+oJ1HFUgcnidIAzwDVutwO7pMt9NQQ4PJwey3eh4ILuIEnMTnkv56LFeMhD8rptNicL yQD6oxBKo6J1yUtUO1AVBAnAvDaNVrsjdhZjyjZ+rJ0p+Z9fnCyWvg2IsziVdbzKk1Gv mFzDhI99LvhL8kPC+CMvwT15gRkKG7If3zUhWDi76DrMWYN1tCgcTD6ndWCe+GopgXeQ bVgA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=RTZfNgEBbOY/0YsLzb+Oj32vJ08Xo43ssePlfikhZsc=; b=eOnE8HHUzYi8mQwTodl7g+JZHtuTgYVZGI05iqjADRVhMuOUioG3D3uh3wNJ8PIj3B 0Sidan23iVcD5/N5KvBbpg8YPsLfRQH/QxomiwogggD6ccc67KG43Z4j1TVmb/xf3bLh mGXMVv5QVgr/RiXYONzn94YqJR93SEd1kzLU8KmlbyDu04fLw/sIitImFCT5s0o2gvmc sIZOUY45UqkuS5OFiFkaPUUT2l+qEVQtv4sQjNyJOY7jg0gcqBlWRkHgWj9fAHxu407q BhyCrrIPnVk/Tyi0nTH6j/FJLhItm9xs0p+xLBzNSa8/V2v6qMq47P6zFqLS+q1daFIy 8+Pw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id u4-20020a17090282c400b00156d899e1c8si8369555plz.615.2022.04.22.13.54.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Apr 2022 13:54:47 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 9957D2C6E34; Fri, 22 Apr 2022 12:49:05 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1390286AbiDUPpz (ORCPT + 99 others); Thu, 21 Apr 2022 11:45:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42630 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1390347AbiDUPpg (ORCPT ); Thu, 21 Apr 2022 11:45:36 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B09BE49250 for ; Thu, 21 Apr 2022 08:42:23 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 4D6E5CE2420 for ; Thu, 21 Apr 2022 15:42:06 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3CE24C385A5; Thu, 21 Apr 2022 15:42:03 +0000 (UTC) Date: Thu, 21 Apr 2022 11:42:01 -0400 From: Steven Rostedt To: Mark Rutland Cc: Wang ShaoBo , cj.chengjian@huawei.com, huawei.libin@huawei.com, xiexiuqi@huawei.com, liwei391@huawei.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, catalin.marinas@arm.com, will@kernel.org, zengshun.wu@outlook.com Subject: Re: [RFC PATCH -next v2 3/4] arm64/ftrace: support dynamically allocated trampolines Message-ID: <20220421114201.21228eeb@gandalf.local.home> In-Reply-To: References: <20220316100132.244849-1-bobo.shaobowang@huawei.com> <20220316100132.244849-4-bobo.shaobowang@huawei.com> <20220421100639.03c0d123@gandalf.local.home> X-Mailer: Claws Mail 3.17.8 (GTK+ 2.24.33; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, 21 Apr 2022 16:14:13 +0100 Mark Rutland wrote: > > Let's say you have 10 ftrace_ops registered (with bpf and kprobes this can > > be quite common). But each of these ftrace_ops traces a function (or > > functions) that are not being traced by the other ftrace_ops. That is, each > > ftrace_ops has its own unique function(s) that they are tracing. One could > > be tracing schedule, the other could be tracing ksoftirqd_should_run > > (whatever). > > Ok, so that's when messing around with bpf or kprobes, and not generally > when using plain old ftrace functionality under /sys/kernel/tracing/ > (unless that's concurrent with one of the former, as per your other > reply) ? It's any user of the ftrace infrastructure, which includes kprobes, bpf, perf, function tracing, function graph tracing, and also affects instances. > > > Without this change, because the arch does not support dynamically > > allocated trampolines, it means that all these ftrace_ops will be > > registered to the same trampoline. That means, for every function that is > > traced, it will loop through all 10 of theses ftrace_ops and check their > > hashes to see if their callback should be called or not. > > Sure; I can see how that can be quite expensive. > > What I'm trying to figure out is who this matters to and when, since the > implementation is going to come with a bunch of subtle/fractal > complexities, and likely a substantial overhead too when enabling or > disabling tracing of a patch-site. I'd like to understand the trade-offs > better. > > > With dynamically allocated trampolines, each ftrace_ops will have their own > > trampoline, and that trampoline will be called directly if the function > > is only being traced by the one ftrace_ops. This is much more efficient. > > > > If a function is traced by more than one ftrace_ops, then it falls back to > > the loop. > > I see -- so the dynamic trampoline is just to get the ops? Or is that > doing additional things? It's to get both the ftrace_ops (as that's one of the parameters) as well as to call the callback directly. Not sure if arm is affected by spectre, but the "loop" function is filled with indirect function calls, where as the dynamic trampolines call the callback directly. Instead of: bl ftrace_caller ftrace_caller: [..] bl ftrace_ops_list_func [..] void ftrace_ops_list_func(...) { __do_for_each_ftrace_ops(op, ftrace_ops_list) { if (ftrace_ops_test(op, ip)) // test the hash to see if it // should trace this // function. op->func(...); } } It does: bl dyanmic_tramp dynamic_tramp: [..] bl func // call the op->func directly! Much more efficient! > > There might be a middle-ground here where we patch the ftrace_ops > pointer into a literal pool at the patch-site, which would allow us to > handle this atomically, and would avoid the issues with out-of-range > trampolines. Have an example of what you are suggesting? -- Steve