Received: by 2002:a05:6a10:6d10:0:0:0:0 with SMTP id gq16csp793185pxb; Fri, 22 Apr 2022 11:15:32 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzVh+OSPOhmZdlU6jWRvjb+gp76AU6uCGtsNBaEAGIk4R6ELWMfQqCuWbbhY990I/LhOlOo X-Received: by 2002:a17:902:b698:b0:158:faee:442f with SMTP id c24-20020a170902b69800b00158faee442fmr5941090pls.75.1650651332465; Fri, 22 Apr 2022 11:15:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1650651332; cv=none; d=google.com; s=arc-20160816; b=X2IqOcftFd42YhhtiO0I5AkcuijUnNb5CzdA3fu/zEOjgXwAXi8ZO3O8ylqqNV1pxD DB3s2+js3iQEqFjhT7WaYEtJmzyxO9DKIeOFWkYXZ9DiNKh3fNExAOpvD2dLEF2ANk/3 +uJQcuedd768oL2aWfkgSbOJOemenV1AlNZ5MphbPWRUl1FQl+wakx2BD84+Mc58sXqL EQaudJqHI8DgM+oz67v6sBwrwpaegerG7iYczqnV8KzPNz9p+kUXyLQpMHjV4peQhGqr D2g783LNNB5OM2KDqWogKVjW4UHNw6+wr4zP4ukiZ8GYk8HfdYhEZib0rDKPHSFYlBt9 P6sw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=Z6kPCXGu7952CbwibaCmYVg+utaWZq0gW/i8lwXx6Nk=; b=TzNujaO6h2OQseDtEPY9LLk+5DRRaq0e+4sXTtmh+V6I6oA2I2uFaF/poK4uDxCWsB LD3VyknPHW1MKTjuw58fkrsyNEP3JYhSPIfL7ELo6peyCe4xkk2FD15PlbU51b5MH56M mIRWCb0BVHQr/MR9SyiKVFVgp3ZKqgFRT5jRadpQb03bBx3xHIqqA8WvFEAn3sQYqrAq Il4WfL+fQO3MzZ/XvryKXfZAuBn0o31tTuL9nriNGk61DFan7gItL8TzfVaivzUTskI3 MRzWGE6RW1Mn0qOalJpazzb/yNP69go4RfePkZfYpfcs5no4dkGl3VvrLWB8a5JobdjI 5R2Q== ARC-Authentication-Results: i=1; mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id w25-20020a634759000000b003a0ab4ec87fsi8953875pgk.740.2022.04.22.11.15.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 22 Apr 2022 11:15:32 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 4E6B51334D9; Fri, 22 Apr 2022 10:49:57 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1389942AbiDUPRL (ORCPT + 99 others); Thu, 21 Apr 2022 11:17:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49998 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230247AbiDUPRH (ORCPT ); Thu, 21 Apr 2022 11:17:07 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 93D6743EC5 for ; Thu, 21 Apr 2022 08:14:17 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 505791515; Thu, 21 Apr 2022 08:14:17 -0700 (PDT) Received: from lakrids (usa-sjc-imap-foss1.foss.arm.com [10.121.207.14]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id D9F963F73B; Thu, 21 Apr 2022 08:14:15 -0700 (PDT) Date: Thu, 21 Apr 2022 16:14:13 +0100 From: Mark Rutland To: Steven Rostedt Cc: Wang ShaoBo , cj.chengjian@huawei.com, huawei.libin@huawei.com, xiexiuqi@huawei.com, liwei391@huawei.com, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org, catalin.marinas@arm.com, will@kernel.org, zengshun.wu@outlook.com Subject: Re: [RFC PATCH -next v2 3/4] arm64/ftrace: support dynamically allocated trampolines Message-ID: References: <20220316100132.244849-1-bobo.shaobowang@huawei.com> <20220316100132.244849-4-bobo.shaobowang@huawei.com> <20220421100639.03c0d123@gandalf.local.home> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220421100639.03c0d123@gandalf.local.home> X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 21, 2022 at 10:06:39AM -0400, Steven Rostedt wrote: > On Thu, 21 Apr 2022 14:10:04 +0100 > Mark Rutland wrote: > > > On Wed, Mar 16, 2022 at 06:01:31PM +0800, Wang ShaoBo wrote: > > > From: Cheng Jian > > > > > > When tracing multiple functions customly, a list function is called > > > in ftrace_(regs)_caller, which makes all the other traced functions > > > recheck the hash of the ftrace_ops when tracing happend, apparently > > > it is inefficient. > > > > ... and when does that actually matter? Who does this and why? > > I don't think it was explained properly. What dynamically allocated > trampolines give you is this. Thanks for the, explanation, btw! > Let's say you have 10 ftrace_ops registered (with bpf and kprobes this can > be quite common). But each of these ftrace_ops traces a function (or > functions) that are not being traced by the other ftrace_ops. That is, each > ftrace_ops has its own unique function(s) that they are tracing. One could > be tracing schedule, the other could be tracing ksoftirqd_should_run > (whatever). Ok, so that's when messing around with bpf or kprobes, and not generally when using plain old ftrace functionality under /sys/kernel/tracing/ (unless that's concurrent with one of the former, as per your other reply) ? > Without this change, because the arch does not support dynamically > allocated trampolines, it means that all these ftrace_ops will be > registered to the same trampoline. That means, for every function that is > traced, it will loop through all 10 of theses ftrace_ops and check their > hashes to see if their callback should be called or not. Sure; I can see how that can be quite expensive. What I'm trying to figure out is who this matters to and when, since the implementation is going to come with a bunch of subtle/fractal complexities, and likely a substantial overhead too when enabling or disabling tracing of a patch-site. I'd like to understand the trade-offs better. > With dynamically allocated trampolines, each ftrace_ops will have their own > trampoline, and that trampoline will be called directly if the function > is only being traced by the one ftrace_ops. This is much more efficient. > > If a function is traced by more than one ftrace_ops, then it falls back to > the loop. I see -- so the dynamic trampoline is just to get the ops? Or is that doing additional things? There might be a middle-ground here where we patch the ftrace_ops pointer into a literal pool at the patch-site, which would allow us to handle this atomically, and would avoid the issues with out-of-range trampolines. Thanks, Mark.