Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp5694418ybg; Tue, 22 Oct 2019 07:06:32 -0700 (PDT) X-Google-Smtp-Source: APXvYqy20Qh6jeRWqCqPEldclVCSoS+ecAEwzUZSPeO4Un8r5kGlJuXxpHjL+5SVAwgVok+Za3R5 X-Received: by 2002:a17:906:5010:: with SMTP id s16mr8782180ejj.67.1571753192708; Tue, 22 Oct 2019 07:06:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571753192; cv=none; d=google.com; s=arc-20160816; b=Ctgw0z12X1r+L3dKRWhj6wZ3zjtvalJ0GEwE4ec9y/CqiDsRnVBJ+OCpeHCSpdhGyq fHQ9hCzqn/p/mGnU+ZMXGG4/u8Gjuu8x3IQaBslD6p6ABs6IykX3T2P8eGP7XoRZJcDB DZw8zel7qA80XJwVWl85X+zclxCItS+d5whwH/8FkOA6LESkWbaJEETJ+oa2li+C2WkP oP6pMeoR38AAjMdGcexF9ab3JwJFizVsyGmxZH1THtw3Fkoj6uL0bs+5zDMTAKSU6mY1 94s8sqRZISCCoXyTpk4FwYIaKIZa3Kc00YXu7+KG0+JKxCRdM3pb9WVtZf1JRUCiyfou dr5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date; bh=PtT1Oqt+mNsQ7ChSyqNnpRkmzata7BVX7rDtjNzwsMY=; b=JrjFrxv98IHmHv27GMr7bfJ8lHEAH5lYU1L6jZwUdjZf1Xk8/msydnitSXyq9VjFM5 Oa1ALL0HJxieTiw2X1Mz01J+uVI2JzHDnS5XBWi6gaWwYrl17fL2VGTgTygA1DsgnDke EJda07Bsr5P4JY49M6b5qB3/DqdMF7yATXEZDoIaTNWtra1ZFNVhNl0e5rdIu7h4Ma19 5/rWSLn1b48wXeb21JjjFKSe1CII/AxMGFyYTlhEgZmvdJrbfRy5nyy+Kj4h/6U9jshK d2Sj142IuGL1d40cO3JkIwrshS+NsuACFkySvD2qAoka1Uo7q9ClLLO0P7qO0c/MWT/N kQ2A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b28si3797884edn.230.2019.10.22.07.05.56; Tue, 22 Oct 2019 07:06:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732109AbfJVNo7 (ORCPT + 99 others); Tue, 22 Oct 2019 09:44:59 -0400 Received: from mail.kernel.org ([198.145.29.99]:39132 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731218AbfJVNo6 (ORCPT ); Tue, 22 Oct 2019 09:44:58 -0400 Received: from gandalf.local.home (cpe-66-24-58-225.stny.res.rr.com [66.24.58.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 7041C20700; Tue, 22 Oct 2019 13:44:56 +0000 (UTC) Date: Tue, 22 Oct 2019 09:44:55 -0400 From: Steven Rostedt To: Alexei Starovoitov Cc: Peter Zijlstra , Daniel Bristot de Oliveira , LKML , X86 ML , Nadav Amit , Andy Lutomirski , Dave Hansen , Song Liu , Masami Hiramatsu Subject: Re: [PATCH 3/3] x86/ftrace: Use text_poke() Message-ID: <20191022094455.6a0a1a27@gandalf.local.home> In-Reply-To: <20191022071956.07e21543@gandalf.local.home> References: <20191002182106.GC4643@worktop.programming.kicks-ass.net> <20191003181045.7fb1a5b3@gandalf.local.home> <20191004112237.GA19463@hirez.programming.kicks-ass.net> <20191004094228.5a5774fe@gandalf.local.home> <20191021204310.3c26f730@oasis.local.home> <20191021231630.49805757@oasis.local.home> <20191021231904.4b968dc1@oasis.local.home> <20191022040532.fvpxcs74i4mn4rc6@ast-mbp.dhcp.thefacebook.com> <20191022071956.07e21543@gandalf.local.home> X-Mailer: Claws Mail 3.17.3 (GTK+ 2.24.32; x86_64-pc-linux-gnu) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, 22 Oct 2019 07:19:56 -0400 Steven Rostedt wrote: > > I'm not touching dyn_ftrace. > > Actually calling my stuff ftrace+bpf is probably not correct either. > > I'm reusing code patching of nop into call that ftrace does. That's it. > > Turned out I cannot use 99% of ftrace facilities. > > ftrace_caller, ftrace_call, ftrace_ops_list_func and the whole ftrace api > > with ip, parent_ip and pt_regs cannot be used for this part of the work. > > bpf prog needs to access raw function arguments. To achieve that I'm > > You can do that today with the ftrace facility, just like live patching > does. You register a ftrace_ops with the flag FTRACE_OPS_FL_IPMODIFY, > and your func will set the regs->ip to your bpf handler. When the > ftrace_ops->func returns, instead of going back to the called > function, it can jump to your bpf_handler. You can create a shadow stack > (like function graph tracer does) to save the return address for where > you bpf handler needs to return to. As your bpf_handler needs raw > access to the parameters, it may not even need the shadow stack because > it should know the function it is reading the parameters from. To show just how easy this is, I wrote up a quick hack that hijacks the wake_up_process() function and adds a trace_printk() to see what was woken up. My output from the trace is this: -0 [007] ..s1 68.517276: my_wake_up: We are waking up rcu_preempt:10 <...>-1240 [001] .... 68.517727: my_wake_up: We are waking up kthreadd:2 <...>-1240 [001] d..1 68.517973: my_wake_up: We are waking up kworker/1:0:17 bash-1188 [003] d..2 68.519020: my_wake_up: We are waking up kworker/u16:3:140 bash-1188 [003] d..2 68.519138: my_wake_up: We are waking up kworker/u16:3:140 sshd-1187 [005] d.s2 68.519295: my_wake_up: We are waking up kworker/5:2:517 -0 [007] ..s1 68.522293: my_wake_up: We are waking up rcu_preempt:10 -0 [007] ..s1 68.526309: my_wake_up: We are waking up rcu_preempt:10 I added the code to the trace-event-sample.c sample module, and got the above when I loaded that module (modprobe trace-event-sample). It's mostly non arch specific (that is, you can do this with any arch that supports the IPMODIFY flag). The only parts that would need arch specific code is the regs->ip compare. The pip check can also be done less "hacky". But this shows you how easy this can be done today. Not sure what is missing that you need. Here's the patch: diff --git a/samples/trace_events/trace-events-sample.c b/samples/trace_events/trace-events-sample.c index 1a72b7d95cdc..526a6098c811 100644 --- a/samples/trace_events/trace-events-sample.c +++ b/samples/trace_events/trace-events-sample.c @@ -11,6 +11,41 @@ #define CREATE_TRACE_POINTS #include "trace-events-sample.h" +#include + +int wake_up_process(struct task_struct *p); + +int x; + +static int my_wake_up(struct task_struct *p) +{ + int ret; + + trace_printk("We are waking up %s:%d\n", p->comm, p->pid); + ret = wake_up_process(p); + /* Force not having a tail call */ + if (!x) + return ret; + return 0; +} + +static void my_hijack_func(unsigned long ip, unsigned long pip, + struct ftrace_ops *ops, struct pt_regs *regs) +{ + unsigned long this_func = (unsigned long)my_wake_up; + + if (pip >= this_func && pip <= this_func + 0x10000) + return; + + regs->ip = my_wake_up; +} + +static struct ftrace_ops my_ops = { + .func = my_hijack_func, + .flags = FTRACE_OPS_FL_IPMODIFY | FTRACE_OPS_FL_RECURSION_SAFE | + FTRACE_OPS_FL_SAVE_REGS, +}; + static const char *random_strings[] = { "Mother Goose", "Snoopy", @@ -115,6 +150,11 @@ void foo_bar_unreg(void) static int __init trace_event_init(void) { + int ret; + + ret = ftrace_set_filter_ip(&my_ops, (unsigned long)wake_up_process, 0, 0); + if (!ret) + register_ftrace_function(&my_ops); simple_tsk = kthread_run(simple_thread, NULL, "event-sample"); if (IS_ERR(simple_tsk)) return -1; @@ -124,6 +164,7 @@ static int __init trace_event_init(void) static void __exit trace_event_exit(void) { + unregister_ftrace_function(&my_ops); kthread_stop(simple_tsk); mutex_lock(&thread_mutex); if (simple_tsk_fn) -- Steve