Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp7420517ybl; Tue, 24 Dec 2019 02:11:16 -0800 (PST) X-Google-Smtp-Source: APXvYqznkpO2NsDUHR6XXVOj6BKrkg+OWDRzg9wRDicOlaavpIHI2xhrYFMuRyARa9cIl29Mxzp0 X-Received: by 2002:a9d:7851:: with SMTP id c17mr15946055otm.58.1577182276330; Tue, 24 Dec 2019 02:11:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1577182276; cv=none; d=google.com; s=arc-20160816; b=BCe25HRmkOYUgSbFGm45zSi/FPGxxPUvHxFE2U77mw+xqhujvw987FY1llMwRJUSDX kcPMDLqRyNQx1BIl+bweF3tSgcUDfhTJiILKA07zDRDU4hk83asxz/3rN0IqkEgj9rnY V1Bx9osICGxsBIUmhwjCm48dSHSMgaBzYPF7gI1PoJeF/RVcko3b9vmNLIdY1OBYFnxS q1vB0q+ki0Jk2+L9pFzOhiR0RUIZUNgYXTXkVGVAv73TNRPHV3jlN75HpUi2jroMnnOT cjgmHnpuwifWn7VLdPtKU41enTAWwwGNItNrP6IHXcYPAIjADreEADrmoGzEfGk1RHcj O/aA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:subject:cc:to:from:date :dkim-signature; bh=7bJQNxEahriQxW2R2sZe4x8q88VP55bQvumqy5d1WOk=; b=p2Xe1k5RWcc0tjd5fRI6Q2kJEYKQLb/Uhx0+a84DF0mX/zlXREeilbyItAvGdF9883 U0acfcVWj29FaFtLQEQxcbAK5DCoXASOSbFLqYOoK1902oP+6T0F+Wt6sd3ZT8JfaWC7 N2EqGwXvDkEKr7WUEACR6KdocoqG79TDEKLYtMLmM5TraLXWUN732XVW0TGTeMd9uA65 QuzGn/BLYox0eMJGGY6RxcD+gvFyT+HyjhU8S+CiZMdCT+pNCm436lTsGtLCbljuCDeh eBqxZMzhBmnzrUvNv6GLBqOZEgboLECjt+FlE+EchTaoaIMMdO6qI5WuI3+21JWYxdgl MQKA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=X8ogjSxm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id a15si8139310otq.83.2019.12.24.02.11.04; Tue, 24 Dec 2019 02:11:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=X8ogjSxm; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726216AbfLXKJX (ORCPT + 99 others); Tue, 24 Dec 2019 05:09:23 -0500 Received: from mail.kernel.org ([198.145.29.99]:46324 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726084AbfLXKJX (ORCPT ); Tue, 24 Dec 2019 05:09:23 -0500 Received: from devnote2 (NE2965lan1.rev.em-net.ne.jp [210.141.244.193]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id 4EAF72071A; Tue, 24 Dec 2019 10:09:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1577182161; bh=OcYvU6R9hPJgmJg3UwAfVCin1mEXqh7FP8N9hsHQNeU=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=X8ogjSxmBHca5nsL5BInqifxvj7nOhyh902O2b0NIdrXPscb+r1ItHnf9hJoLzZAv L9q7c04rXqWTvPYBsAVWDiyIXedtoYN+67rSjcq52yqNI50HL88+uCnLSaNFE3+I5Y ztzJ8syVplerh7dv0AfnWbgdi+rci75L2/USw358= Date: Tue, 24 Dec 2019 19:09:16 +0900 From: Masami Hiramatsu To: Jisheng Zhang Cc: "catalin.marinas@arm.com" , "will@kernel.org" , "naveen.n.rao@linux.vnet.ibm.com" , "mark.rutland@arm.com" , "corbet@lwn.net" , "rostedt@goodmis.org" , "linux-doc@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "linux-arm-kernel@lists.infradead.org" Subject: Re: [PATCH v6] arm64: implement KPROBES_ON_FTRACE Message-Id: <20191224190916.2e47478445fb179e88f60cc3@kernel.org> In-Reply-To: <20191223153300.30281a93@xhacker.debian> References: <20191218140622.57bbaca5@xhacker.debian> <20191218222550.51f0b681de7bbab7e49b09a9@kernel.org> <20191223153300.30281a93@xhacker.debian> X-Mailer: Sylpheed 3.5.1 (GTK+ 2.24.32; x86_64-pc-linux-gnu) Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Jisheng, On Mon, 23 Dec 2019 07:47:24 +0000 Jisheng Zhang wrote: > Hi Masami, > > On Wed, 18 Dec 2019 22:25:50 +0900 Masami Hiramatsu wrote: > > > > > > > > On Wed, 18 Dec 2019 06:21:35 +0000 > > Jisheng Zhang wrote: > > > > > KPROBES_ON_FTRACE avoids much of the overhead with regular kprobes as it > > > eliminates the need for a trap, as well as the need to emulate or > > > single-step instructions. > > > > > > Tested on berlin arm64 platform. > > > > > > ~ # mount -t debugfs debugfs /sys/kernel/debug/ > > > ~ # cd /sys/kernel/debug/ > > > /sys/kernel/debug # echo 'p _do_fork' > tracing/kprobe_events > > > > > > before the patch: > > > > > > /sys/kernel/debug # cat kprobes/list > > > ffffff801009fe28 k _do_fork+0x0 [DISABLED] > > > > > > after the patch: > > > > > > /sys/kernel/debug # cat kprobes/list > > > ffffff801009ff54 k _do_fork+0x4 [DISABLED][FTRACE] > > > > BTW, it seems this automatically changes the offset without > > user's intention or any warnings. How would you manage if the user > > pass a new probe on _do_fork+0x4? > > In current implementation, two probes at the same address _do_fork+0x4 OK, that is my point. > > IOW, it is still the question who really wants to probe on > > the _do_fork+"0", if kprobes modifies it automatically, > > no one can do that anymore. > > This can be happen if the user want to record LR or SP value > > at the function call for debug. If kprobe always modifies it, > > we will lose the way to do it. > > arm64's DYNAMIC_FTRACE_WITH_REGS implementation makes use of GCC > -fpatchable-function-entry=2 option to insert two nops. When the function > is traced, the first nop will be modified to the LR saver, then the > second nop to "bl ", commit 3b23e4991fb6(" > arm64: implement ftrace with regs") explains the mechanism. So both of the instruction at func+0 and func+4 are replaced. > So on arm64(in fact any arch makes use of -fpatchable-function-entry will > behave similarly), when DYNAMIC_FTRACE_WITH_REGS is enabled, the ftrace location > is always on the first 4 bytes offset. > > I think when users want to register a kprobe on _do_fork+0, what he really want > is to kprobe on the patched(by -fpatchable-function-entry)_do_fork+4 OK, in this case, kprobe should treat the first 2 instructions as a single virtual instruction. This means, - kprobes can probe func+0, but not func+4 if the function is ftraced. (-EILSEQ must be returned) - both debugfs/kprobes/list and tracefs/kprobe_events should show the probed address as func+0. Not func+4. Then, user can use kprobes as if there is one big (8-byte) instruction at the entry of the function. Since probing on func+4 is rejected and the call-site LR/SP is restored in ftrace, there should be no contradiction. It should work as if we put a breakpoint on the func + 0. > > PS: per my understanding, powerpc's kprobes_on_ftrace also introduces an > extra automatic offset due to its implementation. Uh, that is also ugly.... must be fixed. > > > > Could you remove below function at this moment? > > > > > +kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset) > > > +{ > > > + unsigned long addr = kallsyms_lookup_name(name); > > > + > > > + if (addr && !offset) { > > > + unsigned long faddr; > > > + /* > > > + * with -fpatchable-function-entry=2, the first 4 bytes is the > > > + * LR saver, then the actual call insn. So ftrace location is > > > + * always on the first 4 bytes offset. > > > + */ > > > + faddr = ftrace_location_range(addr, > > > + addr + AARCH64_INSN_SIZE); > > > + if (faddr) > > > + return (kprobe_opcode_t *)faddr; > > > + } > > > + return (kprobe_opcode_t *)addr; > > > +} > > > + > > > +bool arch_kprobe_on_func_entry(unsigned long offset) > > > +{ > > > + return offset <= AARCH64_INSN_SIZE; > > > +} > > > > > > Without this automatic change, we still can change the offset > > in upper layer. > > If remove the two functions, kprobe on _do_fork can't ride on > ftrace infrastructure, but kprobe on _do_fork+4 can. I'm not sure > whether this is reasonable. Every kprobe users on arm64 will need to > remember to pass an extra offset +4 to make use of kprobe_on_ftrace, could > we hide the "+4"? Yes, OK, as I said above, please hide +4. We will see the virtual "call" instruction (= "mov x9, lr; br ") at the entry of ftraced function. :) Thank you, -- Masami Hiramatsu