Hi Masami,
On Wed, 18 Dec 2019 22:25:50 +0900 Masami Hiramatsu wrote:
>
>
> On Wed, 18 Dec 2019 06:21:35 +0000
> Jisheng Zhang <[email protected]> wrote:
>
> > KPROBES_ON_FTRACE avoids much of the overhead with regular kprobes as it
> > eliminates the need for a trap, as well as the need to emulate or
> > single-step instructions.
> >
> > Tested on berlin arm64 platform.
> >
> > ~ # mount -t debugfs debugfs /sys/kernel/debug/
> > ~ # cd /sys/kernel/debug/
> > /sys/kernel/debug # echo 'p _do_fork' > tracing/kprobe_events
> >
> > before the patch:
> >
> > /sys/kernel/debug # cat kprobes/list
> > ffffff801009fe28 k _do_fork+0x0 [DISABLED]
> >
> > after the patch:
> >
> > /sys/kernel/debug # cat kprobes/list
> > ffffff801009ff54 k _do_fork+0x4 [DISABLED][FTRACE]
>
> BTW, it seems this automatically changes the offset without
> user's intention or any warnings. How would you manage if the user
> pass a new probe on _do_fork+0x4?
In current implementation, two probes at the same address _do_fork+0x4
>
> IOW, it is still the question who really wants to probe on
> the _do_fork+"0", if kprobes modifies it automatically,
> no one can do that anymore.
> This can be happen if the user want to record LR or SP value
> at the function call for debug. If kprobe always modifies it,
> we will lose the way to do it.
arm64's DYNAMIC_FTRACE_WITH_REGS implementation makes use of GCC
-fpatchable-function-entry=2 option to insert two nops. When the function
is traced, the first nop will be modified to the LR saver, then the
second nop to "bl <ftrace-entry>", commit 3b23e4991fb6("
arm64: implement ftrace with regs") explains the mechanism.
So on arm64(in fact any arch makes use of -fpatchable-function-entry will
behave similarly), when DYNAMIC_FTRACE_WITH_REGS is enabled, the ftrace location
is always on the first 4 bytes offset.
I think when users want to register a kprobe on _do_fork+0, what he really want
is to kprobe on the patched(by -fpatchable-function-entry)_do_fork+4
PS: per my understanding, powerpc's kprobes_on_ftrace also introduces an
extra automatic offset due to its implementation.
>
> Could you remove below function at this moment?
>
> > +kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
> > +{
> > + unsigned long addr = kallsyms_lookup_name(name);
> > +
> > + if (addr && !offset) {
> > + unsigned long faddr;
> > + /*
> > + * with -fpatchable-function-entry=2, the first 4 bytes is the
> > + * LR saver, then the actual call insn. So ftrace location is
> > + * always on the first 4 bytes offset.
> > + */
> > + faddr = ftrace_location_range(addr,
> > + addr + AARCH64_INSN_SIZE);
> > + if (faddr)
> > + return (kprobe_opcode_t *)faddr;
> > + }
> > + return (kprobe_opcode_t *)addr;
> > +}
> > +
> > +bool arch_kprobe_on_func_entry(unsigned long offset)
> > +{
> > + return offset <= AARCH64_INSN_SIZE;
> > +}
>
>
> Without this automatic change, we still can change the offset
> in upper layer.
If remove the two functions, kprobe on _do_fork can't ride on
ftrace infrastructure, but kprobe on _do_fork+4 can. I'm not sure
whether this is reasonable. Every kprobe users on arm64 will need to
remember to pass an extra offset +4 to make use of kprobe_on_ftrace, could
we hide the "+4"?
Thanks
Hi Jisheng,
On Mon, 23 Dec 2019 07:47:24 +0000
Jisheng Zhang <[email protected]> wrote:
> Hi Masami,
>
> On Wed, 18 Dec 2019 22:25:50 +0900 Masami Hiramatsu wrote:
>
>
> >
> >
> > On Wed, 18 Dec 2019 06:21:35 +0000
> > Jisheng Zhang <[email protected]> wrote:
> >
> > > KPROBES_ON_FTRACE avoids much of the overhead with regular kprobes as it
> > > eliminates the need for a trap, as well as the need to emulate or
> > > single-step instructions.
> > >
> > > Tested on berlin arm64 platform.
> > >
> > > ~ # mount -t debugfs debugfs /sys/kernel/debug/
> > > ~ # cd /sys/kernel/debug/
> > > /sys/kernel/debug # echo 'p _do_fork' > tracing/kprobe_events
> > >
> > > before the patch:
> > >
> > > /sys/kernel/debug # cat kprobes/list
> > > ffffff801009fe28 k _do_fork+0x0 [DISABLED]
> > >
> > > after the patch:
> > >
> > > /sys/kernel/debug # cat kprobes/list
> > > ffffff801009ff54 k _do_fork+0x4 [DISABLED][FTRACE]
> >
> > BTW, it seems this automatically changes the offset without
> > user's intention or any warnings. How would you manage if the user
> > pass a new probe on _do_fork+0x4?
>
> In current implementation, two probes at the same address _do_fork+0x4
OK, that is my point.
> > IOW, it is still the question who really wants to probe on
> > the _do_fork+"0", if kprobes modifies it automatically,
> > no one can do that anymore.
> > This can be happen if the user want to record LR or SP value
> > at the function call for debug. If kprobe always modifies it,
> > we will lose the way to do it.
>
> arm64's DYNAMIC_FTRACE_WITH_REGS implementation makes use of GCC
> -fpatchable-function-entry=2 option to insert two nops. When the function
> is traced, the first nop will be modified to the LR saver, then the
> second nop to "bl <ftrace-entry>", commit 3b23e4991fb6("
> arm64: implement ftrace with regs") explains the mechanism.
So both of the instruction at func+0 and func+4 are replaced.
> So on arm64(in fact any arch makes use of -fpatchable-function-entry will
> behave similarly), when DYNAMIC_FTRACE_WITH_REGS is enabled, the ftrace location
> is always on the first 4 bytes offset.
>
> I think when users want to register a kprobe on _do_fork+0, what he really want
> is to kprobe on the patched(by -fpatchable-function-entry)_do_fork+4
OK, in this case, kprobe should treat the first 2 instructions as a
single virtual instruction. This means,
- kprobes can probe func+0, but not func+4 if the function is ftraced.
(-EILSEQ must be returned)
- both debugfs/kprobes/list and tracefs/kprobe_events should show the
probed address as func+0. Not func+4.
Then, user can use kprobes as if there is one big (8-byte) instruction
at the entry of the function. Since probing on func+4 is rejected and
the call-site LR/SP is restored in ftrace, there should be no
contradiction. It should work as if we put a breakpoint on the func + 0.
>
> PS: per my understanding, powerpc's kprobes_on_ftrace also introduces an
> extra automatic offset due to its implementation.
Uh, that is also ugly.... must be fixed.
> >
> > Could you remove below function at this moment?
> >
> > > +kprobe_opcode_t *kprobe_lookup_name(const char *name, unsigned int offset)
> > > +{
> > > + unsigned long addr = kallsyms_lookup_name(name);
> > > +
> > > + if (addr && !offset) {
> > > + unsigned long faddr;
> > > + /*
> > > + * with -fpatchable-function-entry=2, the first 4 bytes is the
> > > + * LR saver, then the actual call insn. So ftrace location is
> > > + * always on the first 4 bytes offset.
> > > + */
> > > + faddr = ftrace_location_range(addr,
> > > + addr + AARCH64_INSN_SIZE);
> > > + if (faddr)
> > > + return (kprobe_opcode_t *)faddr;
> > > + }
> > > + return (kprobe_opcode_t *)addr;
> > > +}
> > > +
> > > +bool arch_kprobe_on_func_entry(unsigned long offset)
> > > +{
> > > + return offset <= AARCH64_INSN_SIZE;
> > > +}
> >
> >
> > Without this automatic change, we still can change the offset
> > in upper layer.
>
> If remove the two functions, kprobe on _do_fork can't ride on
> ftrace infrastructure, but kprobe on _do_fork+4 can. I'm not sure
> whether this is reasonable. Every kprobe users on arm64 will need to
> remember to pass an extra offset +4 to make use of kprobe_on_ftrace, could
> we hide the "+4"?
Yes, OK, as I said above, please hide +4. We will see the virtual
"call" instruction (= "mov x9, lr; br <addr>") at the entry of ftraced
function. :)
Thank you,
--
Masami Hiramatsu <[email protected]>
Hi,
On Tue, 24 Dec 2019 19:09:16 +0900 Masami Hiramatsu wrote:
>
> Hi Jisheng,
>
> On Mon, 23 Dec 2019 07:47:24 +0000
> Jisheng Zhang <[email protected]> wrote:
>
> > Hi Masami,
> >
> > On Wed, 18 Dec 2019 22:25:50 +0900 Masami Hiramatsu wrote:
> >
> >
> > >
> > >
> > > On Wed, 18 Dec 2019 06:21:35 +0000
> > > Jisheng Zhang <[email protected]> wrote:
> > >
> > > > KPROBES_ON_FTRACE avoids much of the overhead with regular kprobes as it
> > > > eliminates the need for a trap, as well as the need to emulate or
> > > > single-step instructions.
> > > >
> > > > Tested on berlin arm64 platform.
> > > >
> > > > ~ # mount -t debugfs debugfs /sys/kernel/debug/
> > > > ~ # cd /sys/kernel/debug/
> > > > /sys/kernel/debug # echo 'p _do_fork' > tracing/kprobe_events
> > > >
> > > > before the patch:
> > > >
> > > > /sys/kernel/debug # cat kprobes/list
> > > > ffffff801009fe28 k _do_fork+0x0 [DISABLED]
> > > >
> > > > after the patch:
> > > >
> > > > /sys/kernel/debug # cat kprobes/list
> > > > ffffff801009ff54 k _do_fork+0x4 [DISABLED][FTRACE]
> > >
> > > BTW, it seems this automatically changes the offset without
> > > user's intention or any warnings. How would you manage if the user
> > > pass a new probe on _do_fork+0x4?
> >
> > In current implementation, two probes at the same address _do_fork+0x4
>
> OK, that is my point.
>
> > > IOW, it is still the question who really wants to probe on
> > > the _do_fork+"0", if kprobes modifies it automatically,
> > > no one can do that anymore.
> > > This can be happen if the user want to record LR or SP value
> > > at the function call for debug. If kprobe always modifies it,
> > > we will lose the way to do it.
> >
> > arm64's DYNAMIC_FTRACE_WITH_REGS implementation makes use of GCC
> > -fpatchable-function-entry=2 option to insert two nops. When the function
> > is traced, the first nop will be modified to the LR saver, then the
> > second nop to "bl <ftrace-entry>", commit 3b23e4991fb6("
> > arm64: implement ftrace with regs") explains the mechanism.
>
> So both of the instruction at func+0 and func+4 are replaced.
>
> > So on arm64(in fact any arch makes use of -fpatchable-function-entry will
> > behave similarly), when DYNAMIC_FTRACE_WITH_REGS is enabled, the ftrace location
> > is always on the first 4 bytes offset.
> >
> > I think when users want to register a kprobe on _do_fork+0, what he really want
> > is to kprobe on the patched(by -fpatchable-function-entry)_do_fork+4
>
> OK, in this case, kprobe should treat the first 2 instructions as a
> single virtual instruction. This means,
>
> - kprobes can probe func+0, but not func+4 if the function is ftraced.
> (-EILSEQ must be returned)
> - both debugfs/kprobes/list and tracefs/kprobe_events should show the
> probed address as func+0. Not func+4.
>
> Then, user can use kprobes as if there is one big (8-byte) instruction
> at the entry of the function. Since probing on func+4 is rejected and
> the call-site LR/SP is restored in ftrace, there should be no
> contradiction. It should work as if we put a breakpoint on the func + 0.
From https://lkml.org/lkml/2019/6/18/648, Naveen tried to allow probing on
any ftrace address, is it acceptable on arm64 as well? I will post patches
for this purpose.
Thanks for your review