2009-04-02 17:24:16

by Masami Hiramatsu

[permalink] [raw]
Subject: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

Hi,

Here are the patches of kprobe-based event tracer for x86, version 4.

This version supports only x86(-32/-64) (If someone is interested in
porting this to other architectures, he just needs to port
kprobes/kretprobes and ptrace enhancement[PATCH 2/6]).

I added x86 insn decoder on this version. It might be better
integrated with KVM's decoder, and kprobes x86 code should be
rewritten with it.


This can be applied on the linux-2.6-tip tree.

This patchset includes following changes:
- Fix kernel_trap_sp() on x86 according to systemtap runtime. [1/6]
- Add arch-dep register and stack fetching functions [2/6]
- Add x86 instruction decoder [3/6]
- Check insertion point safety in kprobe [4/6]
- Add kprobe-tracer plugin [5/6]
- Support fetching various status (register/stack/memory/etc.) [6/6]

Done items:
- Add kernel_trap_sp() and fetch_*() on other archs.
- Support name-based register fetching (ax, bx, and so on)
- Support indirect memory fetch from registers etc.
- Check insertion point safety by using instruction decoder.

Future items:
- .init function tracing support.
- Support primitive types(long, ulong, int, uint, etc) for args.


kprobe-based event tracer
---------------------------

This tracer is similar to the events tracer which is based on Tracepoint
infrastructure. Instead of Tracepoint, this tracer is based on kprobes(kprobe
and kretprobe). It probes anywhere where kprobes can probe(this means, all
functions body except for __kprobes functions).

Unlike the function tracer, this tracer can probe instructions inside of
kernel functions. It allows you to check which instruction has been executed.

Unlike the Tracepoint based events tracer, this tracer can add new probe points
on the fly.

Similar to the events tracer, this tracer doesn't need to be activated via
current_tracer, instead of that, just set probe points via
/debug/tracing/kprobe_probes.

Synopsis of kprobe_probes:
p SYMBOL[+offs|-offs]|MEMADDR [FETCHARGS] : set a probe
r SYMBOL[+0] [FETCHARGS] : set a return probe

FETCHARGS:
%REG : Fetch register REG
sN : Fetch Nth entry of stack (N >= 0)
@ADDR : Fetch memory at ADDR (ADDR should be in kernel)
@SYM[+|-offs] : Fetch memory at SYM +|- offs (SYM should be a data symbol)
aN : Fetch function argument. (N >= 0)(*)
rv : Fetch return value.(**)
ra : Fetch return address.(**)
+|-offs(FETCHARG) : fetch memory at FETCHARG +|- offs address.(***)

(*) aN may not correct on asmlinkaged functions and at the middle of
function body.
(**) only for return probe.
(***) this is useful for fetching a field of data structures.

E.g.
echo p do_sys_open a0 a1 a2 a3 > /debug/tracing/kprobe_probes

This sets a kprobe on the top of do_sys_open() function with recording
1st to 4th arguments.

echo r do_sys_open rv rp >> /debug/tracing/kprobe_probes

This sets a kretprobe on the return point of do_sys_open() function with
recording return value and return address.

echo > /debug/tracing/kprobe_probes

This clears all probe points. and you can see the traced information via
/debug/tracing/trace.

cat /debug/tracing/trace
# tracer: nop
#
# TASK-PID CPU# TIMESTAMP FUNCTION
# | | | | |
<...>-2376 [001] 262.389131: do_sys_open: @do_sys_open+0 0xffffff9c 0x98db83e 0x8880 0x0
<...>-2376 [001] 262.391166: sys_open: <-do_sys_open+0 0x5 0xc06e8ebb
<...>-2376 [001] 264.384876: do_sys_open: @do_sys_open+0 0xffffff9c 0x98db83e 0x8880 0x0
<...>-2376 [001] 264.386880: sys_open: <-do_sys_open+0 0x5 0xc06e8ebb
<...>-2084 [001] 265.380330: do_sys_open: @do_sys_open+0 0xffffff9c 0x804be3e 0x0 0x1b6
<...>-2084 [001] 265.380399: sys_open: <-do_sys_open+0 0x3 0xc06e8ebb

@SYMBOL means that kernel hits a probe, and <-SYMBOL means kernel returns
from SYMBOL(e.g. "sys_open: <-do_sys_open+0" means kernel returns from
do_sys_open to sys_open).


Documentation/ftrace.txt | 70 ++++
arch/x86/include/asm/insn.h | 130 +++++++
arch/x86/include/asm/ptrace.h | 70 ++++-
arch/x86/kernel/kprobes.c | 51 +++
arch/x86/kernel/ptrace.c | 59 +++
arch/x86/lib/Makefile | 1 +
arch/x86/lib/insn.c | 627 ++++++++++++++++++++++++++++++++
kernel/trace/Kconfig | 9 +
kernel/trace/Makefile | 1 +
kernel/trace/trace_kprobe.c | 789 +++++++++++++++++++++++++++++++++++++++++
10 files changed, 1805 insertions(+), 2 deletions(-)

Thank you,


--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: [email protected]



2009-04-03 11:27:45

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer


* Masami Hiramatsu <[email protected]> wrote:

> Hi,
>
> Here are the patches of kprobe-based event tracer for x86, version 4.
>
> This version supports only x86(-32/-64) (If someone is interested in
> porting this to other architectures, he just needs to port
> kprobes/kretprobes and ptrace enhancement[PATCH 2/6]).
>
> I added x86 insn decoder on this version. It might be better
> integrated with KVM's decoder, and kprobes x86 code should be
> rewritten with it.
>
>
> This can be applied on the linux-2.6-tip tree.
>
> This patchset includes following changes:
> - Fix kernel_trap_sp() on x86 according to systemtap runtime. [1/6]
> - Add arch-dep register and stack fetching functions [2/6]
> - Add x86 instruction decoder [3/6]
> - Check insertion point safety in kprobe [4/6]
> - Add kprobe-tracer plugin [5/6]
> - Support fetching various status (register/stack/memory/etc.) [6/6]

ok, the structure and concept looks quite good now, really nice!

I'm wondering about something i suggested many moons ago: to look
into the KVM decoder+emulator (arch/x86/kvm/x86_emulate.c).

I remember there were some issues with that (one problem being that
the KVM decoder is a special-purpose thing covering specific range
of execution environments - not a near-full integer-ops decoder like
the one we are aiming for here) - are there any other fundamental
problems beyond 'it has to be done' ?

Conceptually we want just a single piece of decoder logic in
arch/x86/. If the KVM folks are cool with it we could factor out the
KVM one into arch/x86/lib/. But ... if there are compelling reasons
to leave the KVM one alone in its limited environment we can do that
too.

Avi, Peter, what's your take on this?

Ingo

2009-04-03 11:30:30

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

> I'm wondering about something i suggested many moons ago: to look
> into the KVM decoder+emulator (arch/x86/kvm/x86_emulate.c).

Hi Ingo,
Me and Masami just discussed this a few emails ago in this thread:)

-Andi
--
[email protected] -- Speaking for myself only.

2009-04-03 11:51:24

by Avi Kivity

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

Ingo Molnar wrote:
> ok, the structure and concept looks quite good now, really nice!
>
> I'm wondering about something i suggested many moons ago: to look
> into the KVM decoder+emulator (arch/x86/kvm/x86_emulate.c).
>
> I remember there were some issues with that (one problem being that
> the KVM decoder is a special-purpose thing covering specific range
> of execution environments - not a near-full integer-ops decoder like
> the one we are aiming for here) - are there any other fundamental
> problems beyond 'it has to be done' ?
>
> Conceptually we want just a single piece of decoder logic in
> arch/x86/. If the KVM folks are cool with it we could factor out the
> KVM one into arch/x86/lib/. But ... if there are compelling reasons
> to leave the KVM one alone in its limited environment we can do that
> too.
>

kvm has three requirements not needed by kprobes:
- it wants to execute instructions, not just decode them, including
generating faults where appropriate
- it is performance critical
- it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously

If an arch/x86/ decoder/emulator gives me these I'll gladly switch to
it. x86_emulate.c is high on my list of most disliked code.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

2009-04-03 12:12:56

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer


* Avi Kivity <[email protected]> wrote:

> Ingo Molnar wrote:
>> ok, the structure and concept looks quite good now, really nice!
>>
>> I'm wondering about something i suggested many moons ago: to look into
>> the KVM decoder+emulator (arch/x86/kvm/x86_emulate.c).
>>
>> I remember there were some issues with that (one problem being
>> that the KVM decoder is a special-purpose thing covering specific
>> range of execution environments - not a near-full integer-ops
>> decoder like the one we are aiming for here) - are there any
>> other fundamental problems beyond 'it has to be done' ?
>>
>> Conceptually we want just a single piece of decoder logic in
>> arch/x86/. If the KVM folks are cool with it we could factor out
>> the KVM one into arch/x86/lib/. But ... if there are compelling
>> reasons to leave the KVM one alone in its limited environment we
>> can do that too.
>
> kvm has three requirements not needed by kprobes:
> - it wants to execute instructions, not just decode them, including
> generating faults where appropriate
> - it is performance critical
> - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously
>
> If an arch/x86/ decoder/emulator gives me these I'll gladly switch
> to it. x86_emulate.c is high on my list of most disliked code.

Well, this has to be driven from the KVM side as the kprobes use
will only be for decoding so if it's modified from the kprobes side
the KVM-only functionality might regress.

So ... we can do the library decoder for kprobes purposes, and
someone versed in the KVM emulator can then combine the two.

Ingo

2009-04-03 12:18:00

by Avi Kivity

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

Ingo Molnar wrote:
>> kvm has three requirements not needed by kprobes:
>> - it wants to execute instructions, not just decode them, including
>> generating faults where appropriate
>> - it is performance critical
>> - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously
>>
>> If an arch/x86/ decoder/emulator gives me these I'll gladly switch
>> to it. x86_emulate.c is high on my list of most disliked code.
>>
>
> Well, this has to be driven from the KVM side as the kprobes use
> will only be for decoding so if it's modified from the kprobes side
> the KVM-only functionality might regress.
>
> So ... we can do the library decoder for kprobes purposes, and
> someone versed in the KVM emulator can then combine the two.
>

Problem is, anyone versed in the kvm emulator will want to run as far
away from this work as possible.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

2009-04-03 12:24:07

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

> So ... we can do the library decoder for kprobes purposes, and
> someone versed in the KVM emulator can then combine the two.

The KVM (or rather Xen, that is where it comes from) decoder is already
a "library decoder". That is it does nearly everything
through callbacks, and if you don't want some functionality
you can nop the callbacks. Nearly because some some
direct KVM references have crept in recently (e.g. to vcpus),
but those could be probably removed again without too much effort.
There are not many of them.

Also doing another interpreter is a lot of work and a lot of testing,
so basing it on something that is already well tested is probably
a good idea.

-/dev/null/Andi
--
[email protected] -- Speaking for myself only.

2009-04-03 12:28:43

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer


* Avi Kivity <[email protected]> wrote:

> Ingo Molnar wrote:
>>> kvm has three requirements not needed by kprobes:
>>> - it wants to execute instructions, not just decode them, including
>>> generating faults where appropriate
>>> - it is performance critical
>>> - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously
>>>
>>> If an arch/x86/ decoder/emulator gives me these I'll gladly switch
>>> to it. x86_emulate.c is high on my list of most disliked code.
>>>
>>
>> Well, this has to be driven from the KVM side as the kprobes use
>> will only be for decoding so if it's modified from the kprobes
>> side the KVM-only functionality might regress.
>>
>> So ... we can do the library decoder for kprobes purposes, and
>> someone versed in the KVM emulator can then combine the two.
>
> Problem is, anyone versed in the kvm emulator will want to run as
> far away from this work as possible.

Are you suggesting that the KVM emulator should never have been
merged in the first place? ;-)

Anyway, we'll make sure the kprobes/library decoder is as clean as
possible - so it ought to be hackable and extensible without the
risk of permanent brain damage. Mmiotrace and kmemcheck has decoding
smarts too, and i think the sw-breakpoint injection code of KGDB
could use it as well - so there's broader utility in all this.

Ingo

2009-04-03 12:34:31

by Avi Kivity

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

Ingo Molnar wrote:
>> Problem is, anyone versed in the kvm emulator will want to run as
>> far away from this work as possible.
>>
>
> Are you suggesting that the KVM emulator should never have been
> merged in the first place? ;-)
>

Truth always comes out eventually.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

2009-04-03 13:17:25

by Vegard Nossum

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

2009/4/3 Ingo Molnar <[email protected]>:
>
> * Avi Kivity <[email protected]> wrote:
>
>> Ingo Molnar wrote:
>>>> kvm has three requirements not needed by kprobes:
>>>> - it wants to execute instructions, not just decode them, including
>>>>   generating faults where appropriate
>>>> - it is performance critical
>>>> - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously
>>>>
>>>> If an arch/x86/ decoder/emulator gives me these I'll gladly switch
>>>> to it.  x86_emulate.c is high on my list of most disliked code.
>>>>
>>>
>>> Well, this has to be driven from the KVM side as the kprobes use
>>> will only be for decoding so if it's modified from the kprobes
>>> side the KVM-only functionality might regress.
>>>
>>> So ... we can do the library decoder for kprobes purposes, and
>>> someone versed in the KVM emulator can then combine the two.
>>
>> Problem is, anyone versed in the kvm emulator will want to run as
>> far away from this work as possible.
>
> Are you suggesting that the KVM emulator should never have been
> merged in the first place? ;-)
>
> Anyway, we'll make sure the kprobes/library decoder is as clean as
> possible - so it ought to be hackable and extensible without the
> risk of permanent brain damage. Mmiotrace and kmemcheck has decoding
> smarts too, and i think the sw-breakpoint injection code of KGDB
> could use it as well - so there's broader utility in all this.

(Sorry in advance for jumping in -- my post may be irrelevant)

For the record, kmemcheck requirements for an instruction decoder are these:

For any instruction with memory operands, we need to know which are
the operands (so for movl %eax, (%ebx) we need to combine the
instruction with a struct pt_regs to get the actual address
dereferenced, i.e. the contents of %ebx), and their sizes (for movzbl,
the source operand is 8 bits, destination operand is 32 bits). For
things like movsb, we need to be able to get both %esi and %edi.

mmiotrace additionally needs to know what the actual values
read/written were, for instructions that read/write to memory (again,
combined with a struct pt_regs).

Maybe this doesn't really say much, since this is what a generic
instruction decoder would be able to do anyway. But kmemcheck and
mmiotrace both have very special-purpose decoders. I don't really know
what other decoders look like, but what I would wish for is this: Some
macros for iterating the operands, where each operand has a type (e.g.
input (for reads), output (for writes), target (for jumps), immediate
address, immediate value, etc.), a size (in bits), and a way to
evaluate the operand. So eval(op, regs) for op=%eax, it will return
regs->eax; for op=4(%eax), it will return regs->eax + 4; for op=4 it
will return 4, etc.

Both kmemcheck and mmiotrace could gain SMP support with instruction
emulation, though it is strictly not necessary. In that case, though,
we would not want to emulate fault handling, etc. (i.e. the fault
should always be generated by the CPU itself).

Please do put me on Cc for future discussions, though.


Vegard

--
"The animistic metaphor of the bug that maliciously sneaked in while
the programmer was not looking is intellectually dishonest as it
disguises that the error is the programmer's own creation."
-- E. W. Dijkstra, EWD1036

2009-04-03 13:41:03

by Avi Kivity

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

Vegard Nossum wrote:
> For the record, kmemcheck requirements for an instruction decoder are these:
>
> For any instruction with memory operands, we need to know which are
> the operands (so for movl %eax, (%ebx) we need to combine the
> instruction with a struct pt_regs to get the actual address
> dereferenced, i.e. the contents of %ebx), and their sizes (for movzbl,
> the source operand is 8 bits, destination operand is 32 bits). For
> things like movsb, we need to be able to get both %esi and %edi.
>
>

The kvm emulator does all of this.

> mmiotrace additionally needs to know what the actual values
> read/written were, for instructions that read/write to memory (again,
> combined with a struct pt_regs).
>

And this.

> Maybe this doesn't really say much, since this is what a generic
> instruction decoder would be able to do anyway. But kmemcheck and
> mmiotrace both have very special-purpose decoders. I don't really know
> what other decoders look like, but what I would wish for is this: Some
> macros for iterating the operands, where each operand has a type (e.g.
> input (for reads), output (for writes), target (for jumps), immediate
> address, immediate value, etc.), a size (in bits), and a way to
> evaluate the operand. So eval(op, regs) for op=%eax, it will return
> regs->eax; for op=4(%eax), it will return regs->eax + 4; for op=4 it
> will return 4, etc.
>

You can do something like this by executing the instruction and
observing what memory is touches through the callbacks.

--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

2009-04-03 13:53:38

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

Vegard Nossum wrote:
> 2009/4/3 Ingo Molnar <[email protected]>:
>> * Avi Kivity <[email protected]> wrote:
>>
>>> Ingo Molnar wrote:
>>>>> kvm has three requirements not needed by kprobes:
>>>>> - it wants to execute instructions, not just decode them, including
>>>>> generating faults where appropriate
>>>>> - it is performance critical
>>>>> - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously
>>>>>
>>>>> If an arch/x86/ decoder/emulator gives me these I'll gladly switch
>>>>> to it. x86_emulate.c is high on my list of most disliked code.
>>>>>
>>>> Well, this has to be driven from the KVM side as the kprobes use
>>>> will only be for decoding so if it's modified from the kprobes
>>>> side the KVM-only functionality might regress.
>>>>
>>>> So ... we can do the library decoder for kprobes purposes, and
>>>> someone versed in the KVM emulator can then combine the two.
>>> Problem is, anyone versed in the kvm emulator will want to run as
>>> far away from this work as possible.
>> Are you suggesting that the KVM emulator should never have been
>> merged in the first place? ;-)
>>
>> Anyway, we'll make sure the kprobes/library decoder is as clean as
>> possible - so it ought to be hackable and extensible without the
>> risk of permanent brain damage. Mmiotrace and kmemcheck has decoding
>> smarts too, and i think the sw-breakpoint injection code of KGDB
>> could use it as well - so there's broader utility in all this.
>
> (Sorry in advance for jumping in -- my post may be irrelevant)

Thank you for clarify your needs :-)

> For the record, kmemcheck requirements for an instruction decoder are these:
>
> For any instruction with memory operands, we need to know which are
> the operands (so for movl %eax, (%ebx) we need to combine the
> instruction with a struct pt_regs to get the actual address
> dereferenced, i.e. the contents of %ebx), and their sizes (for movzbl,
> the source operand is 8 bits, destination operand is 32 bits). For
> things like movsb, we need to be able to get both %esi and %edi.

New decoder can give you the value of mod/rm(insn.modrm), operand size
(insn.opnd_bytes), and immediate size (insn.immediate.nbytes)
To get which register is used, you can decode modrm with MODRM_*()
macros.

> mmiotrace additionally needs to know what the actual values
> read/written were, for instructions that read/write to memory (again,
> combined with a struct pt_regs).

The decoder doesn't use any locks/shared memory, so you can
use it in interrupt context, with pt_regs.

> Maybe this doesn't really say much, since this is what a generic
> instruction decoder would be able to do anyway. But kmemcheck and
> mmiotrace both have very special-purpose decoders. I don't really know
> what other decoders look like, but what I would wish for is this: Some
> macros for iterating the operands, where each operand has a type (e.g.
> input (for reads), output (for writes), target (for jumps), immediate
> address, immediate value, etc.), a size (in bits), and a way to
> evaluate the operand. So eval(op, regs) for op=%eax, it will return
> regs->eax; for op=4(%eax), it will return regs->eax + 4; for op=4 it
> will return 4, etc.

Hmm, it's an interesting idea. I think operand classifying can be done by
evaluating opcode and mod/rm.

> Both kmemcheck and mmiotrace could gain SMP support with instruction
> emulation, though it is strictly not necessary. In that case, though,
> we would not want to emulate fault handling, etc. (i.e. the fault
> should always be generated by the CPU itself).
>
> Please do put me on Cc for future discussions, though.

Of course, thank you!

--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: [email protected]

2009-04-03 14:22:14

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

Avi Kivity wrote:
> Ingo Molnar wrote:
>> ok, the structure and concept looks quite good now, really nice!
>>
>> I'm wondering about something i suggested many moons ago: to look into
>> the KVM decoder+emulator (arch/x86/kvm/x86_emulate.c).
>>
>> I remember there were some issues with that (one problem being that
>> the KVM decoder is a special-purpose thing covering specific range of
>> execution environments - not a near-full integer-ops decoder like the
>> one we are aiming for here) - are there any other fundamental problems
>> beyond 'it has to be done' ?
>>
>> Conceptually we want just a single piece of decoder logic in
>> arch/x86/. If the KVM folks are cool with it we could factor out the
>> KVM one into arch/x86/lib/. But ... if there are compelling reasons to
>> leave the KVM one alone in its limited environment we can do that too.
>>
>
> kvm has three requirements not needed by kprobes:
> - it wants to execute instructions, not just decode them, including
> generating faults where appropriate
> - it is performance critical
> - it needs to support 16-bit, 32-bit, and 64-bit instructions
> simultaneously

Hmm, I'd like to know actually kvm aims to emulate all kinds of
instructions. If so, I might find some bugs in x86_emulate.c.
However, I don't know all bugs. To find all of them, we have to
port x86_emulate.c to user-space, decode binaries with it, and
compare its output with another decoder, as Jim had done with insn.c.

https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html


Thank you,

--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: [email protected]

2009-04-03 14:24:24

by Ingo Molnar

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer


* Masami Hiramatsu <[email protected]> wrote:

> Hmm, I'd like to know actually kvm aims to emulate all kinds of
> instructions. If so, I might find some bugs in x86_emulate.c.
> However, I don't know all bugs. To find all of them, we have to
> port x86_emulate.c to user-space, decode binaries with it, and
> compare its output with another decoder, as Jim had done with
> insn.c.
>
> https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html

btw., i'd suggest we put a build time check for this into the kernel
version as well. For example to decode the vmlinux via objdump, run
it through your decoder as well and compare the results. Put under a
CONFIG_DEBUG_X86_DECODER_TEST kind of (deault-off) build-time
self-test.

This would ensure that the kernel we are running is fully supported
by the decoder - even as GCC/GAS starts using new instructions, etc.

How does this sound to you?

Ingo

2009-04-03 14:31:22

by Avi Kivity

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

Masami Hiramatsu wrote:
> Hmm, I'd like to know actually kvm aims to emulate all kinds of
> instructions.

We're less interested in fpu/sse. The interesting instructions are
those used for page table management, mmio, and real mode execution.

> If so, I might find some bugs in x86_emulate.c.
> However, I don't know all bugs. To find all of them, we have to
> port x86_emulate.c to user-space, decode binaries with it, and
> compare its output with another decoder, as Jim had done with insn.c.
>
>

That would be very useful.


--
I have a truly marvellous patch that fixes the bug which this
signature is too narrow to contain.

2009-04-03 16:56:20

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

Ingo Molnar wrote:
> * Masami Hiramatsu <[email protected]> wrote:
>
>> Hmm, I'd like to know actually kvm aims to emulate all kinds of
>> instructions. If so, I might find some bugs in x86_emulate.c.
>> However, I don't know all bugs. To find all of them, we have to
>> port x86_emulate.c to user-space, decode binaries with it, and
>> compare its output with another decoder, as Jim had done with
>> insn.c.
>>
>> https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html
>
> btw., i'd suggest we put a build time check for this into the kernel
> version as well. For example to decode the vmlinux via objdump, run
> it through your decoder as well and compare the results. Put under a
> CONFIG_DEBUG_X86_DECODER_TEST kind of (deault-off) build-time
> self-test.
>
> This would ensure that the kernel we are running is fully supported
> by the decoder - even as GCC/GAS starts using new instructions, etc.
>
> How does this sound to you?

Thanks! That is a good idea.
Jim, would you think you can port your script into kernel tree?

Thank you,

--
Masami Hiramatsu

Software Engineer
Hitachi Computer Products (America) Inc.
Software Solutions Division

e-mail: [email protected]

2009-04-03 18:01:20

by Jim Keniston

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

On Fri, 2009-04-03 at 12:55 -0400, Masami Hiramatsu wrote:
> Ingo Molnar wrote:
> > * Masami Hiramatsu <[email protected]> wrote:
> >
> >> Hmm, I'd like to know actually kvm aims to emulate all kinds of
> >> instructions. If so, I might find some bugs in x86_emulate.c.
> >> However, I don't know all bugs. To find all of them, we have to
> >> port x86_emulate.c to user-space, decode binaries with it, and
> >> compare its output with another decoder, as Jim had done with
> >> insn.c.
> >>
> >> https://www.redhat.com/archives/utrace-devel/2009-March/msg00031.html
> >
> > btw., i'd suggest we put a build time check for this into the kernel
> > version as well. For example to decode the vmlinux via objdump, run
> > it through your decoder as well and compare the results. Put under a
> > CONFIG_DEBUG_X86_DECODER_TEST kind of (deault-off) build-time
> > self-test.
> >
> > This would ensure that the kernel we are running is fully supported
> > by the decoder - even as GCC/GAS starts using new instructions, etc.
> >
> > How does this sound to you?
>
> Thanks! That is a good idea.
> Jim, would you think you can port your script into kernel tree?
...

I'd be happy to do what's needed to make it happen, and maintain it in
the face of x86 changes. The script itself is practically nothing (~100
lines of awk and C), but what I don't know about the kernel build is a
lot, so I'd need some help from a kernel-build expert.

Jim

2009-04-05 19:37:41

by Pekka Paalanen

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

On Fri, 03 Apr 2009 09:52:09 -0400
Masami Hiramatsu <[email protected]> wrote:

> Vegard Nossum wrote:
> > 2009/4/3 Ingo Molnar <[email protected]>:
> >> * Avi Kivity <[email protected]> wrote:
> >>
> >>> Ingo Molnar wrote:
> >>>>> kvm has three requirements not needed by kprobes:
> >>>>> - it wants to execute instructions, not just decode them, including
> >>>>> generating faults where appropriate
> >>>>> - it is performance critical
> >>>>> - it needs to support 16-bit, 32-bit, and 64-bit instructions simultaneously
> >>>>>
> >>>>> If an arch/x86/ decoder/emulator gives me these I'll gladly switch
> >>>>> to it. x86_emulate.c is high on my list of most disliked code.
> >>>>>
> >>>> Well, this has to be driven from the KVM side as the kprobes use
> >>>> will only be for decoding so if it's modified from the kprobes
> >>>> side the KVM-only functionality might regress.
> >>>>
> >>>> So ... we can do the library decoder for kprobes purposes, and
> >>>> someone versed in the KVM emulator can then combine the two.
> >>> Problem is, anyone versed in the kvm emulator will want to run as
> >>> far away from this work as possible.
> >> Are you suggesting that the KVM emulator should never have been
> >> merged in the first place? ;-)
> >>
> >> Anyway, we'll make sure the kprobes/library decoder is as clean as
> >> possible - so it ought to be hackable and extensible without the
> >> risk of permanent brain damage. Mmiotrace and kmemcheck has decoding
> >> smarts too, and i think the sw-breakpoint injection code of KGDB
> >> could use it as well - so there's broader utility in all this.
> >
> > (Sorry in advance for jumping in -- my post may be irrelevant)
>
> Thank you for clarify your needs :-)
>
> > For the record, kmemcheck requirements for an instruction decoder are these:
> >
> > For any instruction with memory operands, we need to know which are
> > the operands (so for movl %eax, (%ebx) we need to combine the
> > instruction with a struct pt_regs to get the actual address
> > dereferenced, i.e. the contents of %ebx), and their sizes (for movzbl,
> > the source operand is 8 bits, destination operand is 32 bits). For
> > things like movsb, we need to be able to get both %esi and %edi.
>
> New decoder can give you the value of mod/rm(insn.modrm), operand size
> (insn.opnd_bytes), and immediate size (insn.immediate.nbytes)
> To get which register is used, you can decode modrm with MODRM_*()
> macros.
>
> > mmiotrace additionally needs to know what the actual values
> > read/written were, for instructions that read/write to memory (again,
> > combined with a struct pt_regs).
>
> The decoder doesn't use any locks/shared memory, so you can
> use it in interrupt context, with pt_regs.
>
> > Maybe this doesn't really say much, since this is what a generic
> > instruction decoder would be able to do anyway. But kmemcheck and
> > mmiotrace both have very special-purpose decoders. I don't really know
> > what other decoders look like, but what I would wish for is this: Some
> > macros for iterating the operands, where each operand has a type (e.g.
> > input (for reads), output (for writes), target (for jumps), immediate
> > address, immediate value, etc.), a size (in bits), and a way to
> > evaluate the operand. So eval(op, regs) for op=%eax, it will return
> > regs->eax; for op=4(%eax), it will return regs->eax + 4; for op=4 it
> > will return 4, etc.
>
> Hmm, it's an interesting idea. I think operand classifying can be done by
> evaluating opcode and mod/rm.
>
> > Both kmemcheck and mmiotrace could gain SMP support with instruction
> > emulation, though it is strictly not necessary. In that case, though,
> > we would not want to emulate fault handling, etc. (i.e. the fault
> > should always be generated by the CPU itself).

Not just emulation but address diversion, i.e. modifying the operation
(not the text) before executing it. Mmiotrace could do something like
this:
1. a blob calls ioremap
2. mmiotrace maps the MMIO area privately
3. the blob receives a dummy map from ioremap, that will generate
page fault
4. the blob accesses the dummy map and raises a page fault
5. pf handler detects the dummy map
6. mmiotrace pf handler emulates the instruction and replaces the
dummy address with the real MMIO address.
7. mmiotrace records the operation and the datum
8. go to step 4, or whatever

This means mmiotrace would not have to fiddle with the page
tables and page presence bits like it does now. As said, this
would make mmiotrace SMP-proof, and also eliminate the die notifier
(used for the instruction single stepping trap).

IMO a big step from a hack to a tool. Getting rid of the custom
instruction parser in mmiotrace would be a good step in itself.

Avi Kivity noted, that the KVM emulator does almost everything. Does
it allow also address diversion?

I haven't looked at the KVM emulator since something like 2.6.25 or
so, and I probably don't have time to work with it anyway, but
I am very interested to hear how things evolve.


Thanks.

--
Pekka Paalanen
http://www.iki.fi/pq/

2009-04-06 07:55:07

by Avi Kivity

[permalink] [raw]
Subject: Re: [PATCH -tip 0/6 V4] tracing: kprobe-based event tracer

Pekka Paalanen wrote:
> Not just emulation but address diversion, i.e. modifying the operation
> (not the text) before executing it. Mmiotrace could do something like
> this:
> 1. a blob calls ioremap
> 2. mmiotrace maps the MMIO area privately
> 3. the blob receives a dummy map from ioremap, that will generate
> page fault
> 4. the blob accesses the dummy map and raises a page fault
> 5. pf handler detects the dummy map
> 6. mmiotrace pf handler emulates the instruction and replaces the
> dummy address with the real MMIO address.
> 7. mmiotrace records the operation and the datum
> 8. go to step 4, or whatever
>
> This means mmiotrace would not have to fiddle with the page
> tables and page presence bits like it does now. As said, this
> would make mmiotrace SMP-proof, and also eliminate the die notifier
> (used for the instruction single stepping trap).
>
> IMO a big step from a hack to a tool. Getting rid of the custom
> instruction parser in mmiotrace would be a good step in itself.
>
> Avi Kivity noted, that the KVM emulator does almost everything. Does
> it allow also address diversion?
>

Operand access is by means of a callback, so yes. In kvm's use, it's
used to access guest memory, so it modified the addresses before reading
or writing.

--
error compiling committee.c: too many arguments to function