2022-05-14 00:10:40

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH 0/6] perf intel-pt: Add support for tracing KVM test programs

Hi

A common case for KVM test programs is that the guest object code can be
found in the hypervisor process (i.e. the test program running on the
host). Add support for that.

For some more details refer the 3rd patch "perf tools: Add guest_code
support"

For an example, see the last patch "perf intel-pt: Add guest_code support"

For more information about Perf tools support for IntelĀ® Processor Trace
refer:

https://perf.wiki.kernel.org/index.php/Perf_tools_support_for_Intel%C2%AE_Processor_Trace


Adrian Hunter (6):
perf tools: Add machine to machines back pointer
perf tools: Factor out thread__set_guest_comm()
perf tools: Add guest_code support
perf script: Add guest_code support
perf kvm report: Add guest_code support
perf intel-pt: Add guest_code support

tools/perf/Documentation/perf-intel-pt.txt | 67 ++++++++++++++++++++++++
tools/perf/Documentation/perf-kvm.txt | 3 ++
tools/perf/Documentation/perf-script.txt | 4 ++
tools/perf/builtin-kvm.c | 2 +
tools/perf/builtin-script.c | 5 +-
tools/perf/util/event.c | 7 ++-
tools/perf/util/intel-pt.c | 20 ++++++-
tools/perf/util/machine.c | 84 ++++++++++++++++++++++++++++--
tools/perf/util/machine.h | 4 ++
tools/perf/util/session.c | 7 +++
tools/perf/util/symbol_conf.h | 3 +-
11 files changed, 197 insertions(+), 9 deletions(-)


Regards
Adrian


2022-05-14 00:27:38

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH 4/6] perf script: Add guest_code support

Add an option to indicate that guest code can be found in the hypervisor
process.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/Documentation/perf-script.txt | 4 ++++
tools/perf/builtin-script.c | 5 ++++-
2 files changed, 8 insertions(+), 1 deletion(-)

diff --git a/tools/perf/Documentation/perf-script.txt b/tools/perf/Documentation/perf-script.txt
index 2012a8e6c90b..1a557ff8f210 100644
--- a/tools/perf/Documentation/perf-script.txt
+++ b/tools/perf/Documentation/perf-script.txt
@@ -499,6 +499,10 @@ include::itrace.txt[]
The known limitations include exception handing such as
setjmp/longjmp will have calls/returns not match.

+--guest-code::
+ Indicate that guest code can be found in the hypervisor process,
+ which is a common case for KVM test programs.
+
SEE ALSO
--------
linkperf:perf-record[1], linkperf:perf-script-perl[1],
diff --git a/tools/perf/builtin-script.c b/tools/perf/builtin-script.c
index cf5eab5431b4..96a2106a3dac 100644
--- a/tools/perf/builtin-script.c
+++ b/tools/perf/builtin-script.c
@@ -3884,6 +3884,8 @@ int cmd_script(int argc, const char **argv)
"file", "file saving guest os /proc/kallsyms"),
OPT_STRING(0, "guestmodules", &symbol_conf.default_guest_modules,
"file", "file saving guest os /proc/modules"),
+ OPT_BOOLEAN(0, "guest-code", &symbol_conf.guest_code,
+ "Guest code can be found in hypervisor process"),
OPT_BOOLEAN('\0', "stitch-lbr", &script.stitch_lbr,
"Enable LBR callgraph stitching approach"),
OPTS_EVSWITCH(&script.evswitch),
@@ -3909,7 +3911,8 @@ int cmd_script(int argc, const char **argv)
if (symbol_conf.guestmount ||
symbol_conf.default_guest_vmlinux_name ||
symbol_conf.default_guest_kallsyms ||
- symbol_conf.default_guest_modules) {
+ symbol_conf.default_guest_modules ||
+ symbol_conf.guest_code) {
/*
* Enable guest sample processing.
*/
--
2.25.1


2022-05-14 00:46:23

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH 6/6] perf intel-pt: Add guest_code support

A common case for KVM test programs is that the guest object code can be
found in the hypervisor process (i.e. the test program running on the
host). To support that, a new option "--guest-code" has been added in
previous patches.

In this patch, add support also to Intel PT.

In particular, ensure guest_code thread is set up before attempting to
walk object code or synthesize samples.

Example:

# perf record --kcore -e intel_pt/cyc/ -- tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test
[ perf record: Woken up 1 times to write data ]
[ perf record: Captured and wrote 0.280 MB perf.data ]
# perf script --guest-code --itrace=bep --ns -F-period,+addr,+flags
[SNIP]
tsc_msrs_test 18436 [007] 10897.962087733: branches: call ffffffffc13b2ff5 __vmx_vcpu_run+0x15 (vmlinux) => ffffffffc13b2f50 vmx_update_host_rsp+0x0 (vmlinux)
tsc_msrs_test 18436 [007] 10897.962087733: branches: return ffffffffc13b2f5d vmx_update_host_rsp+0xd (vmlinux) => ffffffffc13b2ffa __vmx_vcpu_run+0x1a (vmlinux)
tsc_msrs_test 18436 [007] 10897.962087733: branches: call ffffffffc13b303b __vmx_vcpu_run+0x5b (vmlinux) => ffffffffc13b2f80 vmx_vmenter+0x0 (vmlinux)
tsc_msrs_test 18436 [007] 10897.962087836: branches: vmentry ffffffffc13b2f82 vmx_vmenter+0x2 (vmlinux) => 0 [unknown] ([unknown])
[guest/18436] 18436 [007] 10897.962087836: branches: vmentry 0 [unknown] ([unknown]) => 402c81 guest_code+0x131 (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
[guest/18436] 18436 [007] 10897.962087836: branches: call 402c81 guest_code+0x131 (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 40dba0 ucall+0x0 (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
[guest/18436] 18436 [007] 10897.962088248: branches: vmexit 40dba0 ucall+0x0 (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 0 [unknown] ([unknown])
tsc_msrs_test 18436 [007] 10897.962088248: branches: vmexit 0 [unknown] ([unknown]) => ffffffffc13b2fa0 vmx_vmexit+0x0 (vmlinux)
tsc_msrs_test 18436 [007] 10897.962088248: branches: jmp ffffffffc13b2fa0 vmx_vmexit+0x0 (vmlinux) => ffffffffc13b2fd2 vmx_vmexit+0x32 (vmlinux)
tsc_msrs_test 18436 [007] 10897.962088256: branches: return ffffffffc13b2fd2 vmx_vmexit+0x32 (vmlinux) => ffffffffc13b3040 __vmx_vcpu_run+0x60 (vmlinux)
tsc_msrs_test 18436 [007] 10897.962088270: branches: return ffffffffc13b30b6 __vmx_vcpu_run+0xd6 (vmlinux) => ffffffffc13b2f2e vmx_vcpu_enter_exit+0x4e (vmlinux)
[SNIP]
tsc_msrs_test 18436 [007] 10897.962089321: branches: call ffffffffc13b2ff5 __vmx_vcpu_run+0x15 (vmlinux) => ffffffffc13b2f50 vmx_update_host_rsp+0x0 (vmlinux)
tsc_msrs_test 18436 [007] 10897.962089321: branches: return ffffffffc13b2f5d vmx_update_host_rsp+0xd (vmlinux) => ffffffffc13b2ffa __vmx_vcpu_run+0x1a (vmlinux)
tsc_msrs_test 18436 [007] 10897.962089321: branches: call ffffffffc13b303b __vmx_vcpu_run+0x5b (vmlinux) => ffffffffc13b2f80 vmx_vmenter+0x0 (vmlinux)
tsc_msrs_test 18436 [007] 10897.962089424: branches: vmentry ffffffffc13b2f82 vmx_vmenter+0x2 (vmlinux) => 0 [unknown] ([unknown])
[guest/18436] 18436 [007] 10897.962089424: branches: vmentry 0 [unknown] ([unknown]) => 40dba0 ucall+0x0 (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
[guest/18436] 18436 [007] 10897.962089701: branches: jmp 40dc1b ucall+0x7b (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 40dc39 ucall+0x99 (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
[guest/18436] 18436 [007] 10897.962089701: branches: jcc 40dc3c ucall+0x9c (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 40dc20 ucall+0x80 (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
[guest/18436] 18436 [007] 10897.962089701: branches: jcc 40dc3c ucall+0x9c (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 40dc20 ucall+0x80 (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
[guest/18436] 18436 [007] 10897.962089701: branches: jcc 40dc37 ucall+0x97 (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 40dc50 ucall+0xb0 (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
[guest/18436] 18436 [007] 10897.962089878: branches: vmexit 40dc55 ucall+0xb5 (/home/ahunter/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 0 [unknown] ([unknown])
tsc_msrs_test 18436 [007] 10897.962089878: branches: vmexit 0 [unknown] ([unknown]) => ffffffffc13b2fa0 vmx_vmexit+0x0 (vmlinux)
tsc_msrs_test 18436 [007] 10897.962089878: branches: jmp ffffffffc13b2fa0 vmx_vmexit+0x0 (vmlinux) => ffffffffc13b2fd2 vmx_vmexit+0x32 (vmlinux)
tsc_msrs_test 18436 [007] 10897.962089887: branches: return ffffffffc13b2fd2 vmx_vmexit+0x32 (vmlinux) => ffffffffc13b3040 __vmx_vcpu_run+0x60 (vmlinux)
tsc_msrs_test 18436 [007] 10897.962089901: branches: return ffffffffc13b30b6 __vmx_vcpu_run+0xd6 (vmlinux) => ffffffffc13b2f2e vmx_vcpu_enter_exit+0x4e (vmlinux)
[SNIP]

# perf kvm --guest-code --guest --host report -i perf.data --stdio | head -20

# To display the perf.data header info, please use --header/--header-only options.
#
#
# Total Lost Samples: 0
#
# Samples: 12 of event 'instructions'
# Event count (approx.): 2274583
#
# Children Self Command Shared Object Symbol
# ........ ........ ............. .................... ...........................................
#
54.70% 0.00% tsc_msrs_test [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe
|
---entry_SYSCALL_64_after_hwframe
do_syscall_64
|
|--29.44%--syscall_exit_to_user_mode
| exit_to_user_mode_prepare
| task_work_run
| __fput

For more information about Perf tools support for IntelĀ® Processor Trace
refer:

https://perf.wiki.kernel.org/index.php/Perf_tools_support_for_Intel%C2%AE_Processor_Trace

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/Documentation/perf-intel-pt.txt | 67 ++++++++++++++++++++++
tools/perf/util/intel-pt.c | 20 ++++++-
2 files changed, 85 insertions(+), 2 deletions(-)

diff --git a/tools/perf/Documentation/perf-intel-pt.txt b/tools/perf/Documentation/perf-intel-pt.txt
index 92532d0d3618..14cc61b254db 100644
--- a/tools/perf/Documentation/perf-intel-pt.txt
+++ b/tools/perf/Documentation/perf-intel-pt.txt
@@ -1398,6 +1398,73 @@ There were none.
:17006 17006 [001] 11500.262869216: ffffffff8220116e error_entry+0xe ([guest.kernel.kallsyms]) pushq %rax


+Tracing Virtual Machines - Guest Code
+-------------------------------------
+
+A common case for KVM test programs is that the guest object code can be
+found in the hypervisor process (i.e. the test program running on the
+host). To support that, option "--guest-code" has been added to perf script
+and perf kvm report.
+
+Here is an example tracing a test program from the kernel's KVM selftests:
+
+ # perf record --kcore -e intel_pt/cyc/ -- tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test
+ [ perf record: Woken up 1 times to write data ]
+ [ perf record: Captured and wrote 0.280 MB perf.data ]
+ # perf script --guest-code --itrace=bep --ns -F-period,+addr,+flags
+ [SNIP]
+ tsc_msrs_test 18436 [007] 10897.962087733: branches: call ffffffffc13b2ff5 __vmx_vcpu_run+0x15 (vmlinux) => ffffffffc13b2f50 vmx_update_host_rsp+0x0 (vmlinux)
+ tsc_msrs_test 18436 [007] 10897.962087733: branches: return ffffffffc13b2f5d vmx_update_host_rsp+0xd (vmlinux) => ffffffffc13b2ffa __vmx_vcpu_run+0x1a (vmlinux)
+ tsc_msrs_test 18436 [007] 10897.962087733: branches: call ffffffffc13b303b __vmx_vcpu_run+0x5b (vmlinux) => ffffffffc13b2f80 vmx_vmenter+0x0 (vmlinux)
+ tsc_msrs_test 18436 [007] 10897.962087836: branches: vmentry ffffffffc13b2f82 vmx_vmenter+0x2 (vmlinux) => 0 [unknown] ([unknown])
+ [guest/18436] 18436 [007] 10897.962087836: branches: vmentry 0 [unknown] ([unknown]) => 402c81 guest_code+0x131 (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
+ [guest/18436] 18436 [007] 10897.962087836: branches: call 402c81 guest_code+0x131 (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 40dba0 ucall+0x0 (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
+ [guest/18436] 18436 [007] 10897.962088248: branches: vmexit 40dba0 ucall+0x0 (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 0 [unknown] ([unknown])
+ tsc_msrs_test 18436 [007] 10897.962088248: branches: vmexit 0 [unknown] ([unknown]) => ffffffffc13b2fa0 vmx_vmexit+0x0 (vmlinux)
+ tsc_msrs_test 18436 [007] 10897.962088248: branches: jmp ffffffffc13b2fa0 vmx_vmexit+0x0 (vmlinux) => ffffffffc13b2fd2 vmx_vmexit+0x32 (vmlinux)
+ tsc_msrs_test 18436 [007] 10897.962088256: branches: return ffffffffc13b2fd2 vmx_vmexit+0x32 (vmlinux) => ffffffffc13b3040 __vmx_vcpu_run+0x60 (vmlinux)
+ tsc_msrs_test 18436 [007] 10897.962088270: branches: return ffffffffc13b30b6 __vmx_vcpu_run+0xd6 (vmlinux) => ffffffffc13b2f2e vmx_vcpu_enter_exit+0x4e (vmlinux)
+ [SNIP]
+ tsc_msrs_test 18436 [007] 10897.962089321: branches: call ffffffffc13b2ff5 __vmx_vcpu_run+0x15 (vmlinux) => ffffffffc13b2f50 vmx_update_host_rsp+0x0 (vmlinux)
+ tsc_msrs_test 18436 [007] 10897.962089321: branches: return ffffffffc13b2f5d vmx_update_host_rsp+0xd (vmlinux) => ffffffffc13b2ffa __vmx_vcpu_run+0x1a (vmlinux)
+ tsc_msrs_test 18436 [007] 10897.962089321: branches: call ffffffffc13b303b __vmx_vcpu_run+0x5b (vmlinux) => ffffffffc13b2f80 vmx_vmenter+0x0 (vmlinux)
+ tsc_msrs_test 18436 [007] 10897.962089424: branches: vmentry ffffffffc13b2f82 vmx_vmenter+0x2 (vmlinux) => 0 [unknown] ([unknown])
+ [guest/18436] 18436 [007] 10897.962089424: branches: vmentry 0 [unknown] ([unknown]) => 40dba0 ucall+0x0 (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
+ [guest/18436] 18436 [007] 10897.962089701: branches: jmp 40dc1b ucall+0x7b (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 40dc39 ucall+0x99 (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
+ [guest/18436] 18436 [007] 10897.962089701: branches: jcc 40dc3c ucall+0x9c (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 40dc20 ucall+0x80 (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
+ [guest/18436] 18436 [007] 10897.962089701: branches: jcc 40dc3c ucall+0x9c (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 40dc20 ucall+0x80 (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
+ [guest/18436] 18436 [007] 10897.962089701: branches: jcc 40dc37 ucall+0x97 (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 40dc50 ucall+0xb0 (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test)
+ [guest/18436] 18436 [007] 10897.962089878: branches: vmexit 40dc55 ucall+0xb5 (/home/user/git/work/tools/testing/selftests/kselftest_install/kvm/tsc_msrs_test) => 0 [unknown] ([unknown])
+ tsc_msrs_test 18436 [007] 10897.962089878: branches: vmexit 0 [unknown] ([unknown]) => ffffffffc13b2fa0 vmx_vmexit+0x0 (vmlinux)
+ tsc_msrs_test 18436 [007] 10897.962089878: branches: jmp ffffffffc13b2fa0 vmx_vmexit+0x0 (vmlinux) => ffffffffc13b2fd2 vmx_vmexit+0x32 (vmlinux)
+ tsc_msrs_test 18436 [007] 10897.962089887: branches: return ffffffffc13b2fd2 vmx_vmexit+0x32 (vmlinux) => ffffffffc13b3040 __vmx_vcpu_run+0x60 (vmlinux)
+ tsc_msrs_test 18436 [007] 10897.962089901: branches: return ffffffffc13b30b6 __vmx_vcpu_run+0xd6 (vmlinux) => ffffffffc13b2f2e vmx_vcpu_enter_exit+0x4e (vmlinux)
+ [SNIP]
+
+ # perf kvm --guest-code --guest --host report -i perf.data --stdio | head -20
+
+ # To display the perf.data header info, please use --header/--header-only options.
+ #
+ #
+ # Total Lost Samples: 0
+ #
+ # Samples: 12 of event 'instructions'
+ # Event count (approx.): 2274583
+ #
+ # Children Self Command Shared Object Symbol
+ # ........ ........ ............. .................... ...........................................
+ #
+ 54.70% 0.00% tsc_msrs_test [kernel.vmlinux] [k] entry_SYSCALL_64_after_hwframe
+ |
+ ---entry_SYSCALL_64_after_hwframe
+ do_syscall_64
+ |
+ |--29.44%--syscall_exit_to_user_mode
+ | exit_to_user_mode_prepare
+ | task_work_run
+ | __fput
+
+
Event Trace
-----------

diff --git a/tools/perf/util/intel-pt.c b/tools/perf/util/intel-pt.c
index ec43d364d0de..66f23006cfff 100644
--- a/tools/perf/util/intel-pt.c
+++ b/tools/perf/util/intel-pt.c
@@ -192,6 +192,7 @@ struct intel_pt_queue {
pid_t next_tid;
struct thread *thread;
struct machine *guest_machine;
+ struct thread *guest_thread;
struct thread *unknown_guest_thread;
pid_t guest_machine_pid;
bool exclude_kernel;
@@ -688,6 +689,11 @@ static int intel_pt_get_guest(struct intel_pt_queue *ptq)
ptq->guest_machine = NULL;
thread__zput(ptq->unknown_guest_thread);

+ if (symbol_conf.guest_code) {
+ thread__zput(ptq->guest_thread);
+ ptq->guest_thread = machines__findnew_guest_code(machines, pid);
+ }
+
machine = machines__find_guest(machines, pid);
if (!machine)
return -1;
@@ -729,11 +735,16 @@ static int intel_pt_walk_next_insn(struct intel_pt_insn *intel_pt_insn,
cpumode = intel_pt_nr_cpumode(ptq, *ip, nr);

if (nr) {
- if (cpumode != PERF_RECORD_MISC_GUEST_KERNEL ||
+ if ((!symbol_conf.guest_code && cpumode != PERF_RECORD_MISC_GUEST_KERNEL) ||
intel_pt_get_guest(ptq))
return -EINVAL;
machine = ptq->guest_machine;
- thread = ptq->unknown_guest_thread;
+ thread = ptq->guest_thread;
+ if (!thread) {
+ if (cpumode != PERF_RECORD_MISC_GUEST_KERNEL)
+ return -EINVAL;
+ thread = ptq->unknown_guest_thread;
+ }
} else {
thread = ptq->thread;
if (!thread) {
@@ -1300,6 +1311,7 @@ static void intel_pt_free_queue(void *priv)
if (!ptq)
return;
thread__zput(ptq->thread);
+ thread__zput(ptq->guest_thread);
thread__zput(ptq->unknown_guest_thread);
intel_pt_decoder_free(ptq->decoder);
zfree(&ptq->event_buf);
@@ -2372,6 +2384,10 @@ static int intel_pt_sample(struct intel_pt_queue *ptq)
ptq->sample_ipc = ptq->state->flags & INTEL_PT_SAMPLE_IPC;
}

+ /* Ensure guest code maps are set up */
+ if (symbol_conf.guest_code && (state->from_nr || state->to_nr))
+ intel_pt_get_guest(ptq);
+
/*
* Do PEBS first to allow for the possibility that the PEBS timestamp
* precedes the current timestamp.
--
2.25.1


2022-05-14 01:00:06

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 6/6] perf intel-pt: Add guest_code support

Adrian Hunter <[email protected]> writes:
>>
>> I'm still not fully sure how it exactly finds the code on the host,
>> how is the code transferred?
>
> I don't know. From a quick look at the code in
> tools/testing/selftests/kvm/lib/kvm_util.c it seems to be using
> KVM_SET_USER_MEMORY_REGION IOCTL.

Okay so it assumes that the pages with code on the guest are still intact: that is
you cannot quit the traced program, or at least not do something that would
fill it with other data?. Is that correct?

It sounds like with that restriction it's more useful for kernel traces.

-Andi

2022-05-14 01:16:40

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH 1/6] perf tools: Add machine to machines back pointer

When dealing with guest machines, it can be necessary to get a reference
to the host machine. Add a machines pointer to struct machine to make that
possible.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/machine.c | 2 ++
tools/perf/util/machine.h | 2 ++
2 files changed, 4 insertions(+)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index 95391236f5f6..e96f6ea4fd82 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -299,6 +299,8 @@ struct machine *machines__add(struct machines *machines, pid_t pid,
rb_link_node(&machine->rb_node, parent, p);
rb_insert_color_cached(&machine->rb_node, &machines->guests, leftmost);

+ machine->machines = machines;
+
return machine;
}

diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 0023165422aa..0d113771e8c8 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -18,6 +18,7 @@ struct symbol;
struct target;
struct thread;
union perf_event;
+struct machines;

/* Native host kernel uses -1 as pid index in machine */
#define HOST_KERNEL_ID (-1)
@@ -59,6 +60,7 @@ struct machine {
void *priv;
u64 db_id;
};
+ struct machines *machines;
bool trampolines_mapped;
};

--
2.25.1


2022-05-14 01:28:28

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH 6/6] perf intel-pt: Add guest_code support

On 13/05/22 17:46, Andi Kleen wrote:
> Adrian Hunter <[email protected]> writes:
>
>> A common case for KVM test programs is that the guest object code can be
>> found in the hypervisor process (i.e. the test program running on the
>> host). To support that, a new option "--guest-code" has been added in
>> previous patches.
>>
>> In this patch, add support also to Intel PT.
>>
>> In particular, ensure guest_code thread is set up before attempting to
>> walk object code or synthesize samples.
>
> Can you make it clear in the documentation what parts runs on the host
> and what parts on the guest?

That is up to the test program. All the host thread maps are
copied, so perf tools doesn't have to know.

>
> I'm still not fully sure how it exactly finds the code on the host,
> how is the code transferred?

I don't know. From a quick look at the code in
tools/testing/selftests/kvm/lib/kvm_util.c it seems to be using
KVM_SET_USER_MEMORY_REGION IOCTL.

>
> Other than that more support for this use case is very useful.
>
> -Andi


2022-05-14 02:09:07

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH 6/6] perf intel-pt: Add guest_code support

Adrian Hunter <[email protected]> writes:

> A common case for KVM test programs is that the guest object code can be
> found in the hypervisor process (i.e. the test program running on the
> host). To support that, a new option "--guest-code" has been added in
> previous patches.
>
> In this patch, add support also to Intel PT.
>
> In particular, ensure guest_code thread is set up before attempting to
> walk object code or synthesize samples.

Can you make it clear in the documentation what parts runs on the host
and what parts on the guest?

I'm still not fully sure how it exactly finds the code on the host,
how is the code transferred?

Other than that more support for this use case is very useful.

-Andi

2022-05-14 03:10:14

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH 5/6] perf kvm report: Add guest_code support

Add an option to indicate that guest code can be found in the hypervisor
process.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/Documentation/perf-kvm.txt | 3 +++
tools/perf/builtin-kvm.c | 2 ++
2 files changed, 5 insertions(+)

diff --git a/tools/perf/Documentation/perf-kvm.txt b/tools/perf/Documentation/perf-kvm.txt
index cf95baef7b61..83c742adf86e 100644
--- a/tools/perf/Documentation/perf-kvm.txt
+++ b/tools/perf/Documentation/perf-kvm.txt
@@ -94,6 +94,9 @@ OPTIONS
kernel module information. Users copy it out from guest os.
--guestvmlinux=<path>::
Guest os kernel vmlinux.
+--guest-code::
+ Indicate that guest code can be found in the hypervisor process,
+ which is a common case for KVM test programs.
-v::
--verbose::
Be more verbose (show counter open errors, etc).
diff --git a/tools/perf/builtin-kvm.c b/tools/perf/builtin-kvm.c
index 2fa687f73e5e..3696ae97f149 100644
--- a/tools/perf/builtin-kvm.c
+++ b/tools/perf/builtin-kvm.c
@@ -1603,6 +1603,8 @@ int cmd_kvm(int argc, const char **argv)
"file", "file saving guest os /proc/kallsyms"),
OPT_STRING(0, "guestmodules", &symbol_conf.default_guest_modules,
"file", "file saving guest os /proc/modules"),
+ OPT_BOOLEAN(0, "guest-code", &symbol_conf.guest_code,
+ "Guest code can be found in hypervisor process"),
OPT_INCR('v', "verbose", &verbose,
"be more verbose (show counter open errors, etc)"),
OPT_END()
--
2.25.1


2022-05-14 03:16:09

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH 2/6] perf tools: Factor out thread__set_guest_comm()

Factor out thread__set_guest_comm() so it can be reused.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/machine.c | 12 +++++++++---
1 file changed, 9 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index e96f6ea4fd82..e67b5a7670f3 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -84,6 +84,14 @@ static int machine__set_mmap_name(struct machine *machine)
return machine->mmap_name ? 0 : -ENOMEM;
}

+static void thread__set_guest_comm(struct thread *thread, pid_t pid)
+{
+ char comm[64];
+
+ snprintf(comm, sizeof(comm), "[guest/%d]", pid);
+ thread__set_comm(thread, comm, 0);
+}
+
int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
{
int err = -ENOMEM;
@@ -119,13 +127,11 @@ int machine__init(struct machine *machine, const char *root_dir, pid_t pid)
if (pid != HOST_KERNEL_ID) {
struct thread *thread = machine__findnew_thread(machine, -1,
pid);
- char comm[64];

if (thread == NULL)
goto out;

- snprintf(comm, sizeof(comm), "[guest/%d]", pid);
- thread__set_comm(thread, comm, 0);
+ thread__set_guest_comm(thread, pid);
thread__put(thread);
}

--
2.25.1


2022-05-14 04:04:00

by Adrian Hunter

[permalink] [raw]
Subject: [PATCH 3/6] perf tools: Add guest_code support

A common case for KVM test programs is that the guest object code can be
found in the hypervisor process (i.e. the test program running on the
host). To support that, copy the host thread's maps to the guest thread's
maps. Note, we do not discover the guest until we encounter a guest event,
which works well because it is not until then that we know that the host
thread's maps have been set up.

Typically the main function for the guest object code is called
"guest_code", hence the name chosen for this feature.

This is primarily aimed at supporting Intel PT, or similar, where trace
data can be recorded for a guest. Refer to the final patch in this series
"perf intel-pt: Add guest_code support" for an example.

Signed-off-by: Adrian Hunter <[email protected]>
---
tools/perf/util/event.c | 7 +++-
tools/perf/util/machine.c | 70 +++++++++++++++++++++++++++++++++++
tools/perf/util/machine.h | 2 +
tools/perf/util/session.c | 7 ++++
tools/perf/util/symbol_conf.h | 3 +-
5 files changed, 86 insertions(+), 3 deletions(-)

diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
index 6439c888ae38..0476bb3a4188 100644
--- a/tools/perf/util/event.c
+++ b/tools/perf/util/event.c
@@ -683,9 +683,12 @@ static bool check_address_range(struct intlist *addr_list, int addr_range,
int machine__resolve(struct machine *machine, struct addr_location *al,
struct perf_sample *sample)
{
- struct thread *thread = machine__findnew_thread(machine, sample->pid,
- sample->tid);
+ struct thread *thread;

+ if (symbol_conf.guest_code && !machine__is_host(machine))
+ thread = machine__findnew_guest_code(machine, sample->pid);
+ else
+ thread = machine__findnew_thread(machine, sample->pid, sample->tid);
if (thread == NULL)
return -1;

diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
index e67b5a7670f3..ae2e1fb422e2 100644
--- a/tools/perf/util/machine.c
+++ b/tools/perf/util/machine.c
@@ -392,6 +392,76 @@ struct machine *machines__find_guest(struct machines *machines, pid_t pid)
return machine;
}

+/*
+ * A common case for KVM test programs is that the guest object code can be
+ * found in the hypervisor process (i.e. the test program running on the host).
+ * To support that, copy the host thread's maps to the guest thread's maps.
+ * Note, we do not discover the guest until we encounter a guest event,
+ * which works well because it is not until then that we know that the host
+ * thread's maps have been set up.
+ */
+static struct thread *findnew_guest_code(struct machine *machine,
+ struct machine *host_machine,
+ pid_t pid)
+{
+ struct thread *host_thread;
+ struct thread *thread;
+ int err;
+
+ if (!machine)
+ return NULL;
+
+ thread = machine__findnew_thread(machine, -1, pid);
+ if (!thread)
+ return NULL;
+
+ /* Assume maps are set up if there are any */
+ if (thread->maps->nr_maps)
+ return thread;
+
+ host_thread = machine__find_thread(host_machine, -1, pid);
+ if (!host_thread)
+ goto out_err;
+
+ thread__set_guest_comm(thread, pid);
+
+ /*
+ * Guest code can be found in hypervisor process at the same address
+ * so copy host maps.
+ */
+ err = maps__clone(thread, host_thread->maps);
+ thread__put(host_thread);
+ if (err)
+ goto out_err;
+
+ return thread;
+
+out_err:
+ thread__zput(thread);
+ return NULL;
+}
+
+struct thread *machines__findnew_guest_code(struct machines *machines, pid_t pid)
+{
+ struct machine *host_machine = machines__find(machines, HOST_KERNEL_ID);
+ struct machine *machine = machines__findnew(machines, pid);
+
+ return findnew_guest_code(machine, host_machine, pid);
+}
+
+struct thread *machine__findnew_guest_code(struct machine *machine, pid_t pid)
+{
+ struct machines *machines = machine->machines;
+ struct machine *host_machine;
+
+ if (!machines)
+ return NULL;
+
+ host_machine = machines__find(machines, HOST_KERNEL_ID);
+
+ return findnew_guest_code(machine, host_machine, pid);
+}
+
void machines__process_guests(struct machines *machines,
machine__process_t process, void *data)
{
diff --git a/tools/perf/util/machine.h b/tools/perf/util/machine.h
index 0d113771e8c8..01a5fca643b7 100644
--- a/tools/perf/util/machine.h
+++ b/tools/perf/util/machine.h
@@ -168,6 +168,8 @@ struct machine *machines__find_host(struct machines *machines);
struct machine *machines__find(struct machines *machines, pid_t pid);
struct machine *machines__findnew(struct machines *machines, pid_t pid);
struct machine *machines__find_guest(struct machines *machines, pid_t pid);
+struct thread *machines__findnew_guest_code(struct machines *machines, pid_t pid);
+struct thread *machine__findnew_guest_code(struct machine *machine, pid_t pid);

void machines__set_id_hdr_size(struct machines *machines, u16 id_hdr_size);
void machines__set_comm_exec(struct machines *machines, bool comm_exec);
diff --git a/tools/perf/util/session.c b/tools/perf/util/session.c
index f9a320694b85..6577e1227bd5 100644
--- a/tools/perf/util/session.c
+++ b/tools/perf/util/session.c
@@ -1410,6 +1410,13 @@ static struct machine *machines__find_for_cpumode(struct machines *machines,
else
pid = sample->pid;

+ /*
+ * Guest code machine is created as needed and does not use
+ * DEFAULT_GUEST_KERNEL_ID.
+ */
+ if (symbol_conf.guest_code)
+ return machines__findnew(machines, pid);
+
return machines__find_guest(machines, pid);
}

diff --git a/tools/perf/util/symbol_conf.h b/tools/perf/util/symbol_conf.h
index a70b3ec09dac..bc3d046fbb63 100644
--- a/tools/perf/util/symbol_conf.h
+++ b/tools/perf/util/symbol_conf.h
@@ -43,7 +43,8 @@ struct symbol_conf {
report_individual_block,
inline_name,
disable_add2line_warn,
- buildid_mmap2;
+ buildid_mmap2,
+ guest_code;
const char *vmlinux_name,
*kallsyms_name,
*source_prefix,
--
2.25.1


2022-05-15 10:32:29

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH 6/6] perf intel-pt: Add guest_code support

On 13/05/22 21:13, Andi Kleen wrote:
> Adrian Hunter <[email protected]> writes:
>>>
>>> I'm still not fully sure how it exactly finds the code on the host,
>>> how is the code transferred?
>>
>> I don't know. From a quick look at the code in
>> tools/testing/selftests/kvm/lib/kvm_util.c it seems to be using
>> KVM_SET_USER_MEMORY_REGION IOCTL.
>
> Okay so it assumes that the pages with code on the guest are still intact: that is
> you cannot quit the traced program, or at least not do something that would
> fill it with other data?. Is that correct?
>
> It sounds like with that restriction it's more useful for kernel traces.

These patches are to support tracing of test programs that
*are* the hypervisor, creating, populating, and destroying the VM.
The VM is not running an OS.

Like:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/tools/testing/selftests/kvm/x86_64/tsc_msrs_test.c

guest_code() gets mapped into the VM at the same virtual address as the host,
so the Intel PT decoder, which wants to walk the object code, can find the
object code in the host program.

2022-05-17 09:59:38

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 3/6] perf tools: Add guest_code support

Hi Adrian,

On Fri, May 13, 2022 at 2:03 AM Adrian Hunter <[email protected]> wrote:
>
> A common case for KVM test programs is that the guest object code can be
> found in the hypervisor process (i.e. the test program running on the
> host). To support that, copy the host thread's maps to the guest thread's
> maps. Note, we do not discover the guest until we encounter a guest event,
> which works well because it is not until then that we know that the host
> thread's maps have been set up.
>
> Typically the main function for the guest object code is called
> "guest_code", hence the name chosen for this feature.

Ok, so that's just a convention and there's no hard-coded
support for the "guest_code" function in this code, right?

>
> This is primarily aimed at supporting Intel PT, or similar, where trace
> data can be recorded for a guest. Refer to the final patch in this series
> "perf intel-pt: Add guest_code support" for an example.
>
> Signed-off-by: Adrian Hunter <[email protected]>
> ---
> tools/perf/util/event.c | 7 +++-
> tools/perf/util/machine.c | 70 +++++++++++++++++++++++++++++++++++
> tools/perf/util/machine.h | 2 +
> tools/perf/util/session.c | 7 ++++
> tools/perf/util/symbol_conf.h | 3 +-
> 5 files changed, 86 insertions(+), 3 deletions(-)
>
> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
> index 6439c888ae38..0476bb3a4188 100644
> --- a/tools/perf/util/event.c
> +++ b/tools/perf/util/event.c
> @@ -683,9 +683,12 @@ static bool check_address_range(struct intlist *addr_list, int addr_range,
> int machine__resolve(struct machine *machine, struct addr_location *al,
> struct perf_sample *sample)
> {
> - struct thread *thread = machine__findnew_thread(machine, sample->pid,
> - sample->tid);
> + struct thread *thread;
>
> + if (symbol_conf.guest_code && !machine__is_host(machine))
> + thread = machine__findnew_guest_code(machine, sample->pid);
> + else
> + thread = machine__findnew_thread(machine, sample->pid, sample->tid);
> if (thread == NULL)
> return -1;
>
> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
> index e67b5a7670f3..ae2e1fb422e2 100644
> --- a/tools/perf/util/machine.c
> +++ b/tools/perf/util/machine.c
> @@ -392,6 +392,76 @@ struct machine *machines__find_guest(struct machines *machines, pid_t pid)
> return machine;
> }
>
> +/*
> + * A common case for KVM test programs is that the guest object code can be
> + * found in the hypervisor process (i.e. the test program running on the host).
> + * To support that, copy the host thread's maps to the guest thread's maps.
> + * Note, we do not discover the guest until we encounter a guest event,
> + * which works well because it is not until then that we know that the host
> + * thread's maps have been set up.
> + */
> +static struct thread *findnew_guest_code(struct machine *machine,

But this function returns a thread and IIUC that's the task which
does the host to guest transition. Then why not calling it just
findnew__hypervisor() ?

Thanks,
Namhyung


> + struct machine *host_machine,
> + pid_t pid)
> +{
> + struct thread *host_thread;
> + struct thread *thread;
> + int err;
> +
> + if (!machine)
> + return NULL;
> +
> + thread = machine__findnew_thread(machine, -1, pid);
> + if (!thread)
> + return NULL;
> +
> + /* Assume maps are set up if there are any */
> + if (thread->maps->nr_maps)
> + return thread;
> +
> + host_thread = machine__find_thread(host_machine, -1, pid);
> + if (!host_thread)
> + goto out_err;
> +
> + thread__set_guest_comm(thread, pid);
> +
> + /*
> + * Guest code can be found in hypervisor process at the same address
> + * so copy host maps.
> + */
> + err = maps__clone(thread, host_thread->maps);
> + thread__put(host_thread);
> + if (err)
> + goto out_err;
> +
> + return thread;
> +
> +out_err:
> + thread__zput(thread);
> + return NULL;
> +}
> +

2022-05-17 18:48:14

by Adrian Hunter

[permalink] [raw]
Subject: Re: [PATCH 3/6] perf tools: Add guest_code support

On 17/05/22 06:13, Namhyung Kim wrote:
> Hi Adrian,
>
> On Fri, May 13, 2022 at 2:03 AM Adrian Hunter <[email protected]> wrote:
>>
>> A common case for KVM test programs is that the guest object code can be
>> found in the hypervisor process (i.e. the test program running on the
>> host). To support that, copy the host thread's maps to the guest thread's
>> maps. Note, we do not discover the guest until we encounter a guest event,
>> which works well because it is not until then that we know that the host
>> thread's maps have been set up.
>>
>> Typically the main function for the guest object code is called
>> "guest_code", hence the name chosen for this feature.
>
> Ok, so that's just a convention and there's no hard-coded
> support for the "guest_code" function in this code, right?

That is correct.

>
>>
>> This is primarily aimed at supporting Intel PT, or similar, where trace
>> data can be recorded for a guest. Refer to the final patch in this series
>> "perf intel-pt: Add guest_code support" for an example.
>>
>> Signed-off-by: Adrian Hunter <[email protected]>
>> ---
>> tools/perf/util/event.c | 7 +++-
>> tools/perf/util/machine.c | 70 +++++++++++++++++++++++++++++++++++
>> tools/perf/util/machine.h | 2 +
>> tools/perf/util/session.c | 7 ++++
>> tools/perf/util/symbol_conf.h | 3 +-
>> 5 files changed, 86 insertions(+), 3 deletions(-)
>>
>> diff --git a/tools/perf/util/event.c b/tools/perf/util/event.c
>> index 6439c888ae38..0476bb3a4188 100644
>> --- a/tools/perf/util/event.c
>> +++ b/tools/perf/util/event.c
>> @@ -683,9 +683,12 @@ static bool check_address_range(struct intlist *addr_list, int addr_range,
>> int machine__resolve(struct machine *machine, struct addr_location *al,
>> struct perf_sample *sample)
>> {
>> - struct thread *thread = machine__findnew_thread(machine, sample->pid,
>> - sample->tid);
>> + struct thread *thread;
>>
>> + if (symbol_conf.guest_code && !machine__is_host(machine))
>> + thread = machine__findnew_guest_code(machine, sample->pid);
>> + else
>> + thread = machine__findnew_thread(machine, sample->pid, sample->tid);
>> if (thread == NULL)
>> return -1;
>>
>> diff --git a/tools/perf/util/machine.c b/tools/perf/util/machine.c
>> index e67b5a7670f3..ae2e1fb422e2 100644
>> --- a/tools/perf/util/machine.c
>> +++ b/tools/perf/util/machine.c
>> @@ -392,6 +392,76 @@ struct machine *machines__find_guest(struct machines *machines, pid_t pid)
>> return machine;
>> }
>>
>> +/*
>> + * A common case for KVM test programs is that the guest object code can be
>> + * found in the hypervisor process (i.e. the test program running on the host).
>> + * To support that, copy the host thread's maps to the guest thread's maps.
>> + * Note, we do not discover the guest until we encounter a guest event,
>> + * which works well because it is not until then that we know that the host
>> + * thread's maps have been set up.
>> + */
>> +static struct thread *findnew_guest_code(struct machine *machine,
>
> But this function returns a thread and IIUC that's the task which
> does the host to guest transition. Then why not calling it just
> findnew__hypervisor() ?

The thread returned is in the guest machine. While the code comes
from the hypervisor, it is in the guest VM when it runs.

From Intel PT point of view, this function allows finding the guest
object code by setting up the guest thread and its maps.

I will try to improve on the explanation in V2.

>
> Thanks,
> Namhyung
>
>
>> + struct machine *host_machine,
>> + pid_t pid)
>> +{
>> + struct thread *host_thread;
>> + struct thread *thread;
>> + int err;
>> +
>> + if (!machine)
>> + return NULL;
>> +
>> + thread = machine__findnew_thread(machine, -1, pid);
>> + if (!thread)
>> + return NULL;
>> +
>> + /* Assume maps are set up if there are any */
>> + if (thread->maps->nr_maps)
>> + return thread;
>> +
>> + host_thread = machine__find_thread(host_machine, -1, pid);
>> + if (!host_thread)
>> + goto out_err;
>> +
>> + thread__set_guest_comm(thread, pid);
>> +
>> + /*
>> + * Guest code can be found in hypervisor process at the same address
>> + * so copy host maps.
>> + */
>> + err = maps__clone(thread, host_thread->maps);
>> + thread__put(host_thread);
>> + if (err)
>> + goto out_err;
>> +
>> + return thread;
>> +
>> +out_err:
>> + thread__zput(thread);
>> + return NULL;
>> +}
>> +