2022-09-21 17:51:51

by Xiaoyao Li

[permalink] [raw]
Subject: [RFC PATCH v2 0/3] KVM: VMX: Fix VM entry failure on PT_MODE_HOST_GUEST while host is using PT

There is one bug in KVM that can hit vm-entry failure 100% on platform
supporting PT_MODE_HOST_GUEST mode following below steps:

1. #modprobe -r kvm_intel
2. #modprobe kvm_intel pt_mode=1
3. start a VM with QEMU
4. on host: #perf record -e intel_pt//

The vm-entry failure happens because it violates the requirement stated
in Intel SDM 26.2.1.1 VM-Execution Control Fields

If the logical processor is operating with Intel PT enabled (if
IA32_RTIT_CTL.TraceEn = 1) at the time of VM entry, the "load
IA32_RTIT_CTL" VM-entry control must be 0.

On PT_MODE_HOST_GUEST node, vm-entry load RTIT is always set. Thus KVM
needs to ensure IA32_RTIT_CTL.TraceEn is 0 before VM-entry. Currently KVM
manually WRMSR(IA32_RTIT_CTL) to clear TraceEn bit. However, it doesn't
work everytime since there is a posibility that IA32_RTIT_CTL.TraceEn is
re-enabled in PT PMI handler before vm-entry.

This series tries to fix the issue by exposing and calling perf driver
API to stop host PT event (if any) before vm-entry and resume PT event
after vm-exit. Perf API can prevent PT PMI handler from re-enabling PT.

By the way, drop the save/restore of PT MSRs of host because the resume
of PT event after vm-exit doesn't rely on the previous value of PT MSRs.

Changes in v1:
- Export perf_event_{en,dis}able_local() and pt_get_curr_event() for KVM to
stop/resume PT event; (Suggested-by Wang, Wei W <[email protected]>)
- Drop the save/restore of host PT MSRs.

v1: https://lore.kernel.org/all/[email protected]/

Xiaoyao Li (3):
perf/core: Expose perf_event_{en,dis}able_local()
perf/x86/intel/pt: Introduce and export pt_get_curr_event()
KVM: VMX: Stop/resume host PT before/after VMX transition when
PT_MODE_HOST_GUEST

arch/x86/events/intel/pt.c | 8 ++++++++
arch/x86/include/asm/perf_event.h | 2 ++
arch/x86/kvm/vmx/vmx.c | 31 +++++++++++++------------------
arch/x86/kvm/vmx/vmx.h | 2 +-
include/linux/perf_event.h | 1 +
kernel/events/core.c | 7 +++++++
6 files changed, 32 insertions(+), 19 deletions(-)

--
2.27.0


2022-09-21 18:00:06

by Xiaoyao Li

[permalink] [raw]
Subject: [RFC PATCH v2 1/3] perf/core: Expose perf_event_{en,dis}able_local()

KVM needs them to disable/enable an Intel PT perf event before
vm-entry/after vm-exit.

Signed-off-by: Xiaoyao Li <[email protected]>
---
include/linux/perf_event.h | 1 +
kernel/events/core.c | 7 +++++++
2 files changed, 8 insertions(+)

diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h
index ee8b9ecdc03b..fc5f3952d6a2 100644
--- a/include/linux/perf_event.h
+++ b/include/linux/perf_event.h
@@ -1472,6 +1472,7 @@ extern int perf_swevent_get_recursion_context(void);
extern void perf_swevent_put_recursion_context(int rctx);
extern u64 perf_swevent_set_period(struct perf_event *event);
extern void perf_event_enable(struct perf_event *event);
+extern void perf_event_enable_local(struct perf_event *event);
extern void perf_event_disable(struct perf_event *event);
extern void perf_event_disable_local(struct perf_event *event);
extern void perf_event_disable_inatomic(struct perf_event *event);
diff --git a/kernel/events/core.c b/kernel/events/core.c
index 2621fd24ad26..8324bb99c6bf 100644
--- a/kernel/events/core.c
+++ b/kernel/events/core.c
@@ -2446,6 +2446,7 @@ void perf_event_disable_local(struct perf_event *event)
{
event_function_local(event, __perf_event_disable, NULL);
}
+EXPORT_SYMBOL_GPL(perf_event_disable_local);

/*
* Strictly speaking kernel users cannot create groups and therefore this
@@ -2984,6 +2985,12 @@ static void _perf_event_enable(struct perf_event *event)
event_function_call(event, __perf_event_enable, NULL);
}

+void perf_event_enable_local(struct perf_event *event)
+{
+ event_function_local(event, __perf_event_enable, NULL);
+}
+EXPORT_SYMBOL_GPL(perf_event_enable_local);
+
/*
* See perf_event_disable();
*/
--
2.27.0

2022-09-22 12:39:00

by Wang, Wei W

[permalink] [raw]
Subject: RE: [RFC PATCH v2 1/3] perf/core: Expose perf_event_{en,dis}able_local()

On Thursday, September 22, 2022 12:45 AM, Li, Xiaoyao wrote:
> KVM needs them to disable/enable an Intel PT perf event before vm-entry/after
> vm-exit.

I would explain more in the commit here:

Export perf_event_disable_local and perf_event_enable_local for perf users to disable
and enable perf events on the current local CPU. One usage is in PT virtualization
by KVM:
- before VMEnter to guest, KVM calls perf_event_disable_local to disable the host PT event
running on the current CPU, and this reuses the PT driver to save the related h/w states.
- after VMExit to host, KVM calls perf_event_enable_local to resume the host PT event on
the current CPU by having the PT driver load the previously saved states into h/w.

>
> Signed-off-by: Xiaoyao Li <[email protected]>
> ---
> include/linux/perf_event.h | 1 +
> kernel/events/core.c | 7 +++++++
> 2 files changed, 8 insertions(+)
>
> diff --git a/include/linux/perf_event.h b/include/linux/perf_event.h index
> ee8b9ecdc03b..fc5f3952d6a2 100644
> --- a/include/linux/perf_event.h
> +++ b/include/linux/perf_event.h
> @@ -1472,6 +1472,7 @@ extern int
> perf_swevent_get_recursion_context(void);
> extern void perf_swevent_put_recursion_context(int rctx); extern u64
> perf_swevent_set_period(struct perf_event *event); extern void
> perf_event_enable(struct perf_event *event);
> +extern void perf_event_enable_local(struct perf_event *event);
> extern void perf_event_disable(struct perf_event *event); extern void
> perf_event_disable_local(struct perf_event *event); extern void
> perf_event_disable_inatomic(struct perf_event *event); diff --git
> a/kernel/events/core.c b/kernel/events/core.c index
> 2621fd24ad26..8324bb99c6bf 100644
> --- a/kernel/events/core.c
> +++ b/kernel/events/core.c
> @@ -2446,6 +2446,7 @@ void perf_event_disable_local(struct perf_event
> *event) {
> event_function_local(event, __perf_event_disable, NULL); }
> +EXPORT_SYMBOL_GPL(perf_event_disable_local);
>
> /*
> * Strictly speaking kernel users cannot create groups and therefore this @@
> -2984,6 +2985,12 @@ static void _perf_event_enable(struct perf_event *event)
> event_function_call(event, __perf_event_enable, NULL); }
>
> +void perf_event_enable_local(struct perf_event *event) {
> + event_function_local(event, __perf_event_enable, NULL); }
> +EXPORT_SYMBOL_GPL(perf_event_enable_local);
> +
> /*
> * See perf_event_disable();
> */
> --
> 2.27.0