2020-07-31 07:46:50

by Like Xu

[permalink] [raw]
Subject: [PATCH 0/6] Guest Architectural LBR Enabling

Hi All (especially developers who use perf in guest),

Please help review the ssuccessor pacthes to enable Arch LBR on KVM.
(The prerequisite v13 LBR patchset [2] seems more eager to get
the attention of reviewers and maintainer).

LBR (Last Branch Records) enables recording of software path history
by logging taken branches and other control flows within architectural
registers. Intel CPUs have had model-specific LBRs for quite some time
but this evolves them into an architectural feature now.

The Architectural Last Branch Records (LBRS) is already publiced
in the 319433-040 release of IntelĀ® Architecture Instruction
Set Extensions and Future Features Programming Reference [0].

The main advantages for the Arch LBR users are [1]:
- Faster context switching due to XSAVES support and faster reset of
LBR MSRs via the new DEPTH MSR
- Faster LBR read for a non-PEBS event due to XSAVES support, which
lowers the overhead of the NMI handler. (For a PEBS event, the LBR
information is recorded in the PEBS records. There is no impact on
the PEBS event.)
- Linux kernel can support the LBR features without knowing the model
number of the current CPU.

The Kernel 5.9 will enable Arch LBR on the host based on
tip/perf/core, so this patchset happens to enable it on KVM as well.

Before 'git am' this patchset, you may need merge the latest
tip/perf/core branch and the legacy LBR enabling patches
[PATCH v13 00/10] Guest Last Branch Recording Enabling [2].
or just wait for the above pacthes to be merged upstream.

[0] https://software.intel.com/content/www/us/en/develop/download/
intel-architecture-instruction-set-extensions-and-future-features-programming-reference.html
[1] https://lore.kernel.org/lkml/[email protected]/
[2] https://lore.kernel.org/kvm/[email protected]/

Please check more details in each commit and feel free to comment.

Like Xu (6):
KVM: vmx/pmu: Add VMCS field check before exposing LBR_FMT
perf/x86/lbr: Unify LBR_INFO registers exposure check condition
KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR
KVM: vmx/pmu: Add MSR_ARCH_LBR_CTL emulation for Arch LBR
KVM: vmx/pmu: Add Arch LBR emulation and its VMCS field
KVM: x86: Expose Architectural LBR CPUID and its XSAVES bit

arch/x86/events/intel/lbr.c | 4 +-
arch/x86/include/asm/vmx.h | 4 ++
arch/x86/kvm/cpuid.c | 19 +++++++++
arch/x86/kvm/vmx/capabilities.h | 16 ++++++-
arch/x86/kvm/vmx/pmu_intel.c | 74 +++++++++++++++++++++++++++++++--
arch/x86/kvm/vmx/vmx.c | 16 ++++++-
arch/x86/kvm/vmx/vmx.h | 3 ++
arch/x86/kvm/x86.c | 6 +++
8 files changed, 133 insertions(+), 9 deletions(-)

--
2.21.3


2020-07-31 07:46:59

by Like Xu

[permalink] [raw]
Subject: [PATCH 2/6] perf/x86/lbr: Unify LBR_INFO registers exposure check condition

Architectural LBR has IA32_LBR_x_INFO instead of LBR_FORMAT_INFO_x
to hold metadata for the operation, including mispredict, TSX, and
elapsed cycle time information.

The x86_pmu.lbr_info will be assigned in intel_pmu_?_lbr_init_?(),
it's safe to expose LBR_INFO in the x86_perf_get_lbr(), instead of
relying solely on 'lbr_format == LBR_FORMAT_INFO' check.

Cc: Kan Liang <[email protected]>
Cc: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Like Xu <[email protected]>
---
arch/x86/events/intel/lbr.c | 4 +---
1 file changed, 1 insertion(+), 3 deletions(-)

diff --git a/arch/x86/events/intel/lbr.c b/arch/x86/events/intel/lbr.c
index 63f58bdf556c..8d816f580342 100644
--- a/arch/x86/events/intel/lbr.c
+++ b/arch/x86/events/intel/lbr.c
@@ -1832,12 +1832,10 @@ void __init intel_pmu_arch_lbr_init(void)
*/
int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
{
- int lbr_fmt = x86_pmu.intel_cap.lbr_format;
-
lbr->nr = x86_pmu.lbr_nr;
lbr->from = x86_pmu.lbr_from;
lbr->to = x86_pmu.lbr_to;
- lbr->info = (lbr_fmt == LBR_FORMAT_INFO) ? x86_pmu.lbr_info : 0;
+ lbr->info = x86_pmu.lbr_info;

return 0;
}
--
2.21.3

2020-07-31 07:47:10

by Like Xu

[permalink] [raw]
Subject: [PATCH 6/6] KVM: x86: Expose Architectural LBR CPUID and its XSAVES bit

If CPUID.(EAX=07H, ECX=0):EDX[19] is exposed to 1, the KVM supports Arch
LBRs and CPUID leaf 01CH indicates details of the Arch LBRs capabilities.
As the first step, KVM only exposes the current LBR depth on the host for
guest, which is likely to be the maximum supported value on the host.

If KVM supports XSAVES, the CPUID.(EAX=0DH, ECX=1):EDX:ECX[bit 15]
is also exposed to 1, which means the availability of support for Arch
LBR configuration state save and restore. When available, guest software
operating at CPL=0 can use XSAVES/XRSTORS manage supervisor state
component Arch LBR for own purposes once IA32_XSS [bit 15] is set.
XSAVE support for Arch LBRs is enumerated in CPUID.(EAX=0DH, ECX=0FH).

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/kvm/cpuid.c | 19 +++++++++++++++++++
arch/x86/kvm/vmx/vmx.c | 2 ++
arch/x86/kvm/x86.c | 6 ++++++
3 files changed, 27 insertions(+)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 7d92854082a1..578ef0719182 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -713,6 +713,25 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
entry->edx = 0;
}
break;
+ /* Architectural LBR */
+ case 0x1c:
+ {
+ u64 lbr_depth_mask = 0;
+
+ if (!kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR)) {
+ entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
+ break;
+ }
+
+ lbr_depth_mask = 1UL << fls(entry->eax & 0xff);
+ /*
+ * KVM only exposes the maximum supported depth,
+ * which is also the value used by the host Arch LBR driver.
+ */
+ entry->eax &= ~0xff;
+ entry->eax |= lbr_depth_mask;
+ break;
+ }
/* Intel PT */
case 0x14:
if (!kvm_cpu_cap_has(X86_FEATURE_INTEL_PT)) {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 3843aebf4efb..98bad0dbfdf1 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7409,6 +7409,8 @@ static __init void vmx_set_cpu_caps(void)
kvm_cpu_cap_check_and_set(X86_FEATURE_INVPCID);
if (vmx_pt_mode_is_host_guest())
kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT);
+ if (cpu_has_vmx_arch_lbr())
+ kvm_cpu_cap_check_and_set(X86_FEATURE_ARCH_LBR);

if (vmx_umip_emulated())
kvm_cpu_cap_set(X86_FEATURE_UMIP);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8a58d0355a99..4da3f9ee96a6 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -184,6 +184,8 @@ static struct kvm_shared_msrs __percpu *shared_msrs;
| XFEATURE_MASK_BNDCSR | XFEATURE_MASK_AVX512 \
| XFEATURE_MASK_PKRU)

+#define KVM_SUPPORTED_XSS_ARCH_LBR (1ULL << 15)
+
u64 __read_mostly host_efer;
EXPORT_SYMBOL_GPL(host_efer);

@@ -9803,6 +9805,10 @@ int kvm_arch_hardware_setup(void *opaque)

if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
supported_xss = 0;
+ else {
+ if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
+ supported_xss |= KVM_SUPPORTED_XSS_ARCH_LBR;
+ }

#define __kvm_cpu_cap_has(UNUSED_, f) kvm_cpu_cap_has(f)
cr4_reserved_bits = __cr4_reserved_bits(__kvm_cpu_cap_has, UNUSED_);
--
2.21.3

2020-07-31 07:48:00

by Like Xu

[permalink] [raw]
Subject: [PATCH 3/6] KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR

The number of Arch LBR entries available for recording operations
is dictated by the value in MSR_ARCH_LBR_DEPTH.DEPTH. The supported
LBR depth values can be found in CPUID.(EAX=01CH, ECX=0):EAX[7:0]
and for each bit n set in this field, the MSR_ARCH_LBR_DEPTH.DEPTH
value 8*(n+1) is supported.

On a software write to MSR_ARCH_LBR_DEPTH, all LBR entries are reset
to 0. Emulate the reset behavior by introducing lbr_desc->arch_lbr_reset
and sync it to the host MSR_ARCH_LBR_DEPTH msr when the guest LBR
event is ACTIVE and the LBR records msrs are pass-through to the guest.

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 44 ++++++++++++++++++++++++++++++++++++
arch/x86/kvm/vmx/vmx.h | 3 +++
2 files changed, 47 insertions(+)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index d61a30d3a6ed..8021fbdbd618 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -231,6 +231,9 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
ret = pmu->version > 1;
break;
+ case MSR_ARCH_LBR_DEPTH:
+ ret = guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR);
+ break;
default:
ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
@@ -261,6 +264,7 @@ static inline void intel_pmu_release_guest_lbr_event(struct kvm_vcpu *vcpu)
if (lbr_desc->event) {
perf_event_release_kernel(lbr_desc->event);
lbr_desc->event = NULL;
+ lbr_desc->arch_lbr_reset = false;
vcpu_to_pmu(vcpu)->event_count--;
}
}
@@ -356,11 +360,27 @@ static bool intel_pmu_handle_lbr_msrs_access(struct kvm_vcpu *vcpu,
return true;
}

+/*
+ * Check if the requested depth values is supported
+ * based on the bits [0:7] of the guest cpuid.1c.eax.
+ */
+static bool arch_lbr_depth_is_valid(struct kvm_vcpu *vcpu, u64 depth)
+{
+ struct kvm_cpuid_entry2 *best;
+
+ best = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
+ if (depth && best)
+ return (best->eax & 0xff) & (1ULL << (depth / 8 - 1));
+
+ return false;
+}
+
static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
struct kvm_pmc *pmc;
u32 msr = msr_info->index;
+ struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);

switch (msr) {
case MSR_CORE_PERF_FIXED_CTR_CTRL:
@@ -375,6 +395,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
msr_info->data = pmu->global_ovf_ctrl;
return 0;
+ case MSR_ARCH_LBR_DEPTH:
+ msr_info->data = lbr_desc->records.nr;
+ return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
(pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -403,6 +426,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
struct kvm_pmc *pmc;
u32 msr = msr_info->index;
u64 data = msr_info->data;
+ struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);

switch (msr) {
case MSR_CORE_PERF_FIXED_CTR_CTRL:
@@ -435,6 +459,13 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 0;
}
break;
+ case MSR_ARCH_LBR_DEPTH:
+ if (!arch_lbr_depth_is_valid(vcpu, data))
+ return 1;
+ lbr_desc->records.nr = data;
+ lbr_desc->arch_lbr_reset = true;
+ __set_bit(INTEL_GUEST_LBR_INUSE, pmu->pmc_in_use);
+ return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
(pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -484,6 +515,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->counter_bitmask[KVM_PMC_FIXED] = 0;
pmu->version = 0;
pmu->reserved_bits = 0xffffffff00200000ull;
+ lbr_desc->arch_lbr_reset = false;

entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
if (!entry)
@@ -567,6 +599,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
lbr_desc->records.nr = 0;
lbr_desc->event = NULL;
lbr_desc->already_passthrough = false;
+ lbr_desc->arch_lbr_reset = false;
}

static void intel_pmu_reset(struct kvm_vcpu *vcpu)
@@ -623,6 +656,14 @@ static void intel_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
intel_pmu_legacy_freezing_lbrs_on_pmi(vcpu);
}

+static void intel_pmu_arch_lbr_reset(struct kvm_vcpu *vcpu)
+{
+ struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
+
+ wrmsrl(MSR_ARCH_LBR_DEPTH, lbr_desc->records.nr);
+ lbr_desc->arch_lbr_reset = false;
+}
+
static void vmx_update_intercept_for_lbr_msrs(struct kvm_vcpu *vcpu, bool set)
{
unsigned long *msr_bitmap = to_vmx(vcpu)->vmcs01.msr_bitmap;
@@ -658,6 +699,9 @@ static inline void vmx_enable_lbr_msrs_passthrough(struct kvm_vcpu *vcpu)
{
struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);

+ if (unlikely(lbr_desc->arch_lbr_reset))
+ intel_pmu_arch_lbr_reset(vcpu);
+
if (lbr_desc->already_passthrough)
return;

diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index f95d61942a1c..5c02463993ca 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -115,6 +115,9 @@ struct lbr_desc {

/* A flag to reduce the overhead of LBR pass-through or cancellation. */
bool already_passthrough;
+
+ /* Reset all LBR entries on a guest write to MSR_ARCH_LBR_DEPTH */
+ bool arch_lbr_reset;
};

/*
--
2.21.3

2020-07-31 07:48:13

by Like Xu

[permalink] [raw]
Subject: [PATCH 4/6] KVM: vmx/pmu: Add MSR_ARCH_LBR_CTL emulation for Arch LBR

Arch LBRs are enabled by setting MSR_ARCH_LBR_CTL.LBREn to 1. On
processors that support Arch LBR, MSR_IA32_DEBUGCTLMSR[bit 0] has
no meaning. It can be written to 0 or 1, but reads will always return 0.

A new guest state field named "Guest IA32_LBR_CTL" has been added to
enhance guest LBR usage and the guest value of MSR_ARCH_LBR_CTL is
written to this field on all VM exits.

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/include/asm/vmx.h | 2 ++
arch/x86/kvm/vmx/pmu_intel.c | 13 +++++++++++++
arch/x86/kvm/vmx/vmx.c | 8 ++++++++
3 files changed, 23 insertions(+)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index cd7de4b401fe..27f53c81a17f 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -243,6 +243,8 @@ enum vmcs_field {
GUEST_BNDCFGS_HIGH = 0x00002813,
GUEST_IA32_RTIT_CTL = 0x00002814,
GUEST_IA32_RTIT_CTL_HIGH = 0x00002815,
+ GUEST_IA32_LBR_CTL = 0x00002816,
+ GUEST_IA32_LBR_CTL_HIGH = 0x00002817,
HOST_IA32_PAT = 0x00002c00,
HOST_IA32_PAT_HIGH = 0x00002c01,
HOST_IA32_EFER = 0x00002c02,
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 8021fbdbd618..289c267732bd 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -19,6 +19,7 @@
#include "pmu.h"

#define MSR_PMC_FULL_WIDTH_BIT (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)
+#define MSR_ARCH_LBR_CTL_MASK 0x7f000f

static struct kvm_event_hw_type_mapping intel_arch_events[] = {
/* Index must match CPUID 0x0A.EBX bit vector */
@@ -232,6 +233,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
ret = pmu->version > 1;
break;
case MSR_ARCH_LBR_DEPTH:
+ case MSR_ARCH_LBR_CTL:
ret = guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR);
break;
default:
@@ -398,6 +400,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_ARCH_LBR_DEPTH:
msr_info->data = lbr_desc->records.nr;
return 0;
+ case MSR_ARCH_LBR_CTL:
+ msr_info->data = vmcs_read64(GUEST_IA32_LBR_CTL);
+ return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
(pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -466,6 +471,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
lbr_desc->arch_lbr_reset = true;
__set_bit(INTEL_GUEST_LBR_INUSE, pmu->pmc_in_use);
return 0;
+ case MSR_ARCH_LBR_CTL:
+ if (data & ~MSR_ARCH_LBR_CTL_MASK)
+ return 1;
+ vmcs_write64(GUEST_IA32_LBR_CTL, data);
+ if (intel_pmu_lbr_is_enabled(vcpu) && !lbr_desc->event &&
+ (data & DEBUGCTLMSR_LBR))
+ intel_pmu_create_guest_lbr_event(vcpu);
+ return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
(pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 6d8b1d98ae1d..9f42553b1e11 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2054,6 +2054,14 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
__func__, data);
data &= ~DEBUGCTLMSR_BTF;
}
+
+ /*
+ * For Arch LBR, IA32_DEBUGCTL[bit 0] has no meaning.
+ * It can be written to 0 or 1, but reads will always return 0.
+ */
+ if (guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR))
+ data &= ~DEBUGCTLMSR_LBR;
+
vmcs_write64(GUEST_IA32_DEBUGCTL, data);
if (intel_pmu_lbr_is_enabled(vcpu) && !to_vmx(vcpu)->lbr_desc.event &&
(data & DEBUGCTLMSR_LBR))
--
2.21.3

2020-07-31 07:50:12

by Like Xu

[permalink] [raw]
Subject: [PATCH 5/6] KVM: vmx/pmu: Add Arch LBR emulation and its VMCS field

When set bit 21 in vmentry_ctrl, VM entry will write the value from the
"Guest IA32_LBR_CTL" guest state field to IA32_LBR_CTL. When set bit 26
in vmexit_ctrl, VM exit will clear IA32_LBR_CTL after the value has been
saved to the "Guest IA32_LBR_CTL" guest state field.

To enable guest Arch LBR, KVM should set both the "Load Guest IA32_LBR_CTL"
entry control and the "Clear IA32_LBR_CTL" exit control. If these two
conditions cannot be met, the vmx_get_perf_capabilities() will clear
the LBR_FMT bits.

If Arch LBR is exposed on KVM, the guest could set X86_FEATURE_ARCH_LBR
to enable guest LBR, which is equivalent to the legacy LBR_FMT setting.
The Arch LBR feature could bypass the host/guest x86_model check and
the records msrs can still be pass-through to guest as usual and work
like the legacy LBR.

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/include/asm/vmx.h | 2 ++
arch/x86/kvm/vmx/capabilities.h | 9 ++++++++-
arch/x86/kvm/vmx/pmu_intel.c | 17 ++++++++++++++---
arch/x86/kvm/vmx/vmx.c | 6 ++++--
4 files changed, 28 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 27f53c81a17f..2e4b89a55c53 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -94,6 +94,7 @@
#define VM_EXIT_CLEAR_BNDCFGS 0x00800000
#define VM_EXIT_PT_CONCEAL_PIP 0x01000000
#define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000
+#define VM_EXIT_CLEAR_IA32_LBR_CTL 0x04000000

#define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff

@@ -107,6 +108,7 @@
#define VM_ENTRY_LOAD_BNDCFGS 0x00010000
#define VM_ENTRY_PT_CONCEAL_PIP 0x00020000
#define VM_ENTRY_LOAD_IA32_RTIT_CTL 0x00040000
+#define VM_ENTRY_LOAD_IA32_LBR_CTL 0x00200000

#define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index f5f0586f4cd7..d1f6bba243c4 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -378,6 +378,13 @@ static inline bool cpu_has_vmx_lbr(void)
(vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_DEBUG_CONTROLS);
}

+static inline bool cpu_has_vmx_arch_lbr(void)
+{
+ return boot_cpu_has(X86_FEATURE_ARCH_LBR) &&
+ (vmcs_config.vmexit_ctrl & VM_EXIT_CLEAR_IA32_LBR_CTL) &&
+ (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_LBR_CTL);
+}
+
static inline u64 vmx_get_perf_capabilities(void)
{
/*
@@ -389,7 +396,7 @@ static inline u64 vmx_get_perf_capabilities(void)
if (boot_cpu_has(X86_FEATURE_PDCM))
rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap);

- if (cpu_has_vmx_lbr())
+ if (cpu_has_vmx_lbr() || cpu_has_vmx_arch_lbr())
perf_cap |= perf_cap & PMU_CAP_LBR_FMT;

return perf_cap;
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 289c267732bd..daf838dc1689 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -177,12 +177,17 @@ bool intel_pmu_lbr_is_compatible(struct kvm_vcpu *vcpu)
if (pmu->version < 2)
return false;

+ if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR) !=
+ guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR))
+ return false;
+
/*
* As a first step, a guest could only enable LBR feature if its
* cpu model is the same as the host because the LBR registers
* would be pass-through to the guest and they're model specific.
*/
- if (boot_cpu_data.x86_model != guest_cpuid_model(vcpu))
+ if (!guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR) &&
+ boot_cpu_data.x86_model != guest_cpuid_model(vcpu))
return false;

return !x86_perf_get_lbr(lbr);
@@ -210,8 +215,11 @@ static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index)
if (!intel_pmu_lbr_is_enabled(vcpu))
return ret;

- ret = (index == MSR_LBR_SELECT) || (index == MSR_LBR_TOS) ||
- (index >= records->from && index < records->from + records->nr) ||
+ if (!guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR))
+ ret = (index == MSR_LBR_SELECT) || (index == MSR_LBR_TOS);
+
+ if (!ret)
+ ret = (index >= records->from && index < records->from + records->nr) ||
(index >= records->to && index < records->to + records->nr);

if (!ret && records->info)
@@ -693,6 +701,9 @@ static void vmx_update_intercept_for_lbr_msrs(struct kvm_vcpu *vcpu, bool set)
lbr->info + i, MSR_TYPE_RW, set);
}

+ if (guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR))
+ return;
+
vmx_set_intercept_for_msr(msr_bitmap, MSR_LBR_SELECT, MSR_TYPE_RW, set);
vmx_set_intercept_for_msr(msr_bitmap, MSR_LBR_TOS, MSR_TYPE_RW, set);
}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9f42553b1e11..3843aebf4efb 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2550,7 +2550,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
VM_EXIT_LOAD_IA32_EFER |
VM_EXIT_CLEAR_BNDCFGS |
VM_EXIT_PT_CONCEAL_PIP |
- VM_EXIT_CLEAR_IA32_RTIT_CTL;
+ VM_EXIT_CLEAR_IA32_RTIT_CTL |
+ VM_EXIT_CLEAR_IA32_LBR_CTL;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
&_vmexit_control) < 0)
return -EIO;
@@ -2574,7 +2575,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
VM_ENTRY_LOAD_IA32_EFER |
VM_ENTRY_LOAD_BNDCFGS |
VM_ENTRY_PT_CONCEAL_PIP |
- VM_ENTRY_LOAD_IA32_RTIT_CTL;
+ VM_ENTRY_LOAD_IA32_RTIT_CTL |
+ VM_ENTRY_LOAD_IA32_LBR_CTL;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
&_vmentry_control) < 0)
return -EIO;
--
2.21.3

2020-07-31 07:50:17

by Like Xu

[permalink] [raw]
Subject: [PATCH 1/6] KVM: vmx/pmu: Add VMCS field check before exposing LBR_FMT

If guest LBR_FMT is exposed on KVM, KVM needs to have guest state
field GUEST_IA32_DEBUGCTL and MSR_IA32_DEBUGCTLMSR vmx switch support.

Fixes: f93d622139de ("KVM: vmx/pmu: Expose LBR_FMT in the MSR_IA32_PERF_CAPABILITIES")
Signed-off-by: Like Xu <[email protected]>
---
arch/x86/kvm/vmx/capabilities.h | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 5829f4e9a7e0..f5f0586f4cd7 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -372,6 +372,12 @@ static inline bool vmx_pt_mode_is_host_guest(void)
return pt_mode == PT_MODE_HOST_GUEST;
}

+static inline bool cpu_has_vmx_lbr(void)
+{
+ return (vmcs_config.vmexit_ctrl & VM_EXIT_SAVE_DEBUG_CONTROLS) &&
+ (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_DEBUG_CONTROLS);
+}
+
static inline u64 vmx_get_perf_capabilities(void)
{
/*
@@ -383,7 +389,8 @@ static inline u64 vmx_get_perf_capabilities(void)
if (boot_cpu_has(X86_FEATURE_PDCM))
rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap);

- perf_cap |= perf_cap & PMU_CAP_LBR_FMT;
+ if (cpu_has_vmx_lbr())
+ perf_cap |= perf_cap & PMU_CAP_LBR_FMT;

return perf_cap;
}
--
2.21.3