2021-02-03 14:09:51

by Like Xu

[permalink] [raw]
Subject: [PATCH v2 0/4] KVM: x86/pmu: Guest Architectural LBR Enabling

Hi geniuses,

Please help review the new version of Arch LBR enabling on
KVM based on the latest kvm/queue tree.

The Architectural Last Branch Records (LBRs) is publiced
in the 319433-040 release of Intel Architecture Instruction
Set Extensions and Future Features Programming Reference[0].

The main advantages for the Arch LBR users are [1]:
- Faster context switching due to XSAVES support and faster reset of
LBR MSRs via the new DEPTH MSR
- Faster LBR read for a non-PEBS event due to XSAVES support, which
lowers the overhead of the NMI handler. (For a PEBS event, the LBR
information is recorded in the PEBS records. There is no impact on
the PEBS event.)
- Linux kernel can support the LBR features without knowing the model
number of the current CPU.

Please check more details in each commit and feel free to comment.

[0] https://software.intel.com/content/www/us/en/develop/download/
intel-architecture-instruction-set-extensions-and-future-features-programming-reference.html
[1] https://lore.kernel.org/lkml/[email protected]/

---
v1->v2 Changelog:
- rebased on the latest kvm/queue tree;
- refine some comments for guest usage;

Previous:
https://lore.kernel.org/kvm/[email protected]/

Like Xu (4):
KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR
KVM: vmx/pmu: Add MSR_ARCH_LBR_CTL emulation for Arch LBR
KVM: vmx/pmu: Add Arch LBR emulation and its VMCS field
KVM: x86: Expose Architectural LBR CPUID and its XSAVES bit

arch/x86/include/asm/vmx.h | 4 ++
arch/x86/kvm/cpuid.c | 23 ++++++++++
arch/x86/kvm/vmx/capabilities.h | 25 +++++++----
arch/x86/kvm/vmx/pmu_intel.c | 74 +++++++++++++++++++++++++++++++--
arch/x86/kvm/vmx/vmx.c | 15 ++++++-
arch/x86/kvm/vmx/vmx.h | 3 ++
arch/x86/kvm/x86.c | 10 ++++-
7 files changed, 140 insertions(+), 14 deletions(-)

--
2.29.2


2021-02-03 14:09:53

by Like Xu

[permalink] [raw]
Subject: [PATCH v2 4/4] KVM: x86: Expose Architectural LBR CPUID and its XSAVES bit

If CPUID.(EAX=07H, ECX=0):EDX[19] is exposed to 1, the KVM supports Arch
LBRs and CPUID leaf 01CH indicates details of the Arch LBRs capabilities.
As the first step, KVM only exposes the current LBR depth on the host for
guest, which is likely to be the maximum supported value on the host.

If KVM supports XSAVES, the CPUID.(EAX=0DH, ECX=1):EDX:ECX[bit 15]
is also exposed to 1, which means the availability of support for Arch
LBR configuration state save and restore. When available, guest software
operating at CPL=0 can use XSAVES/XRSTORS manage supervisor state
component Arch LBR for own purposes once IA32_XSS [bit 15] is set.
XSAVE support for Arch LBRs is enumerated in CPUID.(EAX=0DH, ECX=0FH).

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/kvm/cpuid.c | 23 +++++++++++++++++++++++
arch/x86/kvm/vmx/vmx.c | 2 ++
arch/x86/kvm/x86.c | 10 +++++++++-
3 files changed, 34 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 944f518ca91b..900149eec42d 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -778,6 +778,29 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
entry->edx = 0;
}
break;
+ /* Architectural LBR */
+ case 0x1c:
+ {
+ u64 lbr_depth_mask = 0;
+
+ if (!kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR)) {
+ entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
+ break;
+ }
+
+ /*
+ * KVM only exposes the maximum supported depth,
+ * which is also the fixed value used on the host.
+ *
+ * KVM doesn't allow VMM user sapce to adjust depth
+ * per guest, because the guest LBR emulation depends
+ * on the implementation of the host LBR driver.
+ */
+ lbr_depth_mask = 1UL << fls(entry->eax & 0xff);
+ entry->eax &= ~0xff;
+ entry->eax |= lbr_depth_mask;
+ break;
+ }
/* Intel PT */
case 0x14:
if (!kvm_cpu_cap_has(X86_FEATURE_INTEL_PT)) {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 9ddf0a14d75c..c22175d9564e 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7498,6 +7498,8 @@ static __init void vmx_set_cpu_caps(void)
kvm_cpu_cap_check_and_set(X86_FEATURE_INVPCID);
if (vmx_pt_mode_is_host_guest())
kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT);
+ if (cpu_has_vmx_arch_lbr())
+ kvm_cpu_cap_check_and_set(X86_FEATURE_ARCH_LBR);

if (vmx_umip_emulated())
kvm_cpu_cap_set(X86_FEATURE_UMIP);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 667d0042d0b7..107f2e72f526 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10385,8 +10385,16 @@ int kvm_arch_hardware_setup(void *opaque)

if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
supported_xss = 0;
- else
+ else {
supported_xss &= host_xss;
+ /*
+ * The host doesn't always set ARCH_LBR bit to hoss_xss since this
+ * Arch_LBR component is used on demand in the Arch LBR driver.
+ * Check e649b3f0188f "Support dynamic supervisor feature for LBR".
+ */
+ if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
+ supported_xss |= XFEATURE_MASK_LBR;
+ }

/* Update CET features now that supported_xss is finalized. */
if (!kvm_cet_supported()) {
--
2.29.2

2021-02-03 14:10:15

by Like Xu

[permalink] [raw]
Subject: [PATCH v2 1/4] KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR

The number of Arch LBR entries available for recording operations
is dictated by the value in MSR_ARCH_LBR_DEPTH.DEPTH. The supported
LBR depth values can be found in CPUID.(EAX=01CH, ECX=0):EAX[7:0]
and for each bit n set in this field, the MSR_ARCH_LBR_DEPTH.DEPTH
value 8*(n+1) is supported.

On a software write to MSR_ARCH_LBR_DEPTH, all LBR entries are reset
to 0. Emulate the reset behavior by introducing lbr_desc->arch_lbr_reset
and sync it to the host MSR_ARCH_LBR_DEPTH msr when the guest LBR
event is ACTIVE and the LBR records msrs are pass-through to the guest.

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 43 ++++++++++++++++++++++++++++++++++++
arch/x86/kvm/vmx/vmx.h | 3 +++
2 files changed, 46 insertions(+)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index d1df618cb7de..b550c4a6ce33 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -220,6 +220,9 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
ret = pmu->version > 1;
break;
+ case MSR_ARCH_LBR_DEPTH:
+ ret = guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR);
+ break;
default:
ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
@@ -250,6 +253,7 @@ static inline void intel_pmu_release_guest_lbr_event(struct kvm_vcpu *vcpu)
if (lbr_desc->event) {
perf_event_release_kernel(lbr_desc->event);
lbr_desc->event = NULL;
+ lbr_desc->arch_lbr_reset = false;
vcpu_to_pmu(vcpu)->event_count--;
}
}
@@ -348,10 +352,26 @@ static bool intel_pmu_handle_lbr_msrs_access(struct kvm_vcpu *vcpu,
return true;
}

+/*
+ * Check if the requested depth values is supported
+ * based on the bits [0:7] of the guest cpuid.1c.eax.
+ */
+static bool arch_lbr_depth_is_valid(struct kvm_vcpu *vcpu, u64 depth)
+{
+ struct kvm_cpuid_entry2 *best;
+
+ best = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
+ if (depth && best)
+ return (best->eax & 0xff) & (1ULL << (depth / 8 - 1));
+
+ return false;
+}
+
static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
struct kvm_pmc *pmc;
+ struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
u32 msr = msr_info->index;

switch (msr) {
@@ -367,6 +387,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
msr_info->data = pmu->global_ovf_ctrl;
return 0;
+ case MSR_ARCH_LBR_DEPTH:
+ msr_info->data = lbr_desc->records.nr;
+ return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
(pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -393,6 +416,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
struct kvm_pmc *pmc;
+ struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
u32 msr = msr_info->index;
u64 data = msr_info->data;

@@ -427,6 +451,13 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 0;
}
break;
+ case MSR_ARCH_LBR_DEPTH:
+ if (!arch_lbr_depth_is_valid(vcpu, data))
+ return 1;
+ lbr_desc->records.nr = data;
+ lbr_desc->arch_lbr_reset = true;
+ __set_bit(INTEL_PMC_IDX_FIXED_VLBR, pmu->pmc_in_use);
+ return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
(pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -566,6 +597,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
lbr_desc->records.nr = 0;
lbr_desc->event = NULL;
lbr_desc->msr_passthrough = false;
+ lbr_desc->arch_lbr_reset = false;
}

static void intel_pmu_reset(struct kvm_vcpu *vcpu)
@@ -623,6 +655,14 @@ static void intel_pmu_deliver_pmi(struct kvm_vcpu *vcpu)
intel_pmu_legacy_freezing_lbrs_on_pmi(vcpu);
}

+static void intel_pmu_arch_lbr_reset(struct kvm_vcpu *vcpu)
+{
+ struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
+
+ wrmsrl(MSR_ARCH_LBR_DEPTH, lbr_desc->records.nr);
+ lbr_desc->arch_lbr_reset = false;
+}
+
static void vmx_update_intercept_for_lbr_msrs(struct kvm_vcpu *vcpu, bool set)
{
struct x86_pmu_lbr *lbr = vcpu_to_lbr_records(vcpu);
@@ -654,6 +694,9 @@ static inline void vmx_enable_lbr_msrs_passthrough(struct kvm_vcpu *vcpu)
{
struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);

+ if (unlikely(lbr_desc->arch_lbr_reset))
+ intel_pmu_arch_lbr_reset(vcpu);
+
if (lbr_desc->msr_passthrough)
return;

diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 12c53d05a902..ad8abe203ba5 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -116,6 +116,9 @@ struct lbr_desc {

/* True if LBRs are marked as not intercepted in the MSR bitmap */
bool msr_passthrough;
+
+ /* Reset all LBR entries on a guest write to MSR_ARCH_LBR_DEPTH */
+ bool arch_lbr_reset;
};

/*
--
2.29.2

2021-02-03 14:10:50

by Like Xu

[permalink] [raw]
Subject: [PATCH v2 2/4] KVM: vmx/pmu: Add MSR_ARCH_LBR_CTL emulation for Arch LBR

Arch LBRs are enabled by setting MSR_ARCH_LBR_CTL.LBREn to 1. On
processors that support Arch LBR, MSR_IA32_DEBUGCTLMSR[bit 0] has
no meaning. It can be written to 0 or 1, but reads will always return 0.

A new guest state field named "Guest IA32_LBR_CTL" has been added to
enhance guest LBR usage and the guest value of MSR_ARCH_LBR_CTL is
written to this field on all VM exits.

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/include/asm/vmx.h | 2 ++
arch/x86/kvm/vmx/pmu_intel.c | 14 ++++++++++++++
arch/x86/kvm/vmx/vmx.c | 7 +++++++
3 files changed, 23 insertions(+)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index 1b387713eddd..c099c3d17612 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -247,6 +247,8 @@ enum vmcs_field {
GUEST_BNDCFGS_HIGH = 0x00002813,
GUEST_IA32_RTIT_CTL = 0x00002814,
GUEST_IA32_RTIT_CTL_HIGH = 0x00002815,
+ GUEST_IA32_LBR_CTL = 0x00002816,
+ GUEST_IA32_LBR_CTL_HIGH = 0x00002817,
HOST_IA32_PAT = 0x00002c00,
HOST_IA32_PAT_HIGH = 0x00002c01,
HOST_IA32_EFER = 0x00002c02,
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index b550c4a6ce33..a00d89c93eb7 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -19,6 +19,7 @@
#include "pmu.h"

#define MSR_PMC_FULL_WIDTH_BIT (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)
+#define ARCH_LBR_CTL_MASK 0x7f000e

static struct kvm_event_hw_type_mapping intel_arch_events[] = {
/* Index must match CPUID 0x0A.EBX bit vector */
@@ -221,6 +222,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
ret = pmu->version > 1;
break;
case MSR_ARCH_LBR_DEPTH:
+ case MSR_ARCH_LBR_CTL:
ret = guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR);
break;
default:
@@ -390,6 +392,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_ARCH_LBR_DEPTH:
msr_info->data = lbr_desc->records.nr;
return 0;
+ case MSR_ARCH_LBR_CTL:
+ msr_info->data = vmcs_read64(GUEST_IA32_LBR_CTL);
+ return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
(pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
@@ -458,6 +463,15 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
lbr_desc->arch_lbr_reset = true;
__set_bit(INTEL_PMC_IDX_FIXED_VLBR, pmu->pmc_in_use);
return 0;
+ case MSR_ARCH_LBR_CTL:
+ if (!(data & ARCH_LBR_CTL_MASK)) {
+ vmcs_write64(GUEST_IA32_LBR_CTL, data);
+ if (intel_pmu_lbr_is_enabled(vcpu) && !lbr_desc->event &&
+ (data & DEBUGCTLMSR_LBR))
+ intel_pmu_create_guest_lbr_event(vcpu);
+ return 0;
+ }
+ break;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0)) ||
(pmc = get_gp_pmc(pmu, msr, MSR_IA32_PMC0))) {
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index beb5a912014d..edecf2961924 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2109,6 +2109,13 @@ static int vmx_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
VM_EXIT_SAVE_DEBUG_CONTROLS)
get_vmcs12(vcpu)->guest_ia32_debugctl = data;

+ /*
+ * For Arch LBR, IA32_DEBUGCTL[bit 0] has no meaning.
+ * It can be written to 0 or 1, but reads will always return 0.
+ */
+ if (guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR))
+ data &= ~DEBUGCTLMSR_LBR;
+
vmcs_write64(GUEST_IA32_DEBUGCTL, data);
if (intel_pmu_lbr_is_enabled(vcpu) && !to_vmx(vcpu)->lbr_desc.event &&
(data & DEBUGCTLMSR_LBR))
--
2.29.2

2021-02-03 14:11:16

by Like Xu

[permalink] [raw]
Subject: [PATCH v2 3/4] KVM: vmx/pmu: Add Arch LBR emulation and its VMCS field

When set bit 21 in vmentry_ctrl, VM entry will write the value from the
"Guest IA32_LBR_CTL" guest state field to IA32_LBR_CTL. When set bit 26
in vmexit_ctrl, VM exit will clear IA32_LBR_CTL after the value has been
saved to the "Guest IA32_LBR_CTL" guest state field.

To enable guest Arch LBR, KVM should set both the "Load Guest IA32_LBR_CTL"
entry control and the "Clear IA32_LBR_CTL" exit control. If these two
conditions cannot be met, the vmx_get_perf_capabilities() will clear
the LBR_FMT bits.

If Arch LBR is exposed on KVM, the guest could set X86_FEATURE_ARCH_LBR
to enable guest LBR, which is equivalent to the legacy LBR_FMT setting.
The Arch LBR feature could bypass the host/guest x86_model check and
the records msrs can still be pass-through to guest as usual and work
like the legacy LBR.

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/include/asm/vmx.h | 2 ++
arch/x86/kvm/vmx/capabilities.h | 25 +++++++++++++++++--------
arch/x86/kvm/vmx/pmu_intel.c | 17 ++++++++++++++---
arch/x86/kvm/vmx/vmx.c | 6 ++++--
4 files changed, 37 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/vmx.h b/arch/x86/include/asm/vmx.h
index c099c3d17612..755179c0a5da 100644
--- a/arch/x86/include/asm/vmx.h
+++ b/arch/x86/include/asm/vmx.h
@@ -95,6 +95,7 @@
#define VM_EXIT_CLEAR_BNDCFGS 0x00800000
#define VM_EXIT_PT_CONCEAL_PIP 0x01000000
#define VM_EXIT_CLEAR_IA32_RTIT_CTL 0x02000000
+#define VM_EXIT_CLEAR_IA32_LBR_CTL 0x04000000
#define VM_EXIT_LOAD_CET_STATE 0x10000000

#define VM_EXIT_ALWAYSON_WITHOUT_TRUE_MSR 0x00036dff
@@ -110,6 +111,7 @@
#define VM_ENTRY_PT_CONCEAL_PIP 0x00020000
#define VM_ENTRY_LOAD_IA32_RTIT_CTL 0x00040000
#define VM_ENTRY_LOAD_CET_STATE 0x00100000
+#define VM_ENTRY_LOAD_IA32_LBR_CTL 0x00200000

#define VM_ENTRY_ALWAYSON_WITHOUT_TRUE_MSR 0x000011ff

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 473c55c824b1..d84af64314fc 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -383,20 +383,29 @@ static inline bool vmx_pt_mode_is_host_guest(void)
return pt_mode == PT_MODE_HOST_GUEST;
}

-static inline u64 vmx_get_perf_capabilities(void)
+static inline bool cpu_has_vmx_arch_lbr(void)
{
- u64 perf_cap = 0;
-
- if (boot_cpu_has(X86_FEATURE_PDCM))
- rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap);
-
- perf_cap &= PMU_CAP_LBR_FMT;
+ return (vmcs_config.vmexit_ctrl & VM_EXIT_CLEAR_IA32_LBR_CTL) &&
+ (vmcs_config.vmentry_ctrl & VM_ENTRY_LOAD_IA32_LBR_CTL);
+}

+static inline u64 vmx_get_perf_capabilities(void)
+{
/*
* Since counters are virtualized, KVM would support full
* width counting unconditionally, even if the host lacks it.
*/
- return PMU_CAP_FW_WRITES | perf_cap;
+ u64 perf_cap = PMU_CAP_FW_WRITES;
+ u64 host_perf_cap = 0;
+
+ if (boot_cpu_has(X86_FEATURE_PDCM))
+ rdmsrl(MSR_IA32_PERF_CAPABILITIES, host_perf_cap);
+
+ perf_cap |= host_perf_cap & PMU_CAP_LBR_FMT;
+ if (boot_cpu_has(X86_FEATURE_ARCH_LBR) && !cpu_has_vmx_arch_lbr())
+ perf_cap &= ~PMU_CAP_LBR_FMT;
+
+ return perf_cap;
}

static inline u64 vmx_supported_debugctl(void)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index a00d89c93eb7..7f20a8e75306 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -176,12 +176,17 @@ static inline struct kvm_pmc *get_fw_gp_pmc(struct kvm_pmu *pmu, u32 msr)

bool intel_pmu_lbr_is_compatible(struct kvm_vcpu *vcpu)
{
+ if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR) !=
+ guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR))
+ return false;
+
/*
* As a first step, a guest could only enable LBR feature if its
* cpu model is the same as the host because the LBR registers
* would be pass-through to the guest and they're model specific.
*/
- return boot_cpu_data.x86_model == guest_cpuid_model(vcpu);
+ return !guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR) &&
+ boot_cpu_data.x86_model == guest_cpuid_model(vcpu);
}

bool intel_pmu_lbr_is_enabled(struct kvm_vcpu *vcpu)
@@ -199,8 +204,11 @@ static bool intel_pmu_is_valid_lbr_msr(struct kvm_vcpu *vcpu, u32 index)
if (!intel_pmu_lbr_is_enabled(vcpu))
return ret;

- ret = (index == MSR_LBR_SELECT) || (index == MSR_LBR_TOS) ||
- (index >= records->from && index < records->from + records->nr) ||
+ if (!guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR))
+ ret = (index == MSR_LBR_SELECT) || (index == MSR_LBR_TOS);
+
+ if (!ret)
+ ret = (index >= records->from && index < records->from + records->nr) ||
(index >= records->to && index < records->to + records->nr);

if (!ret && records->info)
@@ -689,6 +697,9 @@ static void vmx_update_intercept_for_lbr_msrs(struct kvm_vcpu *vcpu, bool set)
vmx_set_intercept_for_msr(vcpu, lbr->info + i, MSR_TYPE_RW, set);
}

+ if (guest_cpuid_has(vcpu, X86_FEATURE_ARCH_LBR))
+ return;
+
vmx_set_intercept_for_msr(vcpu, MSR_LBR_SELECT, MSR_TYPE_RW, set);
vmx_set_intercept_for_msr(vcpu, MSR_LBR_TOS, MSR_TYPE_RW, set);
}
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index edecf2961924..9ddf0a14d75c 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -2632,7 +2632,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
VM_EXIT_CLEAR_BNDCFGS |
VM_EXIT_PT_CONCEAL_PIP |
VM_EXIT_CLEAR_IA32_RTIT_CTL |
- VM_EXIT_LOAD_CET_STATE;
+ VM_EXIT_LOAD_CET_STATE |
+ VM_EXIT_CLEAR_IA32_LBR_CTL;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_EXIT_CTLS,
&_vmexit_control) < 0)
return -EIO;
@@ -2657,7 +2658,8 @@ static __init int setup_vmcs_config(struct vmcs_config *vmcs_conf,
VM_ENTRY_LOAD_BNDCFGS |
VM_ENTRY_PT_CONCEAL_PIP |
VM_ENTRY_LOAD_IA32_RTIT_CTL |
- VM_ENTRY_LOAD_CET_STATE;
+ VM_ENTRY_LOAD_CET_STATE |
+ VM_ENTRY_LOAD_IA32_LBR_CTL;
if (adjust_vmx_controls(min, opt, MSR_IA32_VMX_ENTRY_CTLS,
&_vmentry_control) < 0)
return -EIO;
--
2.29.2

2021-02-03 14:41:21

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] KVM: x86: Expose Architectural LBR CPUID and its XSAVES bit

On 03/02/21 14:57, Like Xu wrote:
> If CPUID.(EAX=07H, ECX=0):EDX[19] is exposed to 1, the KVM supports Arch
> LBRs and CPUID leaf 01CH indicates details of the Arch LBRs capabilities.
> As the first step, KVM only exposes the current LBR depth on the host for
> guest, which is likely to be the maximum supported value on the host.
>
> If KVM supports XSAVES, the CPUID.(EAX=0DH, ECX=1):EDX:ECX[bit 15]
> is also exposed to 1, which means the availability of support for Arch
> LBR configuration state save and restore. When available, guest software
> operating at CPL=0 can use XSAVES/XRSTORS manage supervisor state
> component Arch LBR for own purposes once IA32_XSS [bit 15] is set.
> XSAVE support for Arch LBRs is enumerated in CPUID.(EAX=0DH, ECX=0FH).
>
> Signed-off-by: Like Xu <[email protected]>
> ---
> arch/x86/kvm/cpuid.c | 23 +++++++++++++++++++++++
> arch/x86/kvm/vmx/vmx.c | 2 ++
> arch/x86/kvm/x86.c | 10 +++++++++-
> 3 files changed, 34 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index 944f518ca91b..900149eec42d 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -778,6 +778,29 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
> entry->edx = 0;
> }
> break;
> + /* Architectural LBR */
> + case 0x1c:
> + {
> + u64 lbr_depth_mask = 0;
> +
> + if (!kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR)) {
> + entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
> + break;
> + }
> +
> + /*
> + * KVM only exposes the maximum supported depth,
> + * which is also the fixed value used on the host.
> + *
> + * KVM doesn't allow VMM user sapce to adjust depth
> + * per guest, because the guest LBR emulation depends
> + * on the implementation of the host LBR driver.
> + */
> + lbr_depth_mask = 1UL << fls(entry->eax & 0xff);
> + entry->eax &= ~0xff;
> + entry->eax |= lbr_depth_mask;
> + break;
> + }
> /* Intel PT */
> case 0x14:
> if (!kvm_cpu_cap_has(X86_FEATURE_INTEL_PT)) {
> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
> index 9ddf0a14d75c..c22175d9564e 100644
> --- a/arch/x86/kvm/vmx/vmx.c
> +++ b/arch/x86/kvm/vmx/vmx.c
> @@ -7498,6 +7498,8 @@ static __init void vmx_set_cpu_caps(void)
> kvm_cpu_cap_check_and_set(X86_FEATURE_INVPCID);
> if (vmx_pt_mode_is_host_guest())
> kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT);
> + if (cpu_has_vmx_arch_lbr())
> + kvm_cpu_cap_check_and_set(X86_FEATURE_ARCH_LBR);
>
> if (vmx_umip_emulated())
> kvm_cpu_cap_set(X86_FEATURE_UMIP);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 667d0042d0b7..107f2e72f526 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10385,8 +10385,16 @@ int kvm_arch_hardware_setup(void *opaque)
>
> if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
> supported_xss = 0;
> - else
> + else {
> supported_xss &= host_xss;
> + /*
> + * The host doesn't always set ARCH_LBR bit to hoss_xss since this
> + * Arch_LBR component is used on demand in the Arch LBR driver.
> + * Check e649b3f0188f "Support dynamic supervisor feature for LBR".
> + */
> + if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
> + supported_xss |= XFEATURE_MASK_LBR;
> + }
>
> /* Update CET features now that supported_xss is finalized. */
> if (!kvm_cet_supported()) {
>

This requires some of the XSS patches that Weijang posted for CET, right?

Also, who takes care of saving/restoring the MSRs, if the host has not
added XFEATURE_MASK_LBR to MSR_IA32_XSS?

Thanks,

Paolo

2021-02-04 01:02:45

by Xu, Like

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] KVM: x86: Expose Architectural LBR CPUID and its XSAVES bit

On 2021/2/3 22:37, Paolo Bonzini wrote:
> On 03/02/21 14:57, Like Xu wrote:
>> If CPUID.(EAX=07H, ECX=0):EDX[19] is exposed to 1, the KVM supports Arch
>> LBRs and CPUID leaf 01CH indicates details of the Arch LBRs capabilities.
>> As the first step, KVM only exposes the current LBR depth on the host for
>> guest, which is likely to be the maximum supported value on the host.
>>
>> If KVM supports XSAVES, the CPUID.(EAX=0DH, ECX=1):EDX:ECX[bit 15]
>> is also exposed to 1, which means the availability of support for Arch
>> LBR configuration state save and restore. When available, guest software
>> operating at CPL=0 can use XSAVES/XRSTORS manage supervisor state
>> component Arch LBR for own purposes once IA32_XSS [bit 15] is set.
>> XSAVE support for Arch LBRs is enumerated in CPUID.(EAX=0DH, ECX=0FH).
>>
>> Signed-off-by: Like Xu <[email protected]>
>> ---
>>   arch/x86/kvm/cpuid.c   | 23 +++++++++++++++++++++++
>>   arch/x86/kvm/vmx/vmx.c |  2 ++
>>   arch/x86/kvm/x86.c     | 10 +++++++++-
>>   3 files changed, 34 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>> index 944f518ca91b..900149eec42d 100644
>> --- a/arch/x86/kvm/cpuid.c
>> +++ b/arch/x86/kvm/cpuid.c
>> @@ -778,6 +778,29 @@ static inline int __do_cpuid_func(struct
>> kvm_cpuid_array *array, u32 function)
>>               entry->edx = 0;
>>           }
>>           break;
>> +    /* Architectural LBR */
>> +    case 0x1c:
>> +    {
>> +        u64 lbr_depth_mask = 0;
>> +
>> +        if (!kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR)) {
>> +            entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
>> +            break;
>> +        }
>> +
>> +        /*
>> +         * KVM only exposes the maximum supported depth,
>> +         * which is also the fixed value used on the host.
>> +         *
>> +         * KVM doesn't allow VMM user sapce to adjust depth
>> +         * per guest, because the guest LBR emulation depends
>> +         * on the implementation of the host LBR driver.
>> +         */
>> +        lbr_depth_mask = 1UL << fls(entry->eax & 0xff);
>> +        entry->eax &= ~0xff;
>> +        entry->eax |= lbr_depth_mask;
>> +        break;
>> +    }
>>       /* Intel PT */
>>       case 0x14:
>>           if (!kvm_cpu_cap_has(X86_FEATURE_INTEL_PT)) {
>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>> index 9ddf0a14d75c..c22175d9564e 100644
>> --- a/arch/x86/kvm/vmx/vmx.c
>> +++ b/arch/x86/kvm/vmx/vmx.c
>> @@ -7498,6 +7498,8 @@ static __init void vmx_set_cpu_caps(void)
>>           kvm_cpu_cap_check_and_set(X86_FEATURE_INVPCID);
>>       if (vmx_pt_mode_is_host_guest())
>>           kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT);
>> +    if (cpu_has_vmx_arch_lbr())
>> +        kvm_cpu_cap_check_and_set(X86_FEATURE_ARCH_LBR);
>>         if (vmx_umip_emulated())
>>           kvm_cpu_cap_set(X86_FEATURE_UMIP);
>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>> index 667d0042d0b7..107f2e72f526 100644
>> --- a/arch/x86/kvm/x86.c
>> +++ b/arch/x86/kvm/x86.c
>> @@ -10385,8 +10385,16 @@ int kvm_arch_hardware_setup(void *opaque)
>>         if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
>>           supported_xss = 0;
>> -    else
>> +    else {
>>           supported_xss &= host_xss;
>> +        /*
>> +         * The host doesn't always set ARCH_LBR bit to hoss_xss since this
>> +         * Arch_LBR component is used on demand in the Arch LBR driver.
>> +         * Check e649b3f0188f "Support dynamic supervisor feature for
>> LBR".
>> +         */
>> +        if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
>> +            supported_xss |= XFEATURE_MASK_LBR;
>> +    }
>>         /* Update CET features now that supported_xss is finalized. */
>>       if (!kvm_cet_supported()) {
>>
>
> This requires some of the XSS patches that Weijang posted for CET, right?

Yes, at least we need three of them for Arch LBR:

3009dfd6d61f KVM: x86: Load guest fpu state when accessing MSRs managed by
XSAVES
d39b0a16ad1f KVM: x86: Refresh CPUID on writes to MSR_IA32_XSS
e98bf65e51c9 KVM: x86: Report XSS as an MSR to be saved if there are
supported features

>
> Also, who takes care of saving/restoring the MSRs, if the host has not
> added XFEATURE_MASK_LBR to MSR_IA32_XSS?

I may not understand your concern on this. Let me try to explain:

The guest Arch LBR driver will save the origin host_xss and
mark the LBR bit only in the XSS and then save/restore MSRs
in the extra specified guest memory, and restore the origin host_xss.

On the host side, the same thing happens to vcpu thread
due to the help of guest LBR event created by the vPMU
and the hardware LBR MSRs are saved/restored in a exclusive way.

>
> Thanks,
>
> Paolo
>

2021-02-05 08:22:24

by Xu, Like

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] KVM: x86: Expose Architectural LBR CPUID and its XSAVES bit

Hi Paolo,

I am wondering if it is acceptable for you to
review the minor Architecture LBR patch set without XSAVES for v5.12 ?

As far as I know, the guest Arch LBR  can still work without XSAVES support.

---
thx,likexu

On 2021/2/4 8:59, Xu, Like wrote:
> On 2021/2/3 22:37, Paolo Bonzini wrote:
>> On 03/02/21 14:57, Like Xu wrote:
>>> If CPUID.(EAX=07H, ECX=0):EDX[19] is exposed to 1, the KVM supports Arch
>>> LBRs and CPUID leaf 01CH indicates details of the Arch LBRs capabilities.
>>> As the first step, KVM only exposes the current LBR depth on the host for
>>> guest, which is likely to be the maximum supported value on the host.
>>>
>>> If KVM supports XSAVES, the CPUID.(EAX=0DH, ECX=1):EDX:ECX[bit 15]
>>> is also exposed to 1, which means the availability of support for Arch
>>> LBR configuration state save and restore. When available, guest software
>>> operating at CPL=0 can use XSAVES/XRSTORS manage supervisor state
>>> component Arch LBR for own purposes once IA32_XSS [bit 15] is set.
>>> XSAVE support for Arch LBRs is enumerated in CPUID.(EAX=0DH, ECX=0FH).
>>>
>>> Signed-off-by: Like Xu <[email protected]>
>>> ---
>>>   arch/x86/kvm/cpuid.c   | 23 +++++++++++++++++++++++
>>>   arch/x86/kvm/vmx/vmx.c |  2 ++
>>>   arch/x86/kvm/x86.c     | 10 +++++++++-
>>>   3 files changed, 34 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
>>> index 944f518ca91b..900149eec42d 100644
>>> --- a/arch/x86/kvm/cpuid.c
>>> +++ b/arch/x86/kvm/cpuid.c
>>> @@ -778,6 +778,29 @@ static inline int __do_cpuid_func(struct
>>> kvm_cpuid_array *array, u32 function)
>>>               entry->edx = 0;
>>>           }
>>>           break;
>>> +    /* Architectural LBR */
>>> +    case 0x1c:
>>> +    {
>>> +        u64 lbr_depth_mask = 0;
>>> +
>>> +        if (!kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR)) {
>>> +            entry->eax = entry->ebx = entry->ecx = entry->edx = 0;
>>> +            break;
>>> +        }
>>> +
>>> +        /*
>>> +         * KVM only exposes the maximum supported depth,
>>> +         * which is also the fixed value used on the host.
>>> +         *
>>> +         * KVM doesn't allow VMM user sapce to adjust depth
>>> +         * per guest, because the guest LBR emulation depends
>>> +         * on the implementation of the host LBR driver.
>>> +         */
>>> +        lbr_depth_mask = 1UL << fls(entry->eax & 0xff);
>>> +        entry->eax &= ~0xff;
>>> +        entry->eax |= lbr_depth_mask;
>>> +        break;
>>> +    }
>>>       /* Intel PT */
>>>       case 0x14:
>>>           if (!kvm_cpu_cap_has(X86_FEATURE_INTEL_PT)) {
>>> diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
>>> index 9ddf0a14d75c..c22175d9564e 100644
>>> --- a/arch/x86/kvm/vmx/vmx.c
>>> +++ b/arch/x86/kvm/vmx/vmx.c
>>> @@ -7498,6 +7498,8 @@ static __init void vmx_set_cpu_caps(void)
>>>           kvm_cpu_cap_check_and_set(X86_FEATURE_INVPCID);
>>>       if (vmx_pt_mode_is_host_guest())
>>>           kvm_cpu_cap_check_and_set(X86_FEATURE_INTEL_PT);
>>> +    if (cpu_has_vmx_arch_lbr())
>>> +        kvm_cpu_cap_check_and_set(X86_FEATURE_ARCH_LBR);
>>>         if (vmx_umip_emulated())
>>>           kvm_cpu_cap_set(X86_FEATURE_UMIP);
>>> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
>>> index 667d0042d0b7..107f2e72f526 100644
>>> --- a/arch/x86/kvm/x86.c
>>> +++ b/arch/x86/kvm/x86.c
>>> @@ -10385,8 +10385,16 @@ int kvm_arch_hardware_setup(void *opaque)
>>>         if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
>>>           supported_xss = 0;
>>> -    else
>>> +    else {
>>>           supported_xss &= host_xss;
>>> +        /*
>>> +         * The host doesn't always set ARCH_LBR bit to hoss_xss since
>>> this
>>> +         * Arch_LBR component is used on demand in the Arch LBR driver.
>>> +         * Check e649b3f0188f "Support dynamic supervisor feature for
>>> LBR".
>>> +         */
>>> +        if (kvm_cpu_cap_has(X86_FEATURE_ARCH_LBR))
>>> +            supported_xss |= XFEATURE_MASK_LBR;
>>> +    }
>>>         /* Update CET features now that supported_xss is finalized. */
>>>       if (!kvm_cet_supported()) {
>>>
>>
>> This requires some of the XSS patches that Weijang posted for CET, right?
>
> Yes, at least we need three of them for Arch LBR:
>
> 3009dfd6d61f KVM: x86: Load guest fpu state when accessing MSRs managed
> by XSAVES
> d39b0a16ad1f KVM: x86: Refresh CPUID on writes to MSR_IA32_XSS
> e98bf65e51c9 KVM: x86: Report XSS as an MSR to be saved if there are
> supported features
>
>>
>> Also, who takes care of saving/restoring the MSRs, if the host has not
>> added XFEATURE_MASK_LBR to MSR_IA32_XSS?
>
> I may not understand your concern on this. Let me try to explain:
>
> The guest Arch LBR driver will save the origin host_xss and
> mark the LBR bit only in the XSS and then save/restore MSRs
> in the extra specified guest memory, and restore the origin host_xss.
>
> On the host side, the same thing happens to vcpu thread
> due to the help of guest LBR event created by the vPMU
> and the hardware LBR MSRs are saved/restored in a exclusive way.
>
>>
>> Thanks,
>>
>> Paolo
>>
>

2021-02-05 11:09:48

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] KVM: x86: Expose Architectural LBR CPUID and its XSAVES bit

On 05/02/21 09:16, Xu, Like wrote:
> Hi Paolo,
>
> I am wondering if it is acceptable for you to
> review the minor Architecture LBR patch set without XSAVES for v5.12 ?
>
> As far as I know, the guest Arch LBR  can still work without XSAVES
> support.

I dopn't think it can work. You could have two guests on the same
physical CPU and the MSRs would be corrupted if the guests write to the
MSR but they do not enable the LBRs.

Paolo

2021-02-07 01:04:13

by Xu, Like

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] KVM: x86: Expose Architectural LBR CPUID and its XSAVES bit

On 2021/2/5 19:00, Paolo Bonzini wrote:
> On 05/02/21 09:16, Xu, Like wrote:
>> Hi Paolo,
>>
>> I am wondering if it is acceptable for you to
>> review the minor Architecture LBR patch set without XSAVES for v5.12 ?
>>
>> As far as I know, the guest Arch LBR  can still work without XSAVES
>> support.
>
> I dopn't think it can work.  You could have two guests on the same
> physical CPU and the MSRs would be corrupted if the guests write to the
> MSR but they do not enable the LBRs.
>
> Paolo
>
Neither Arch LBR nor the old version of LBR have this corruption issue,
and we will not use XSAVES for at least LBR MSRs in the VMX transaction.

This is because we have reused the LBR save/restore swicth support from the
host perf mechanism in the legacy LBR support, which will save/restore the LBR
MSRs of the vcpu (thread) when the vcpu is sched in/out.

Therefore, if we have two guests on the same physical CPU, the usage of LBR
MSRs
is isolated, and it's also true when we use LBR to trace the hypervisor on
the host.
The same thing happens on the platforms which supports Arch LBR.

I propose that we don't support using XSAVES to save/restore Arch LRB *in
the guest*
(just like the guest Intel PT), but use the traditional RD/WRMSR, which
still works
like the legacy LBR.

Since we already have legacy LBR support, we can add a small amount of
effort (just
two more MSRs emulation and related CPUID exposure) to support Arch LBR w/o
XSAVES.

I estimate that there are many issues we need to address when we supporting
guests
to use xsaves instructions. As a rational choice, we could enable the basic
Arch LBR.

Paolo and Sean, what do you think ?

---
thx, likexu

2021-02-08 10:44:47

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] KVM: x86: Expose Architectural LBR CPUID and its XSAVES bit

On 07/02/21 02:02, Xu, Like wrote:
> On 2021/2/5 19:00, Paolo Bonzini wrote:
>> On 05/02/21 09:16, Xu, Like wrote:
>>> Hi Paolo,
>>>
>>> I am wondering if it is acceptable for you to
>>> review the minor Architecture LBR patch set without XSAVES for v5.12 ?
>>>
>>> As far as I know, the guest Arch LBR  can still work without XSAVES
>>> support.
>>
>> I dopn't think it can work.  You could have two guests on the same
>> physical CPU and the MSRs would be corrupted if the guests write to
>> the MSR but they do not enable the LBRs.
>>
>> Paolo
>>
> Neither Arch LBR nor the old version of LBR have this corruption issue,
> and we will not use XSAVES for at least LBR MSRs in the VMX transaction.
>
> This is because we have reused the LBR save/restore swicth support from the
> host perf mechanism in the legacy LBR support, which will save/restore
> the LBR
> MSRs of the vcpu (thread) when the vcpu is sched in/out.
>
> Therefore, if we have two guests on the same physical CPU, the usage of
> LBR MSRs
> is isolated, and it's also true when we use LBR to trace the hypervisor
> on the host.
> The same thing happens on the platforms which supports Arch LBR.
>
> I propose that we don't support using XSAVES to save/restore Arch LRB
> *in the guest*
> (just like the guest Intel PT), but use the traditional RD/WRMSR, which
> still works
> like the legacy LBR.

Ok, this makes sense. I'll review the patches more carefully, looking
at 5.13 for the target.

Paolo

> Since we already have legacy LBR support, we can add a small amount of
> effort (just
> two more MSRs emulation and related CPUID exposure) to support Arch LBR
> w/o XSAVES.
>
> I estimate that there are many issues we need to address when we
> supporting guests
> to use xsaves instructions. As a rational choice, we could enable the
> basic Arch LBR.
>
> Paolo and Sean, what do you think ?
>
> ---
> thx, likexu
>

2021-03-02 18:47:22

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR

On 2021/3/2 6:34, Sean Christopherson wrote:
> On Wed, Feb 03, 2021, Like Xu wrote:
>> @@ -348,10 +352,26 @@ static bool intel_pmu_handle_lbr_msrs_access(struct kvm_vcpu *vcpu,
>> return true;
>> }
>>
>> +/*
>> + * Check if the requested depth values is supported
>> + * based on the bits [0:7] of the guest cpuid.1c.eax.
>> + */
>> +static bool arch_lbr_depth_is_valid(struct kvm_vcpu *vcpu, u64 depth)
>> +{
>> + struct kvm_cpuid_entry2 *best;
>> +
>> + best = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
>> + if (depth && best)
>
>> + return (best->eax & 0xff) & (1ULL << (depth / 8 - 1));
>
> I believe this will genereate undefined behavior if depth > 64. Or if depth < 8.
> And I believe this check also needs to enforce that depth is a multiple of 8.
>
> For each bit n set in this field, the IA32_LBR_DEPTH.DEPTH value 8*(n+1) is
> supported.
>
> Thus it's impossible for 0-7, 9-15, etc... to be legal depths.

Thank you! How about:

best = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
if (best && depth && !(depth % 8))
return (best->eax & 0xff) & (1ULL << (depth / 8 - 1));

return false;

>
>
>> +
>> + return false;
>> +}
>> +
>

2021-03-04 05:27:43

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] KVM: vmx/pmu: Add MSR_ARCH_LBR_DEPTH emulation for Arch LBR

On Wed, Feb 03, 2021, Like Xu wrote:
> @@ -348,10 +352,26 @@ static bool intel_pmu_handle_lbr_msrs_access(struct kvm_vcpu *vcpu,
> return true;
> }
>
> +/*
> + * Check if the requested depth values is supported
> + * based on the bits [0:7] of the guest cpuid.1c.eax.
> + */
> +static bool arch_lbr_depth_is_valid(struct kvm_vcpu *vcpu, u64 depth)
> +{
> + struct kvm_cpuid_entry2 *best;
> +
> + best = kvm_find_cpuid_entry(vcpu, 0x1c, 0);
> + if (depth && best)

> + return (best->eax & 0xff) & (1ULL << (depth / 8 - 1));

I believe this will genereate undefined behavior if depth > 64. Or if depth < 8.
And I believe this check also needs to enforce that depth is a multiple of 8.

For each bit n set in this field, the IA32_LBR_DEPTH.DEPTH value 8*(n+1) is
supported.

Thus it's impossible for 0-7, 9-15, etc... to be legal depths.


> +
> + return false;
> +}
> +

2021-04-14 11:22:38

by Xu, Like

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] KVM: x86: Expose Architectural LBR CPUID and its XSAVES bit

Hi Paolo,

Do we have a chance to make Arch LBR into the mainline in the upcoming
merger window?
https://lore.kernel.org/kvm/[email protected]/

Thanks,
Like Xu

On 2021/2/8 18:31, Paolo Bonzini wrote:
> Ok, this makes sense.  I'll review the patches more carefully, looking at
> 5.13 for the target.
>
> Paolo