2020-03-05 10:00:19

by Luwei Kang

[permalink] [raw]
Subject: [PATCH v1 00/11] PEBS virtualization enabling via DS

The Processor Event-Based Sampling(PEBS) supported on mainstream Intel
platforms can provide an architectural state of the instruction executed
after the instruction that caused the event. This patchset is going to
enable PEBS feature via DS on KVM for the Icelake server.
Although PEBS via DS supports EPT violations feature is supported starting
Skylake Server, but we have to pin DS area to avoid losing PEBS records due
to some issues.

BTW:
The PEBS virtualization via Intel PT patchset V1 has been posted out and the
later version will base on this patchset.
https://lkml.kernel.org/r/[email protected]/

Testing:
The guest can use PEBS feature like native. e.g.

# perf record -e instructions:ppp ./br_instr a

perf report on guest:
# Samples: 2K of event 'instructions:ppp', # Event count (approx.): 1473377250
# Overhead Command Shared Object Symbol
57.74% br_instr br_instr [.] lfsr_cond
41.40% br_instr br_instr [.] cmp_end
0.21% br_instr [kernel.kallsyms] [k] __lock_acquire

perf report on host:
# Samples: 2K of event 'instructions:ppp', # Event count (approx.): 1462721386
# Overhead Command Shared Object Symbol
57.90% br_instr br_instr [.] lfsr_cond
41.95% br_instr br_instr [.] cmp_end
0.05% br_instr [kernel.vmlinux] [k] lock_acquire


Kan Liang (4):
perf/x86/core: Support KVM to assign a dedicated counter for guest
PEBS
perf/x86/ds: Handle guest PEBS events overflow and inject fake PMI
perf/x86: Expose a function to disable auto-reload
KVM: x86/pmu: Decouple event enablement from event creation

Like Xu (1):
KVM: x86/pmu: Add support to reprogram PEBS event for guest counters

Luwei Kang (6):
KVM: x86/pmu: Implement is_pebs_via_ds_supported pmu ops
KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64
KVM: x86/pmu: PEBS MSRs emulation
KVM: x86/pmu: Expose PEBS feature to guest
KVM: x86/pmu: Introduce the mask value for fixed counter
KVM: x86/pmu: Adaptive PEBS virtualization enabling

arch/x86/events/intel/core.c | 74 +++++++++++++++++++++-
arch/x86/events/intel/ds.c | 59 ++++++++++++++++++
arch/x86/events/perf_event.h | 1 +
arch/x86/include/asm/kvm_host.h | 12 ++++
arch/x86/include/asm/msr-index.h | 4 ++
arch/x86/include/asm/perf_event.h | 2 +
arch/x86/kvm/cpuid.c | 9 ++-
arch/x86/kvm/pmu.c | 71 ++++++++++++++++++++-
arch/x86/kvm/pmu.h | 2 +
arch/x86/kvm/svm.c | 12 ++++
arch/x86/kvm/vmx/capabilities.h | 17 +++++
arch/x86/kvm/vmx/pmu_intel.c | 128 +++++++++++++++++++++++++++++++++++++-
arch/x86/kvm/vmx/vmx.c | 6 +-
arch/x86/kvm/vmx/vmx.h | 4 ++
arch/x86/kvm/x86.c | 19 +++++-
include/linux/perf_event.h | 2 +
kernel/events/core.c | 1 +
17 files changed, 414 insertions(+), 9 deletions(-)

--
1.8.3.1


2020-03-05 10:00:28

by Luwei Kang

[permalink] [raw]
Subject: [PATCH v1 06/11] KVM: x86/pmu: Implement is_pebs_via_ds_supported pmu ops

PEBS virtualization enabling in KVM guest via DS is only supported on
Icelake server. This patch introduce a new pmu ops is_pebs_via_ds_supported
to check if PEBS feature can be supported in KVM guest.

Originally-by: Andi Kleen <[email protected]>
Signed-off-by: Luwei Kang <[email protected]>
Co-developed-by: Kan Liang <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
Co-developed-by: Like Xu <[email protected]>
Signed-off-by: Like Xu <[email protected]>
---
arch/x86/kvm/pmu.h | 1 +
arch/x86/kvm/vmx/pmu_intel.c | 18 ++++++++++++++++++
2 files changed, 19 insertions(+)

diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 1333298..476780b 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -32,6 +32,7 @@ struct kvm_pmu_ops {
struct kvm_pmc *(*msr_idx_to_pmc)(struct kvm_vcpu *vcpu, u32 msr);
int (*is_valid_rdpmc_ecx)(struct kvm_vcpu *vcpu, unsigned int idx);
bool (*is_valid_msr)(struct kvm_vcpu *vcpu, u32 msr);
+ bool (*is_pebs_via_ds_supported)(void);
int (*get_msr)(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
void (*refresh)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index ebadc33..a67bd34 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -12,6 +12,7 @@
#include <linux/kvm_host.h>
#include <linux/perf_event.h>
#include <asm/perf_event.h>
+#include <asm/intel-family.h>
#include "x86.h"
#include "cpuid.h"
#include "lapic.h"
@@ -172,6 +173,22 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
return ret;
}

+static bool intel_is_pebs_via_ds_supported(void)
+{
+ if (!boot_cpu_has(X86_FEATURE_PEBS))
+ return false;
+
+ switch (boot_cpu_data.x86_model) {
+ case INTEL_FAM6_ICELAKE_D:
+ case INTEL_FAM6_ICELAKE_X:
+ break;
+ default:
+ return false;
+ }
+
+ return true;
+}
+
static struct kvm_pmc *intel_msr_idx_to_pmc(struct kvm_vcpu *vcpu, u32 msr)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -401,6 +418,7 @@ struct kvm_pmu_ops intel_pmu_ops = {
.msr_idx_to_pmc = intel_msr_idx_to_pmc,
.is_valid_rdpmc_ecx = intel_is_valid_rdpmc_ecx,
.is_valid_msr = intel_is_valid_msr,
+ .is_pebs_via_ds_supported = intel_is_pebs_via_ds_supported,
.get_msr = intel_pmu_get_msr,
.set_msr = intel_pmu_set_msr,
.refresh = intel_pmu_refresh,
--
1.8.3.1

2020-03-05 10:00:44

by Luwei Kang

[permalink] [raw]
Subject: [PATCH v1 02/11] perf/x86/ds: Handle guest PEBS events overflow and inject fake PMI

From: Kan Liang <[email protected]>

With PEBS virtualization, the PEBS record gets delivered to the guest,
but host still sees the PMI. This would normally result in a spurious
PEBS PMI that is ignored. But we need to inject the PMI into the guest,
so that the guest PMI handler can handle the PEBS record.

Check for this case in the perf PEBS handler. If a guest PEBS counter
overflowed, a fake event will be triggered. The fake event results in
calling the KVM PMI callback, which injects the PMI into the guest.

No matter how many PEBS counters are overflowed, only triggering one
fake event is enough. Then the guest handler would retrieve the correct
information from its own PEBS records including the guest state.

Originally-by: Andi Kleen <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/ds.c | 59 ++++++++++++++++++++++++++++++++++++++++++++++
1 file changed, 59 insertions(+)

diff --git a/arch/x86/events/intel/ds.c b/arch/x86/events/intel/ds.c
index dc43cc1..6722f39 100644
--- a/arch/x86/events/intel/ds.c
+++ b/arch/x86/events/intel/ds.c
@@ -1721,6 +1721,62 @@ void intel_pmu_auto_reload_read(struct perf_event *event)
return 0;
}

+/*
+ * We may be running with virtualized PEBS, so the PEBS record
+ * was logged into the guest's DS and is invisible to host.
+ *
+ * For guest-dedicated counters we always have to check if the
+ * counters are overflowed, because PEBS thresholds
+ * are not reported in the PERF_GLOBAL_STATUS.
+ *
+ * In this case we just trigger a fake event for KVM to forward
+ * to the guest as PMI. The guest will then see the real PEBS
+ * record and read the counter values.
+ *
+ * The contents of the event do not matter.
+ */
+static int intel_pmu_handle_guest_pebs(struct cpu_hw_events *cpuc,
+ struct pt_regs *iregs,
+ struct debug_store *ds)
+{
+ struct perf_sample_data data;
+ struct perf_event *event;
+ int bit;
+
+ /*
+ * Ideally, we should check guest DS to understand if the
+ * guest-dedicated PEBS counters are overflowed.
+ *
+ * However, it brings high overhead to retrieve guest DS in host.
+ * The host and guest cannot have pending PEBS events simultaneously.
+ * So we check host DS instead.
+ *
+ * If PEBS interrupt threshold on host is not exceeded in a NMI,
+ * the guest-dedicated counter must be overflowed.
+ */
+ if (!cpuc->intel_ctrl_guest_dedicated_mask || !in_nmi() ||
+ (ds->pebs_interrupt_threshold <= ds->pebs_index))
+ return 0;
+
+ for_each_set_bit(bit,
+ (unsigned long *)&cpuc->intel_ctrl_guest_dedicated_mask,
+ INTEL_PMC_IDX_FIXED + x86_pmu.num_counters_fixed) {
+
+ event = cpuc->events[bit];
+ if (!event->attr.precise_ip)
+ continue;
+
+ perf_sample_data_init(&data, 0, event->hw.last_period);
+ if (perf_event_overflow(event, &data, iregs))
+ x86_pmu_stop(event, 0);
+
+ /* Inject one fake event is enough. */
+ return 1;
+ }
+
+ return 0;
+}
+
static void __intel_pmu_pebs_event(struct perf_event *event,
struct pt_regs *iregs,
void *base, void *top,
@@ -1954,6 +2010,9 @@ static void intel_pmu_drain_pebs_icl(struct pt_regs *iregs)
if (!x86_pmu.pebs_active)
return;

+ if (intel_pmu_handle_guest_pebs(cpuc, iregs, ds))
+ return;
+
base = (struct pebs_basic *)(unsigned long)ds->pebs_buffer_base;
top = (struct pebs_basic *)(unsigned long)ds->pebs_index;

--
1.8.3.1

2020-03-05 10:00:54

by Luwei Kang

[permalink] [raw]
Subject: [PATCH v1 03/11] perf/x86: Expose a function to disable auto-reload

From: Kan Liang <[email protected]>

KVM needs to disable PEBS auto-reload for guest owned event to avoid
unnecessory drain_pebs() in host.

Expose a function to disable auto-reload mechanism by unset the related
flag. The function has to be invoked before event enabling, otherwise
there is no effect.

Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/events/intel/core.c | 14 ++++++++++++++
arch/x86/include/asm/perf_event.h | 2 ++
2 files changed, 16 insertions(+)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index ef95076..cd17601 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -3299,6 +3299,20 @@ static int core_pmu_hw_config(struct perf_event *event)
return intel_pmu_bts_config(event);
}

+/*
+ * Disable PEBS auto-reload mechanism by unset the flag.
+ * The function has to be invoked before event enabling,
+ * otherwise there is no effect.
+ *
+ * Currently, it's used by KVM to disable PEBS auto-reload
+ * for guest owned event.
+ */
+void perf_x86_pmu_unset_auto_reload(struct perf_event *event)
+{
+ event->hw.flags &= ~PERF_X86_EVENT_AUTO_RELOAD;
+}
+EXPORT_SYMBOL_GPL(perf_x86_pmu_unset_auto_reload);
+
static int intel_pmu_hw_config(struct perf_event *event)
{
int ret = x86_pmu_hw_config(event);
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 29964b0..6179234 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -325,6 +325,7 @@ struct perf_guest_switch_msr {
extern void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap);
extern void perf_check_microcode(void);
extern int x86_perf_rdpmc_index(struct perf_event *event);
+extern void perf_x86_pmu_unset_auto_reload(struct perf_event *event);
#else
static inline void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)
{
@@ -333,6 +334,7 @@ static inline void perf_get_x86_pmu_capability(struct x86_pmu_capability *cap)

static inline void perf_events_lapic_init(void) { }
static inline void perf_check_microcode(void) { }
+static inline void perf_x86_pmu_unset_auto_reload(struct perf_event *event) { }
#endif

#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
--
1.8.3.1

2020-03-05 10:01:11

by Luwei Kang

[permalink] [raw]
Subject: [PATCH v1 10/11] KVM: x86/pmu: Introduce the mask value for fixed counter

The mask value of fixed counter control register should be dynamic
adjust with the number of fixed counters. This patch introduce a
variable that include the reserved bits of fix counter control
register.

Signed-off-by: Luwei Kang <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/vmx/pmu_intel.c | 7 ++++++-
2 files changed, 7 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 35d230e..6f82fb7 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -471,6 +471,7 @@ struct kvm_pmu {
unsigned nr_arch_fixed_counters;
unsigned available_event_types;
u64 fixed_ctr_ctrl;
+ u64 fixed_ctr_ctrl_mask;
u64 global_ctrl;
u64 global_status;
u64 global_ovf_ctrl;
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 8161488..578b830 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -277,7 +277,7 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_CORE_PERF_FIXED_CTR_CTRL:
if (pmu->fixed_ctr_ctrl == data)
return 0;
- if (!(data & 0xfffffffffffff444ull)) {
+ if (!(data & pmu->fixed_ctr_ctrl_mask)) {
reprogram_fixed_counters(pmu, data);
return 0;
}
@@ -346,9 +346,11 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
struct kvm_cpuid_entry2 *entry;
union cpuid10_eax eax;
union cpuid10_edx edx;
+ int i;

pmu->nr_arch_gp_counters = 0;
pmu->nr_arch_fixed_counters = 0;
+ pmu->fixed_ctr_ctrl_mask = 0;
pmu->counter_bitmask[KVM_PMC_GP] = 0;
pmu->counter_bitmask[KVM_PMC_FIXED] = 0;
pmu->version = 0;
@@ -383,6 +385,9 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
((u64)1 << edx.split.bit_width_fixed) - 1;
}

+ for (i = 0; i < pmu->nr_arch_fixed_counters; i++)
+ pmu->fixed_ctr_ctrl_mask |= (0xbull << (i * 4));
+ pmu->fixed_ctr_ctrl_mask = ~pmu->fixed_ctr_ctrl_mask;
pmu->global_ctrl = ((1ull << pmu->nr_arch_gp_counters) - 1) |
(((1ull << pmu->nr_arch_fixed_counters) - 1) << INTEL_PMC_IDX_FIXED);
pmu->global_ctrl_mask = ~pmu->global_ctrl;
--
1.8.3.1

2020-03-05 10:01:13

by Luwei Kang

[permalink] [raw]
Subject: [PATCH v1 07/11] KVM: x86/pmu: Expose CPUIDs feature bits PDCM, DS, DTES64

The CPUID features PDCM, DS and DTES64 are required for PEBS feature.
This patch expose CPUID feature PDCM, DS and DTES64 to guest when PEBS
is supported in KVM.

Originally-by: Andi Kleen <[email protected]>
Signed-off-by: Luwei Kang <[email protected]>
Co-developed-by: Kan Liang <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 2 ++
arch/x86/kvm/cpuid.c | 9 ++++++---
arch/x86/kvm/svm.c | 12 ++++++++++++
arch/x86/kvm/vmx/capabilities.h | 17 +++++++++++++++++
arch/x86/kvm/vmx/vmx.c | 2 ++
5 files changed, 39 insertions(+), 3 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 83abb49..033d9f9 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1180,6 +1180,8 @@ struct kvm_x86_ops {
bool (*umip_emulated)(void);
bool (*pt_supported)(void);
bool (*pku_supported)(void);
+ bool (*pdcm_supported)(void);
+ bool (*dtes64_supported)(void);

int (*check_nested_events)(struct kvm_vcpu *vcpu, bool external_intr);
void (*request_immediate_exit)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index b1c4694..92dabf3 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -446,6 +446,9 @@ static inline int __do_cpuid_func(struct kvm_cpuid_entry2 *entry, u32 function,
unsigned f_rdtscp = kvm_x86_ops->rdtscp_supported() ? F(RDTSCP) : 0;
unsigned f_xsaves = kvm_x86_ops->xsaves_supported() ? F(XSAVES) : 0;
unsigned f_intel_pt = kvm_x86_ops->pt_supported() ? F(INTEL_PT) : 0;
+ unsigned int f_pdcm = kvm_x86_ops->pdcm_supported() ? F(PDCM) : 0;
+ unsigned int f_ds = kvm_x86_ops->dtes64_supported() ? F(DS) : 0;
+ unsigned int f_dtes64 = kvm_x86_ops->dtes64_supported() ? F(DTES64) : 0;

/* cpuid 1.edx */
const u32 kvm_cpuid_1_edx_x86_features =
@@ -454,7 +457,7 @@ static inline int __do_cpuid_func(struct kvm_cpuid_entry2 *entry, u32 function,
F(CX8) | F(APIC) | 0 /* Reserved */ | F(SEP) |
F(MTRR) | F(PGE) | F(MCA) | F(CMOV) |
F(PAT) | F(PSE36) | 0 /* PSN */ | F(CLFLUSH) |
- 0 /* Reserved, DS, ACPI */ | F(MMX) |
+ 0 /* Reserved */ | f_ds | 0 /* ACPI */ | F(MMX) |
F(FXSR) | F(XMM) | F(XMM2) | F(SELFSNOOP) |
0 /* HTT, TM, Reserved, PBE */;
/* cpuid 0x80000001.edx */
@@ -471,10 +474,10 @@ static inline int __do_cpuid_func(struct kvm_cpuid_entry2 *entry, u32 function,
const u32 kvm_cpuid_1_ecx_x86_features =
/* NOTE: MONITOR (and MWAIT) are emulated as NOP,
* but *not* advertised to guests via CPUID ! */
- F(XMM3) | F(PCLMULQDQ) | 0 /* DTES64, MONITOR */ |
+ F(XMM3) | F(PCLMULQDQ) | f_dtes64 | 0 /* MONITOR */ |
0 /* DS-CPL, VMX, SMX, EST */ |
0 /* TM2 */ | F(SSSE3) | 0 /* CNXT-ID */ | 0 /* Reserved */ |
- F(FMA) | F(CX16) | 0 /* xTPR Update, PDCM */ |
+ F(FMA) | F(CX16) | 0 /* xTPR Update */ | f_pdcm |
F(PCID) | 0 /* Reserved, DCA */ | F(XMM4_1) |
F(XMM4_2) | F(X2APIC) | F(MOVBE) | F(POPCNT) |
0 /* Reserved*/ | F(AES) | F(XSAVE) | 0 /* OSXSAVE */ | F(AVX) |
diff --git a/arch/x86/kvm/svm.c b/arch/x86/kvm/svm.c
index 24c0b2b..984ab6c 100644
--- a/arch/x86/kvm/svm.c
+++ b/arch/x86/kvm/svm.c
@@ -6123,6 +6123,16 @@ static bool svm_pku_supported(void)
return false;
}

+static bool svm_pdcm_supported(void)
+{
+ return false;
+}
+
+static bool svm_dtes64_supported(void)
+{
+ return false;
+}
+
#define PRE_EX(exit) { .exit_code = (exit), \
.stage = X86_ICPT_PRE_EXCEPT, }
#define POST_EX(exit) { .exit_code = (exit), \
@@ -7485,6 +7495,8 @@ static void svm_pre_update_apicv_exec_ctrl(struct kvm *kvm, bool activate)
.umip_emulated = svm_umip_emulated,
.pt_supported = svm_pt_supported,
.pku_supported = svm_pku_supported,
+ .pdcm_supported = svm_pdcm_supported,
+ .dtes64_supported = svm_dtes64_supported,

.set_supported_cpuid = svm_set_supported_cpuid,

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index f486e26..9e352b5 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -5,6 +5,7 @@
#include <asm/vmx.h>

#include "lapic.h"
+#include "pmu.h"

extern bool __read_mostly enable_vpid;
extern bool __read_mostly flexpriority_enabled;
@@ -151,6 +152,22 @@ static inline bool vmx_pku_supported(void)
return boot_cpu_has(X86_FEATURE_PKU);
}

+static inline bool vmx_pdcm_supported(void)
+{
+ if (kvm_x86_ops->pmu_ops->is_pebs_via_ds_supported)
+ return kvm_x86_ops->pmu_ops->is_pebs_via_ds_supported();
+
+ return false;
+}
+
+static inline bool vmx_dtes64_supported(void)
+{
+ if (kvm_x86_ops->pmu_ops->is_pebs_via_ds_supported)
+ return kvm_x86_ops->pmu_ops->is_pebs_via_ds_supported();
+
+ return false;
+}
+
static inline bool cpu_has_vmx_rdtscp(void)
{
return vmcs_config.cpu_based_2nd_exec_ctrl &
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index 40b1e61..cef7089 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -7951,6 +7951,8 @@ static bool vmx_check_apicv_inhibit_reasons(ulong bit)
.umip_emulated = vmx_umip_emulated,
.pt_supported = vmx_pt_supported,
.pku_supported = vmx_pku_supported,
+ .pdcm_supported = vmx_pdcm_supported,
+ .dtes64_supported = vmx_dtes64_supported,

.request_immediate_exit = vmx_request_immediate_exit,

--
1.8.3.1

2020-03-05 10:01:45

by Luwei Kang

[permalink] [raw]
Subject: [PATCH v1 08/11] KVM: x86/pmu: PEBS MSRs emulation

This patch implement the PEBS MSRs emulation in KVM, include
IA32_PEBS_ENABLE and IA32_DS_AREA.

The IA32_DS_AREA register will be added into the MSR-load list when PEBS is
enabled in KVM guest that to make the guest's DS value can be loaded to the
real HW before VM-entry, and will be removed when PEBS is disabled.

Originally-by: Andi Kleen <[email protected]>
Signed-off-by: Luwei Kang <[email protected]>
Co-developed-by: Kan Liang <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 4 ++++
arch/x86/kvm/vmx/pmu_intel.c | 45 +++++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/vmx/vmx.c | 4 ++--
arch/x86/kvm/vmx/vmx.h | 4 ++++
arch/x86/kvm/x86.c | 7 +++++++
5 files changed, 62 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 033d9f9..33b990b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -479,6 +479,8 @@ struct kvm_pmu {
u64 global_ovf_ctrl_mask;
u64 reserved_bits;
u64 pebs_enable;
+ u64 pebs_enable_mask;
+ u64 ds_area;
u8 version;
struct kvm_pmc gp_counters[INTEL_PMC_MAX_GENERIC];
struct kvm_pmc fixed_counters[INTEL_PMC_MAX_FIXED];
@@ -493,6 +495,8 @@ struct kvm_pmu {
*/
bool need_cleanup;

+ bool has_pebs_via_ds;
+
/*
* The total number of programmed perf_events and it helps to avoid
* redundant check before cleanup if guest don't use vPMU at all.
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index a67bd34..227589a 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -67,6 +67,21 @@ static void global_ctrl_changed(struct kvm_pmu *pmu, u64 data)
reprogram_counter(pmu, bit);
}

+static void pebs_enable_changed(struct kvm_pmu *pmu, u64 data)
+{
+ struct vcpu_vmx *vmx = to_vmx(pmu_to_vcpu(pmu));
+ u64 host_ds_area;
+
+ if (data) {
+ rdmsrl_safe(MSR_IA32_DS_AREA, &host_ds_area);
+ add_atomic_switch_msr(vmx, MSR_IA32_DS_AREA,
+ pmu->ds_area, host_ds_area, false);
+ } else
+ clear_atomic_switch_msr(vmx, MSR_IA32_DS_AREA);
+
+ pmu->pebs_enable = data;
+}
+
static unsigned intel_find_arch_event(struct kvm_pmu *pmu,
u8 event_select,
u8 unit_mask)
@@ -163,6 +178,10 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
ret = pmu->version > 1;
break;
+ case MSR_IA32_DS_AREA:
+ case MSR_IA32_PEBS_ENABLE:
+ ret = pmu->has_pebs_via_ds;
+ break;
default:
ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
@@ -219,6 +238,12 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *data)
case MSR_CORE_PERF_GLOBAL_OVF_CTRL:
*data = pmu->global_ovf_ctrl;
return 0;
+ case MSR_IA32_PEBS_ENABLE:
+ *data = pmu->pebs_enable;
+ return 0;
+ case MSR_IA32_DS_AREA:
+ *data = pmu->ds_area;
+ return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0))) {
u64 val = pmc_read_counter(pmc);
@@ -275,6 +300,17 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 0;
}
break;
+ case MSR_IA32_PEBS_ENABLE:
+ if (pmu->pebs_enable == data)
+ return 0;
+ if (!(data & pmu->pebs_enable_mask)) {
+ pebs_enable_changed(pmu, data);
+ return 0;
+ }
+ break;
+ case MSR_IA32_DS_AREA:
+ pmu->ds_area = data;
+ return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0))) {
if (!msr_info->host_initiated)
@@ -351,6 +387,15 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->global_ovf_ctrl_mask &=
~MSR_CORE_PERF_GLOBAL_OVF_CTRL_TRACE_TOPA_PMI;

+ entry = kvm_find_cpuid_entry(vcpu, 1, 0);
+ if (entry && (entry->ecx & X86_FEATURE_DTES64) &&
+ (entry->ecx & X86_FEATURE_PDCM) &&
+ (entry->edx & X86_FEATURE_DS) &&
+ intel_is_pebs_via_ds_supported()) {
+ pmu->has_pebs_via_ds = 1;
+ pmu->pebs_enable_mask = ~pmu->global_ctrl;
+ }
+
entry = kvm_find_cpuid_entry(vcpu, 7, 0);
if (entry &&
(boot_cpu_has(X86_FEATURE_HLE) || boot_cpu_has(X86_FEATURE_RTM)) &&
diff --git a/arch/x86/kvm/vmx/vmx.c b/arch/x86/kvm/vmx/vmx.c
index cef7089..c6d9a87 100644
--- a/arch/x86/kvm/vmx/vmx.c
+++ b/arch/x86/kvm/vmx/vmx.c
@@ -864,7 +864,7 @@ int vmx_find_msr_index(struct vmx_msrs *m, u32 msr)
return -ENOENT;
}

-static void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
+void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr)
{
int i;
struct msr_autoload *m = &vmx->msr_autoload;
@@ -916,7 +916,7 @@ static void add_atomic_switch_msr_special(struct vcpu_vmx *vmx,
vm_exit_controls_setbit(vmx, exit);
}

-static void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
+void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
u64 guest_val, u64 host_val, bool entry_only)
{
int i, j = 0;
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index e64da06..ea899e7 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -536,4 +536,8 @@ static inline bool vmx_has_waitpkg(struct vcpu_vmx *vmx)

void dump_vmcs(void);

+extern void clear_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr);
+extern void add_atomic_switch_msr(struct vcpu_vmx *vmx, unsigned msr,
+ u64 guest_val, u64 host_val, bool entry_only);
+
#endif /* __KVM_X86_VMX_H */
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5de2006..7a23406 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1193,6 +1193,7 @@ bool kvm_rdpmc(struct kvm_vcpu *vcpu)
MSR_ARCH_PERFMON_EVENTSEL0 + 12, MSR_ARCH_PERFMON_EVENTSEL0 + 13,
MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
+ MSR_IA32_PEBS_ENABLE, MSR_IA32_DS_AREA,
};

static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)];
@@ -5263,6 +5264,12 @@ static void kvm_init_msr_list(void)
if (!kvm_x86_ops->rdtscp_supported())
continue;
break;
+ case MSR_IA32_PEBS_ENABLE:
+ case MSR_IA32_DS_AREA:
+ if (!kvm_x86_ops->pmu_ops ||
+ !kvm_x86_ops->pmu_ops->is_pebs_via_ds_supported())
+ continue;
+ break;
case MSR_IA32_RTIT_CTL:
case MSR_IA32_RTIT_STATUS:
if (!kvm_x86_ops->pt_supported())
--
1.8.3.1

2020-03-05 10:01:51

by Luwei Kang

[permalink] [raw]
Subject: [PATCH v1 09/11] KVM: x86/pmu: Expose PEBS feature to guest

This patch exposed some bits of MSRs to KVM guest which realate with PEBS
feature. It include some bits in IA32_PERF_CAPABILITIES and IA32_MISC_ENABLE.

Originally-by: Andi Kleen <[email protected]>
Signed-off-by: Luwei Kang <[email protected]>
Co-developed-by: Kan Liang <[email protected]>
Signed-off-by: Kan Liang <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/include/asm/msr-index.h | 3 +++
arch/x86/kvm/vmx/pmu_intel.c | 15 +++++++++++++++
arch/x86/kvm/x86.c | 6 +++++-
4 files changed, 24 insertions(+), 1 deletion(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 33b990b..35d230e 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -481,6 +481,7 @@ struct kvm_pmu {
u64 pebs_enable;
u64 pebs_enable_mask;
u64 ds_area;
+ u64 perf_cap;
u8 version;
struct kvm_pmc gp_counters[INTEL_PMC_MAX_GENERIC];
struct kvm_pmc fixed_counters[INTEL_PMC_MAX_FIXED];
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index d5e517d..2bf66e9 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -151,6 +151,9 @@
#define MSR_PEBS_DATA_CFG 0x000003f2
#define MSR_IA32_DS_AREA 0x00000600
#define MSR_IA32_PERF_CAPABILITIES 0x00000345
+#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
+#define PERF_CAP_ARCH_REG BIT_ULL(7)
+#define PERF_CAP_PEBS_FORMAT 0xf00
#define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6

#define MSR_IA32_RTIT_CTL 0x00000570
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 227589a..8161488 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -180,6 +180,7 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
break;
case MSR_IA32_DS_AREA:
case MSR_IA32_PEBS_ENABLE:
+ case MSR_IA32_PERF_CAPABILITIES:
ret = pmu->has_pebs_via_ds;
break;
default:
@@ -244,6 +245,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *data)
case MSR_IA32_DS_AREA:
*data = pmu->ds_area;
return 0;
+ case MSR_IA32_PERF_CAPABILITIES:
+ *data = pmu->perf_cap;
+ return 0;
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0))) {
u64 val = pmc_read_counter(pmc);
@@ -311,6 +315,8 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_DS_AREA:
pmu->ds_area = data;
return 0;
+ case MSR_IA32_PERF_CAPABILITIES:
+ break; /* RO MSR */
default:
if ((pmc = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0))) {
if (!msr_info->host_initiated)
@@ -396,6 +402,15 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->pebs_enable_mask = ~pmu->global_ctrl;
}

+ if (pmu->has_pebs_via_ds) {
+ u64 perf_cap;
+
+ rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap);
+ pmu->perf_cap = (perf_cap & (PERF_CAP_PEBS_TRAP |
+ PERF_CAP_ARCH_REG |
+ PERF_CAP_PEBS_FORMAT));
+ }
+
entry = kvm_find_cpuid_entry(vcpu, 7, 0);
if (entry &&
(boot_cpu_has(X86_FEATURE_HLE) || boot_cpu_has(X86_FEATURE_RTM)) &&
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 7a23406..5ab8447 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -3086,7 +3086,11 @@ int kvm_get_msr_common(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
msr_info->data = (u64)vcpu->arch.ia32_tsc_adjust_msr;
break;
case MSR_IA32_MISC_ENABLE:
- msr_info->data = vcpu->arch.ia32_misc_enable_msr;
+ if (vcpu_to_pmu(vcpu)->has_pebs_via_ds)
+ msr_info->data = (vcpu->arch.ia32_misc_enable_msr &
+ ~MSR_IA32_MISC_ENABLE_PEBS_UNAVAIL);
+ else
+ msr_info->data = vcpu->arch.ia32_misc_enable_msr;
break;
case MSR_IA32_SMBASE:
if (!msr_info->host_initiated)
--
1.8.3.1

2020-03-05 10:02:05

by Luwei Kang

[permalink] [raw]
Subject: [PATCH v1 11/11] KVM: x86/pmu: Adaptive PEBS virtualization enabling

The PEBS feature enabled the collection of the GPRs, eventing IP, TSC and
memory access related information. On Icelake, it has been enhanced to collect
more CPU state information like XMM register values, and LBR To and FROM
addresses as per customer usage requests. With the addition of these new
groups of data, the PEBS record size is greatly increased. Adaptive PEBS
provides Software the capability to configure the PEBS records to capture
only the data of interest, keeping the record size compact. By default, the
PEBS record will only contain the Basic group. Optionally, each counter can
be configured to generate a PEBS records with the groups specified in
MSR_PEBS_DATA_CFG.

This patch implement the adaptive PEBS virtualization enabling in
KVM guest, include feature detection, MSRs emulation, expose
capability.

Signed-off-by: Luwei Kang <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 3 +++
arch/x86/include/asm/msr-index.h | 1 +
arch/x86/kvm/pmu.h | 1 +
arch/x86/kvm/vmx/pmu_intel.c | 46 ++++++++++++++++++++++++++++++++++++++--
arch/x86/kvm/x86.c | 6 ++++++
5 files changed, 55 insertions(+), 2 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 6f82fb7..7b0a023 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -482,6 +482,8 @@ struct kvm_pmu {
u64 pebs_enable;
u64 pebs_enable_mask;
u64 ds_area;
+ u64 pebs_data_cfg;
+ u64 pebs_data_cfg_mask;
u64 perf_cap;
u8 version;
struct kvm_pmc gp_counters[INTEL_PMC_MAX_GENERIC];
@@ -498,6 +500,7 @@ struct kvm_pmu {
bool need_cleanup;

bool has_pebs_via_ds;
+ bool has_pebs_adaptive;

/*
* The total number of programmed perf_events and it helps to avoid
diff --git a/arch/x86/include/asm/msr-index.h b/arch/x86/include/asm/msr-index.h
index 2bf66e9..d3d6e48 100644
--- a/arch/x86/include/asm/msr-index.h
+++ b/arch/x86/include/asm/msr-index.h
@@ -154,6 +154,7 @@
#define PERF_CAP_PEBS_TRAP BIT_ULL(6)
#define PERF_CAP_ARCH_REG BIT_ULL(7)
#define PERF_CAP_PEBS_FORMAT 0xf00
+#define PERF_CAP_PEBS_BASELINE BIT_ULL(14)
#define MSR_PEBS_LD_LAT_THRESHOLD 0x000003f6

#define MSR_IA32_RTIT_CTL 0x00000570
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 476780b..9de6ef1 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -33,6 +33,7 @@ struct kvm_pmu_ops {
int (*is_valid_rdpmc_ecx)(struct kvm_vcpu *vcpu, unsigned int idx);
bool (*is_valid_msr)(struct kvm_vcpu *vcpu, u32 msr);
bool (*is_pebs_via_ds_supported)(void);
+ bool (*is_pebs_baseline_supported)(void);
int (*get_msr)(struct kvm_vcpu *vcpu, u32 msr, u64 *data);
int (*set_msr)(struct kvm_vcpu *vcpu, struct msr_data *msr_info);
void (*refresh)(struct kvm_vcpu *vcpu);
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 578b830..6a0eef3 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -70,14 +70,21 @@ static void global_ctrl_changed(struct kvm_pmu *pmu, u64 data)
static void pebs_enable_changed(struct kvm_pmu *pmu, u64 data)
{
struct vcpu_vmx *vmx = to_vmx(pmu_to_vcpu(pmu));
- u64 host_ds_area;
+ u64 host_ds_area, host_pebs_data_cfg;

if (data) {
rdmsrl_safe(MSR_IA32_DS_AREA, &host_ds_area);
add_atomic_switch_msr(vmx, MSR_IA32_DS_AREA,
pmu->ds_area, host_ds_area, false);
- } else
+
+ rdmsrl_safe(MSR_PEBS_DATA_CFG, &host_pebs_data_cfg);
+ add_atomic_switch_msr(vmx, MSR_PEBS_DATA_CFG,
+ pmu->pebs_data_cfg, host_pebs_data_cfg, false);
+
+ } else {
clear_atomic_switch_msr(vmx, MSR_IA32_DS_AREA);
+ clear_atomic_switch_msr(vmx, MSR_PEBS_DATA_CFG);
+ }

pmu->pebs_enable = data;
}
@@ -183,6 +190,9 @@ static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
case MSR_IA32_PERF_CAPABILITIES:
ret = pmu->has_pebs_via_ds;
break;
+ case MSR_PEBS_DATA_CFG:
+ ret = pmu->has_pebs_adaptive;
+ break;
default:
ret = get_gp_pmc(pmu, msr, MSR_IA32_PERFCTR0) ||
get_gp_pmc(pmu, msr, MSR_P6_EVNTSEL0) ||
@@ -209,6 +219,18 @@ static bool intel_is_pebs_via_ds_supported(void)
return true;
}

+static bool intel_is_pebs_baseline_supported(void)
+{
+ u64 perf_cap;
+
+ rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap);
+ if (intel_is_pebs_via_ds_supported() &&
+ (perf_cap & PERF_CAP_PEBS_BASELINE))
+ return true;
+
+ return false;
+}
+
static struct kvm_pmc *intel_msr_idx_to_pmc(struct kvm_vcpu *vcpu, u32 msr)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -245,6 +267,9 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, u32 msr, u64 *data)
case MSR_IA32_DS_AREA:
*data = pmu->ds_area;
return 0;
+ case MSR_PEBS_DATA_CFG:
+ *data = pmu->pebs_data_cfg;
+ return 0;
case MSR_IA32_PERF_CAPABILITIES:
*data = pmu->perf_cap;
return 0;
@@ -315,6 +340,12 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
case MSR_IA32_DS_AREA:
pmu->ds_area = data;
return 0;
+ case MSR_PEBS_DATA_CFG:
+ if (!(data & pmu->pebs_data_cfg_mask)) {
+ pmu->pebs_data_cfg = data;
+ return 0;
+ }
+ break;
case MSR_IA32_PERF_CAPABILITIES:
break; /* RO MSR */
default:
@@ -414,6 +445,16 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->perf_cap = (perf_cap & (PERF_CAP_PEBS_TRAP |
PERF_CAP_ARCH_REG |
PERF_CAP_PEBS_FORMAT));
+
+ if (perf_cap & PERF_CAP_PEBS_BASELINE) {
+ pmu->has_pebs_adaptive = 1;
+ pmu->perf_cap |= PERF_CAP_PEBS_BASELINE;
+ pmu->pebs_data_cfg_mask = ~0xff00000full;
+ pmu->reserved_bits &= ~ICL_EVENTSEL_ADAPTIVE;
+ for (i = 0; i < pmu->nr_arch_fixed_counters; i++)
+ pmu->fixed_ctr_ctrl_mask &= ~(1ULL <<
+ (INTEL_PMC_IDX_FIXED + i * 4));
+ }
}

entry = kvm_find_cpuid_entry(vcpu, 7, 0);
@@ -484,6 +525,7 @@ struct kvm_pmu_ops intel_pmu_ops = {
.is_valid_rdpmc_ecx = intel_is_valid_rdpmc_ecx,
.is_valid_msr = intel_is_valid_msr,
.is_pebs_via_ds_supported = intel_is_pebs_via_ds_supported,
+ .is_pebs_baseline_supported = intel_is_pebs_baseline_supported,
.get_msr = intel_pmu_get_msr,
.set_msr = intel_pmu_set_msr,
.refresh = intel_pmu_refresh,
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 5ab8447..aa1344b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -1194,6 +1194,7 @@ bool kvm_rdpmc(struct kvm_vcpu *vcpu)
MSR_ARCH_PERFMON_EVENTSEL0 + 14, MSR_ARCH_PERFMON_EVENTSEL0 + 15,
MSR_ARCH_PERFMON_EVENTSEL0 + 16, MSR_ARCH_PERFMON_EVENTSEL0 + 17,
MSR_IA32_PEBS_ENABLE, MSR_IA32_DS_AREA,
+ MSR_PEBS_DATA_CFG,
};

static u32 msrs_to_save[ARRAY_SIZE(msrs_to_save_all)];
@@ -5274,6 +5275,11 @@ static void kvm_init_msr_list(void)
!kvm_x86_ops->pmu_ops->is_pebs_via_ds_supported())
continue;
break;
+ case MSR_PEBS_DATA_CFG:
+ if (!kvm_x86_ops->pmu_ops ||
+ !kvm_x86_ops->pmu_ops->is_pebs_baseline_supported())
+ continue;
+ break;
case MSR_IA32_RTIT_CTL:
case MSR_IA32_RTIT_STATUS:
if (!kvm_x86_ops->pt_supported())
--
1.8.3.1

2020-03-05 16:53:13

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v1 00/11] PEBS virtualization enabling via DS

On 05/03/20 18:56, Luwei Kang wrote:
> BTW:
> The PEBS virtualization via Intel PT patchset V1 has been posted out and the
> later version will base on this patchset.
> https://lkml.kernel.org/r/[email protected]/

Thanks, I'll review both.

Paolo

2020-03-05 22:50:20

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v1 00/11] PEBS virtualization enabling via DS

> Testing:
> The guest can use PEBS feature like native. e.g.

Could you please add example qemu command lines too? That will make it much easier
for someone to reproduce.

-Andi
>
> # perf record -e instructions:ppp ./br_instr a
>
> perf report on guest:
> # Samples: 2K of event 'instructions:ppp', # Event count (approx.): 1473377250
> # Overhead Command Shared Object Symbol
> 57.74% br_instr br_instr [.] lfsr_cond
> 41.40% br_instr br_instr [.] cmp_end
> 0.21% br_instr [kernel.kallsyms] [k] __lock_acquire
>
> perf report on host:
> # Samples: 2K of event 'instructions:ppp', # Event count (approx.): 1462721386
> # Overhead Command Shared Object Symbol
> 57.90% br_instr br_instr [.] lfsr_cond
> 41.95% br_instr br_instr [.] cmp_end
> 0.05% br_instr [kernel.vmlinux] [k] lock_acquire

2020-03-06 05:38:05

by Luwei Kang

[permalink] [raw]
Subject: RE: [PATCH v1 00/11] PEBS virtualization enabling via DS

> Subject: Re: [PATCH v1 00/11] PEBS virtualization enabling via DS
>
> > Testing:
> > The guest can use PEBS feature like native. e.g.
>
> Could you please add example qemu command lines too? That will make it
> much easier for someone to reproduce.

I introduce a new CPU parameter "pebs" to enable PEBS in KVM guest(disabled in default)
e.g. "qemu-system-x86_64 -enable-kvm -M q35 -cpu Icelake-Server,pmu=true,pebs=true ...."

[PATCH v1 0/3] PEBS virtualization enabling via DS in Qemu
https://lore.kernel.org/qemu-devel/[email protected]/

Thanks,
Luwei Kang

>
> -Andi
> >
> > # perf record -e instructions:ppp ./br_instr a
> >
> > perf report on guest:
> > # Samples: 2K of event 'instructions:ppp', # Event count (approx.):
> 1473377250
> > # Overhead Command Shared Object Symbol
> > 57.74% br_instr br_instr [.] lfsr_cond
> > 41.40% br_instr br_instr [.] cmp_end
> > 0.21% br_instr [kernel.kallsyms] [k] __lock_acquire
> >
> > perf report on host:
> > # Samples: 2K of event 'instructions:ppp', # Event count (approx.):
> 1462721386
> > # Overhead Command Shared Object Symbol
> > 57.90% br_instr br_instr [.] lfsr_cond
> > 41.95% br_instr br_instr [.] cmp_end
> > 0.05% br_instr [kernel.vmlinux] [k] lock_acquire