2021-11-12 09:51:54

by Like Xu

[permalink] [raw]
Subject: [PATCH 0/7] KVM: x86/pmu: Four functional fixes

Hi,

The first one (patch 01) is to fix my childish code about the disallowed
fixed ctr3.

The second one (patch 02) is to fix the aged inconsistent behaviour
about CPUID 0AH.EBX.

The third one (patch 03/04) is to avoid perf_event creation for
unavailable Intel CPUID events.

Finally a new way is proposed to
fix amd_event_mapping[] for new AMD platforms.

Please check each commit message for more details
and let me know if there is any room for improvement,

Thanks.

Like Xu (7):
KVM: x86/pmu: Make top-down.slots event unavailable in supported leaf
KVM: x86/pmu: Fix available_event_types check for REF_CPU_CYCLES event
KVM: x86/pmu: Pass "struct kvm_pmu *" to the find_fixed_event()
KVM: x86/pmu: Avoid perf_event creation for invalid counter config
KVM: x86/pmu: Refactor pmu->available_event_types field using BITMAP
perf: x86/core: Add interface to query perfmon_event_map[] directly
KVM: x86/pmu: Setup the {inte|amd}_event_mapping[] when hardware_setup

arch/x86/events/core.c | 9 +++
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/include/asm/perf_event.h | 5 ++
arch/x86/kvm/cpuid.c | 14 ++++
arch/x86/kvm/pmu.c | 35 +++++++++-
arch/x86/kvm/pmu.h | 4 +-
arch/x86/kvm/svm/pmu.c | 24 ++-----
arch/x86/kvm/vmx/pmu_intel.c | 106 +++++++++++++++++++++++-------
arch/x86/kvm/x86.c | 1 +
9 files changed, 153 insertions(+), 47 deletions(-)

--
2.33.0



2021-11-12 09:51:56

by Like Xu

[permalink] [raw]
Subject: [PATCH 1/7] KVM: x86/pmu: Make top-down.slots event unavailable in supported leaf

From: Like Xu <[email protected]>

When we choose to disable the fourth fixed counter TOPDOWN.SLOTS,
we need to also reduce the length of the 0AH.EBX bit vector, which
enumerates architecture performance monitoring events, and set
0AH.EBX.[bit 7] to 1 if the new value of EAX[31:24] is still > 7.

Fixes: 2e8cd7a3b8287 ("kvm: x86: limit the maximum number of vPMU fixed counters to 3")
Signed-off-by: Like Xu <[email protected]>
---
arch/x86/kvm/cpuid.c | 14 ++++++++++++++
1 file changed, 14 insertions(+)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index 2d70edb0f323..bbf8cf3f43b0 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -746,6 +746,20 @@ static inline int __do_cpuid_func(struct kvm_cpuid_array *array, u32 function)
eax.split.mask_length = cap.events_mask_len;

edx.split.num_counters_fixed = min(cap.num_counters_fixed, MAX_FIXED_COUNTERS);
+
+ /*
+ * The 8th architectural event (top-down slots) will be supported
+ * if the 4th fixed counter exists && EAX[31:24] > 7 && EBX[7] = 0.
+ *
+ * For now, KVM needs to make this event unavailable.
+ */
+ if (edx.split.num_counters_fixed < 4) {
+ if (eax.split.mask_length > 7)
+ eax.split.mask_length--;
+ if (eax.split.mask_length > 7)
+ cap.events_mask |= BIT_ULL(7);
+ }
+
edx.split.bit_width_fixed = cap.bit_width_fixed;
if (cap.version)
edx.split.anythread_deprecated = 1;
--
2.33.0


2021-11-12 09:51:58

by Like Xu

[permalink] [raw]
Subject: [PATCH 2/7] KVM: x86/pmu: Fix available_event_types check for REF_CPU_CYCLES event

From: Like Xu <[email protected]>

For the CPUID 0x0A.EBX bit vector, the [7] event should be the Intel
unrealized architectural performance events "Topdown Slots" instead
of the *kernel* generalized common hardware event "REF_CPU_CYCLES", so
we can skip the cpuid unavaliblity check in the intel_find_arch_event()
for the last REF_CPU_CYCLES event and update the confusing comment.

Fixes: 62079d8a43128 ("KVM: PMU: add proper support for fixed counter 2")
Signed-off-by: Like Xu <[email protected]>
---
arch/x86/kvm/vmx/pmu_intel.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index b8e0d21b7c8a..bc6845265362 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -21,7 +21,6 @@
#define MSR_PMC_FULL_WIDTH_BIT (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)

static struct kvm_event_hw_type_mapping intel_arch_events[] = {
- /* Index must match CPUID 0x0A.EBX bit vector */
[0] = { 0x3c, 0x00, PERF_COUNT_HW_CPU_CYCLES },
[1] = { 0xc0, 0x00, PERF_COUNT_HW_INSTRUCTIONS },
[2] = { 0x3c, 0x01, PERF_COUNT_HW_BUS_CYCLES },
@@ -29,6 +28,7 @@ static struct kvm_event_hw_type_mapping intel_arch_events[] = {
[4] = { 0x2e, 0x41, PERF_COUNT_HW_CACHE_MISSES },
[5] = { 0xc4, 0x00, PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
[6] = { 0xc5, 0x00, PERF_COUNT_HW_BRANCH_MISSES },
+ /* The above index must match CPUID 0x0A.EBX bit vector */
[7] = { 0x00, 0x03, PERF_COUNT_HW_REF_CPU_CYCLES },
};

@@ -75,9 +75,9 @@ static unsigned intel_find_arch_event(struct kvm_pmu *pmu,
int i;

for (i = 0; i < ARRAY_SIZE(intel_arch_events); i++)
- if (intel_arch_events[i].eventsel == event_select
- && intel_arch_events[i].unit_mask == unit_mask
- && (pmu->available_event_types & (1 << i)))
+ if (intel_arch_events[i].eventsel == event_select &&
+ intel_arch_events[i].unit_mask == unit_mask &&
+ ((i > 6) || pmu->available_event_types & (1 << i)))
break;

if (i == ARRAY_SIZE(intel_arch_events))
--
2.33.0


2021-11-12 09:52:05

by Like Xu

[permalink] [raw]
Subject: [PATCH 3/7] KVM: x86/pmu: Pass "struct kvm_pmu *" to the find_fixed_event()

From: Like Xu <[email protected]>

The KVM userspace may make some hw events (including cpu-cycles,
instruction, ref-cpu-cycles) not work properly by marking bits in the
guest CPUID 0AH.EBX leaf, but these counters will still be accessible.

As a preliminary preparation, this part of the check depends on the
access to the pmu->available_event_types value in the find_fixed_event
as well as find_arch_event().

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/kvm/pmu.c | 3 ++-
arch/x86/kvm/pmu.h | 2 +-
arch/x86/kvm/svm/pmu.c | 2 +-
arch/x86/kvm/vmx/pmu_intel.c | 2 +-
4 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 0772bad9165c..7093fc70cd38 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -245,6 +245,7 @@ void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)
bool pmi = ctrl & 0x8;
struct kvm_pmu_event_filter *filter;
struct kvm *kvm = pmc->vcpu->kvm;
+ struct kvm_pmu *pmu = pmc_to_pmu(pmc);

pmc_pause_counter(pmc);

@@ -268,7 +269,7 @@ void reprogram_fixed_counter(struct kvm_pmc *pmc, u8 ctrl, int idx)

pmc->current_config = (u64)ctrl;
pmc_reprogram_counter(pmc, PERF_TYPE_HARDWARE,
- kvm_x86_ops.pmu_ops->find_fixed_event(idx),
+ kvm_x86_ops.pmu_ops->find_fixed_event(pmu, idx),
!(en_field & 0x2), /* exclude user */
!(en_field & 0x1), /* exclude kernel */
pmi, false, false);
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index 0e4f2b1fa9fb..fe29537b1343 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -26,7 +26,7 @@ struct kvm_event_hw_type_mapping {
struct kvm_pmu_ops {
unsigned (*find_arch_event)(struct kvm_pmu *pmu, u8 event_select,
u8 unit_mask);
- unsigned (*find_fixed_event)(int idx);
+ unsigned int (*find_fixed_event)(struct kvm_pmu *pmu, int idx);
bool (*pmc_is_enabled)(struct kvm_pmc *pmc);
struct kvm_pmc *(*pmc_idx_to_pmc)(struct kvm_pmu *pmu, int pmc_idx);
struct kvm_pmc *(*rdpmc_ecx_to_pmc)(struct kvm_vcpu *vcpu,
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index fdf587f19c5f..3ee8f86d9ace 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -152,7 +152,7 @@ static unsigned amd_find_arch_event(struct kvm_pmu *pmu,
}

/* return PERF_COUNT_HW_MAX as AMD doesn't have fixed events */
-static unsigned amd_find_fixed_event(int idx)
+static unsigned int amd_find_fixed_event(struct kvm_pmu *pmu, int idx)
{
return PERF_COUNT_HW_MAX;
}
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index bc6845265362..4c04e94ae548 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -86,7 +86,7 @@ static unsigned intel_find_arch_event(struct kvm_pmu *pmu,
return intel_arch_events[i].event_type;
}

-static unsigned intel_find_fixed_event(int idx)
+static unsigned int intel_find_fixed_event(struct kvm_pmu *pmu, int idx)
{
u32 event;
size_t size = ARRAY_SIZE(fixed_pmc_events);
--
2.33.0


2021-11-12 09:52:07

by Like Xu

[permalink] [raw]
Subject: [PATCH 4/7] KVM: x86/pmu: Avoid perf_event creation for invalid counter config

From: Like Xu <[email protected]>

KVM needs to be fixed to avoid perf_event creation when the requested
hw event on a gp or fixed counter is marked as unavailable in the Intel
guest CPUID 0AH.EBX leaf.

It's proposed to use is_intel_cpuid_event() to distinguish whether the hw
event is an Intel pre-defined architecture event, so that we can decide to
reprogram it with PERF_TYPE_HARDWARE (for fixed and gp) or
PERF_TYPE_RAW (for gp only) perf_event, or just avoid creating perf_event.

If an Intel cpuid event is marked as unavailable by checking
pmu->available_event_types, the intel_find_[fixed|arch]_event() returns
a new special value of "PERF_COUNT_HW_MAX + 1" to tell the caller
to avoid creating perf_ event and not to use PERF_TYPE_RAW mode for gp.

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/kvm/pmu.c | 8 +++++++
arch/x86/kvm/vmx/pmu_intel.c | 45 +++++++++++++++++++++++++++++++-----
2 files changed, 47 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 7093fc70cd38..3b47bd92e7bb 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -111,6 +111,14 @@ static void pmc_reprogram_counter(struct kvm_pmc *pmc, u32 type,
.config = config,
};

+ /*
+ * The "config > PERF_COUNT_HW_MAX" only appears when
+ * the kernel generic event is marked as unavailable
+ * in the Intel guest architecture event CPUID leaf.
+ */
+ if (type == PERF_TYPE_HARDWARE && config >= PERF_COUNT_HW_MAX)
+ return;
+
attr.sample_period = get_sample_period(pmc, pmc->counter);

if (in_tx)
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 4c04e94ae548..4f58c14efa61 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -68,17 +68,39 @@ static void global_ctrl_changed(struct kvm_pmu *pmu, u64 data)
reprogram_counter(pmu, bit);
}

+/* UMask and Event Select Encodings for Intel CPUID Events */
+static inline bool is_intel_cpuid_event(u8 event_select, u8 unit_mask)
+{
+ if ((!unit_mask && event_select == 0x3C) ||
+ (!unit_mask && event_select == 0xC0) ||
+ (unit_mask == 0x01 && event_select == 0x3C) ||
+ (unit_mask == 0x4F && event_select == 0x2E) ||
+ (unit_mask == 0x41 && event_select == 0x2E) ||
+ (!unit_mask && event_select == 0xC4) ||
+ (!unit_mask && event_select == 0xC5))
+ return true;
+
+ /* the unimplemented topdown.slots event check is kipped. */
+ return false;
+}
+
static unsigned intel_find_arch_event(struct kvm_pmu *pmu,
u8 event_select,
u8 unit_mask)
{
int i;

- for (i = 0; i < ARRAY_SIZE(intel_arch_events); i++)
- if (intel_arch_events[i].eventsel == event_select &&
- intel_arch_events[i].unit_mask == unit_mask &&
- ((i > 6) || pmu->available_event_types & (1 << i)))
- break;
+ for (i = 0; i < ARRAY_SIZE(intel_arch_events); i++) {
+ if (intel_arch_events[i].eventsel != event_select ||
+ intel_arch_events[i].unit_mask != unit_mask)
+ continue;
+
+ if (is_intel_cpuid_event(event_select, unit_mask) &&
+ !(pmu->available_event_types & BIT_ULL(i)))
+ return PERF_COUNT_HW_MAX + 1;
+
+ break;
+ }

if (i == ARRAY_SIZE(intel_arch_events))
return PERF_COUNT_HW_MAX;
@@ -90,12 +112,23 @@ static unsigned int intel_find_fixed_event(struct kvm_pmu *pmu, int idx)
{
u32 event;
size_t size = ARRAY_SIZE(fixed_pmc_events);
+ u8 event_select, unit_mask;
+ unsigned int event_type;

if (idx >= size)
return PERF_COUNT_HW_MAX;

event = fixed_pmc_events[array_index_nospec(idx, size)];
- return intel_arch_events[event].event_type;
+
+ event_select = intel_arch_events[event].eventsel;
+ unit_mask = intel_arch_events[event].unit_mask;
+ event_type = intel_arch_events[event].event_type;
+
+ if (is_intel_cpuid_event(event_select, unit_mask) &&
+ !(pmu->available_event_types & BIT_ULL(event_type)))
+ return PERF_COUNT_HW_MAX + 1;
+
+ return event_type;
}

/* check if a PMC is enabled by comparing it with globl_ctrl bits. */
--
2.33.0


2021-11-12 09:52:11

by Like Xu

[permalink] [raw]
Subject: [PATCH 5/7] KVM: x86/pmu: Refactor pmu->available_event_types field using BITMAP

From: Like Xu <[email protected]>

Replace the explicit declaration of "unsigned available_event_types" with
the generic macro DECLARE_BITMAP and rename it to "avail_cpuid_events"
for better self-explanation.

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 2 +-
arch/x86/kvm/vmx/pmu_intel.c | 11 +++++++----
2 files changed, 8 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 88fce6ab4bbd..2e69dec3ad7b 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -495,7 +495,6 @@ struct kvm_pmc {
struct kvm_pmu {
unsigned nr_arch_gp_counters;
unsigned nr_arch_fixed_counters;
- unsigned available_event_types;
u64 fixed_ctr_ctrl;
u64 global_ctrl;
u64 global_status;
@@ -510,6 +509,7 @@ struct kvm_pmu {
DECLARE_BITMAP(reprogram_pmi, X86_PMC_IDX_MAX);
DECLARE_BITMAP(all_valid_pmc_idx, X86_PMC_IDX_MAX);
DECLARE_BITMAP(pmc_in_use, X86_PMC_IDX_MAX);
+ DECLARE_BITMAP(avail_cpuid_events, X86_PMC_IDX_MAX);

/*
* The gate to release perf_events not marked in
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index 4f58c14efa61..db36e743c3cc 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -96,7 +96,7 @@ static unsigned intel_find_arch_event(struct kvm_pmu *pmu,
continue;

if (is_intel_cpuid_event(event_select, unit_mask) &&
- !(pmu->available_event_types & BIT_ULL(i)))
+ !test_bit(i, pmu->avail_cpuid_events))
return PERF_COUNT_HW_MAX + 1;

break;
@@ -125,7 +125,7 @@ static unsigned int intel_find_fixed_event(struct kvm_pmu *pmu, int idx)
event_type = intel_arch_events[event].event_type;

if (is_intel_cpuid_event(event_select, unit_mask) &&
- !(pmu->available_event_types & BIT_ULL(event_type)))
+ !test_bit(event_type, pmu->avail_cpuid_events))
return PERF_COUNT_HW_MAX + 1;

return event_type;
@@ -497,6 +497,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
struct lbr_desc *lbr_desc = vcpu_to_lbr_desc(vcpu);
+ unsigned long avail_cpuid_events;

struct x86_pmu_capability x86_pmu;
struct kvm_cpuid_entry2 *entry;
@@ -527,8 +528,10 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
eax.split.bit_width = min_t(int, eax.split.bit_width, x86_pmu.bit_width_gp);
pmu->counter_bitmask[KVM_PMC_GP] = ((u64)1 << eax.split.bit_width) - 1;
eax.split.mask_length = min_t(int, eax.split.mask_length, x86_pmu.events_mask_len);
- pmu->available_event_types = ~entry->ebx &
- ((1ull << eax.split.mask_length) - 1);
+ avail_cpuid_events = ~entry->ebx & ((1ull << eax.split.mask_length) - 1);
+ bitmap_copy(pmu->avail_cpuid_events,
+ (unsigned long *)&avail_cpuid_events,
+ eax.split.mask_length);

if (pmu->version == 1) {
pmu->nr_arch_fixed_counters = 0;
--
2.33.0


2021-11-12 09:52:19

by Like Xu

[permalink] [raw]
Subject: [PATCH 6/7] perf: x86/core: Add interface to query perfmon_event_map[] directly

From: Like Xu <[email protected]>

Currently, we have [intel|knc|p4|p6]_perfmon_event_map on the Intel
platforms and amd_[f17h]_perfmon_event_map on the AMD platforms.

Early clumsy KVM code or other potential perf_event users may have
hard-coded these perfmon_maps (e.g., arch/x86/kvm/svm/pmu.c), so
it would not make sense to program a common hardware event based
on the generic "enum perf_hw_id" once the two tables do not match.

Let's provide an interface for callers outside the perf subsystem to get
the counter config based on the perfmon_event_map currently in use,
and it also helps to save bytes.

Cc: Peter Zijlstra <[email protected]>
Signed-off-by: Like Xu <[email protected]>
---
arch/x86/events/core.c | 9 +++++++++
arch/x86/include/asm/perf_event.h | 5 +++++
2 files changed, 14 insertions(+)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 2a57dbed4894..dc88d39cec1b 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -691,6 +691,15 @@ void x86_pmu_disable_all(void)
}
}

+u64 perf_get_hw_event_config(int perf_hw_id)
+{
+ if (perf_hw_id < x86_pmu.max_events)
+ return x86_pmu.event_map(perf_hw_id);
+
+ return 0;
+}
+EXPORT_SYMBOL_GPL(perf_get_hw_event_config);
+
struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr)
{
return static_call(x86_pmu_guest_get_msrs)(nr);
diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 8fc1b5003713..11a93cb1198b 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -492,9 +492,14 @@ static inline void perf_check_microcode(void) { }

#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
+extern u64 perf_get_hw_event_config(int perf_hw_id);
extern int x86_perf_get_lbr(struct x86_pmu_lbr *lbr);
#else
struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
+u64 perf_get_hw_event_config(int perf_hw_id);
+{
+ return 0;
+}
static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
{
return -1;
--
2.33.0


2021-11-12 09:52:23

by Like Xu

[permalink] [raw]
Subject: [PATCH 7/7] KVM: x86/pmu: Setup the {inte|amd}_event_mapping[] when hardware_setup

From: Like Xu <[email protected]>

The current amd_event_mapping[] is only valid for "K7 and later, up to and
including Family 16h" and it needs amd_f17h_perfmon_event_mapp[] for
"Family 17h and later". It's proposed to fix it in a more generic approach.

For AMD platforms, the new introduced interface perf_get_hw_event_config()
could be applied to fill up the new introduced global kernel_arch_events[].

For Intel platforms, we need to distinguish the "kernel_arch_events"
(which is defined based on the kernel generic "enum perf_hw_id" )
and "intel_cpuid_events" (which is defined based on the Intel CPUID).

To keep the validation check function for Intel cpuid events, the
get_perf_hw_id_from_cpuid_idx() is added to correcte the bit index
in the pmu->avail_cpuid_events based on "enum perf_hw_id" when
the original strictly ordered intel_arch_events[] is replaced
by the new strictly ordered kernel_arch_events[].

When the kernel_arch_events[] is initialized, the original 8-element array
is replaced by a new 10-element array, and the eventsel and unit_mask of
the two new members of will be zero, which makes the call to "perf_hw_id"
in the find_arch_event() very confusing. In this case, KVM will not query
kernel_arch_events[] when the trapped event_select and unit_mask are
both 0, it will fall back to PERF_TYPE_RAW mode to program the perf_event.

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/kvm/pmu.c | 24 ++++++++++++-
arch/x86/kvm/pmu.h | 2 ++
arch/x86/kvm/svm/pmu.c | 22 +++---------
arch/x86/kvm/vmx/pmu_intel.c | 70 +++++++++++++++++++++++-------------
arch/x86/kvm/x86.c | 1 +
5 files changed, 76 insertions(+), 43 deletions(-)

diff --git a/arch/x86/kvm/pmu.c b/arch/x86/kvm/pmu.c
index 3b47bd92e7bb..03d28912309a 100644
--- a/arch/x86/kvm/pmu.c
+++ b/arch/x86/kvm/pmu.c
@@ -19,6 +19,9 @@
#include "lapic.h"
#include "pmu.h"

+struct kvm_event_hw_type_mapping kernel_arch_events[PERF_COUNT_HW_MAX];
+EXPORT_SYMBOL_GPL(kernel_arch_events);
+
/* This is enough to filter the vast majority of currently defined events. */
#define KVM_PMU_EVENT_FILTER_MAX_EVENTS 300

@@ -217,7 +220,9 @@ void reprogram_gp_counter(struct kvm_pmc *pmc, u64 eventsel)
event_select = eventsel & ARCH_PERFMON_EVENTSEL_EVENT;
unit_mask = (eventsel & ARCH_PERFMON_EVENTSEL_UMASK) >> 8;

- if (!(eventsel & (ARCH_PERFMON_EVENTSEL_EDGE |
+ /* Fall back to PERF_TYPE_RAW mode if event_select and unit_mask are both 0. */
+ if ((event_select | unit_mask) &&
+ !(eventsel & (ARCH_PERFMON_EVENTSEL_EDGE |
ARCH_PERFMON_EVENTSEL_INV |
ARCH_PERFMON_EVENTSEL_CMASK |
HSW_IN_TX |
@@ -499,6 +504,23 @@ void kvm_pmu_destroy(struct kvm_vcpu *vcpu)
kvm_pmu_reset(vcpu);
}

+/* Initialize common hardware events mapping based on enum perf_hw_id. */
+void kvm_pmu_hw_events_mapping_setup(void)
+{
+ u64 config;
+ int i;
+
+ for (i = 0; i < PERF_COUNT_HW_MAX; i++) {
+ config = perf_get_hw_event_config(i) & 0xFFFFULL;
+
+ kernel_arch_events[i] = (struct kvm_event_hw_type_mapping){
+ .eventsel = config & ARCH_PERFMON_EVENTSEL_EVENT,
+ .unit_mask = (config & ARCH_PERFMON_EVENTSEL_UMASK) >> 8,
+ .event_type = i,
+ };
+ }
+}
+
int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp)
{
struct kvm_pmu_event_filter tmp, *filter;
diff --git a/arch/x86/kvm/pmu.h b/arch/x86/kvm/pmu.h
index fe29537b1343..688b784f1e26 100644
--- a/arch/x86/kvm/pmu.h
+++ b/arch/x86/kvm/pmu.h
@@ -160,8 +160,10 @@ void kvm_pmu_cleanup(struct kvm_vcpu *vcpu);
void kvm_pmu_destroy(struct kvm_vcpu *vcpu);
int kvm_vm_ioctl_set_pmu_event_filter(struct kvm *kvm, void __user *argp);

+void kvm_pmu_hw_events_mapping_setup(void);
bool is_vmware_backdoor_pmc(u32 pmc_idx);

extern struct kvm_pmu_ops intel_pmu_ops;
extern struct kvm_pmu_ops amd_pmu_ops;
+extern struct kvm_event_hw_type_mapping kernel_arch_events[];
#endif /* __KVM_X86_PMU_H */
diff --git a/arch/x86/kvm/svm/pmu.c b/arch/x86/kvm/svm/pmu.c
index 3ee8f86d9ace..68814b3b6e27 100644
--- a/arch/x86/kvm/svm/pmu.c
+++ b/arch/x86/kvm/svm/pmu.c
@@ -32,18 +32,6 @@ enum index {
INDEX_ERROR,
};

-/* duplicated from amd_perfmon_event_map, K7 and above should work. */
-static struct kvm_event_hw_type_mapping amd_event_mapping[] = {
- [0] = { 0x76, 0x00, PERF_COUNT_HW_CPU_CYCLES },
- [1] = { 0xc0, 0x00, PERF_COUNT_HW_INSTRUCTIONS },
- [2] = { 0x7d, 0x07, PERF_COUNT_HW_CACHE_REFERENCES },
- [3] = { 0x7e, 0x07, PERF_COUNT_HW_CACHE_MISSES },
- [4] = { 0xc2, 0x00, PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
- [5] = { 0xc3, 0x00, PERF_COUNT_HW_BRANCH_MISSES },
- [6] = { 0xd0, 0x00, PERF_COUNT_HW_STALLED_CYCLES_FRONTEND },
- [7] = { 0xd1, 0x00, PERF_COUNT_HW_STALLED_CYCLES_BACKEND },
-};
-
static unsigned int get_msr_base(struct kvm_pmu *pmu, enum pmu_type type)
{
struct kvm_vcpu *vcpu = pmu_to_vcpu(pmu);
@@ -140,15 +128,15 @@ static unsigned amd_find_arch_event(struct kvm_pmu *pmu,
{
int i;

- for (i = 0; i < ARRAY_SIZE(amd_event_mapping); i++)
- if (amd_event_mapping[i].eventsel == event_select
- && amd_event_mapping[i].unit_mask == unit_mask)
+ for (i = 0; i < PERF_COUNT_HW_MAX; i++)
+ if (kernel_arch_events[i].eventsel == event_select &&
+ kernel_arch_events[i].unit_mask == unit_mask)
break;

- if (i == ARRAY_SIZE(amd_event_mapping))
+ if (i == PERF_COUNT_HW_MAX)
return PERF_COUNT_HW_MAX;

- return amd_event_mapping[i].event_type;
+ return kernel_arch_events[i].event_type;
}

/* return PERF_COUNT_HW_MAX as AMD doesn't have fixed events */
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index db36e743c3cc..40b4112aefa4 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -20,20 +20,14 @@

#define MSR_PMC_FULL_WIDTH_BIT (MSR_IA32_PMC0 - MSR_IA32_PERFCTR0)

-static struct kvm_event_hw_type_mapping intel_arch_events[] = {
- [0] = { 0x3c, 0x00, PERF_COUNT_HW_CPU_CYCLES },
- [1] = { 0xc0, 0x00, PERF_COUNT_HW_INSTRUCTIONS },
- [2] = { 0x3c, 0x01, PERF_COUNT_HW_BUS_CYCLES },
- [3] = { 0x2e, 0x4f, PERF_COUNT_HW_CACHE_REFERENCES },
- [4] = { 0x2e, 0x41, PERF_COUNT_HW_CACHE_MISSES },
- [5] = { 0xc4, 0x00, PERF_COUNT_HW_BRANCH_INSTRUCTIONS },
- [6] = { 0xc5, 0x00, PERF_COUNT_HW_BRANCH_MISSES },
- /* The above index must match CPUID 0x0A.EBX bit vector */
- [7] = { 0x00, 0x03, PERF_COUNT_HW_REF_CPU_CYCLES },
-};
-
-/* mapping between fixed pmc index and intel_arch_events array */
-static int fixed_pmc_events[] = {1, 0, 7};
+/*
+ * mapping between fixed pmc index and kernel_arch_events array
+ *
+ * PERF_COUNT_HW_INSTRUCTIONS
+ * PERF_COUNT_HW_CPU_CYCLES
+ * PERF_COUNT_HW_REF_CPU_CYCLES
+ */
+static int fixed_pmc_events[] = {1, 0, 9};

static void reprogram_fixed_counters(struct kvm_pmu *pmu, u64 data)
{
@@ -90,9 +84,9 @@ static unsigned intel_find_arch_event(struct kvm_pmu *pmu,
{
int i;

- for (i = 0; i < ARRAY_SIZE(intel_arch_events); i++) {
- if (intel_arch_events[i].eventsel != event_select ||
- intel_arch_events[i].unit_mask != unit_mask)
+ for (i = 0; i < PERF_COUNT_HW_MAX; i++) {
+ if (kernel_arch_events[i].eventsel != event_select ||
+ kernel_arch_events[i].unit_mask != unit_mask)
continue;

if (is_intel_cpuid_event(event_select, unit_mask) &&
@@ -102,10 +96,10 @@ static unsigned intel_find_arch_event(struct kvm_pmu *pmu,
break;
}

- if (i == ARRAY_SIZE(intel_arch_events))
+ if (i == PERF_COUNT_HW_MAX)
return PERF_COUNT_HW_MAX;

- return intel_arch_events[i].event_type;
+ return kernel_arch_events[i].event_type;
}

static unsigned int intel_find_fixed_event(struct kvm_pmu *pmu, int idx)
@@ -120,9 +114,9 @@ static unsigned int intel_find_fixed_event(struct kvm_pmu *pmu, int idx)

event = fixed_pmc_events[array_index_nospec(idx, size)];

- event_select = intel_arch_events[event].eventsel;
- unit_mask = intel_arch_events[event].unit_mask;
- event_type = intel_arch_events[event].event_type;
+ event_select = kernel_arch_events[event].eventsel;
+ unit_mask = kernel_arch_events[event].unit_mask;
+ event_type = kernel_arch_events[event].event_type;

if (is_intel_cpuid_event(event_select, unit_mask) &&
!test_bit(event_type, pmu->avail_cpuid_events))
@@ -493,6 +487,33 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 1;
}

+static inline int get_perf_hw_id_from_cpuid_idx(int bit)
+{
+ switch (bit) {
+ case 0:
+ case 1:
+ return bit;
+ case 2:
+ return PERF_COUNT_HW_BUS_CYCLES;
+ case 3:
+ case 4:
+ case 5:
+ case 6:
+ return --bit;
+ }
+
+ return PERF_COUNT_HW_MAX;
+}
+
+static inline void setup_available_kernel_arch_events(struct kvm_pmu *pmu,
+ unsigned int avail_cpuid_events, unsigned int mask_length)
+{
+ int bit;
+
+ for_each_set_bit(bit, (unsigned long *)&avail_cpuid_events, mask_length)
+ __set_bit(get_perf_hw_id_from_cpuid_idx(bit), pmu->avail_cpuid_events);
+}
+
static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -529,9 +550,8 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->counter_bitmask[KVM_PMC_GP] = ((u64)1 << eax.split.bit_width) - 1;
eax.split.mask_length = min_t(int, eax.split.mask_length, x86_pmu.events_mask_len);
avail_cpuid_events = ~entry->ebx & ((1ull << eax.split.mask_length) - 1);
- bitmap_copy(pmu->avail_cpuid_events,
- (unsigned long *)&avail_cpuid_events,
- eax.split.mask_length);
+ setup_available_kernel_arch_events(pmu, avail_cpuid_events,
+ eax.split.mask_length);

if (pmu->version == 1) {
pmu->nr_arch_fixed_counters = 0;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index ac83d873d65b..8f7e70f59665 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -11317,6 +11317,7 @@ int kvm_arch_hardware_setup(void *opaque)
memcpy(&kvm_x86_ops, ops->runtime_ops, sizeof(kvm_x86_ops));
kvm_ops_static_call_update();

+ kvm_pmu_hw_events_mapping_setup();
if (!kvm_cpu_cap_has(X86_FEATURE_XSAVES))
supported_xss = 0;

--
2.33.0


2021-11-17 23:22:36

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH 6/7] perf: x86/core: Add interface to query perfmon_event_map[] directly

Hi Like,

Thank you for the patch! Yet something to improve:

[auto build test ERROR on kvm/queue]
[also build test ERROR on tip/perf/core mst-vhost/linux-next linus/master v5.16-rc1 next-20211117]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url: https://github.com/0day-ci/linux/commits/Like-Xu/KVM-x86-pmu-Four-functional-fixes/20211112-175332
base: https://git.kernel.org/pub/scm/virt/kvm/kvm.git queue
config: x86_64-randconfig-c022-20211115 (attached as .config)
compiler: gcc-9 (Debian 9.3.0-22) 9.3.0
reproduce (this is a W=1 build):
# https://github.com/0day-ci/linux/commit/43c9d66955e7ece2fd8f6c03cc606cf72be8e8d4
git remote add linux-review https://github.com/0day-ci/linux
git fetch --no-tags linux-review Like-Xu/KVM-x86-pmu-Four-functional-fixes/20211112-175332
git checkout 43c9d66955e7ece2fd8f6c03cc606cf72be8e8d4
# save the attached .config to linux build tree
mkdir build_dir
make W=1 O=build_dir ARCH=x86_64 SHELL=/bin/bash

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <[email protected]>

All errors (new ones prefixed by >>):

In file included from include/linux/perf_event.h:25,
from include/linux/trace_events.h:10,
from include/trace/trace_events.h:21,
from include/trace/define_trace.h:102,
from drivers/base/regmap/trace.h:257,
from drivers/base/regmap/regmap.c:23:
>> arch/x86/include/asm/perf_event.h:500:1: error: expected identifier or '(' before '{' token
500 | {
| ^
--
In file included from include/linux/perf_event.h:25,
from include/linux/trace_events.h:10,
from include/trace/syscall.h:7,
from include/linux/syscalls.h:87,
from init/main.c:21:
>> arch/x86/include/asm/perf_event.h:500:1: error: expected identifier or '(' before '{' token
500 | {
| ^
init/main.c:788:20: warning: no previous prototype for 'mem_encrypt_init' [-Wmissing-prototypes]
788 | void __init __weak mem_encrypt_init(void) { }
| ^~~~~~~~~~~~~~~~
--
In file included from include/linux/perf_event.h:25,
from include/linux/trace_events.h:10,
from include/trace/syscall.h:7,
from include/linux/syscalls.h:87,
from include/linux/entry-common.h:7,
from arch/x86/include/asm/idtentry.h:9,
from arch/x86/include/asm/traps.h:9,
from arch/x86/mm/extable.c:9:
>> arch/x86/include/asm/perf_event.h:500:1: error: expected identifier or '(' before '{' token
500 | {
| ^
arch/x86/mm/extable.c:27:16: warning: no previous prototype for 'ex_handler_default' [-Wmissing-prototypes]
27 | __visible bool ex_handler_default(const struct exception_table_entry *fixup,
| ^~~~~~~~~~~~~~~~~~
arch/x86/mm/extable.c:37:16: warning: no previous prototype for 'ex_handler_fault' [-Wmissing-prototypes]
37 | __visible bool ex_handler_fault(const struct exception_table_entry *fixup,
| ^~~~~~~~~~~~~~~~
arch/x86/mm/extable.c:58:16: warning: no previous prototype for 'ex_handler_fprestore' [-Wmissing-prototypes]
58 | __visible bool ex_handler_fprestore(const struct exception_table_entry *fixup,
| ^~~~~~~~~~~~~~~~~~~~
arch/x86/mm/extable.c:73:16: warning: no previous prototype for 'ex_handler_uaccess' [-Wmissing-prototypes]
73 | __visible bool ex_handler_uaccess(const struct exception_table_entry *fixup,
| ^~~~~~~~~~~~~~~~~~
arch/x86/mm/extable.c:84:16: warning: no previous prototype for 'ex_handler_copy' [-Wmissing-prototypes]
84 | __visible bool ex_handler_copy(const struct exception_table_entry *fixup,
| ^~~~~~~~~~~~~~~
arch/x86/mm/extable.c:96:16: warning: no previous prototype for 'ex_handler_rdmsr_unsafe' [-Wmissing-prototypes]
96 | __visible bool ex_handler_rdmsr_unsafe(const struct exception_table_entry *fixup,
| ^~~~~~~~~~~~~~~~~~~~~~~
arch/x86/mm/extable.c:113:16: warning: no previous prototype for 'ex_handler_wrmsr_unsafe' [-Wmissing-prototypes]
113 | __visible bool ex_handler_wrmsr_unsafe(const struct exception_table_entry *fixup,
| ^~~~~~~~~~~~~~~~~~~~~~~
arch/x86/mm/extable.c:129:16: warning: no previous prototype for 'ex_handler_clear_fs' [-Wmissing-prototypes]
129 | __visible bool ex_handler_clear_fs(const struct exception_table_entry *fixup,
| ^~~~~~~~~~~~~~~~~~~
--
In file included from include/linux/perf_event.h:25,
from include/linux/trace_events.h:10,
from include/trace/syscall.h:7,
from include/linux/syscalls.h:87,
from kernel/exit.c:42:
>> arch/x86/include/asm/perf_event.h:500:1: error: expected identifier or '(' before '{' token
500 | {
| ^
kernel/exit.c:1810:13: warning: no previous prototype for 'abort' [-Wmissing-prototypes]
1810 | __weak void abort(void)
| ^~~~~
--
In file included from include/linux/perf_event.h:25,
from include/linux/trace_events.h:10,
from include/trace/trace_events.h:21,
from include/trace/define_trace.h:102,
from include/trace/events/vmscan.h:460,
from mm/vmscan.c:63:
>> arch/x86/include/asm/perf_event.h:500:1: error: expected identifier or '(' before '{' token
500 | {
| ^
mm/vmscan.c: In function 'demote_page_list':
mm/vmscan.c:1340:6: warning: variable 'err' set but not used [-Wunused-but-set-variable]
1340 | int err;
| ^~~
--
In file included from include/linux/perf_event.h:25,
from include/linux/trace_events.h:10,
from include/trace/syscall.h:7,
from include/linux/syscalls.h:87,
from fs/pipe.c:24:
>> arch/x86/include/asm/perf_event.h:500:1: error: expected identifier or '(' before '{' token
500 | {
| ^
fs/pipe.c:755:15: warning: no previous prototype for 'account_pipe_buffers' [-Wmissing-prototypes]
755 | unsigned long account_pipe_buffers(struct user_struct *user,
| ^~~~~~~~~~~~~~~~~~~~
fs/pipe.c:761:6: warning: no previous prototype for 'too_many_pipe_buffers_soft' [-Wmissing-prototypes]
761 | bool too_many_pipe_buffers_soft(unsigned long user_bufs)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
fs/pipe.c:768:6: warning: no previous prototype for 'too_many_pipe_buffers_hard' [-Wmissing-prototypes]
768 | bool too_many_pipe_buffers_hard(unsigned long user_bufs)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
fs/pipe.c:775:6: warning: no previous prototype for 'pipe_is_unprivileged_user' [-Wmissing-prototypes]
775 | bool pipe_is_unprivileged_user(void)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
fs/pipe.c:1245:5: warning: no previous prototype for 'pipe_resize_ring' [-Wmissing-prototypes]
1245 | int pipe_resize_ring(struct pipe_inode_info *pipe, unsigned int nr_slots)
| ^~~~~~~~~~~~~~~~
--
In file included from include/linux/perf_event.h:25,
from include/linux/trace_events.h:10,
from include/trace/syscall.h:7,
from include/linux/syscalls.h:87,
from fs/d_path.c:2:
>> arch/x86/include/asm/perf_event.h:500:1: error: expected identifier or '(' before '{' token
500 | {
| ^
fs/d_path.c:320:7: warning: no previous prototype for 'simple_dname' [-Wmissing-prototypes]
320 | char *simple_dname(struct dentry *dentry, char *buffer, int buflen)
| ^~~~~~~~~~~~
--
In file included from include/linux/perf_event.h:25,
from include/linux/hw_breakpoint.h:5,
from kernel/trace/trace.h:15,
from kernel/trace/trace_output.h:6,
from kernel/trace/ftrace.c:45:
>> arch/x86/include/asm/perf_event.h:500:1: error: expected identifier or '(' before '{' token
500 | {
| ^
kernel/trace/ftrace.c:302:5: warning: no previous prototype for '__register_ftrace_function' [-Wmissing-prototypes]
302 | int __register_ftrace_function(struct ftrace_ops *ops)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~
kernel/trace/ftrace.c:345:5: warning: no previous prototype for '__unregister_ftrace_function' [-Wmissing-prototypes]
345 | int __unregister_ftrace_function(struct ftrace_ops *ops)
| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~
kernel/trace/ftrace.c:3876:15: warning: no previous prototype for 'arch_ftrace_match_adjust' [-Wmissing-prototypes]
3876 | char * __weak arch_ftrace_match_adjust(char *str, const char *search)
| ^~~~~~~~~~~~~~~~~~~~~~~~
--
In file included from include/linux/perf_event.h:25,
from include/linux/hw_breakpoint.h:5,
from kernel/trace/trace.h:15,
from kernel/trace/trace.c:53:
>> arch/x86/include/asm/perf_event.h:500:1: error: expected identifier or '(' before '{' token
500 | {
| ^
kernel/trace/trace.c: In function 'trace_check_vprintf':
kernel/trace/trace.c:3837:3: warning: function 'trace_check_vprintf' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format]
3837 | trace_seq_vprintf(&iter->seq, iter->fmt, ap);
| ^~~~~~~~~~~~~~~~~
kernel/trace/trace.c:3892:3: warning: function 'trace_check_vprintf' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format]
3892 | trace_seq_vprintf(&iter->seq, p, ap);
| ^~~~~~~~~~~~~~~~~
--
In file included from include/linux/perf_event.h:25,
from include/linux/hw_breakpoint.h:5,
from kernel/trace/trace.h:15,
from kernel/trace/trace_output.h:6,
from kernel/trace/trace_output.c:14:
>> arch/x86/include/asm/perf_event.h:500:1: error: expected identifier or '(' before '{' token
500 | {
| ^
kernel/trace/trace_output.c: In function 'trace_output_raw':
kernel/trace/trace_output.c:331:2: warning: function 'trace_output_raw' might be a candidate for 'gnu_printf' format attribute [-Wsuggest-attribute=format]
331 | trace_seq_vprintf(s, trace_event_format(iter, fmt), ap);
| ^~~~~~~~~~~~~~~~~
--
In file included from include/linux/perf_event.h:25,
from include/linux/hw_breakpoint.h:5,
from kernel/trace/trace.h:15,
from kernel/trace/trace_preemptirq.c:13:
>> arch/x86/include/asm/perf_event.h:500:1: error: expected identifier or '(' before '{' token
500 | {
| ^
kernel/trace/trace_preemptirq.c:88:16: warning: no previous prototype for 'trace_hardirqs_on_caller' [-Wmissing-prototypes]
88 | __visible void trace_hardirqs_on_caller(unsigned long caller_addr)
| ^~~~~~~~~~~~~~~~~~~~~~~~
kernel/trace/trace_preemptirq.c:103:16: warning: no previous prototype for 'trace_hardirqs_off_caller' [-Wmissing-prototypes]
103 | __visible void trace_hardirqs_off_caller(unsigned long caller_addr)
| ^~~~~~~~~~~~~~~~~~~~~~~~~
..


vim +500 arch/x86/include/asm/perf_event.h

492
493 #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
494 extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
495 extern u64 perf_get_hw_event_config(int perf_hw_id);
496 extern int x86_perf_get_lbr(struct x86_pmu_lbr *lbr);
497 #else
498 struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
499 u64 perf_get_hw_event_config(int perf_hw_id);
> 500 {
501 return 0;
502 }
503 static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
504 {
505 return -1;
506 }
507 #endif
508

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/[email protected]


Attachments:
(No filename) (12.13 kB)
.config.gz (33.22 kB)
Download all attachments

2021-11-18 08:06:43

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH 6/7] perf: x86/core: Add interface to query perfmon_event_map[] directly

On 18/11/2021 7:21 am, kernel test robot wrote:
> Hi Like,
>
> Thank you for the patch! Yet something to improve:
>
> [auto build test ERROR on kvm/queue]
> [also build test ERROR on tip/perf/core mst-vhost/linux-next linus/master v5.16-rc1 next-20211117]
> [If your patch is applied to the wrong git tree, kindly drop us a note.
> And when submitting patch, we suggest to use '--base' as documented in
> https://git-scm.com/docs/git-format-patch]
>

...

> vim +500 arch/x86/include/asm/perf_event.h
>
> 492
> 493 #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
> 494 extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
> 495 extern u64 perf_get_hw_event_config(int perf_hw_id);
> 496 extern int x86_perf_get_lbr(struct x86_pmu_lbr *lbr);
> 497 #else
> 498 struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
> 499 u64 perf_get_hw_event_config(int perf_hw_id);

Thanks to the robot, I should have removed the ";" from this line.

Awaiting further review comments.

> > 500 {
> 501 return 0;
> 502 }
> 503 static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
> 504 {
> 505 return -1;
> 506 }
> 507 #endif
> 508
>
> ---
> 0-DAY CI Kernel Test Service, Intel Corporation
> https://lists.01.org/hyperkitty/list/[email protected]
>

2021-11-18 13:36:50

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH 6/7] perf: x86/core: Add interface to query perfmon_event_map[] directly

On 18/11/2021 4:06 pm, Like Xu wrote:
> On 18/11/2021 7:21 am, kernel test robot wrote:
>> Hi Like,
>>
>> Thank you for the patch! Yet something to improve:
>>
>> [auto build test ERROR on kvm/queue]
>> [also build test ERROR on tip/perf/core mst-vhost/linux-next linus/master
>> v5.16-rc1 next-20211117]
>> [If your patch is applied to the wrong git tree, kindly drop us a note.
>> And when submitting patch, we suggest to use '--base' as documented in
>> https://git-scm.com/docs/git-format-patch]
>>
>
> ...
>
>> vim +500 arch/x86/include/asm/perf_event.h
>>
>>     492
>>     493    #if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
>>     494    extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
>>     495    extern u64 perf_get_hw_event_config(int perf_hw_id);
>>     496    extern int x86_perf_get_lbr(struct x86_pmu_lbr *lbr);
>>     497    #else
>>     498    struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
>>     499    u64 perf_get_hw_event_config(int perf_hw_id);
>
> Thanks to the robot, I should have removed the ";" from this line.
>

Sorry, my bot is shouting at me again. This part should be:

diff --git a/arch/x86/include/asm/perf_event.h b/arch/x86/include/asm/perf_event.h
index 8fc1b5003713..85fd768d49d7 100644
--- a/arch/x86/include/asm/perf_event.h
+++ b/arch/x86/include/asm/perf_event.h
@@ -492,9 +492,11 @@ static inline void perf_check_microcode(void) { }

#if defined(CONFIG_PERF_EVENTS) && defined(CONFIG_CPU_SUP_INTEL)
extern struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
+extern u64 perf_get_hw_event_config(int perf_hw_id);
extern int x86_perf_get_lbr(struct x86_pmu_lbr *lbr);
#else
struct perf_guest_switch_msr *perf_guest_get_msrs(int *nr);
+static u64 perf_get_hw_event_config(int perf_hw_id);
static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
{
return -1;

> Awaiting further review comments.
>
>>   > 500    {
>>     501        return 0;
>>     502    }
>>     503    static inline int x86_perf_get_lbr(struct x86_pmu_lbr *lbr)
>>     504    {
>>     505        return -1;
>>     506    }
>>     507    #endif
>>     508
>>
>> ---
>> 0-DAY CI Kernel Test Service, Intel Corporation
>> https://lists.01.org/hyperkitty/list/[email protected]
>>

2021-11-25 13:21:24

by kernel test robot

[permalink] [raw]
Subject: [KVM] 54244a5dd7: BUG:KASAN:stack-out-of-bounds_in_find_first_bit



Greeting,

FYI, we noticed the following commit (built with gcc-9):

commit: 54244a5dd79183120f8c5f26d3a89f3966b48022 ("[PATCH 7/7] KVM: x86/pmu: Setup the {inte|amd}_event_mapping[] when hardware_setup")
url: https://github.com/0day-ci/linux/commits/Like-Xu/KVM-x86-pmu-Four-functional-fixes/20211112-175332
base: https://git.kernel.org/cgit/virt/kvm/kvm.git queue
patch link: https://lore.kernel.org/kvm/[email protected]

in testcase: kvm-unit-tests
version: kvm-unit-tests-x86_64-49934b5-1_20211109
with following parameters:

ucode: 0x28



on test machine: 8 threads 1 sockets Intel(R) Core(TM) i7-4790T CPU @ 2.70GHz with 16G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):



If you fix the issue, kindly add following tag
Reported-by: kernel test robot <[email protected]>


[ 84.771702][ T4365] BUG: KASAN: stack-out-of-bounds in _find_first_bit (lib/find_bit.c:83)
[ 84.780637][ T4365] Read of size 8 at addr ffffc9000c60f8f8 by task qemu-system-x86/4365
[ 84.790296][ T4365]
[ 84.794004][ T4365] CPU: 0 PID: 4365 Comm: qemu-system-x86 Not tainted 5.15.0-rc2-00208-g54244a5dd791 #1
[ 84.805011][ T4365] Hardware name: Gigabyte Technology Co., Ltd. Z97X-UD5H/Z97X-UD5H, BIOS F9 04/21/2015
[ 84.816034][ T4365] Call Trace:
[ 84.820699][ T4365] dump_stack_lvl (lib/dump_stack.c:107)
[ 84.826539][ T4365] print_address_description+0x21/0x140
[ 84.834470][ T4365] ? _find_first_bit (lib/find_bit.c:83)
[ 84.840512][ T4365] kasan_report.cold (mm/kasan/report.c:443 mm/kasan/report.c:459)
[ 84.846572][ T4365] ? _find_first_bit (lib/find_bit.c:83)
[ 84.852560][ T4365] _find_first_bit (lib/find_bit.c:83)
[ 84.858377][ T4365] intel_pmu_refresh (arch/x86/kvm/vmx/pmu_intel.c:513 (discriminator 3) arch/x86/kvm/vmx/pmu_intel.c:553 (discriminator 3)) kvm_intel
[ 84.865539][ T4365] ? __kernel_text_address (kernel/extable.c:105)
[ 84.871885][ T4365] ? vmemdup_user (mm/util.c:200)
[ 84.877581][ T4365] ? intel_msr_idx_to_pmc (arch/x86/kvm/vmx/pmu_intel.c:518) kvm_intel
[ 84.885068][ T4365] ? arch_stack_walk (arch/x86/kernel/stacktrace.c:26)
[ 84.890932][ T4365] ? kasan_unpoison (mm/kasan/shadow.c:108 mm/kasan/shadow.c:142)
[ 84.896629][ T4365] kvm_vcpu_after_set_cpuid (arch/x86/kvm/cpuid.c:1125 arch/x86/kvm/cpuid.h:77 arch/x86/kvm/cpuid.h:89 arch/x86/kvm/cpuid.c:204) kvm
[ 84.903810][ T4365] kvm_vcpu_ioctl_set_cpuid2 (arch/x86/kvm/cpuid.c:327) kvm
[ 84.910961][ T4365] kvm_arch_vcpu_ioctl (arch/x86/kvm/x86.c:5208) kvm
[ 84.917710][ T4365] ? kmem_cache_alloc (mm/slab.h:520 mm/slub.c:3206 mm/slub.c:3214 mm/slub.c:3219)
[ 84.923632][ T4365] ? vm_area_alloc (kernel/fork.c:349)
[ 84.929232][ T4365] ? mmap_region (mm/mmap.c:1767)
[ 84.934827][ T4365] ? do_mmap (mm/mmap.c:1575)
[ 84.939958][ T4365] ? vm_mmap_pgoff (mm/util.c:519)
[ 84.945616][ T4365] ? ksys_mmap_pgoff (mm/mmap.c:1624)
[ 84.951437][ T4365] ? kvm_arch_vcpu_put (arch/x86/kvm/x86.c:5124) kvm
[ 84.957991][ T4365] ? rmqueue_bulk (mm/page_alloc.c:3677)
[ 84.963736][ T4365] ? kernel_init_free_pages+0xc7/0x1c0
[ 84.970700][ T4365] ? prep_new_page (mm/page_alloc.c:1267 mm/page_alloc.c:2414 mm/page_alloc.c:2424)
[ 84.976358][ T4365] ? get_page_from_freelist (mm/page_alloc.c:4159)
[ 84.982821][ T4365] ? mem_cgroup_oom_trylock (mm/memcontrol.c:2531)
[ 84.989391][ T4365] ? __alloc_pages_slowpath+0x1fc0/0x1fc0
[ 84.997091][ T4365] ? __mod_memcg_lruvec_state (mm/memcontrol.c:684)
[ 85.003658][ T4365] ? __mod_lruvec_page_state (arch/x86/include/asm/preempt.h:85 include/linux/rcupdate.h:73 include/linux/rcupdate.h:719 mm/memcontrol.c:729)
[ 85.010244][ T4365] ? pagevec_add_and_need_flush (arch/x86/include/asm/atomic.h:29 include/linux/atomic/atomic-instrumented.h:28 include/linux/swap.h:355 mm/swap.c:223 mm/swap.c:218)
[ 85.016966][ T4365] ? mutex_lock_killable (arch/x86/include/asm/atomic64_64.h:190 include/linux/atomic/atomic-long.h:443 include/linux/atomic/atomic-instrumented.h:1669 kernel/locking/mutex.c:165 kernel/locking/mutex.c:949)
[ 85.023061][ T4365] ? __mutex_lock_killable_slowpath (kernel/locking/mutex.c:946)
[ 85.030041][ T4365] kvm_vcpu_ioctl (arch/x86/kvm/../../../virt/kvm/kvm_main.c:3747) kvm
[ 85.036224][ T4365] ? fiemap_prep (fs/ioctl.c:778)
[ 85.041714][ T4365] ? kvm_set_memory_region (arch/x86/kvm/../../../virt/kvm/kvm_main.c:3743) kvm
[ 85.048494][ T4365] ? copy_page_range (mm/memory.c:4609)
[ 85.054537][ T4365] ? __might_fault (mm/memory.c:5263)
[ 85.060056][ T4365] ? down_read_trylock (arch/x86/include/asm/atomic64_64.h:34 include/linux/atomic/atomic-long.h:41 include/linux/atomic/atomic-instrumented.h:1198 kernel/locking/rwsem.c:171 kernel/locking/rwsem.c:176 kernel/locking/rwsem.c:1249 kernel/locking/rwsem.c:1503)
[ 85.066011][ T4365] ? __fget_files (fs/file.c:865)
[ 85.071629][ T4365] __x64_sys_ioctl (fs/ioctl.c:52 fs/ioctl.c:874 fs/ioctl.c:860 fs/ioctl.c:860)
[ 85.077309][ T4365] do_syscall_64 (arch/x86/entry/common.c:50 arch/x86/entry/common.c:80)
[ 85.082620][ T4365] entry_SYSCALL_64_after_hwframe (arch/x86/entry/entry_64.S:113)
[ 85.089453][ T4365] RIP: 0033:0x7f06dc8f1427
[ 85.094794][ T4365] Code: 00 00 90 48 8b 05 69 aa 0c 00 64 c7 00 26 00 00 00 48 c7 c0 ff ff ff ff c3 66 2e 0f 1f 84 00 00 00 00 00 b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d 39 aa 0c 00 f7 d8 64 89 01 48
All code
========
0: 00 00 add %al,(%rax)
2: 90 nop
3: 48 8b 05 69 aa 0c 00 mov 0xcaa69(%rip),%rax # 0xcaa73
a: 64 c7 00 26 00 00 00 movl $0x26,%fs:(%rax)
11: 48 c7 c0 ff ff ff ff mov $0xffffffffffffffff,%rax
18: c3 retq
19: 66 2e 0f 1f 84 00 00 nopw %cs:0x0(%rax,%rax,1)
20: 00 00 00
23: b8 10 00 00 00 mov $0x10,%eax
28: 0f 05 syscall
2a:* 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax <-- trapping instruction
30: 73 01 jae 0x33
32: c3 retq
33: 48 8b 0d 39 aa 0c 00 mov 0xcaa39(%rip),%rcx # 0xcaa73
3a: f7 d8 neg %eax
3c: 64 89 01 mov %eax,%fs:(%rcx)
3f: 48 rex.W

Code starting with the faulting instruction
===========================================
0: 48 3d 01 f0 ff ff cmp $0xfffffffffffff001,%rax
6: 73 01 jae 0x9
8: c3 retq
9: 48 8b 0d 39 aa 0c 00 mov 0xcaa39(%rip),%rcx # 0xcaa49
10: f7 d8 neg %eax
12: 64 89 01 mov %eax,%fs:(%rcx)
15: 48 rex.W
[ 85.116510][ T4365] RSP: 002b:00007f06d9f6c558 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
[ 85.125943][ T4365] RAX: ffffffffffffffda RBX: 000000004008ae90 RCX: 00007f06dc8f1427
[ 85.134961][ T4365] RDX: 00007f06d9f6c6d0 RSI: 000000004008ae90 RDI: 000000000000000e
[ 85.143958][ T4365] RBP: 00007f06d9f6c6d0 R08: 000000000000000c R09: 0000000000000000
[ 85.152953][ T4365] R10: 0000000000000000 R11: 0000000000000246 R12: 000055b6978d4620
[ 85.161952][ T4365] R13: 0000000000000020 R14: 000055b6978d4620 R15: 0000000000000022
[ 85.170960][ T4365]
[ 85.174322][ T4365]
[ 85.177658][ T4365] addr ffffc9000c60f8f8 is located in stack of task qemu-system-x86/4365 at offset 48 in frame:
[ 85.189115][ T4365] intel_pmu_refresh (arch/x86/kvm/vmx/pmu_intel.c:518) kvm_intel
[ 85.195987][ T4365]
[ 85.199373][ T4365] this frame has 2 objects:
[ 85.204920][ T4365] [48, 52) 'avail_cpuid_events'
[ 85.204922][ T4365] [64, 92) 'x86_pmu'
[ 85.210898][ T4365]
[ 85.219235][ T4365] Memory state around the buggy address:
[ 85.225867][ T4365] ffffc9000c60f780: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 85.234910][ T4365] ffffc9000c60f800: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 85.243951][ T4365] >ffffc9000c60f880: 00 00 00 00 00 00 00 00 00 f1 f1 f1 f1 f1 f1 04
[ 85.252982][ T4365] ^
[ 85.261951][ T4365] ffffc9000c60f900: f2 00 00 00 04 f3 f3 f3 f3 00 00 00 00 00 00 00
[ 85.270999][ T4365] ffffc9000c60f980: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 85.280044][ T4365] ==================================================================
[ 85.289098][ T4365] Disabling lock debugging due to kernel taint
[ 95.394905][ T375] IPMI BMC is not supported on this machine, skip bmc-watchdog setup!
[ 95.394916][ T375]
[ 97.539025][ T375]
[ 97.555987][ T375]
[ 99.429527][ T375]
[ 101.300244][ T375]
[ 103.169726][ T375]
[ 105.510268][ T375]
[ 107.393317][ T5543] kvm: emulating exchange as write
[ 107.481028][ T375]
[ 109.383408][ T375]
[ 111.259366][ T375]
[ 113.128375][ T375]
[ 115.008256][ T375]
[ 116.886251][ T375]
[ 119.219491][ T375]
[ 122.445594][ T375]
[ 124.341055][ T375]
[ 126.072608][ T375]
[ 129.915795][ T375]
[ 131.824177][ T375]
[ 138.705573][ T375]
[ 140.580108][ T375]
[ 142.455213][ T375]
[ 144.326085][ T375]
[ 146.221456][ T375]
[ 148.105465][ T375]
[ 150.001780][ T375]
[ 150.013556][ T375]
[ 150.024507][ T375]
[ 155.625268][ T375]
[ 157.506774][ T375]
[ 159.384324][ T375]
[ 161.257915][ T375]
[ 163.132870][ T375]
[ 165.008923][ T375]
[ 165.020671][ T375]
[ 167.307168][T10789] kvm [10786]: vcpu0, guest rIP: 0x4091d8 vmx_set_msr: BTF|LBR in IA32_DEBUGCTLMSR 0x3, nop
[ 167.320146][T10789] kvm [10786]: vcpu0, guest rIP: 0x409277 vmx_set_msr: BTF|LBR in IA32_DEBUGCTLMSR 0x3, nop
[ 179.394522][ T375]
[ 181.402357][ T375]
[ 183.369745][ T375]
[ 185.345571][ T375]
[ 187.293076][ T375]
[ 189.262862][ T375]
[ 191.364860][ T375]
[ 193.434728][ T375]


To reproduce:

git clone https://github.com/intel/lkp-tests.git
cd lkp-tests
sudo bin/lkp install job.yaml # job file is attached in this email
bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
sudo bin/lkp run generated-yaml-file

# if come across any failure that blocks the test,
# please remove ~/.lkp and /lkp dir to run from a clean state.



---
0DAY/LKP+ Test Infrastructure Open Source Technology Center
https://lists.01.org/hyperkitty/list/[email protected] Intel Corporation

Thanks,
Oliver Sang


Attachments:
(No filename) (10.08 kB)
config-5.15.0-rc2-00208-g54244a5dd791 (170.52 kB)
job-script (5.29 kB)
dmesg.xz (4.29 kB)
kvm-unit-tests (2.50 kB)
job.yaml (4.29 kB)
reproduce (15.00 B)
Download all attachments