2020-06-13 08:18:19

by Like Xu

[permalink] [raw]
Subject: [PATCH v12 00/11] Guest Last Branch Recording Enabling

Hi all,

Please help review this new version for the Kenrel 5.9 release.

Now, you may apply the last two qemu-devel patches to the upstream
qemu and try the guest LBR feature with '-cpu host' command line.

v11->v12 Changelog:
- apply "Signed-off-by" form PeterZ and his codes for the perf subsystem;
- add validity checks before expose LBR via MSR_IA32_PERF_CAPABILITIES;
- refactor MSR_IA32_DEBUGCTLMSR emulation with validity check;
- reorder "perf_event_attr" fields according to how they're declared;
- replace event_is_oncpu() with "event->state" check;
- make LBR emualtion specific to vmx rather than x86 generic;
- move pass-through LBR code to vmx.c instead of pmu_intel.c;
- add vmx_lbr_en/disable_passthrough layer to make code readable;
- rewrite pmu availability check with vmx_passthrough_lbr_msrs();

You may check more details in each commit.

Previous:
https://lore.kernel.org/kvm/[email protected]/

---

The last branch recording (LBR) is a performance monitor unit (PMU)
feature on Intel processors that records a running trace of the most
recent branches taken by the processor in the LBR stack. This patch
series is going to enable this feature for plenty of KVM guests.

The userspace could configure whether it's enabled or not for each
guest via MSR_IA32_PERF_CAPABILITIES msr. As a first step, a guest
could only enable LBR feature if its cpu model is the same as the
host since the LBR feature is still one of model specific features.

If it's enabled on the guest, the guest LBR driver would accesses the
LBR MSR (including IA32_DEBUGCTLMSR and records MSRs) as host does.
The first guest access on the LBR related MSRs is always interceptible.
The KVM trap would create a special LBR event (called guest LBR event)
which enables the callstack mode and none of hardware counter is assigned.
The host perf would enable and schedule this event as usual.

Guest's first access to a LBR registers gets trapped to KVM, which
creates a guest LBR perf event. It's a regular LBR perf event which gets
the LBR facility assigned from the perf subsystem. Once that succeeds,
the LBR stack msrs are passed through to the guest for efficient accesses.
However, if another host LBR event comes in and takes over the LBR
facility, the LBR msrs will be made interceptible, and guest following
accesses to the LBR msrs will be trapped and meaningless.

Because saving/restoring tens of LBR MSRs (e.g. 32 LBR stack entries) in
VMX transition brings too excessive overhead to frequent vmx transition
itself, the guest LBR event would help save/restore the LBR stack msrs
during the context switching with the help of native LBR event callstack
mechanism, including LBR_SELECT msr.

If the guest no longer accesses the LBR-related MSRs within a scheduling
time slice and the LBR enable bit is unset, vPMU would release its guest
LBR event as a normal event of a unused vPMC and the pass-through
state of the LBR stack msrs would be canceled.

---

LBR testcase:
echo 1 > /proc/sys/kernel/watchdog
echo 25 > /proc/sys/kernel/perf_cpu_time_max_percent
echo 5000 > /proc/sys/kernel/perf_event_max_sample_rate
echo 0 > /proc/sys/kernel/perf_cpu_time_max_percent
./perf record -b ./br_instr a

- Perf report on the host:
Samples: 72K of event 'cycles', Event count (approx.): 72512
Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles
12.12% br_instr br_instr [.] cmp_end [.] lfsr_cond 1
11.05% br_instr br_instr [.] lfsr_cond [.] cmp_end 5
8.81% br_instr br_instr [.] lfsr_cond [.] cmp_end 4
5.04% br_instr br_instr [.] cmp_end [.] lfsr_cond 20
4.92% br_instr br_instr [.] lfsr_cond [.] cmp_end 6
4.88% br_instr br_instr [.] cmp_end [.] lfsr_cond 6
4.58% br_instr br_instr [.] cmp_end [.] lfsr_cond 5

- Perf report on the guest:
Samples: 92K of event 'cycles', Event count (approx.): 92544
Overhead Command Source Shared Object Source Symbol Target Symbol Basic Block Cycles
12.03% br_instr br_instr [.] cmp_end [.] lfsr_cond 1
11.09% br_instr br_instr [.] lfsr_cond [.] cmp_end 5
8.57% br_instr br_instr [.] lfsr_cond [.] cmp_end 4
5.08% br_instr br_instr [.] lfsr_cond [.] cmp_end 6
5.06% br_instr br_instr [.] cmp_end [.] lfsr_cond 20
4.87% br_instr br_instr [.] cmp_end [.] lfsr_cond 6
4.70% br_instr br_instr [.] cmp_end [.] lfsr_cond 5

Conclusion: the profiling results on the guest are similar to that on the host.

Like Xu (10):
perf/x86/core: Refactor hw->idx checks and cleanup
perf/x86/lbr: Add interface to get LBR information
perf/x86: Add constraint to create guest LBR event without hw counter
perf/x86: Keep LBR records unchanged in host context for guest usage
KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES
KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled
KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI
KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation
KVM: vmx/pmu: Release guest LBR event via lazy release mechanism

Wei Wang (1):
perf/x86: Fix variable types for LBR registers

Qemu-devel:
target/i386: define a MSR based feature word - FEAT_PERF_CAPABILITIES
target/i386: add -cpu,lbr=true support to enable guest LBR

arch/x86/events/core.c | 26 +--
arch/x86/events/intel/core.c | 109 ++++++++-----
arch/x86/events/intel/lbr.c | 51 +++++-
arch/x86/events/perf_event.h | 8 +-
arch/x86/include/asm/perf_event.h | 34 +++-
arch/x86/kvm/pmu.c | 12 +-
arch/x86/kvm/pmu.h | 5 +
arch/x86/kvm/vmx/capabilities.h | 23 ++-
arch/x86/kvm/vmx/pmu_intel.c | 253 +++++++++++++++++++++++++++++-
arch/x86/kvm/vmx/vmx.c | 86 +++++++++-
arch/x86/kvm/vmx/vmx.h | 17 ++
arch/x86/kvm/x86.c | 13 --
12 files changed, 559 insertions(+), 78 deletions(-)

--
2.21.3


2020-06-13 08:18:29

by Like Xu

[permalink] [raw]
Subject: [PATCH v12 02/11] perf/x86/core: Refactor hw->idx checks and cleanup

For intel_pmu_en/disable_event(), reorder the branches checks for hw->idx
and make them sorted by probability: gp,fixed,bts,others.

Clean up the x86_assign_hw_event() by converting multiple if-else
statements to a switch statement.

To skip x86_perf_event_update() and x86_perf_event_set_period(),
it's generic to replace "idx == INTEL_PMC_IDX_FIXED_BTS" check with
'!hwc->event_base' because that should be 0 for all non-gp/fixed cases.

Wrap related bit operations into intel_set/clear_masks() and make the main
path more cleaner and readable.

No functional changes.

Original-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Like Xu <[email protected]>
---
arch/x86/events/core.c | 25 +++++++----
arch/x86/events/intel/core.c | 85 +++++++++++++++++++-----------------
2 files changed, 62 insertions(+), 48 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 9e63ee50b19a..9a5056472b67 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -71,10 +71,9 @@ u64 x86_perf_event_update(struct perf_event *event)
struct hw_perf_event *hwc = &event->hw;
int shift = 64 - x86_pmu.cntval_bits;
u64 prev_raw_count, new_raw_count;
- int idx = hwc->idx;
u64 delta;

- if (idx == INTEL_PMC_IDX_FIXED_BTS)
+ if (unlikely(!hwc->event_base))
return 0;

/*
@@ -1097,22 +1096,30 @@ static inline void x86_assign_hw_event(struct perf_event *event,
struct cpu_hw_events *cpuc, int i)
{
struct hw_perf_event *hwc = &event->hw;
+ int idx;

- hwc->idx = cpuc->assign[i];
+ idx = hwc->idx = cpuc->assign[i];
hwc->last_cpu = smp_processor_id();
hwc->last_tag = ++cpuc->tags[i];

- if (hwc->idx == INTEL_PMC_IDX_FIXED_BTS) {
+ switch (hwc->idx) {
+ case INTEL_PMC_IDX_FIXED_BTS:
hwc->config_base = 0;
hwc->event_base = 0;
- } else if (hwc->idx >= INTEL_PMC_IDX_FIXED) {
+ break;
+
+ case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS-1:
hwc->config_base = MSR_ARCH_PERFMON_FIXED_CTR_CTRL;
- hwc->event_base = MSR_ARCH_PERFMON_FIXED_CTR0 + (hwc->idx - INTEL_PMC_IDX_FIXED);
- hwc->event_base_rdpmc = (hwc->idx - INTEL_PMC_IDX_FIXED) | 1<<30;
- } else {
+ hwc->event_base = MSR_ARCH_PERFMON_FIXED_CTR0 +
+ (idx - INTEL_PMC_IDX_FIXED);
+ hwc->event_base_rdpmc = (idx - INTEL_PMC_IDX_FIXED) | 1<<30;
+ break;
+
+ default:
hwc->config_base = x86_pmu_config_addr(hwc->idx);
hwc->event_base = x86_pmu_event_addr(hwc->idx);
hwc->event_base_rdpmc = x86_pmu_rdpmc_index(hwc->idx);
+ break;
}
}

@@ -1233,7 +1240,7 @@ int x86_perf_event_set_period(struct perf_event *event)
s64 period = hwc->sample_period;
int ret = 0, idx = hwc->idx;

- if (idx == INTEL_PMC_IDX_FIXED_BTS)
+ if (unlikely(!hwc->event_base))
return 0;

/*
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index ca35c8b5ee10..8dac4c61bf76 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2136,8 +2136,35 @@ static inline void intel_pmu_ack_status(u64 ack)
wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, ack);
}

-static void intel_pmu_disable_fixed(struct hw_perf_event *hwc)
+static inline bool event_is_checkpointed(struct perf_event *event)
+{
+ return unlikely(event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0;
+}
+
+static inline void intel_set_masks(struct perf_event *event, int idx)
+{
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+ if (event->attr.exclude_host)
+ __set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_guest_mask);
+ if (event->attr.exclude_guest)
+ __set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_host_mask);
+ if (event_is_checkpointed(event))
+ __set_bit(idx, (unsigned long *)&cpuc->intel_cp_status);
+}
+
+static inline void intel_clear_masks(struct perf_event *event, int idx)
{
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+ __clear_bit(idx, (unsigned long *)&cpuc->intel_ctrl_guest_mask);
+ __clear_bit(idx, (unsigned long *)&cpuc->intel_ctrl_host_mask);
+ __clear_bit(idx, (unsigned long *)&cpuc->intel_cp_status);
+}
+
+static void intel_pmu_disable_fixed(struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx - INTEL_PMC_IDX_FIXED;
u64 ctrl_val, mask;

@@ -2148,31 +2175,22 @@ static void intel_pmu_disable_fixed(struct hw_perf_event *hwc)
wrmsrl(hwc->config_base, ctrl_val);
}

-static inline bool event_is_checkpointed(struct perf_event *event)
-{
- return (event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0;
-}
-
static void intel_pmu_disable_event(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
- struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ int idx = hwc->idx;

- if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) {
+ if (idx < INTEL_PMC_IDX_FIXED) {
+ intel_clear_masks(event, idx);
+ x86_pmu_disable_event(event);
+ } else if (idx < INTEL_PMC_IDX_FIXED_BTS) {
+ intel_clear_masks(event, idx);
+ intel_pmu_disable_fixed(event);
+ } else if (idx == INTEL_PMC_IDX_FIXED_BTS) {
intel_pmu_disable_bts();
intel_pmu_drain_bts_buffer();
- return;
}

- cpuc->intel_ctrl_guest_mask &= ~(1ull << hwc->idx);
- cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx);
- cpuc->intel_cp_status &= ~(1ull << hwc->idx);
-
- if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL))
- intel_pmu_disable_fixed(hwc);
- else
- x86_pmu_disable_event(event);
-
/*
* Needs to be called after x86_pmu_disable_event,
* so we don't trigger the event without PEBS bit set.
@@ -2238,33 +2256,22 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
static void intel_pmu_enable_event(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
- struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
-
- if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) {
- if (!__this_cpu_read(cpu_hw_events.enabled))
- return;
-
- intel_pmu_enable_bts(hwc->config);
- return;
- }
-
- if (event->attr.exclude_host)
- cpuc->intel_ctrl_guest_mask |= (1ull << hwc->idx);
- if (event->attr.exclude_guest)
- cpuc->intel_ctrl_host_mask |= (1ull << hwc->idx);
-
- if (unlikely(event_is_checkpointed(event)))
- cpuc->intel_cp_status |= (1ull << hwc->idx);
+ int idx = hwc->idx;

if (unlikely(event->attr.precise_ip))
intel_pmu_pebs_enable(event);

- if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
+ if (idx < INTEL_PMC_IDX_FIXED) {
+ intel_set_masks(event, idx);
+ __x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE);
+ } else if (idx < INTEL_PMC_IDX_FIXED_BTS) {
+ intel_set_masks(event, idx);
intel_pmu_enable_fixed(event);
- return;
+ } else if (idx == INTEL_PMC_IDX_FIXED_BTS) {
+ if (!__this_cpu_read(cpu_hw_events.enabled))
+ return;
+ intel_pmu_enable_bts(hwc->config);
}
-
- __x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE);
}

static void intel_pmu_add_event(struct perf_event *event)
--
2.21.3

2020-06-13 08:18:33

by Like Xu

[permalink] [raw]
Subject: [PATCH v12 06/11] KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES

The bits [0, 5] of the read-only MSR_IA32_PERF_CAPABILITIES tells about
the record format stored in the LBR records. Userspace could expose guest
LBR when host supports LBR and the exactly supported LBR format value is
initialized to the MSR_IA32_PERF_CAPABILITIES and vcpu model is compatible.

Signed-off-by: Like Xu <[email protected]>
---
arch/x86/kvm/vmx/capabilities.h | 11 ++++++-
arch/x86/kvm/vmx/pmu_intel.c | 52 +++++++++++++++++++++++++++++++--
arch/x86/kvm/vmx/vmx.h | 6 ++++
3 files changed, 65 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/vmx/capabilities.h b/arch/x86/kvm/vmx/capabilities.h
index 4bbd8b448d22..b633a90320ee 100644
--- a/arch/x86/kvm/vmx/capabilities.h
+++ b/arch/x86/kvm/vmx/capabilities.h
@@ -19,6 +19,7 @@ extern int __read_mostly pt_mode;
#define PT_MODE_HOST_GUEST 1

#define PMU_CAP_FW_WRITES (1ULL << 13)
+#define PMU_CAP_LBR_FMT 0x3f

struct nested_vmx_msrs {
/*
@@ -375,7 +376,15 @@ static inline u64 vmx_get_perf_capabilities(void)
* Since counters are virtualized, KVM would support full
* width counting unconditionally, even if the host lacks it.
*/
- return PMU_CAP_FW_WRITES;
+ u64 perf_cap = PMU_CAP_FW_WRITES;
+
+ if (boot_cpu_has(X86_FEATURE_PDCM))
+ rdmsrl(MSR_IA32_PERF_CAPABILITIES, perf_cap);
+
+ /* From now on, KVM will support LBR. */
+ perf_cap |= perf_cap & PMU_CAP_LBR_FMT;
+
+ return perf_cap;
}

#endif /* __KVM_X86_VMX_CAPS_H */
diff --git a/arch/x86/kvm/vmx/pmu_intel.c b/arch/x86/kvm/vmx/pmu_intel.c
index bdcce65c7a1d..a953c7d633f6 100644
--- a/arch/x86/kvm/vmx/pmu_intel.c
+++ b/arch/x86/kvm/vmx/pmu_intel.c
@@ -168,6 +168,13 @@ static inline struct kvm_pmc *get_fw_gp_pmc(struct kvm_pmu *pmu, u32 msr)
return get_gp_pmc(pmu, msr, MSR_IA32_PMC0);
}

+static inline bool lbr_is_enabled(struct kvm_vcpu *vcpu)
+{
+ struct x86_pmu_lbr *lbr = &to_vmx(vcpu)->lbr_desc.lbr;
+
+ return lbr->nr && (vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT);
+}
+
static bool intel_is_valid_msr(struct kvm_vcpu *vcpu, u32 msr)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -251,6 +258,30 @@ static int intel_pmu_get_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
return 1;
}

+static inline bool lbr_fmt_is_matched(u64 data)
+{
+ return (data & PMU_CAP_LBR_FMT) ==
+ (vmx_get_perf_capabilities() & PMU_CAP_LBR_FMT);
+}
+
+static inline bool lbr_is_compatible(struct kvm_vcpu *vcpu)
+{
+ struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+
+ if (pmu->version < 2)
+ return false;
+
+ /*
+ * As a first step, a guest could only enable LBR feature if its cpu
+ * model is the same as the host because the LBR registers would
+ * be pass-through to the guest and they're model specific.
+ */
+ if (boot_cpu_data.x86_model != guest_cpuid_model(vcpu))
+ return false;
+
+ return true;
+}
+
static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
{
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
@@ -295,6 +326,14 @@ static int intel_pmu_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr_info)
if (guest_cpuid_has(vcpu, X86_FEATURE_PDCM) ?
(data & ~vmx_get_perf_capabilities()) : data)
return 1;
+ if (data & PMU_CAP_LBR_FMT) {
+ if (!lbr_fmt_is_matched(data))
+ return 1;
+ if (!lbr_is_compatible(vcpu))
+ return 1;
+ if (x86_perf_get_lbr(&to_vmx(vcpu)->lbr_desc.lbr))
+ return 1;
+ }
vcpu->arch.perf_capabilities = data;
return 0;
default:
@@ -337,6 +376,7 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
struct kvm_cpuid_entry2 *entry;
union cpuid10_eax eax;
union cpuid10_edx edx;
+ struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc;

pmu->nr_arch_gp_counters = 0;
pmu->nr_arch_fixed_counters = 0;
@@ -344,7 +384,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
pmu->counter_bitmask[KVM_PMC_FIXED] = 0;
pmu->version = 0;
pmu->reserved_bits = 0xffffffff00200000ull;
- vcpu->arch.perf_capabilities = 0;

entry = kvm_find_cpuid_entry(vcpu, 0xa, 0);
if (!entry)
@@ -357,8 +396,6 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
return;

perf_get_x86_pmu_capability(&x86_pmu);
- if (guest_cpuid_has(vcpu, X86_FEATURE_PDCM))
- vcpu->arch.perf_capabilities = vmx_get_perf_capabilities();

pmu->nr_arch_gp_counters = min_t(int, eax.split.num_counters,
x86_pmu.num_counters_gp);
@@ -397,6 +434,10 @@ static void intel_pmu_refresh(struct kvm_vcpu *vcpu)
bitmap_set(pmu->all_valid_pmc_idx,
INTEL_PMC_MAX_GENERIC, pmu->nr_arch_fixed_counters);

+ if ((vcpu->arch.perf_capabilities & PMU_CAP_LBR_FMT) &&
+ x86_perf_get_lbr(&lbr_desc->lbr))
+ vcpu->arch.perf_capabilities &= ~PMU_CAP_LBR_FMT;
+
nested_vmx_pmu_entry_exit_ctls_update(vcpu);
}

@@ -404,6 +445,7 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
{
int i;
struct kvm_pmu *pmu = vcpu_to_pmu(vcpu);
+ struct lbr_desc *lbr_desc = &to_vmx(vcpu)->lbr_desc;

for (i = 0; i < INTEL_PMC_MAX_GENERIC; i++) {
pmu->gp_counters[i].type = KVM_PMC_GP;
@@ -418,6 +460,10 @@ static void intel_pmu_init(struct kvm_vcpu *vcpu)
pmu->fixed_counters[i].idx = i + INTEL_PMC_IDX_FIXED;
pmu->fixed_counters[i].current_config = 0;
}
+
+ vcpu->arch.perf_capabilities = guest_cpuid_has(vcpu, X86_FEATURE_PDCM) ?
+ vmx_get_perf_capabilities() : 0;
+ lbr_desc->lbr.nr = 0;
}

static void intel_pmu_reset(struct kvm_vcpu *vcpu)
diff --git a/arch/x86/kvm/vmx/vmx.h b/arch/x86/kvm/vmx/vmx.h
index 8a83b5edc820..ef24338b194d 100644
--- a/arch/x86/kvm/vmx/vmx.h
+++ b/arch/x86/kvm/vmx/vmx.h
@@ -91,6 +91,11 @@ struct pt_desc {
struct pt_ctx guest;
};

+struct lbr_desc {
+ /* Basic information about LBR records. */
+ struct x86_pmu_lbr lbr;
+};
+
/*
* The nested_vmx structure is part of vcpu_vmx, and holds information we need
* for correct emulation of VMX (i.e., nested VMX) on this vcpu.
@@ -302,6 +307,7 @@ struct vcpu_vmx {
u64 ept_pointer;

struct pt_desc pt_desc;
+ struct lbr_desc lbr_desc;
};

enum ept_pointers_status {
--
2.21.3

2020-06-23 13:15:47

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling

On 2020/6/13 16:09, Like Xu wrote:
> Hi all,
>
> Please help review this new version for the Kenrel 5.9 release.
>
> Now, you may apply the last two qemu-devel patches to the upstream
> qemu and try the guest LBR feature with '-cpu host' command line.
>
> v11->v12 Changelog:
> - apply "Signed-off-by" form PeterZ and his codes for the perf subsystem;
> - add validity checks before expose LBR via MSR_IA32_PERF_CAPABILITIES;
> - refactor MSR_IA32_DEBUGCTLMSR emulation with validity check;
> - reorder "perf_event_attr" fields according to how they're declared;
> - replace event_is_oncpu() with "event->state" check;
> - make LBR emualtion specific to vmx rather than x86 generic;
> - move pass-through LBR code to vmx.c instead of pmu_intel.c;
> - add vmx_lbr_en/disable_passthrough layer to make code readable;
> - rewrite pmu availability check with vmx_passthrough_lbr_msrs();
>
> You may check more details in each commit.
>
> Previous:
> https://lore.kernel.org/kvm/[email protected]/
>
> ---
...
>
> Wei Wang (1):
> perf/x86: Fix variable types for LBR registers > Like Xu (10):
> perf/x86/core: Refactor hw->idx checks and cleanup
> perf/x86/lbr: Add interface to get LBR information
> perf/x86: Add constraint to create guest LBR event without hw counter
> perf/x86: Keep LBR records unchanged in host context for guest usage

Hi Peter,
Would you like to add "Acked-by" to the first three perf patches ?

> KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES
> KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
> KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled
> KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI
> KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation
> KVM: vmx/pmu: Release guest LBR event via lazy release mechanism
>

Hi Paolo,
Would you like to take a moment to review the KVM part for this feature ?

Thanks,
Like Xu

>
> Qemu-devel:
> target/i386: add -cpu,lbr=true support to enable guest LBR
>
> arch/x86/events/core.c | 26 +--
> arch/x86/events/intel/core.c | 109 ++++++++-----
> arch/x86/events/intel/lbr.c | 51 +++++-
> arch/x86/events/perf_event.h | 8 +-
> arch/x86/include/asm/perf_event.h | 34 +++-
> arch/x86/kvm/pmu.c | 12 +-
> arch/x86/kvm/pmu.h | 5 +
> arch/x86/kvm/vmx/capabilities.h | 23 ++-
> arch/x86/kvm/vmx/pmu_intel.c | 253 +++++++++++++++++++++++++++++-
> arch/x86/kvm/vmx/vmx.c | 86 +++++++++-
> arch/x86/kvm/vmx/vmx.h | 17 ++
> arch/x86/kvm/x86.c | 13 --
> 12 files changed, 559 insertions(+), 78 deletions(-)
>

2020-07-01 02:39:58

by Like Xu

[permalink] [raw]
Subject: Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling

Ping friendly.

If there is room for improvement, please let me know.

On 2020/6/23 21:13, Like Xu wrote:
> On 2020/6/13 16:09, Like Xu wrote:
>> Hi all,
>>
>> Please help review this new version for the Kenrel 5.9 release.
>>
>> Now, you may apply the last two qemu-devel patches to the upstream
>> qemu and try the guest LBR feature with '-cpu host' command line.
>>
>> v11->v12 Changelog:
>> - apply "Signed-off-by" form PeterZ and his codes for the perf subsystem;
>> - add validity checks before expose LBR via MSR_IA32_PERF_CAPABILITIES;
>> - refactor MSR_IA32_DEBUGCTLMSR emulation with validity check;
>> - reorder "perf_event_attr" fields according to how they're declared;
>> - replace event_is_oncpu() with "event->state" check;
>> - make LBR emualtion specific to vmx rather than x86 generic;
>> - move pass-through LBR code to vmx.c instead of pmu_intel.c;
>> - add vmx_lbr_en/disable_passthrough layer to make code readable;
>> - rewrite pmu availability check with vmx_passthrough_lbr_msrs();
>>
>> You may check more details in each commit.
>>
>> Previous:
>> https://lore.kernel.org/kvm/[email protected]/
>>
>> ---
> ...
>>
>> Wei Wang (1):
>>   perf/x86: Fix variable types for LBR registers > Like Xu (10):
>>    perf/x86/core: Refactor hw->idx checks and cleanup
>>    perf/x86/lbr: Add interface to get LBR information
>>    perf/x86: Add constraint to create guest LBR event without hw counter
>>    perf/x86: Keep LBR records unchanged in host context for guest usage
>
> Hi Peter,
> Would you like to add "Acked-by" to the first three perf patches ?
>
>>    KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES
>>    KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
>>    KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled
>>    KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI
>>    KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation
>>    KVM: vmx/pmu: Release guest LBR event via lazy release mechanism
>>
>
> Hi Paolo,
> Would you like to take a moment to review the KVM part for this feature ?
>
> Thanks,
> Like Xu
>
>>
>> Qemu-devel:
>>    target/i386: add -cpu,lbr=true support to enable guest LBR
>>
>>   arch/x86/events/core.c            |  26 +--
>>   arch/x86/events/intel/core.c      | 109 ++++++++-----
>>   arch/x86/events/intel/lbr.c       |  51 +++++-
>>   arch/x86/events/perf_event.h      |   8 +-
>>   arch/x86/include/asm/perf_event.h |  34 +++-
>>   arch/x86/kvm/pmu.c                |  12 +-
>>   arch/x86/kvm/pmu.h                |   5 +
>>   arch/x86/kvm/vmx/capabilities.h   |  23 ++-
>>   arch/x86/kvm/vmx/pmu_intel.c      | 253 +++++++++++++++++++++++++++++-
>>   arch/x86/kvm/vmx/vmx.c            |  86 +++++++++-
>>   arch/x86/kvm/vmx/vmx.h            |  17 ++
>>   arch/x86/kvm/x86.c                |  13 --
>>   12 files changed, 559 insertions(+), 78 deletions(-)
>>
>

2020-07-02 07:42:11

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling

On Sat, Jun 13, 2020 at 04:09:45PM +0800, Like Xu wrote:
> Like Xu (10):
> perf/x86/core: Refactor hw->idx checks and cleanup
> perf/x86/lbr: Add interface to get LBR information
> perf/x86: Add constraint to create guest LBR event without hw counter
> perf/x86: Keep LBR records unchanged in host context for guest usage

> Wei Wang (1):
> perf/x86: Fix variable types for LBR registers

> arch/x86/events/core.c | 26 +--
> arch/x86/events/intel/core.c | 109 ++++++++-----
> arch/x86/events/intel/lbr.c | 51 +++++-
> arch/x86/events/perf_event.h | 8 +-
> arch/x86/include/asm/perf_event.h | 34 +++-

These look good to me; but at the same time Kan is sending me
Architectural LBR patches.

Kan, if I take these perf patches and stick them in a tip/perf/vlbr
topic branch, can you rebase the arch lbr stuff on top, or is there
anything in the arch-lbr series that badly conflicts with this work?

Paolo, would that topic branch work for you too, to then stick these
patches in top?

> KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES
> KVM: vmx/pmu: Unmask LBR fields in the MSR_IA32_DEBUGCTLMSR emualtion
> KVM: vmx/pmu: Pass-through LBR msrs when guest LBR event is scheduled
> KVM: vmx/pmu: Emulate legacy freezing LBRs on virtual PMI
> KVM: vmx/pmu: Reduce the overhead of LBR pass-through or cancellation
> KVM: vmx/pmu: Release guest LBR event via lazy release mechanism

> arch/x86/kvm/pmu.c | 12 +-
> arch/x86/kvm/pmu.h | 5 +
> arch/x86/kvm/vmx/capabilities.h | 23 ++-
> arch/x86/kvm/vmx/pmu_intel.c | 253 +++++++++++++++++++++++++++++-
> arch/x86/kvm/vmx/vmx.c | 86 +++++++++-
> arch/x86/kvm/vmx/vmx.h | 17 ++
> arch/x86/kvm/x86.c | 13 --

> 12 files changed, 559 insertions(+), 78 deletions(-)

2020-07-02 13:11:50

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling



On 7/2/2020 3:40 AM, Peter Zijlstra wrote:
> On Sat, Jun 13, 2020 at 04:09:45PM +0800, Like Xu wrote:
>> Like Xu (10):
>> perf/x86/core: Refactor hw->idx checks and cleanup
>> perf/x86/lbr: Add interface to get LBR information
>> perf/x86: Add constraint to create guest LBR event without hw counter
>> perf/x86: Keep LBR records unchanged in host context for guest usage
>
>> Wei Wang (1):
>> perf/x86: Fix variable types for LBR registers
>
>> arch/x86/events/core.c | 26 +--
>> arch/x86/events/intel/core.c | 109 ++++++++-----
>> arch/x86/events/intel/lbr.c | 51 +++++-
>> arch/x86/events/perf_event.h | 8 +-
>> arch/x86/include/asm/perf_event.h | 34 +++-
>
> These look good to me; but at the same time Kan is sending me
> Architectural LBR patches.
>
> Kan, if I take these perf patches and stick them in a tip/perf/vlbr
> topic branch, can you rebase the arch lbr stuff on top, or is there
> anything in the arch-lbr series that badly conflicts with this work?
>

Yes, I can rebase the arch lbr patches on top of them.
Please push the tip/perf/vlbr branch, so I can pull and rebase my patches.

Thanks,
Kan

2020-07-02 14:01:45

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling

On Thu, Jul 02, 2020 at 09:11:06AM -0400, Liang, Kan wrote:
> On 7/2/2020 3:40 AM, Peter Zijlstra wrote:
> > On Sat, Jun 13, 2020 at 04:09:45PM +0800, Like Xu wrote:
> > > Like Xu (10):
> > > perf/x86/core: Refactor hw->idx checks and cleanup
> > > perf/x86/lbr: Add interface to get LBR information
> > > perf/x86: Add constraint to create guest LBR event without hw counter
> > > perf/x86: Keep LBR records unchanged in host context for guest usage
> >
> > > Wei Wang (1):
> > > perf/x86: Fix variable types for LBR registers
> >
> > > arch/x86/events/core.c | 26 +--
> > > arch/x86/events/intel/core.c | 109 ++++++++-----
> > > arch/x86/events/intel/lbr.c | 51 +++++-
> > > arch/x86/events/perf_event.h | 8 +-
> > > arch/x86/include/asm/perf_event.h | 34 +++-
> >
> > These look good to me; but at the same time Kan is sending me
> > Architectural LBR patches.
> >
> > Kan, if I take these perf patches and stick them in a tip/perf/vlbr
> > topic branch, can you rebase the arch lbr stuff on top, or is there
> > anything in the arch-lbr series that badly conflicts with this work?
> >
>
> Yes, I can rebase the arch lbr patches on top of them.
> Please push the tip/perf/vlbr branch, so I can pull and rebase my patches.

For now I have:

git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/vlbr

Once the 0day robot comes back all-green, I'll push it out to
tip/perf/vlbr and merge it into tip/perf/core.

Thanks!

2020-07-03 07:58:05

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling

On Thu, Jul 02, 2020 at 03:58:42PM +0200, Peter Zijlstra wrote:
> On Thu, Jul 02, 2020 at 09:11:06AM -0400, Liang, Kan wrote:
> > On 7/2/2020 3:40 AM, Peter Zijlstra wrote:
> > > On Sat, Jun 13, 2020 at 04:09:45PM +0800, Like Xu wrote:
> > > > Like Xu (10):
> > > > perf/x86/core: Refactor hw->idx checks and cleanup
> > > > perf/x86/lbr: Add interface to get LBR information
> > > > perf/x86: Add constraint to create guest LBR event without hw counter
> > > > perf/x86: Keep LBR records unchanged in host context for guest usage
> > >
> > > > Wei Wang (1):
> > > > perf/x86: Fix variable types for LBR registers
> > >
> > > > arch/x86/events/core.c | 26 +--
> > > > arch/x86/events/intel/core.c | 109 ++++++++-----
> > > > arch/x86/events/intel/lbr.c | 51 +++++-
> > > > arch/x86/events/perf_event.h | 8 +-
> > > > arch/x86/include/asm/perf_event.h | 34 +++-
> > >
> > > These look good to me; but at the same time Kan is sending me
> > > Architectural LBR patches.
> > >
> > > Kan, if I take these perf patches and stick them in a tip/perf/vlbr
> > > topic branch, can you rebase the arch lbr stuff on top, or is there
> > > anything in the arch-lbr series that badly conflicts with this work?
> > >
> >
> > Yes, I can rebase the arch lbr patches on top of them.
> > Please push the tip/perf/vlbr branch, so I can pull and rebase my patches.
>
> For now I have:
>
> git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/vlbr
>
> Once the 0day robot comes back all-green, I'll push it out to
> tip/perf/vlbr and merge it into tip/perf/core.

tip/perf/vlbr now exists, thanks!

2020-07-03 08:02:09

by tip-bot2 for Jacob Pan

[permalink] [raw]
Subject: [tip: perf/core] perf/x86/core: Refactor hw->idx checks and cleanup

The following commit has been merged into the perf/core branch of tip:

Commit-ID: 027440b5d426a51f33b515bbd236cc479d1e051f
Gitweb: https://git.kernel.org/tip/027440b5d426a51f33b515bbd236cc479d1e051f
Author: Like Xu <[email protected]>
AuthorDate: Sat, 13 Jun 2020 16:09:47 +08:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Thu, 02 Jul 2020 15:51:46 +02:00

perf/x86/core: Refactor hw->idx checks and cleanup

For intel_pmu_en/disable_event(), reorder the branches checks for hw->idx
and make them sorted by probability: gp,fixed,bts,others.

Clean up the x86_assign_hw_event() by converting multiple if-else
statements to a switch statement.

To skip x86_perf_event_update() and x86_perf_event_set_period(),
it's generic to replace "idx == INTEL_PMC_IDX_FIXED_BTS" check with
'!hwc->event_base' because that should be 0 for all non-gp/fixed cases.

Wrap related bit operations into intel_set/clear_masks() and make the main
path more cleaner and readable.

No functional changes.

Signed-off-by: Like Xu <[email protected]>
Original-by: Peter Zijlstra (Intel) <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
arch/x86/events/core.c | 25 ++++++----
arch/x86/events/intel/core.c | 85 ++++++++++++++++++-----------------
2 files changed, 62 insertions(+), 48 deletions(-)

diff --git a/arch/x86/events/core.c b/arch/x86/events/core.c
index 4103665..15cb7af 100644
--- a/arch/x86/events/core.c
+++ b/arch/x86/events/core.c
@@ -71,10 +71,9 @@ u64 x86_perf_event_update(struct perf_event *event)
struct hw_perf_event *hwc = &event->hw;
int shift = 64 - x86_pmu.cntval_bits;
u64 prev_raw_count, new_raw_count;
- int idx = hwc->idx;
u64 delta;

- if (idx == INTEL_PMC_IDX_FIXED_BTS)
+ if (unlikely(!hwc->event_base))
return 0;

/*
@@ -1097,22 +1096,30 @@ static inline void x86_assign_hw_event(struct perf_event *event,
struct cpu_hw_events *cpuc, int i)
{
struct hw_perf_event *hwc = &event->hw;
+ int idx;

- hwc->idx = cpuc->assign[i];
+ idx = hwc->idx = cpuc->assign[i];
hwc->last_cpu = smp_processor_id();
hwc->last_tag = ++cpuc->tags[i];

- if (hwc->idx == INTEL_PMC_IDX_FIXED_BTS) {
+ switch (hwc->idx) {
+ case INTEL_PMC_IDX_FIXED_BTS:
hwc->config_base = 0;
hwc->event_base = 0;
- } else if (hwc->idx >= INTEL_PMC_IDX_FIXED) {
+ break;
+
+ case INTEL_PMC_IDX_FIXED ... INTEL_PMC_IDX_FIXED_BTS-1:
hwc->config_base = MSR_ARCH_PERFMON_FIXED_CTR_CTRL;
- hwc->event_base = MSR_ARCH_PERFMON_FIXED_CTR0 + (hwc->idx - INTEL_PMC_IDX_FIXED);
- hwc->event_base_rdpmc = (hwc->idx - INTEL_PMC_IDX_FIXED) | 1<<30;
- } else {
+ hwc->event_base = MSR_ARCH_PERFMON_FIXED_CTR0 +
+ (idx - INTEL_PMC_IDX_FIXED);
+ hwc->event_base_rdpmc = (idx - INTEL_PMC_IDX_FIXED) | 1<<30;
+ break;
+
+ default:
hwc->config_base = x86_pmu_config_addr(hwc->idx);
hwc->event_base = x86_pmu_event_addr(hwc->idx);
hwc->event_base_rdpmc = x86_pmu_rdpmc_index(hwc->idx);
+ break;
}
}

@@ -1233,7 +1240,7 @@ int x86_perf_event_set_period(struct perf_event *event)
s64 period = hwc->sample_period;
int ret = 0, idx = hwc->idx;

- if (idx == INTEL_PMC_IDX_FIXED_BTS)
+ if (unlikely(!hwc->event_base))
return 0;

/*
diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index ca35c8b..8dac4c6 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -2136,8 +2136,35 @@ static inline void intel_pmu_ack_status(u64 ack)
wrmsrl(MSR_CORE_PERF_GLOBAL_OVF_CTRL, ack);
}

-static void intel_pmu_disable_fixed(struct hw_perf_event *hwc)
+static inline bool event_is_checkpointed(struct perf_event *event)
+{
+ return unlikely(event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0;
+}
+
+static inline void intel_set_masks(struct perf_event *event, int idx)
+{
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+ if (event->attr.exclude_host)
+ __set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_guest_mask);
+ if (event->attr.exclude_guest)
+ __set_bit(idx, (unsigned long *)&cpuc->intel_ctrl_host_mask);
+ if (event_is_checkpointed(event))
+ __set_bit(idx, (unsigned long *)&cpuc->intel_cp_status);
+}
+
+static inline void intel_clear_masks(struct perf_event *event, int idx)
{
+ struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+
+ __clear_bit(idx, (unsigned long *)&cpuc->intel_ctrl_guest_mask);
+ __clear_bit(idx, (unsigned long *)&cpuc->intel_ctrl_host_mask);
+ __clear_bit(idx, (unsigned long *)&cpuc->intel_cp_status);
+}
+
+static void intel_pmu_disable_fixed(struct perf_event *event)
+{
+ struct hw_perf_event *hwc = &event->hw;
int idx = hwc->idx - INTEL_PMC_IDX_FIXED;
u64 ctrl_val, mask;

@@ -2148,31 +2175,22 @@ static void intel_pmu_disable_fixed(struct hw_perf_event *hwc)
wrmsrl(hwc->config_base, ctrl_val);
}

-static inline bool event_is_checkpointed(struct perf_event *event)
-{
- return (event->hw.config & HSW_IN_TX_CHECKPOINTED) != 0;
-}
-
static void intel_pmu_disable_event(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
- struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
+ int idx = hwc->idx;

- if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) {
+ if (idx < INTEL_PMC_IDX_FIXED) {
+ intel_clear_masks(event, idx);
+ x86_pmu_disable_event(event);
+ } else if (idx < INTEL_PMC_IDX_FIXED_BTS) {
+ intel_clear_masks(event, idx);
+ intel_pmu_disable_fixed(event);
+ } else if (idx == INTEL_PMC_IDX_FIXED_BTS) {
intel_pmu_disable_bts();
intel_pmu_drain_bts_buffer();
- return;
}

- cpuc->intel_ctrl_guest_mask &= ~(1ull << hwc->idx);
- cpuc->intel_ctrl_host_mask &= ~(1ull << hwc->idx);
- cpuc->intel_cp_status &= ~(1ull << hwc->idx);
-
- if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL))
- intel_pmu_disable_fixed(hwc);
- else
- x86_pmu_disable_event(event);
-
/*
* Needs to be called after x86_pmu_disable_event,
* so we don't trigger the event without PEBS bit set.
@@ -2238,33 +2256,22 @@ static void intel_pmu_enable_fixed(struct perf_event *event)
static void intel_pmu_enable_event(struct perf_event *event)
{
struct hw_perf_event *hwc = &event->hw;
- struct cpu_hw_events *cpuc = this_cpu_ptr(&cpu_hw_events);
-
- if (unlikely(hwc->idx == INTEL_PMC_IDX_FIXED_BTS)) {
- if (!__this_cpu_read(cpu_hw_events.enabled))
- return;
-
- intel_pmu_enable_bts(hwc->config);
- return;
- }
-
- if (event->attr.exclude_host)
- cpuc->intel_ctrl_guest_mask |= (1ull << hwc->idx);
- if (event->attr.exclude_guest)
- cpuc->intel_ctrl_host_mask |= (1ull << hwc->idx);
-
- if (unlikely(event_is_checkpointed(event)))
- cpuc->intel_cp_status |= (1ull << hwc->idx);
+ int idx = hwc->idx;

if (unlikely(event->attr.precise_ip))
intel_pmu_pebs_enable(event);

- if (unlikely(hwc->config_base == MSR_ARCH_PERFMON_FIXED_CTR_CTRL)) {
+ if (idx < INTEL_PMC_IDX_FIXED) {
+ intel_set_masks(event, idx);
+ __x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE);
+ } else if (idx < INTEL_PMC_IDX_FIXED_BTS) {
+ intel_set_masks(event, idx);
intel_pmu_enable_fixed(event);
- return;
+ } else if (idx == INTEL_PMC_IDX_FIXED_BTS) {
+ if (!__this_cpu_read(cpu_hw_events.enabled))
+ return;
+ intel_pmu_enable_bts(hwc->config);
}
-
- __x86_pmu_enable_event(hwc, ARCH_PERFMON_EVENTSEL_ENABLE);
}

static void intel_pmu_add_event(struct perf_event *event)

2020-07-03 08:05:09

by Xu, Like

[permalink] [raw]
Subject: Re: [PATCH v12 00/11] Guest Last Branch Recording Enabling

On 2020/7/3 15:56, Peter Zijlstra wrote:
> On Thu, Jul 02, 2020 at 03:58:42PM +0200, Peter Zijlstra wrote:
>> On Thu, Jul 02, 2020 at 09:11:06AM -0400, Liang, Kan wrote:
>>> On 7/2/2020 3:40 AM, Peter Zijlstra wrote:
>>>> On Sat, Jun 13, 2020 at 04:09:45PM +0800, Like Xu wrote:
>>>>> Like Xu (10):
>>>>> perf/x86/core: Refactor hw->idx checks and cleanup
>>>>> perf/x86/lbr: Add interface to get LBR information
>>>>> perf/x86: Add constraint to create guest LBR event without hw counter
>>>>> perf/x86: Keep LBR records unchanged in host context for guest usage
>>>>> Wei Wang (1):
>>>>> perf/x86: Fix variable types for LBR registers
>>>>> arch/x86/events/core.c | 26 +--
>>>>> arch/x86/events/intel/core.c | 109 ++++++++-----
>>>>> arch/x86/events/intel/lbr.c | 51 +++++-
>>>>> arch/x86/events/perf_event.h | 8 +-
>>>>> arch/x86/include/asm/perf_event.h | 34 +++-
>>>> These look good to me; but at the same time Kan is sending me
>>>> Architectural LBR patches.
>>>>
>>>> Kan, if I take these perf patches and stick them in a tip/perf/vlbr
>>>> topic branch, can you rebase the arch lbr stuff on top, or is there
>>>> anything in the arch-lbr series that badly conflicts with this work?
>>>>
>>> Yes, I can rebase the arch lbr patches on top of them.
>>> Please push the tip/perf/vlbr branch, so I can pull and rebase my patches.
>> For now I have:
>>
>> git://git.kernel.org/pub/scm/linux/kernel/git/peterz/queue.git perf/vlbr
>>
>> Once the 0day robot comes back all-green, I'll push it out to
>> tip/perf/vlbr and merge it into tip/perf/core.
> tip/perf/vlbr now exists, thanks!
Hi Peter,

Thanks for your patience and professional support on this feature!

Thanks,
Like Xu

2020-07-08 13:38:25

by Andi Kleen

[permalink] [raw]
Subject: Re: [PATCH v12 06/11] KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES

> + /*
> + * As a first step, a guest could only enable LBR feature if its cpu
> + * model is the same as the host because the LBR registers would
> + * be pass-through to the guest and they're model specific.
> + */
> + if (boot_cpu_data.x86_model != guest_cpuid_model(vcpu))
> + return false;

Could we relax this in a followon patch? (after this series is merged)

It's enough of the perf cap LBR version matches, don't need full model
number match. This would require a way to configure the LBR version
from qemu.

This would allow more flexibility, for example migration from
Icelake to Skylake and vice versa.

-Andi

2020-07-08 14:41:01

by Xu, Like

[permalink] [raw]
Subject: Re: [PATCH v12 06/11] KVM: vmx/pmu: Expose LBR to guest via MSR_IA32_PERF_CAPABILITIES

On 2020/7/8 21:36, Andi Kleen wrote:
>> + /*
>> + * As a first step, a guest could only enable LBR feature if its cpu
>> + * model is the same as the host because the LBR registers would
>> + * be pass-through to the guest and they're model specific.
>> + */
>> + if (boot_cpu_data.x86_model != guest_cpuid_model(vcpu))
>> + return false;
> Could we relax this in a followon patch? (after this series is merged)
Sure, there would be a follow-on patch to relax this check after it's merged.
>
> It's enough of the perf cap LBR version matches, don't need full model
> number match.
I assume you are referring to the LBR_FMT value in the perf_capabilities.
> This would require a way to configure the LBR version
> from qemu.
Sure, I may propose this configuration in the QEMU community.
>
> This would allow more flexibility, for example migration from
> Icelake to Skylake and vice versa.
Yes, we need this flexibility to cover as many platforms as possible.

Thanks,
Like Xu
>
> -Andi