2024-01-04 15:40:03

by Paolo Bonzini

[permalink] [raw]
Subject: [PATCH] KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL

When commit c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE
MSR emulation for extended PEBS") switched the initialization of
cpuc->guest_switch_msrs to use compound literals, it screwed up
the boolean logic:

+ u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
...
- arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask;
- arr[0].guest &= ~(cpuc->pebs_enabled & x86_pmu.pebs_capable);
+ .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),

Before the patch, the value of arr[0].guest would have been intel_ctrl &
~cpuc->intel_ctrl_host_mask & ~pebs_mask. The intent is to always treat
PEBS events as host-only because, while the guest runs, there is no way
to tell the processor about the virtual address where to put PEBS records
intended for the host.

Unfortunately, the new expression can be expanded to

(intel_ctrl & ~cpuc->intel_ctrl_host_mask) | (intel_ctrl & ~pebs_mask)

which makes no sense; it includes any bit that isn't *both* marked as
exclude_guest and using PEBS. So, reinstate the old logic. Another
way to write it could be "intel_ctrl & ~(cpuc->intel_ctrl_host_mask |
pebs_mask)", presumably the intention of the author of the faulty.
However, I personally find the repeated application of A AND NOT B to
be a bit more readable.

This shows up as guest failures when running concurrent long-running
perf workloads on the host, and was reported to happen with rcutorture.
All guests on a given host would die simultaneously with something like an
instruction fault or a segmentation violation.

Reported-by: Paul E. McKenney <[email protected]>
Analyzed-by: Sean Christopherson <[email protected]>
Tested-by: Paul E. McKenney <[email protected]>
Cc: [email protected]
Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
Signed-off-by: Paolo Bonzini <[email protected]>
---
arch/x86/events/intel/core.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
index ce1c777227b4..0f2786d4e405 100644
--- a/arch/x86/events/intel/core.c
+++ b/arch/x86/events/intel/core.c
@@ -4051,12 +4051,17 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
int global_ctrl, pebs_enable;

+ /*
+ * In addition to obeying exclude_guest/exclude_host, remove bits being
+ * used for PEBS when running a guest, because PEBS writes to virtual
+ * addresses (not physical addresses).
+ */
*nr = 0;
global_ctrl = (*nr)++;
arr[global_ctrl] = (struct perf_guest_switch_msr){
.msr = MSR_CORE_PERF_GLOBAL_CTRL,
.host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask,
- .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
+ .guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask & ~pebs_mask,
};

if (!x86_pmu.pebs)
--
2.43.0



2024-01-04 17:39:08

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL



On 2024-01-04 10:39 a.m., Paolo Bonzini wrote:
> When commit c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE
> MSR emulation for extended PEBS") switched the initialization of
> cpuc->guest_switch_msrs to use compound literals, it screwed up
> the boolean logic:
>
> + u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
> ...
> - arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask;
> - arr[0].guest &= ~(cpuc->pebs_enabled & x86_pmu.pebs_capable);
> + .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
>
> Before the patch, the value of arr[0].guest would have been intel_ctrl &
> ~cpuc->intel_ctrl_host_mask & ~pebs_mask. The intent is to always treat
> PEBS events as host-only because, while the guest runs, there is no way
> to tell the processor about the virtual address where to put PEBS records
> intended for the host.
>
> Unfortunately, the new expression can be expanded to
>
> (intel_ctrl & ~cpuc->intel_ctrl_host_mask) | (intel_ctrl & ~pebs_mask)
>
> which makes no sense; it includes any bit that isn't *both* marked as
> exclude_guest and using PEBS. So, reinstate the old logic.

I think the old logic will completely disable the PEBS in guest
capability. Because the counter which is assigned to a guest PEBS event
will also be set in the pebs_mask. The old logic disable the counter in
GLOBAL_CTRL in guest. Nothing will be counted.

Like once proposed a fix in the intel_guest_get_msrs().
https://lore.kernel.org/lkml/[email protected]/
It should work for the issue.

Ideally, we should prevent the host PEBS from profiling a guest via
rejecting the event creation in the perf. But I couldn't find a good way
to distinguish host-created PEBS and guest-created PEBS. So Like's
proposal should be a good alternative so far.

Thanks,
Kan

> Another
> way to write it could be "intel_ctrl & ~(cpuc->intel_ctrl_host_mask |
> pebs_mask)", presumably the intention of the author of the faulty.
> However, I personally find the repeated application of A AND NOT B to
> be a bit more readable.
>
> This shows up as guest failures when running concurrent long-running
> perf workloads on the host, and was reported to happen with rcutorture.
> All guests on a given host would die simultaneously with something like an
> instruction fault or a segmentation violation.
>
> Reported-by: Paul E. McKenney <[email protected]>
> Analyzed-by: Sean Christopherson <[email protected]>
> Tested-by: Paul E. McKenney <[email protected]>
> Cc: [email protected]
> Fixes: c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE MSR emulation for extended PEBS")
> Signed-off-by: Paolo Bonzini <[email protected]>
> ---
> arch/x86/events/intel/core.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/events/intel/core.c b/arch/x86/events/intel/core.c
> index ce1c777227b4..0f2786d4e405 100644
> --- a/arch/x86/events/intel/core.c
> +++ b/arch/x86/events/intel/core.c
> @@ -4051,12 +4051,17 @@ static struct perf_guest_switch_msr *intel_guest_get_msrs(int *nr, void *data)
> u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
> int global_ctrl, pebs_enable;
>
> + /*
> + * In addition to obeying exclude_guest/exclude_host, remove bits being
> + * used for PEBS when running a guest, because PEBS writes to virtual
> + * addresses (not physical addresses).
> + */
> *nr = 0;
> global_ctrl = (*nr)++;
> arr[global_ctrl] = (struct perf_guest_switch_msr){
> .msr = MSR_CORE_PERF_GLOBAL_CTRL,
> .host = intel_ctrl & ~cpuc->intel_ctrl_guest_mask,
> - .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
> + .guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask & ~pebs_mask,
> };
>
> if (!x86_pmu.pebs)

2024-01-04 18:23:24

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL

On Thu, Jan 04, 2024, Liang, Kan wrote:
>
>
> On 2024-01-04 10:39 a.m., Paolo Bonzini wrote:
> > When commit c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE
> > MSR emulation for extended PEBS") switched the initialization of
> > cpuc->guest_switch_msrs to use compound literals, it screwed up
> > the boolean logic:
> >
> > + u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
> > ...
> > - arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask;
> > - arr[0].guest &= ~(cpuc->pebs_enabled & x86_pmu.pebs_capable);
> > + .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
> >
> > Before the patch, the value of arr[0].guest would have been intel_ctrl &
> > ~cpuc->intel_ctrl_host_mask & ~pebs_mask. The intent is to always treat
> > PEBS events as host-only because, while the guest runs, there is no way
> > to tell the processor about the virtual address where to put PEBS records
> > intended for the host.
> >
> > Unfortunately, the new expression can be expanded to
> >
> > (intel_ctrl & ~cpuc->intel_ctrl_host_mask) | (intel_ctrl & ~pebs_mask)
> >
> > which makes no sense; it includes any bit that isn't *both* marked as
> > exclude_guest and using PEBS. So, reinstate the old logic.
>
> I think the old logic will completely disable the PEBS in guest
> capability. Because the counter which is assigned to a guest PEBS event
> will also be set in the pebs_mask. The old logic disable the counter in
> GLOBAL_CTRL in guest. Nothing will be counted.
>
> Like once proposed a fix in the intel_guest_get_msrs().
> https://lore.kernel.org/lkml/[email protected]/
> It should work for the issue.

No, that patch only affects the path where hardware supports enabling PEBS in the
the guest, i.e. intel_guest_get_msrs() will bail before getting to that code due
to the lack of x86_pmu.pebs_ept support, which IIUC is all pre-Icelake Intel CPUs.

if (!kvm_pmu || !x86_pmu.pebs_ept)
return arr;

2024-01-04 19:16:37

by Liang, Kan

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86/pmu: fix masking logic for MSR_CORE_PERF_GLOBAL_CTRL



On 2024-01-04 1:22 p.m., Sean Christopherson wrote:
> On Thu, Jan 04, 2024, Liang, Kan wrote:
>>
>>
>> On 2024-01-04 10:39 a.m., Paolo Bonzini wrote:
>>> When commit c59a1f106f5c ("KVM: x86/pmu: Add IA32_PEBS_ENABLE
>>> MSR emulation for extended PEBS") switched the initialization of
>>> cpuc->guest_switch_msrs to use compound literals, it screwed up
>>> the boolean logic:
>>>
>>> + u64 pebs_mask = cpuc->pebs_enabled & x86_pmu.pebs_capable;
>>> ...
>>> - arr[0].guest = intel_ctrl & ~cpuc->intel_ctrl_host_mask;
>>> - arr[0].guest &= ~(cpuc->pebs_enabled & x86_pmu.pebs_capable);
>>> + .guest = intel_ctrl & (~cpuc->intel_ctrl_host_mask | ~pebs_mask),
>>>
>>> Before the patch, the value of arr[0].guest would have been intel_ctrl &
>>> ~cpuc->intel_ctrl_host_mask & ~pebs_mask. The intent is to always treat
>>> PEBS events as host-only because, while the guest runs, there is no way
>>> to tell the processor about the virtual address where to put PEBS records
>>> intended for the host.
>>>
>>> Unfortunately, the new expression can be expanded to
>>>
>>> (intel_ctrl & ~cpuc->intel_ctrl_host_mask) | (intel_ctrl & ~pebs_mask)
>>>
>>> which makes no sense; it includes any bit that isn't *both* marked as
>>> exclude_guest and using PEBS. So, reinstate the old logic.
>>
>> I think the old logic will completely disable the PEBS in guest
>> capability. Because the counter which is assigned to a guest PEBS event
>> will also be set in the pebs_mask. The old logic disable the counter in
>> GLOBAL_CTRL in guest. Nothing will be counted.
>>
>> Like once proposed a fix in the intel_guest_get_msrs().
>> https://lore.kernel.org/lkml/[email protected]/
>> It should work for the issue.
>
> No, that patch only affects the path where hardware supports enabling PEBS in the
> the guest, i.e. intel_guest_get_msrs() will bail before getting to that code due
> to the lack of x86_pmu.pebs_ept support, which IIUC is all pre-Icelake Intel CPUs.
>
> if (!kvm_pmu || !x86_pmu.pebs_ept)
> return arr;
>

True, we have to disable all PEBS counters for pre-ICL as well.

I think what I missed is that the disable here is temporary. The
arr[global_ctrl].guest will be updated later for the x86_pmu.pebs_ept
platform, so the guest PEBS event should still work.

The patch looks good to me.

Reviewed-by: Kan Liang <[email protected]>

Thanks,
Kan