2024-04-19 08:01:46

by Mi, Dapeng

[permalink] [raw]
Subject: [kvm-unit-tests Patch v4 04/17] x86: pmu: Fix the issue that pmu_counter_t.config crosses cache line

When running pmu test on SPR, the following #GP fault is reported.

Unhandled exception 13 #GP at ip 000000000040771f
error_code=0000 rflags=00010046 cs=00000008
rax=00000000004031ad rcx=0000000000000186 rdx=0000000000000000 rbx=00000000005142f0
rbp=0000000000514260 rsi=0000000000000020 rdi=0000000000000340
r8=0000000000513a65 r9=00000000000003f8 r10=000000000000000d r11=00000000ffffffff
r12=000000000043003c r13=0000000000514450 r14=000000000000000b r15=0000000000000001
cr0=0000000080010011 cr2=0000000000000000 cr3=0000000001007000 cr4=0000000000000020
cr8=0000000000000000
STACK: @40771f 40040e 400976 400aef 40148d 401da9 4001ad
FAIL pmu

It looks EVENTSEL0 MSR (0x186) is written a invalid value (0x4031ad) and
cause a #GP.

Further investigation shows the #GP is caused by below code in
__start_event().

rmsr(MSR_GP_EVENT_SELECTx(event_to_global_idx(evt)),
evt->config | EVNTSEL_EN);

The evt->config is correctly initialized but seems corrupted before
writing to MSR.

The original pmu_counter_t layout looks as below.

typedef struct {
uint32_t ctr;
uint64_t config;
uint64_t count;
int idx;
} pmu_counter_t;

Obviously the config filed crosses two cache lines. When the two cache
lines are not updated simultaneously, the config value is corrupted.

Adjust pmu_counter_t fields order and ensure config field is cache-line
aligned.

Signeduoff-by: Dapeng Mi <[email protected]>
---
x86/pmu.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/x86/pmu.c b/x86/pmu.c
index c971386db4e6..5fd7439a0eba 100644
--- a/x86/pmu.c
+++ b/x86/pmu.c
@@ -20,9 +20,9 @@

typedef struct {
uint32_t ctr;
+ uint32_t idx;
uint64_t config;
uint64_t count;
- int idx;
} pmu_counter_t;

struct pmu_event {
--
2.34.1