2022-07-18 08:41:24

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH] KVM: x86: Do not block APIC write for non ICR registers

The commit 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write
VM-Exits in x2APIC mode") introduces logic to prevent APIC write
for offset other than ICR. This breaks x2AVIC support, which requires
KVM to trap and emulate x2APIC MSR writes.

Therefore, removes the warning and modify to logic to allow MSR write.

Fixes: 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode")
Cc: Zeng Guang <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/lapic.c | 17 ++++++++++++-----
1 file changed, 12 insertions(+), 5 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9d4f73c4dc02..f688090d98b0 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -69,6 +69,7 @@ static bool lapic_timer_advance_dynamic __read_mostly;
/* step-by-step approximation to mitigate fluctuation */
#define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8
static int kvm_lapic_msr_read(struct kvm_lapic *apic, u32 reg, u64 *data);
+static int kvm_lapic_msr_write(struct kvm_lapic *apic, u32 reg, u64 data);

static inline void __kvm_lapic_set_reg(char *regs, int reg_off, u32 val)
{
@@ -2284,17 +2285,23 @@ void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
u64 val;

if (apic_x2apic_mode(apic)) {
+ kvm_lapic_msr_read(apic, offset, &val);
+
/*
* When guest APIC is in x2APIC mode and IPI virtualization
* is enabled, accessing APIC_ICR may cause trap-like VM-exit
* on Intel hardware. Other offsets are not possible.
+ *
+ * For AMD AVIC, write to some APIC registers can cause
+ * trap-like VM-exit (see arch/x86/kvm/svm/avic.c:
+ * avic_unaccel_trap_write()).
*/
- if (WARN_ON_ONCE(offset != APIC_ICR))
+ if (offset == APIC_ICR) {
+ kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
+ trace_kvm_apic_write(APIC_ICR, val);
return;
-
- kvm_lapic_msr_read(apic, offset, &val);
- kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
- trace_kvm_apic_write(APIC_ICR, val);
+ }
+ kvm_lapic_msr_write(apic, offset, val);
} else {
val = kvm_lapic_get_reg(apic, offset);

--
2.34.1


2022-07-18 10:22:49

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Do not block APIC write for non ICR registers

On Mon, 2022-07-18 at 03:39 -0500, Suravee Suthikulpanit wrote:
> The commit 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write
> VM-Exits in x2APIC mode") introduces logic to prevent APIC write
> for offset other than ICR. This breaks x2AVIC support, which requires
> KVM to trap and emulate x2APIC MSR writes.
>
> Therefore, removes the warning and modify to logic to allow MSR write.
>
> Fixes: 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode")
> Cc: Zeng Guang <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
>  arch/x86/kvm/lapic.c | 17 ++++++++++++-----
>  1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 9d4f73c4dc02..f688090d98b0 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -69,6 +69,7 @@ static bool lapic_timer_advance_dynamic __read_mostly;
>  /* step-by-step approximation to mitigate fluctuation */
>  #define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8
>  static int kvm_lapic_msr_read(struct kvm_lapic *apic, u32 reg, u64 *data);
> +static int kvm_lapic_msr_write(struct kvm_lapic *apic, u32 reg, u64 data);
>  
>  static inline void __kvm_lapic_set_reg(char *regs, int reg_off, u32 val)
>  {
> @@ -2284,17 +2285,23 @@ void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
>         u64 val;
>  
>         if (apic_x2apic_mode(apic)) {
> +               kvm_lapic_msr_read(apic, offset, &val);
> +
>                 /*
>                  * When guest APIC is in x2APIC mode and IPI virtualization
>                  * is enabled, accessing APIC_ICR may cause trap-like VM-exit
>                  * on Intel hardware. Other offsets are not possible.
> +                *
> +                * For AMD AVIC, write to some APIC registers can cause
> +                * trap-like VM-exit (see arch/x86/kvm/svm/avic.c:
> +                * avic_unaccel_trap_write()).
>                  */
> -               if (WARN_ON_ONCE(offset != APIC_ICR))
> +               if (offset == APIC_ICR) {
> +                       kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
> +                       trace_kvm_apic_write(APIC_ICR, val);
>                         return;
> -
> -               kvm_lapic_msr_read(apic, offset, &val);
> -               kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
> -               trace_kvm_apic_write(APIC_ICR, val);
> +               }
> +               kvm_lapic_msr_write(apic, offset, val);
>         } else {
>                 val = kvm_lapic_get_reg(apic, offset);
>  


Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky

2022-07-18 17:55:15

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Do not block APIC write for non ICR registers

On Mon, Jul 18, 2022, Suravee Suthikulpanit wrote:
> The commit 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write
> VM-Exits in x2APIC mode") introduces logic to prevent APIC write
> for offset other than ICR. This breaks x2AVIC support, which requires
> KVM to trap and emulate x2APIC MSR writes.
>
> Therefore, removes the warning and modify to logic to allow MSR write.
>
> Fixes: 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode")

This tag is wrong, I believe it should be:

Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")

And that absolutely matters because this should not be backported to older
kernels that don't support x2avic.

> Cc: Zeng Guang <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/lapic.c | 17 ++++++++++++-----
> 1 file changed, 12 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 9d4f73c4dc02..f688090d98b0 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -69,6 +69,7 @@ static bool lapic_timer_advance_dynamic __read_mostly;
> /* step-by-step approximation to mitigate fluctuation */
> #define LAPIC_TIMER_ADVANCE_ADJUST_STEP 8
> static int kvm_lapic_msr_read(struct kvm_lapic *apic, u32 reg, u64 *data);
> +static int kvm_lapic_msr_write(struct kvm_lapic *apic, u32 reg, u64 data);
>
> static inline void __kvm_lapic_set_reg(char *regs, int reg_off, u32 val)
> {
> @@ -2284,17 +2285,23 @@ void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
> u64 val;
>
> if (apic_x2apic_mode(apic)) {
> + kvm_lapic_msr_read(apic, offset, &val);
> +
> /*
> * When guest APIC is in x2APIC mode and IPI virtualization
> * is enabled, accessing APIC_ICR may cause trap-like VM-exit
> * on Intel hardware. Other offsets are not possible.
> + *
> + * For AMD AVIC, write to some APIC registers can cause

x2AVIC if we're going to keep a comment. But at this point, the WARN has done its
job and the comment is obsolete. My preference is to just document that ICR is
special in x2APIC mode and not bother with vendor/feature specific behavior.

> + * trap-like VM-exit (see arch/x86/kvm/svm/avic.c:

Specifying the file name and full path is completely unnecessary.

> + * avic_unaccel_trap_write()).

If something like this comes up again in the future, please just explicitly document
the architecturally (or micro-architectural behavior). Redirecting to KVM code is
annoying for the reader and comments that reference function names all too often
become stale. There _are_ times when explicitly referencing a function is appropriate,
but IMO this isn't one of them.

But if we just drop the Intel vs. AMD stuff, this all goes away.

> */
> - if (WARN_ON_ONCE(offset != APIC_ICR))
> + if (offset == APIC_ICR) {
> + kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
> + trace_kvm_apic_write(APIC_ICR, val);
> return;
> -
> - kvm_lapic_msr_read(apic, offset, &val);
> - kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
> - trace_kvm_apic_write(APIC_ICR, val);
> + }
> + kvm_lapic_msr_write(apic, offset, val);

Because this lacks the TODO below, what about tweaking this so that there's a
single call to kvm_lapic_msr_write()? gcc-11 even generates more efficient code
for this. Alternatively, the ICR path can be an early return inside a single
x2APIC check, but gcc generate identical code and I like making x2APIC+ICR stand
out as being truly special.

Compile tested only.

---
From: Sean Christopherson <[email protected]>
Date: Mon, 18 Jul 2022 10:16:02 -0700
Subject: [PATCH] KVM: x86: Handle trap-like x2APIC accesses for any APIC
register

Handle trap-like VM-Exits for all APIC registers when the guest is in
x2APIC mode and drop the now-stale WARN that KVM encounters trap-like
exits only for ICR. On Intel, only writes to ICR can be trap-like when
APICv and x2APIC are enabled, but AMD's x2AVIC can trap more registers,
e.g. LDR and DFR.

Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
Reported-by: Suravee Suthikulpanit <[email protected]>
Cc: Zeng Guang <[email protected]>
Cc: Maxim Levitsky <[email protected]>
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/kvm/lapic.c | 21 ++++++++++-----------
1 file changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 9d4f73c4dc02..95bb1ef37a12 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2283,21 +2283,20 @@ void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
struct kvm_lapic *apic = vcpu->arch.apic;
u64 val;

- if (apic_x2apic_mode(apic)) {
- /*
- * When guest APIC is in x2APIC mode and IPI virtualization
- * is enabled, accessing APIC_ICR may cause trap-like VM-exit
- * on Intel hardware. Other offsets are not possible.
- */
- if (WARN_ON_ONCE(offset != APIC_ICR))
- return;
-
+ if (apic_x2apic_mode(apic))
kvm_lapic_msr_read(apic, offset, &val);
+ else
+ val = kvm_lapic_get_reg(apic, offset);
+
+ /*
+ * ICR is a single 64-bit register when x2APIC is enabled. For legacy
+ * xAPIC, ICR writes need to go down the common (slightly slower) path
+ * to get the upper half from ICR2.
+ */
+ if (apic_x2apic_mode(apic) && offset == APIC_ICR) {
kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
trace_kvm_apic_write(APIC_ICR, val);
} else {
- val = kvm_lapic_get_reg(apic, offset);
-
/* TODO: optimize to just emulate side effect w/o one more write */
kvm_lapic_reg_write(apic, offset, (u32)val);
}

base-commit: 8031d87aa9953ddeb047a5356ebd0b240c30f233
--

2022-07-19 03:53:44

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Do not block APIC write for non ICR registers

Sean,

On 7/19/2022 12:28 AM, Sean Christopherson wrote:
> On Mon, Jul 18, 2022, Suravee Suthikulpanit wrote:
>> The commit 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write
>> VM-Exits in x2APIC mode") introduces logic to prevent APIC write
>> for offset other than ICR. This breaks x2AVIC support, which requires
>> KVM to trap and emulate x2APIC MSR writes.
>>
>> Therefore, removes the warning and modify to logic to allow MSR write.
>>
>> Fixes: 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode")
>
> This tag is wrong, I believe it should be:
>
> Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
>
> And that absolutely matters because this should not be backported to older
> kernels that don't support x2avic.

The commit 5413bcba7ed5 is the one that modifies the logic in the kvm_apic_write_nodecode().
I understand your point that the 5413bcba7ed is committed later than 4d1d7942e36a and being
affected by the change. However, if there is a case that only x2AVIC stuff is being backported
w/o the virtualize IPI stuff, then this fix is not needed. Hence, I would say the fix is for
the 5413bcba7ed5 as specified in the original patch.

>> .....
>> */
>> - if (WARN_ON_ONCE(offset != APIC_ICR))
>> + if (offset == APIC_ICR) {
>> + kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
>> + trace_kvm_apic_write(APIC_ICR, val);
>> return;
>> -
>> - kvm_lapic_msr_read(apic, offset, &val);
>> - kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
>> - trace_kvm_apic_write(APIC_ICR, val);
>> + }
>> + kvm_lapic_msr_write(apic, offset, val);
>
> Because this lacks the TODO below, what about tweaking this so that there's a
> single call to kvm_lapic_msr_write()? gcc-11 even generates more efficient code
> for this. Alternatively, the ICR path can be an early return inside a single
> x2APIC check, but gcc generate identical code and I like making x2APIC+ICR stand
> out as being truly special.

That sounds good.

> Compile tested only.
>
> ---
> From: Sean Christopherson <[email protected]>
> Date: Mon, 18 Jul 2022 10:16:02 -0700
> Subject: [PATCH] KVM: x86: Handle trap-like x2APIC accesses for any APIC
> register
>
> Handle trap-like VM-Exits for all APIC registers when the guest is in
> x2APIC mode and drop the now-stale WARN that KVM encounters trap-like
> exits only for ICR. On Intel, only writes to ICR can be trap-like when
> APICv and x2APIC are enabled, but AMD's x2AVIC can trap more registers,
> e.g. LDR and DFR.
>
> Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
> Reported-by: Suravee Suthikulpanit <[email protected]>
> Cc: Zeng Guang <[email protected]>
> Cc: Maxim Levitsky <[email protected]>
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/kvm/lapic.c | 21 ++++++++++-----------
> 1 file changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 9d4f73c4dc02..95bb1ef37a12 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2283,21 +2283,20 @@ void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
> struct kvm_lapic *apic = vcpu->arch.apic;
> u64 val;
>
> - if (apic_x2apic_mode(apic)) {
> - /*
> - * When guest APIC is in x2APIC mode and IPI virtualization
> - * is enabled, accessing APIC_ICR may cause trap-like VM-exit
> - * on Intel hardware. Other offsets are not possible.
> - */
> - if (WARN_ON_ONCE(offset != APIC_ICR))
> - return;
> -
> + if (apic_x2apic_mode(apic))
> kvm_lapic_msr_read(apic, offset, &val);
> + else
> + val = kvm_lapic_get_reg(apic, offset);
> +
> + /*
> + * ICR is a single 64-bit register when x2APIC is enabled. For legacy
> + * xAPIC, ICR writes need to go down the common (slightly slower) path
> + * to get the upper half from ICR2.
> + */
> + if (apic_x2apic_mode(apic) && offset == APIC_ICR) {
> kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
> trace_kvm_apic_write(APIC_ICR, val);
> } else {
> - val = kvm_lapic_get_reg(apic, offset);
> -
> /* TODO: optimize to just emulate side effect w/o one more write */
> kvm_lapic_reg_write(apic, offset, (u32)val);
> }
>
> base-commit: 8031d87aa9953ddeb047a5356ebd0b240c30f233
> --

Tested-by: Suravee Suthikulpanit <[email protected]>

Suravee

2022-07-20 10:15:43

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH] KVM: x86: Do not block APIC write for non ICR registers

Paolo / Sean,

Please let me know if you want me to send v2 with changes proposed by Sean.

Regards,
Suravee

On 7/19/22 10:24 AM, Suthikulpanit, Suravee wrote:
> Sean,
>
> On 7/19/2022 12:28 AM, Sean Christopherson wrote:
>> On Mon, Jul 18, 2022, Suravee Suthikulpanit wrote:
>>> The commit 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write
>>> VM-Exits in x2APIC mode") introduces logic to prevent APIC write
>>> for offset other than ICR. This breaks x2AVIC support, which requires
>>> KVM to trap and emulate x2APIC MSR writes.
>>>
>>> Therefore, removes the warning and modify to logic to allow MSR write.
>>>
>>> Fixes: 5413bcba7ed5 ("KVM: x86: Add support for vICR APIC-write VM-Exits in x2APIC mode")
>>
>> This tag is wrong, I believe it should be:
>>
>>    Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
>>
>> And that absolutely matters because this should not be backported to older
>> kernels that don't support x2avic.
>
> The commit 5413bcba7ed5 is the one that modifies the logic in the kvm_apic_write_nodecode().
> I understand your point that the 5413bcba7ed is committed later than 4d1d7942e36a and being
> affected by the change. However, if there is a case that only x2AVIC stuff is being backported
> w/o the virtualize IPI stuff, then this fix is not needed. Hence, I would say the fix is for
> the 5413bcba7ed5 as specified in the original patch.
>
>>> .....
>>>            */
>>> -        if (WARN_ON_ONCE(offset != APIC_ICR))
>>> +        if (offset == APIC_ICR) {
>>> +            kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
>>> +            trace_kvm_apic_write(APIC_ICR, val);
>>>               return;
>>> -
>>> -        kvm_lapic_msr_read(apic, offset, &val);
>>> -        kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
>>> -        trace_kvm_apic_write(APIC_ICR, val);
>>> +        }
>>> +        kvm_lapic_msr_write(apic, offset, val);
>>
>> Because this lacks the TODO below, what about tweaking this so that there's a
>> single call to kvm_lapic_msr_write()?  gcc-11 even generates more efficient code
>> for this.  Alternatively, the ICR path can be an early return inside a single
>> x2APIC check, but gcc generate identical code and I like making x2APIC+ICR stand
>> out as being truly special.
>
> That sounds good.
>
>> Compile tested only.
>>
>> ---
>> From: Sean Christopherson <[email protected]>
>> Date: Mon, 18 Jul 2022 10:16:02 -0700
>> Subject: [PATCH] KVM: x86: Handle trap-like x2APIC accesses for any APIC
>>   register
>>
>> Handle trap-like VM-Exits for all APIC registers when the guest is in
>> x2APIC mode and drop the now-stale WARN that KVM encounters trap-like
>> exits only for ICR.  On Intel, only writes to ICR can be trap-like when
>> APICv and x2APIC are enabled, but AMD's x2AVIC can trap more registers,
>> e.g. LDR and DFR.
>>
>> Fixes: 4d1d7942e36a ("KVM: SVM: Introduce logic to (de)activate x2AVIC mode")
>> Reported-by: Suravee Suthikulpanit <[email protected]>
>> Cc: Zeng Guang <[email protected]>
>> Cc: Maxim Levitsky <[email protected]>
>> Signed-off-by: Sean Christopherson <[email protected]>
>> ---
>>   arch/x86/kvm/lapic.c | 21 ++++++++++-----------
>>   1 file changed, 10 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
>> index 9d4f73c4dc02..95bb1ef37a12 100644
>> --- a/arch/x86/kvm/lapic.c
>> +++ b/arch/x86/kvm/lapic.c
>> @@ -2283,21 +2283,20 @@ void kvm_apic_write_nodecode(struct kvm_vcpu *vcpu, u32 offset)
>>       struct kvm_lapic *apic = vcpu->arch.apic;
>>       u64 val;
>>
>> -    if (apic_x2apic_mode(apic)) {
>> -        /*
>> -         * When guest APIC is in x2APIC mode and IPI virtualization
>> -         * is enabled, accessing APIC_ICR may cause trap-like VM-exit
>> -         * on Intel hardware. Other offsets are not possible.
>> -         */
>> -        if (WARN_ON_ONCE(offset != APIC_ICR))
>> -            return;
>> -
>> +    if (apic_x2apic_mode(apic))
>>           kvm_lapic_msr_read(apic, offset, &val);
>> +    else
>> +        val = kvm_lapic_get_reg(apic, offset);
>> +
>> +    /*
>> +     * ICR is a single 64-bit register when x2APIC is enabled.  For legacy
>> +     * xAPIC, ICR writes need to go down the common (slightly slower) path
>> +     * to get the upper half from ICR2.
>> +     */
>> +    if (apic_x2apic_mode(apic) && offset == APIC_ICR) {
>>           kvm_apic_send_ipi(apic, (u32)val, (u32)(val >> 32));
>>           trace_kvm_apic_write(APIC_ICR, val);
>>       } else {
>> -        val = kvm_lapic_get_reg(apic, offset);
>> -
>>           /* TODO: optimize to just emulate side effect w/o one more write */
>>           kvm_lapic_reg_write(apic, offset, (u32)val);
>>       }
>>
>> base-commit: 8031d87aa9953ddeb047a5356ebd0b240c30f233
>> --
>
> Tested-by: Suravee Suthikulpanit <[email protected]>
>
> Suravee