2022-10-01 01:18:51

by Sean Christopherson

[permalink] [raw]
Subject: [PATCH v4 11/32] KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled

Free the APIC access page memslot if any vCPU enables x2APIC and SVM's
AVIC is enabled to prevent accesses to the virtual APIC on vCPUs with
x2APIC enabled. On AMD, due to its "hybrid" mode where AVIC is enabled
when x2APIC is enabled even without x2AVIC support, keeping the APIC
access page memslot results in the guest being able to access the virtual
APIC page as x2APIC is fully emulated by KVM. I.e. hardware isn't aware
that the guest is operating in x2APIC mode.

Exempt nested SVM's update of APICv state from new logic as x2APIC can't
be toggled on VM-Exit. In practice, invoking the x2APIC logic should be
harmless precisely because it should be a glorified nop, but play it
safe to avoid latent bugs, e.g. with dropping the vCPU's SRCU lock.

Intel doesn't suffer from the same issue as APICv has fully independent
VMCS controls for xAPIC vs. x2APIC virtualization. Technically, KVM
should provide bus error semantics and not memory semantics for the APIC
page when x2APIC is enabled, but KVM already provides memory semantics in
other scenarios, e.g. if APICv/AVIC is enabled and the APIC is hardware
disabled (via APIC_BASE MSR).

Reserve an inhibit bit so that common code can detect whether or not the
"x2APIC inhibit" applies, but use a dedicated flag to track the inhibit
so that it doesn't need to be stripped from apicv_inhibit_reasons (since
it's not a "full" inhibit).

Note, checking apic_access_memslot_enabled without taking locks relies
it being set during vCPU creation (before kvm_vcpu_reset()). vCPUs can
race to set the inhibit and delete the memslot, i.e. can get false
positives, but can't get false negatives as apic_access_memslot_enabled
can't be toggled "on" once any vCPU reaches KVM_RUN.

Opportunistically drop the "can" while updating avic_activate_vmcb()'s
comment, i.e. to state that KVM _does_ support the hybrid mode. Move
the "Note:" down a line to conform to preferred kernel/KVM multi-line
comment style.

Opportunistically update the apicv_update_lock comment, as it isn't
actually used to protect apic_access_memslot_enabled (it's protected by
slots_lock).

Fixes: 0e311d33bfbe ("KVM: SVM: Introduce hybrid-AVIC mode")
Signed-off-by: Sean Christopherson <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 20 +++++++++++++----
arch/x86/kvm/lapic.c | 38 ++++++++++++++++++++++++++++++++-
arch/x86/kvm/lapic.h | 1 +
arch/x86/kvm/svm/avic.c | 15 +++++++------
arch/x86/kvm/svm/nested.c | 2 +-
arch/x86/kvm/x86.c | 16 ++++++++++++--
6 files changed, 77 insertions(+), 15 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index d40206b16d6c..062758135c86 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1139,6 +1139,17 @@ enum kvm_apicv_inhibit {
* AVIC is disabled because SEV doesn't support it.
*/
APICV_INHIBIT_REASON_SEV,
+
+ /*
+ * Due to sharing page tables across vCPUs, the xAPIC memslot must be
+ * deleted if any vCPU has x2APIC enabled as SVM doesn't provide fully
+ * independent controls for AVIC vs. x2AVIC, and also because SVM
+ * supports a "hybrid" AVIC mode for CPUs that support AVIC but not
+ * x2AVIC. Note, this isn't a "full" inhibit and is tracked separately.
+ * AVIC can still be activated, but KVM must not create SPTEs for the
+ * APIC base. For simplicity, this is sticky.
+ */
+ APICV_INHIBIT_REASON_X2APIC,
};

struct kvm_arch {
@@ -1176,10 +1187,11 @@ struct kvm_arch {
struct kvm_apic_map __rcu *apic_map;
atomic_t apic_map_dirty;

- /* Protects apic_access_memslot_enabled and apicv_inhibit_reasons */
- struct rw_semaphore apicv_update_lock;
-
bool apic_access_memslot_enabled;
+ bool apic_access_memslot_inhibited;
+
+ /* Protects apicv_inhibit_reasons */
+ struct rw_semaphore apicv_update_lock;
unsigned long apicv_inhibit_reasons;

gpa_t wall_clock;
@@ -1912,7 +1924,7 @@ gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva,

bool kvm_apicv_activated(struct kvm *kvm);
bool kvm_vcpu_apicv_activated(struct kvm_vcpu *vcpu);
-void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu);
+void __kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu);
void __kvm_set_or_clear_apicv_inhibit(struct kvm *kvm,
enum kvm_apicv_inhibit reason, bool set);
void kvm_set_or_clear_apicv_inhibit(struct kvm *kvm,
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 80e8b1cc6dc2..42b61469674d 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2443,7 +2443,8 @@ int kvm_alloc_apic_access_page(struct kvm *kvm)
int ret = 0;

mutex_lock(&kvm->slots_lock);
- if (kvm->arch.apic_access_memslot_enabled)
+ if (kvm->arch.apic_access_memslot_enabled ||
+ kvm->arch.apic_access_memslot_inhibited)
goto out;

hva = __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
@@ -2471,6 +2472,41 @@ int kvm_alloc_apic_access_page(struct kvm *kvm)
}
EXPORT_SYMBOL_GPL(kvm_alloc_apic_access_page);

+void kvm_inhibit_apic_access_page(struct kvm_vcpu *vcpu)
+{
+ struct kvm *kvm = vcpu->kvm;
+
+ if (!kvm->arch.apic_access_memslot_enabled)
+ return;
+
+ kvm_vcpu_srcu_read_unlock(vcpu);
+
+ mutex_lock(&kvm->slots_lock);
+
+ if (kvm->arch.apic_access_memslot_enabled) {
+ __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, 0, 0);
+ /*
+ * Clear "enabled" after the memslot is deleted so that a
+ * different vCPU doesn't get a false negative when checking
+ * the flag out of slots_lock. No additional memory barrier is
+ * needed as modifying memslots requires waiting other vCPUs to
+ * drop SRCU (see above), and false positives are ok as the
+ * flag is rechecked after acquiring slots_lock.
+ */
+ kvm->arch.apic_access_memslot_enabled = false;
+
+ /*
+ * Mark the memslot as inhibited to prevent reallocating the
+ * memslot during vCPU creation, e.g. if a vCPU is hotplugged.
+ */
+ kvm->arch.apic_access_memslot_inhibited = true;
+ }
+
+ mutex_unlock(&kvm->slots_lock);
+
+ kvm_vcpu_srcu_read_lock(vcpu);
+}
+
void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
{
struct kvm_lapic *apic = vcpu->arch.apic;
diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
index 0587a8282cb3..a318609bb050 100644
--- a/arch/x86/kvm/lapic.h
+++ b/arch/x86/kvm/lapic.h
@@ -113,6 +113,7 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type);
void kvm_apic_update_apicv(struct kvm_vcpu *vcpu);
int kvm_alloc_apic_access_page(struct kvm *kvm);
+void kvm_inhibit_apic_access_page(struct kvm_vcpu *vcpu);

bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
struct kvm_lapic_irq *irq, int *r, struct dest_map *dest_map);
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index ec28ba4c5f1b..535e35edce1d 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -72,12 +72,12 @@ static void avic_activate_vmcb(struct vcpu_svm *svm)

vmcb->control.int_ctl |= AVIC_ENABLE_MASK;

- /* Note:
- * KVM can support hybrid-AVIC mode, where KVM emulates x2APIC
- * MSR accesses, while interrupt injection to a running vCPU
- * can be achieved using AVIC doorbell. The AVIC hardware still
- * accelerate MMIO accesses, but this does not cause any harm
- * as the guest is not supposed to access xAPIC mmio when uses x2APIC.
+ /*
+ * Note: KVM supports hybrid-AVIC mode, where KVM emulates x2APIC MSR
+ * accesses, while interrupt injection to a running vCPU can be
+ * achieved using AVIC doorbell. KVM disables the APIC access page
+ * (deletes the memslot) if any vCPU has x2APIC enabled, thus enabling
+ * AVIC in hybrid mode activates only the doorbell mechanism.
*/
if (apic_x2apic_mode(svm->vcpu.arch.apic) &&
avic_mode == AVIC_MODE_X2) {
@@ -975,7 +975,8 @@ bool avic_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason)
BIT(APICV_INHIBIT_REASON_BLOCKIRQ) |
BIT(APICV_INHIBIT_REASON_SEV) |
BIT(APICV_INHIBIT_REASON_APIC_ID_MODIFIED) |
- BIT(APICV_INHIBIT_REASON_APIC_BASE_MODIFIED);
+ BIT(APICV_INHIBIT_REASON_APIC_BASE_MODIFIED) |
+ BIT(APICV_INHIBIT_REASON_X2APIC);

return supported & BIT(reason);
}
diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
index 4c620999d230..8d5e00a7ef84 100644
--- a/arch/x86/kvm/svm/nested.c
+++ b/arch/x86/kvm/svm/nested.c
@@ -1084,7 +1084,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
* to benefit from it right away.
*/
if (kvm_apicv_activated(vcpu->kvm))
- kvm_vcpu_update_apicv(vcpu);
+ __kvm_vcpu_update_apicv(vcpu);

return 0;
}
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index eb9d2c23fb04..a20002924eb4 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10251,7 +10251,7 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
}

-void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
+void __kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
{
struct kvm_lapic *apic = vcpu->arch.apic;
bool activate;
@@ -10286,7 +10286,19 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
preempt_enable();
up_read(&vcpu->kvm->arch.apicv_update_lock);
}
-EXPORT_SYMBOL_GPL(kvm_vcpu_update_apicv);
+EXPORT_SYMBOL_GPL(__kvm_vcpu_update_apicv);
+
+static void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
+{
+ if (!lapic_in_kernel(vcpu))
+ return;
+
+ if (apic_x2apic_mode(vcpu->arch.apic) &&
+ static_call(kvm_x86_check_apicv_inhibit_reasons)(APICV_INHIBIT_REASON_X2APIC))
+ kvm_inhibit_apic_access_page(vcpu);
+
+ __kvm_vcpu_update_apicv(vcpu);
+}

void __kvm_set_or_clear_apicv_inhibit(struct kvm *kvm,
enum kvm_apicv_inhibit reason, bool set)
--
2.38.0.rc1.362.ged0d419d3c-goog


2022-12-08 22:40:15

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v4 11/32] KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled

On Sat, 2022-10-01 at 00:58 +0000, Sean Christopherson wrote:
> Free the APIC access page memslot if any vCPU enables x2APIC and SVM's
> AVIC is enabled to prevent accesses to the virtual APIC on vCPUs with
> x2APIC enabled. On AMD, due to its "hybrid" mode where AVIC is enabled
> when x2APIC is enabled even without x2AVIC support, keeping the APIC
> access page memslot results in the guest being able to access the virtual
> APIC page as x2APIC is fully emulated by KVM. I.e. hardware isn't aware
> that the guest is operating in x2APIC mode.
>
> Exempt nested SVM's update of APICv state from new logic as x2APIC can't
> be toggled on VM-Exit. In practice, invoking the x2APIC logic should be
> harmless precisely because it should be a glorified nop, but play it
> safe to avoid latent bugs, e.g. with dropping the vCPU's SRCU lock.
>
> Intel doesn't suffer from the same issue as APICv has fully independent
> VMCS controls for xAPIC vs. x2APIC virtualization. Technically, KVM
> should provide bus error semantics and not memory semantics for the APIC
> page when x2APIC is enabled, but KVM already provides memory semantics in
> other scenarios, e.g. if APICv/AVIC is enabled and the APIC is hardware
> disabled (via APIC_BASE MSR).
>
> Reserve an inhibit bit so that common code can detect whether or not the
> "x2APIC inhibit" applies, but use a dedicated flag to track the inhibit
> so that it doesn't need to be stripped from apicv_inhibit_reasons (since
> it's not a "full" inhibit).
>
> Note, checking apic_access_memslot_enabled without taking locks relies
> it being set during vCPU creation (before kvm_vcpu_reset()). vCPUs can
> race to set the inhibit and delete the memslot, i.e. can get false
> positives, but can't get false negatives as apic_access_memslot_enabled
> can't be toggled "on" once any vCPU reaches KVM_RUN.
>
> Opportunistically drop the "can" while updating avic_activate_vmcb()'s
> comment, i.e. to state that KVM _does_ support the hybrid mode. Move
> the "Note:" down a line to conform to preferred kernel/KVM multi-line
> comment style.
>
> Opportunistically update the apicv_update_lock comment, as it isn't
> actually used to protect apic_access_memslot_enabled (it's protected by
> slots_lock).
>
> Fixes: 0e311d33bfbe ("KVM: SVM: Introduce hybrid-AVIC mode")
> Signed-off-by: Sean Christopherson <[email protected]>
> ---
> arch/x86/include/asm/kvm_host.h | 20 +++++++++++++----
> arch/x86/kvm/lapic.c | 38 ++++++++++++++++++++++++++++++++-
> arch/x86/kvm/lapic.h | 1 +
> arch/x86/kvm/svm/avic.c | 15 +++++++------
> arch/x86/kvm/svm/nested.c | 2 +-
> arch/x86/kvm/x86.c | 16 ++++++++++++--
> 6 files changed, 77 insertions(+), 15 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index d40206b16d6c..062758135c86 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1139,6 +1139,17 @@ enum kvm_apicv_inhibit {
> * AVIC is disabled because SEV doesn't support it.
> */
> APICV_INHIBIT_REASON_SEV,
> +
> + /*
> + * Due to sharing page tables across vCPUs, the xAPIC memslot must be
> + * deleted if any vCPU has x2APIC enabled as SVM doesn't provide fully
> + * independent controls for AVIC vs. x2AVIC, and also because SVM
> + * supports a "hybrid" AVIC mode for CPUs that support AVIC but not
> + * x2AVIC. Note, this isn't a "full" inhibit and is tracked separately.
> + * AVIC can still be activated, but KVM must not create SPTEs for the
> + * APIC base. For simplicity, this is sticky.
> + */
> + APICV_INHIBIT_REASON_X2APIC,

I still don't understand why do you want this to be an inhibit bit.

Now this 'inhibit' is not even set/clear.

I prefer to just have a boolean 'is_avic' or, '.needs_x2apic_memslot_inhibition'
in the vendor ops, and check it in 'kvm_vcpu_update_apicv' with the above comment on top of it.

need_x2apic_memslot_inhibition can even be set to false when x2avic is supported at the initalization time,
because then AVIC behaves just like APICv (when x2avic bit is enabled, AVIC mmio is no longer decoded).



static void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
{
if (!lapic_in_kernel(vcpu))
return;

/
* Due to sharing page tables across vCPUs, the xAPIC memslot must be
* deleted if any vCPU has x2APIC enabled as SVM without X2AVIC supoprt
* doesn't provide fully independent controls for AVIC vs. x2AVIC.
* For simplicity, this is sticky.
/

if (apic_x2apic_mode(vcpu->arch.apic) && kvm_x86_ops.needs_x2apic_memslot_inhibition)
kvm_inhibit_apic_access_page(vcpu);

__kvm_vcpu_update_apicv(vcpu);
}


With this fixed:

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky


> };
>
> struct kvm_arch {
> @@ -1176,10 +1187,11 @@ struct kvm_arch {
> struct kvm_apic_map __rcu *apic_map;
> atomic_t apic_map_dirty;
>
> - /* Protects apic_access_memslot_enabled and apicv_inhibit_reasons */
> - struct rw_semaphore apicv_update_lock;
> -
> bool apic_access_memslot_enabled;
> + bool apic_access_memslot_inhibited;
> +
> + /* Protects apicv_inhibit_reasons */
> + struct rw_semaphore apicv_update_lock;
> unsigned long apicv_inhibit_reasons;
>
> gpa_t wall_clock;
> @@ -1912,7 +1924,7 @@ gpa_t kvm_mmu_gva_to_gpa_system(struct kvm_vcpu *vcpu, gva_t gva,
>
> bool kvm_apicv_activated(struct kvm *kvm);
> bool kvm_vcpu_apicv_activated(struct kvm_vcpu *vcpu);
> -void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu);
> +void __kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu);
> void __kvm_set_or_clear_apicv_inhibit(struct kvm *kvm,
> enum kvm_apicv_inhibit reason, bool set);
> void kvm_set_or_clear_apicv_inhibit(struct kvm *kvm,
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 80e8b1cc6dc2..42b61469674d 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2443,7 +2443,8 @@ int kvm_alloc_apic_access_page(struct kvm *kvm)
> int ret = 0;
>
> mutex_lock(&kvm->slots_lock);
> - if (kvm->arch.apic_access_memslot_enabled)
> + if (kvm->arch.apic_access_memslot_enabled ||
> + kvm->arch.apic_access_memslot_inhibited)
> goto out;
>
> hva = __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT,
> @@ -2471,6 +2472,41 @@ int kvm_alloc_apic_access_page(struct kvm *kvm)
> }
> EXPORT_SYMBOL_GPL(kvm_alloc_apic_access_page);
>
> +void kvm_inhibit_apic_access_page(struct kvm_vcpu *vcpu)
> +{
> + struct kvm *kvm = vcpu->kvm;
> +
> + if (!kvm->arch.apic_access_memslot_enabled)
> + return;
> +
> + kvm_vcpu_srcu_read_unlock(vcpu);
> +
> + mutex_lock(&kvm->slots_lock);
> +
> + if (kvm->arch.apic_access_memslot_enabled) {
> + __x86_set_memory_region(kvm, APIC_ACCESS_PAGE_PRIVATE_MEMSLOT, 0, 0);
> + /*
> + * Clear "enabled" after the memslot is deleted so that a
> + * different vCPU doesn't get a false negative when checking
> + * the flag out of slots_lock. No additional memory barrier is
> + * needed as modifying memslots requires waiting other vCPUs to
> + * drop SRCU (see above), and false positives are ok as the
> + * flag is rechecked after acquiring slots_lock.kvm_vcpu_update_apicv
> + */
> + kvm->arch.apic_access_memslot_enabled = false;
> +
> + /*
> + * Mark the memslot as inhibited to prevent reallocating the
> + * memslot during vCPU creation, e.g. if a vCPU is hotplugged.
> + */
> + kvm->arch.apic_access_memslot_inhibited = true;
> + }
> +
> + mutex_unlock(&kvm->slots_lock);
> +
> + kvm_vcpu_srcu_read_lock(vcpu);
> +}
> +
> void kvm_lapic_reset(struct kvm_vcpu *vcpu, bool init_event)
> {
> struct kvm_lapic *apic = vcpu->arch.apic;
> diff --git a/arch/x86/kvm/lapic.h b/arch/x86/kvm/lapic.h
> index 0587a8282cb3..a318609bb050 100644
> --- a/arch/x86/kvm/lapic.h
> +++ b/arch/x86/kvm/lapic.h
> @@ -113,6 +113,7 @@ int kvm_apic_set_irq(struct kvm_vcpu *vcpu, struct kvm_lapic_irq *irq,
> int kvm_apic_local_deliver(struct kvm_lapic *apic, int lvt_type);
> void kvm_apic_update_apicv(struct kvm_vcpu *vcpu);
> int kvm_alloc_apic_access_page(struct kvm *kvm);
> +void kvm_inhibit_apic_access_page(struct kvm_vcpu *vcpu);
>
> bool kvm_irq_delivery_to_apic_fast(struct kvm *kvm, struct kvm_lapic *src,
> struct kvm_lapic_irq *irq, int *r, struct dest_map *dest_map);
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index ec28ba4c5f1b..535e35edce1d 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -72,12 +72,12 @@ static void avic_activate_vmcb(struct vcpu_svm *svm)
>
> vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
>
> - /* Note:
> - * KVM can support hybrid-AVIC mode, where KVM emulates x2APIC
> - * MSR accesses, while interrupt injection to a running vCPU
> - * can be achieved using AVIC doorbell. The AVIC hardware still
> - * accelerate MMIO accesses, but this does not cause any harm
> - * as the guest is not supposed to access xAPIC mmio when uses x2APIC.
> + /*
> + * Note: KVM supports hybrid-AVIC mode, where KVM emulates x2APIC MSR
> + * accesses, while interrupt injection to a running vCPU can be
> + * achieved using AVIC doorbell. KVM disables the APIC access page
> + * (deletes the memslot) if any vCPU has x2APIC enabled, thus enabling
> + * AVIC in hybrid mode activates only the doorbell mechanism.
> */
> if (apic_x2apic_mode(svm->vcpu.arch.apic) &&
> avic_mode == AVIC_MODE_X2) {
> @@ -975,7 +975,8 @@ bool avic_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason)
> BIT(APICV_INHIBIT_REASON_BLOCKIRQ) |
> BIT(APICV_INHIBIT_REASON_SEV) |
> BIT(APICV_INHIBIT_REASON_APIC_ID_MODIFIED) |
> - BIT(APICV_INHIBIT_REASON_APIC_BASE_MODIFIED);
> + BIT(APICV_INHIBIT_REASON_APIC_BASE_MODIFIED) |
> + BIT(APICV_INHIBIT_REASON_X2APIC);
>
> return supported & BIT(reason);
> }
> diff --git a/arch/x86/kvm/svm/nested.c b/arch/x86/kvm/svm/nested.c
> index 4c620999d230..8d5e00a7ef84 100644
> --- a/arch/x86/kvm/svm/nested.c
> +++ b/arch/x86/kvm/svm/nested.c
> @@ -1084,7 +1084,7 @@ int nested_svm_vmexit(struct vcpu_svm *svm)
> * to benefit from it right away.
> */
> if (kvm_apicv_activated(vcpu->kvm))
> - kvm_vcpu_update_apicv(vcpu);
> + __kvm_vcpu_update_apicv(vcpu);
>
> return 0;
> }
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index eb9d2c23fb04..a20002924eb4 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -10251,7 +10251,7 @@ void kvm_make_scan_ioapic_request(struct kvm *kvm)
> kvm_make_all_cpus_request(kvm, KVM_REQ_SCAN_IOAPIC);
> }
>
> -void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
> +void __kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
> {
> struct kvm_lapic *apic = vcpu->arch.apic;
> bool activate;
> @@ -10286,7 +10286,19 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
> preempt_enable();
> up_read(&vcpu->kvm->arch.apicv_update_lock);
> }
> -EXPORT_SYMBOL_GPL(kvm_vcpu_update_apicv);
> +EXPORT_SYMBOL_GPL(__kvm_vcpu_update_apicv);
> +
> +static void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
> +{
> + if (!lapic_in_kernel(vcpu))
> + return;
> +
> + if (apic_x2apic_mode(vcpu->arch.apic) &&
> + static_call(kvm_x86_check_apicv_inhibit_reasons)(APICV_INHIBIT_REASON_X2APIC))
> + kvm_inhibit_apic_access_page(vcpu);
> +
> + __kvm_vcpu_update_apicv(vcpu);
> +}
>
> void __kvm_set_or_clear_apicv_inhibit(struct kvm *kvm,
> enum kvm_apicv_inhibit reason, bool set)


2022-12-16 19:12:53

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v4 11/32] KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled

On Thu, Dec 08, 2022, Maxim Levitsky wrote:
> On Sat, 2022-10-01 at 00:58 +0000, Sean Christopherson wrote:
> > diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> > index d40206b16d6c..062758135c86 100644
> > --- a/arch/x86/include/asm/kvm_host.h
> > +++ b/arch/x86/include/asm/kvm_host.h
> > @@ -1139,6 +1139,17 @@ enum kvm_apicv_inhibit {
> > * AVIC is disabled because SEV doesn't support it.
> > */
> > APICV_INHIBIT_REASON_SEV,
> > +
> > + /*
> > + * Due to sharing page tables across vCPUs, the xAPIC memslot must be
> > + * deleted if any vCPU has x2APIC enabled as SVM doesn't provide fully
> > + * independent controls for AVIC vs. x2AVIC, and also because SVM
> > + * supports a "hybrid" AVIC mode for CPUs that support AVIC but not
> > + * x2AVIC. Note, this isn't a "full" inhibit and is tracked separately.
> > + * AVIC can still be activated, but KVM must not create SPTEs for the
> > + * APIC base. For simplicity, this is sticky.
> > + */
> > + APICV_INHIBIT_REASON_X2APIC,
>
> I still don't understand why do you want this to be an inhibit bit.

Because in my mental model, it's an inhibit, but with special properties. But I
totally get why that's confusing.

> Now this 'inhibit' is not even set/clear.
>
> I prefer to just have a boolean 'is_avic' or,
> '.needs_x2apic_memslot_inhibition' in the vendor ops, and check it in
> 'kvm_vcpu_update_apicv' with the above comment on top of it.
>
> need_x2apic_memslot_inhibition can even be set to false when x2avic is
> supported at the initalization time, because then AVIC behaves just like
> APICv (when x2avic bit is enabled, AVIC mmio is no longer decoded).

Oh, so SVM does effectively have independent controls, it's only the "hybrid" mode
that's affected? In that case, how about this?

/*
* Due to sharing page tables across vCPUs, the xAPIC memslot must be
* deleted if any vCPU has x2APIC enabled and hardware doesn't support
* x2APIC virtualization. E.g. some AMD CPUs support AVIC but not
* x2AVIC. KVM still allows enabling AVIC in this case so that KVM can
* the AVIC doorbell to inject interrupts to running vCPUs, but KVM
* mustn't create SPTEs for the APIC base as the vCPU would incorrectly
* be able to access the vAPIC page via MMIO despite being in x2APIC
* mode. For simplicity, inhibiting the APIC access page is sticky.
*/
if (apic_x2apic_mode(vcpu->arch.apic) &&
!kvm_x86_ops.has_hardware_x2apic_virtualization)
kvm_inhibit_apic_access_page(vcpu)

2022-12-16 19:54:37

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v4 11/32] KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled

On Fri, Dec 16, 2022, Sean Christopherson wrote:
> On Thu, Dec 08, 2022, Maxim Levitsky wrote:
> > I prefer to just have a boolean 'is_avic' or,
> > '.needs_x2apic_memslot_inhibition' in the vendor ops, and check it in
> > 'kvm_vcpu_update_apicv' with the above comment on top of it.
> >
> > need_x2apic_memslot_inhibition can even be set to false when x2avic is
> > supported at the initalization time, because then AVIC behaves just like
> > APICv (when x2avic bit is enabled, AVIC mmio is no longer decoded).
>
> Oh, so SVM does effectively have independent controls, it's only the "hybrid" mode
> that's affected? In that case, how about this?
>
> /*
> * Due to sharing page tables across vCPUs, the xAPIC memslot must be
> * deleted if any vCPU has x2APIC enabled and hardware doesn't support
> * x2APIC virtualization. E.g. some AMD CPUs support AVIC but not
> * x2AVIC. KVM still allows enabling AVIC in this case so that KVM can
> * the AVIC doorbell to inject interrupts to running vCPUs, but KVM
> * mustn't create SPTEs for the APIC base as the vCPU would incorrectly
> * be able to access the vAPIC page via MMIO despite being in x2APIC
> * mode. For simplicity, inhibiting the APIC access page is sticky.
> */
> if (apic_x2apic_mode(vcpu->arch.apic) &&
> !kvm_x86_ops.has_hardware_x2apic_virtualization)

Hrm, that's not quite right either since it's obviously possible to have an Intel
CPU that supports APICv but not x2APIC virtualization. And in that case KVM
doesn't need to inhibit the memslot, e.g. if not all vCPUs are in x2APIC.

I was hoping to have a name that communicate _why_ the memslot needs to be
inhibited, but it's turning out to be really hard to come up with a name that's
descriptive without being ridiculously verbose. The best I've come up with is:

allow_apicv_in_x2apic_without_x2apic_virtualization

It's heinous, but I'm inclined to go with it unless someone has a better idea.

2022-12-27 11:56:01

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v4 11/32] KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled

On 12/16/22 20:40, Sean Christopherson wrote:
> I was hoping to have a name that communicate_why_ the memslot needs to be
> inhibited, but it's turning out to be really hard to come up with a name that's
> descriptive without being ridiculously verbose. The best I've come up with is:
>
> allow_apicv_in_x2apic_without_x2apic_virtualization
>
> It's heinous, but I'm inclined to go with it unless someone has a better idea.

Can any of you provide a patch to squash on top of this one (or on top
of kvm/queue, as you prefer)?

Paolo

2023-01-03 17:20:24

by Sean Christopherson

[permalink] [raw]
Subject: Re: [PATCH v4 11/32] KVM: x86: Inhibit APIC memslot if x2APIC and AVIC are enabled

On Tue, Dec 27, 2022, Paolo Bonzini wrote:
> On 12/16/22 20:40, Sean Christopherson wrote:
> > I was hoping to have a name that communicate_why_ the memslot needs to be
> > inhibited, but it's turning out to be really hard to come up with a name that's
> > descriptive without being ridiculously verbose. The best I've come up with is:
> >
> > allow_apicv_in_x2apic_without_x2apic_virtualization
> >
> > It's heinous, but I'm inclined to go with it unless someone has a better idea.
>
> Can any of you provide a patch to squash on top of this one (or on top of
> kvm/queue, as you prefer)?

I'll send patches, though it might take a day or three. I had a reworked version
of this series prepped and tested before vacation kicked in, but lost power before
I could send my traditional pre-vacation patch bomb :-/