2022-05-09 03:08:23

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v4 00/15] Introducing AMD x2AVIC and hybrid-AVIC modes

Introducing support for AMD x2APIC virtualization. This feature is
indicated by the CPUID Fn8000_000A EDX[14], and it can be activated
by setting bit 31 (enable AVIC) and bit 30 (x2APIC mode) of VMCB
offset 60h.

With x2AVIC support, the guest local APIC can be fully virtualized in
both xAPIC and x2APIC modes, and the mode can be changed during runtime.
For example, when AVIC is enabled, the hypervisor set VMCB bit 31
to activate AVIC for each vCPU. Then, it keeps track of each vCPU's
APIC mode, and updates VMCB bit 30 to enable/disable x2APIC
virtualization mode accordingly.

Besides setting bit VMCB bit 30 and 31, for x2AVIC, kvm_amd driver needs
to disable interception for the x2APIC MSR range to allow AVIC hardware
to virtualize register accesses.

This series also introduce a partial APIC virtualization (hybrid-AVIC)
mode, where APIC register accesses are trapped (i.e. not virtualized
by hardware), but leverage AVIC doorbell for interrupt injection.
This eliminates need to disable x2APIC in the guest on system without
x2AVIC support. (Note: suggested by Maxim)

Regards,
Suravee

Testing for v4:
* Tested booting a Linux VM with x2APIC physical and logical modes upto 512 vCPUs.
* Test enable AVIC in L0 with xAPIC and x2AVIC modes in L1 and launch L2 guest
* Test partial AVIC mode by launching a VM with x2APIC mode

Changes from v3:
(https://lore.kernel.org/lkml/[email protected]/T/)
* Patch 3 : Update logic force_avic
* Patch 8 : Move logic for handling APIC disable to common code (new)
* Patch 9 : Only call avic_refresh_apicv_exec_ctrl
* Patch 12 : Remove APICV_INHIBIT_REASON_X2APIC, and add more comment for hybrid-AVIC mode

Suravee Suthikulpanit (15):
x86/cpufeatures: Introduce x2AVIC CPUID bit
KVM: x86: lapic: Rename [GET/SET]_APIC_DEST_FIELD to
[GET/SET]_XAPIC_DEST_FIELD
KVM: SVM: Detect X2APIC virtualization (x2AVIC) support
KVM: SVM: Update max number of vCPUs supported for x2AVIC mode
KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID
KVM: SVM: Do not support updating APIC ID when in x2APIC mode
KVM: SVM: Adding support for configuring x2APIC MSRs interception
KVM: x86: Deactivate APICv on vCPU with APIC disabled
KVM: SVM: Refresh AVIC configuration when changing APIC mode
KVM: SVM: Introduce helper functions to (de)activate AVIC and x2AVIC
KVM: SVM: Do not throw warning when calling avic_vcpu_load on a
running vcpu
KVM: SVM: Introduce hybrid-AVIC mode
KVM: x86: Warning APICv inconsistency only when vcpu APIC mode is
valid
KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible
KVM: SVM: Add AVIC doorbell tracepoint

arch/x86/hyperv/hv_apic.c | 2 +-
arch/x86/include/asm/apicdef.h | 4 +-
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/kvm_host.h | 1 -
arch/x86/include/asm/svm.h | 21 +++-
arch/x86/kernel/apic/apic.c | 2 +-
arch/x86/kernel/apic/ipi.c | 2 +-
arch/x86/kvm/lapic.c | 6 +-
arch/x86/kvm/svm/avic.c | 191 ++++++++++++++++++++++++++---
arch/x86/kvm/svm/svm.c | 56 +++++----
arch/x86/kvm/svm/svm.h | 6 +-
arch/x86/kvm/trace.h | 18 +++
arch/x86/kvm/x86.c | 8 +-
13 files changed, 262 insertions(+), 56 deletions(-)

--
2.25.1



2022-05-09 04:14:33

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v4 03/15] KVM: SVM: Detect X2APIC virtualization (x2AVIC) support

Add CPUID check for the x2APIC virtualization (x2AVIC) feature.
If available, the SVM driver can support both AVIC and x2AVIC modes
when load the kvm_amd driver with avic=1. The operating mode will be
determined at runtime depending on the guest APIC mode.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/svm.h | 3 +++
arch/x86/kvm/svm/avic.c | 51 ++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/svm/svm.c | 15 ++---------
arch/x86/kvm/svm/svm.h | 1 +
4 files changed, 57 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index f70a5108d464..2c2a104b777e 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -195,6 +195,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
#define AVIC_ENABLE_SHIFT 31
#define AVIC_ENABLE_MASK (1 << AVIC_ENABLE_SHIFT)

+#define X2APIC_MODE_SHIFT 30
+#define X2APIC_MODE_MASK (1 << X2APIC_MODE_SHIFT)
+
#define LBR_CTL_ENABLE_MASK BIT_ULL(0)
#define VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK BIT_ULL(1)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index a8f514212b87..95006bbdf970 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -40,6 +40,15 @@
#define AVIC_GATAG_TO_VMID(x) ((x >> AVIC_VCPU_ID_BITS) & AVIC_VM_ID_MASK)
#define AVIC_GATAG_TO_VCPUID(x) (x & AVIC_VCPU_ID_MASK)

+enum avic_modes {
+ AVIC_MODE_NONE = 0,
+ AVIC_MODE_X1,
+ AVIC_MODE_X2,
+};
+
+static bool force_avic;
+module_param_unsafe(force_avic, bool, 0444);
+
/* Note:
* This hash table is used to map VM_ID to a struct kvm_svm,
* when handling AMD IOMMU GALOG notification to schedule in
@@ -50,6 +59,7 @@ static DEFINE_HASHTABLE(svm_vm_data_hash, SVM_VM_DATA_HASH_BITS);
static u32 next_vm_id = 0;
static bool next_vm_id_wrapped = 0;
static DEFINE_SPINLOCK(svm_vm_data_hash_lock);
+static enum avic_modes avic_mode;

/*
* This is a wrapper of struct amd_iommu_ir_data.
@@ -1077,3 +1087,44 @@ void avic_vcpu_unblocking(struct kvm_vcpu *vcpu)

avic_vcpu_load(vcpu);
}
+
+/*
+ * Note:
+ * - The module param avic enable both xAPIC and x2APIC mode.
+ * - Hypervisor can support both xAVIC and x2AVIC in the same guest.
+ * - The mode can be switched at run-time.
+ */
+bool avic_hardware_setup(struct kvm_x86_ops *x86_ops)
+{
+ if (!npt_enabled)
+ return false;
+
+ if (boot_cpu_has(X86_FEATURE_AVIC)) {
+ avic_mode = AVIC_MODE_X1;
+ pr_info("AVIC enabled\n");
+ } else if (force_avic) {
+ /*
+ * Some older systems does not advertise AVIC support.
+ * See Revision Guide for specific AMD processor for more detail.
+ */
+ avic_mode = AVIC_MODE_X1;
+ pr_warn("AVIC is not supported in CPUID but force enabled");
+ pr_warn("Your system might crash and burn");
+ }
+
+ /* AVIC is a prerequisite for x2AVIC. */
+ if (boot_cpu_has(X86_FEATURE_X2AVIC)) {
+ if (avic_mode == AVIC_MODE_X1) {
+ avic_mode = AVIC_MODE_X2;
+ pr_info("x2AVIC enabled\n");
+ } else {
+ pr_warn(FW_BUG "Cannot support x2AVIC due to AVIC is disabled");
+ pr_warn(FW_BUG "Try enable AVIC using force_avic option");
+ }
+ }
+
+ if (avic_mode != AVIC_MODE_NONE)
+ amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
+
+ return !!avic_mode;
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3b49337998ec..74e6f86f5dc3 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -188,9 +188,6 @@ module_param(tsc_scaling, int, 0444);
static bool avic;
module_param(avic, bool, 0444);

-static bool force_avic;
-module_param_unsafe(force_avic, bool, 0444);
-
bool __read_mostly dump_invalid_vmcb;
module_param(dump_invalid_vmcb, bool, 0644);

@@ -4913,17 +4910,9 @@ static __init int svm_hardware_setup(void)
nrips = false;
}

- enable_apicv = avic = avic && npt_enabled && (boot_cpu_has(X86_FEATURE_AVIC) || force_avic);
+ enable_apicv = avic = avic && avic_hardware_setup(&svm_x86_ops);

- if (enable_apicv) {
- if (!boot_cpu_has(X86_FEATURE_AVIC)) {
- pr_warn("AVIC is not supported in CPUID but force enabled");
- pr_warn("Your system might crash and burn");
- } else
- pr_info("AVIC enabled\n");
-
- amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
- } else {
+ if (!enable_apicv) {
svm_x86_ops.vcpu_blocking = NULL;
svm_x86_ops.vcpu_unblocking = NULL;
svm_x86_ops.vcpu_get_apicv_inhibit_reasons = NULL;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 32220a1b0ea2..678fc7757fe4 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -603,6 +603,7 @@ extern struct kvm_x86_nested_ops svm_nested_ops;

/* avic.c */

+bool avic_hardware_setup(struct kvm_x86_ops *ops);
int avic_ga_log_notifier(u32 ga_tag);
void avic_vm_destroy(struct kvm *kvm);
int avic_vm_init(struct kvm *kvm);
--
2.25.1


2022-05-09 05:25:52

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v4 08/15] KVM: x86: Deactivate APICv on vCPU with APIC disabled

APICv should be deactivated on vCPU that has APIC disabled.
Therefore, call kvm_vcpu_update_apicv() when changing
APIC mode, and add additional check for APIC disable mode
when determine APICV activation,

Suggested-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/lapic.c | 4 +++-
arch/x86/kvm/x86.c | 4 +++-
2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 8b8c4a905976..680824d7aa0d 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2346,8 +2346,10 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
if (((old_value ^ value) & X2APIC_ENABLE) && (value & X2APIC_ENABLE))
kvm_apic_set_x2apic_id(apic, vcpu->vcpu_id);

- if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE))
+ if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE)) {
+ kvm_vcpu_update_apicv(vcpu);
static_call_cond(kvm_x86_set_virtual_apic_mode)(vcpu);
+ }

apic->base_address = apic->vcpu->arch.apic_base &
MSR_IA32_APICBASE_BASE;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8ee8c91fa762..77e49892dea1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9836,7 +9836,9 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)

down_read(&vcpu->kvm->arch.apicv_update_lock);

- activate = kvm_vcpu_apicv_activated(vcpu);
+ /* Do not activate APICV when APIC is disabled */
+ activate = kvm_vcpu_apicv_activated(vcpu) &&
+ (kvm_get_apic_mode(vcpu) != LAPIC_MODE_DISABLED);

if (vcpu->arch.apicv_active == activate)
goto out;
--
2.25.1


2022-05-09 05:45:24

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v4 10/15] KVM: SVM: Introduce helper functions to (de)activate AVIC and x2AVIC

Refactor the current logic for (de)activate AVIC into helper functions,
and also add logic for (de)activate x2AVIC. The helper function are used
when initializing AVIC and switching from AVIC to x2AVIC mode
(handled by svm_refresh_spicv_exec_ctrl()).

When an AVIC-enabled guest switches from APIC to x2APIC mode during
runtime, the SVM driver needs to perform the following steps:

1. Set the x2APIC mode bit for AVIC in VMCB along with the maximum
APIC ID support for each mode accodingly.

2. Disable x2APIC MSRs interception in order to allow the hardware
to virtualize x2APIC MSRs accesses.

Reported-by: kernel test robot <[email protected]>
Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/svm.h | 6 +++++
arch/x86/kvm/svm/avic.c | 54 ++++++++++++++++++++++++++++++++++----
arch/x86/kvm/svm/svm.c | 6 ++---
arch/x86/kvm/svm/svm.h | 1 +
4 files changed, 58 insertions(+), 9 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 4c26b0d47d76..f5525c0e03f7 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -256,6 +256,7 @@ enum avic_ipi_failure_cause {
AVIC_IPI_FAILURE_INVALID_BACKING_PAGE,
};

+#define AVIC_PHYSICAL_MAX_INDEX_MASK GENMASK_ULL(9, 0)

/*
* For AVIC, the max index allowed for physical APIC ID
@@ -500,4 +501,9 @@ DEFINE_GHCB_ACCESSORS(sw_exit_info_2)
DEFINE_GHCB_ACCESSORS(sw_scratch)
DEFINE_GHCB_ACCESSORS(xcr0)

+struct svm_direct_access_msrs {
+ u32 index; /* Index of the MSR */
+ bool always; /* True if intercept is initially cleared */
+};
+
#endif
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index a82981722018..ad2ef6c00559 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -69,6 +69,51 @@ struct amd_svm_iommu_ir {
void *data; /* Storing pointer to struct amd_ir_data */
};

+static inline void avic_set_x2apic_msr_interception(struct vcpu_svm *svm, bool disable)
+{
+ int i;
+
+ for (i = 0; i < MAX_DIRECT_ACCESS_MSRS; i++) {
+ int index = direct_access_msrs[i].index;
+
+ if ((index < APIC_BASE_MSR) ||
+ (index > APIC_BASE_MSR + 0xff))
+ continue;
+ set_msr_interception(&svm->vcpu, svm->msrpm, index,
+ !disable, !disable);
+ }
+}
+
+static void avic_activate_vmcb(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
+ vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
+
+ vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ if (apic_x2apic_mode(svm->vcpu.arch.apic)) {
+ vmcb->control.int_ctl |= X2APIC_MODE_MASK;
+ vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
+ /* Disabling MSR intercept for x2APIC registers */
+ avic_set_x2apic_msr_interception(svm, false);
+ } else {
+ vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
+ /* Enabling MSR intercept for x2APIC registers */
+ avic_set_x2apic_msr_interception(svm, true);
+ }
+}
+
+static void avic_deactivate_vmcb(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
+ vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
+
+ /* Enabling MSR intercept for x2APIC registers */
+ avic_set_x2apic_msr_interception(svm, true);
+}

/* Note:
* This function is called from IOMMU driver to notify
@@ -185,13 +230,12 @@ void avic_init_vmcb(struct vcpu_svm *svm, struct vmcb *vmcb)
vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
- vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;

if (kvm_apicv_activated(svm->vcpu.kvm))
- vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ avic_activate_vmcb(svm);
else
- vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
+ avic_deactivate_vmcb(svm);
}

static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
@@ -1082,9 +1126,9 @@ void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
* accordingly before re-activating.
*/
avic_apicv_post_state_restore(vcpu);
- vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ avic_activate_vmcb(svm);
} else {
- vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
+ avic_deactivate_vmcb(svm);
}
vmcb_mark_dirty(vmcb, VMCB_AVIC);

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 9066568fd19d..96a1fc1a1d1b 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -74,10 +74,8 @@ static uint64_t osvw_len = 4, osvw_status;

static DEFINE_PER_CPU(u64, current_tsc_ratio);

-static const struct svm_direct_access_msrs {
- u32 index; /* Index of the MSR */
- bool always; /* True if intercept is initially cleared */
-} direct_access_msrs[MAX_DIRECT_ACCESS_MSRS] = {
+const struct svm_direct_access_msrs
+direct_access_msrs[MAX_DIRECT_ACCESS_MSRS] = {
{ .index = MSR_STAR, .always = true },
{ .index = MSR_IA32_SYSENTER_CS, .always = true },
{ .index = MSR_IA32_SYSENTER_EIP, .always = false },
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 5ed958863b81..bb5bf70de3b2 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -600,6 +600,7 @@ void nested_vmcb02_compute_g_pat(struct vcpu_svm *svm);
void svm_switch_vmcb(struct vcpu_svm *svm, struct kvm_vmcb_info *target_vmcb);

extern struct kvm_x86_nested_ops svm_nested_ops;
+extern const struct svm_direct_access_msrs direct_access_msrs[];

/* avic.c */

--
2.25.1


2022-05-09 06:31:17

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v4 12/15] KVM: SVM: Introduce hybrid-AVIC mode

Currently, AVIC is inhibited when booting a VM w/ x2APIC support.
because AVIC cannot virtualize x2APIC MSR register accesses.
However, the AVIC doorbell can be used to accelerate interrupt
injection into a running vCPU, while all guest accesses to x2APIC MSRs
will be intercepted and emulated by KVM.

With hybrid-AVIC support, the APICV_INHIBIT_REASON_X2APIC is
no longer enforced.

Suggested-by: Maxim Levitsky <[email protected]>
Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 -
arch/x86/kvm/svm/avic.c | 13 +++++++++++--
arch/x86/kvm/svm/svm.c | 9 ---------
3 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c59fea4bdb6e..da03111b05f6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1051,7 +1051,6 @@ enum kvm_apicv_inhibit {
APICV_INHIBIT_REASON_NESTED,
APICV_INHIBIT_REASON_IRQWIN,
APICV_INHIBIT_REASON_PIT_REINJ,
- APICV_INHIBIT_REASON_X2APIC,
APICV_INHIBIT_REASON_BLOCKIRQ,
APICV_INHIBIT_REASON_ABSENT,
APICV_INHIBIT_REASON_SEV,
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 8e90c659de2d..ceed4b39b884 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -92,12 +92,22 @@ static void avic_activate_vmcb(struct vcpu_svm *svm)
vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;

vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
- if (apic_x2apic_mode(svm->vcpu.arch.apic)) {
+
+ /* Note:
+ * KVM can support hybrid-AVIC mode, where KVM emulates x2APIC
+ * MSR accesses, while interrupt injection to a running vCPU
+ * can be achieved using AVIC doorbell. The AVIC hardware still
+ * accelerate MMIO accesses, but this does not cause any harm
+ * as the guest is not supposed to access xAPIC mmio when uses x2APIC.
+ */
+ if (apic_x2apic_mode(svm->vcpu.arch.apic) &&
+ (avic_mode == AVIC_MODE_X2)) {
vmcb->control.int_ctl |= X2APIC_MODE_MASK;
vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
/* Disabling MSR intercept for x2APIC registers */
avic_set_x2apic_msr_interception(svm, false);
} else {
+ /* For xAVIC and hybrid-x2AVIC modes */
vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
/* Enabling MSR intercept for x2APIC registers */
avic_set_x2apic_msr_interception(svm, true);
@@ -999,7 +1009,6 @@ bool avic_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason)
BIT(APICV_INHIBIT_REASON_NESTED) |
BIT(APICV_INHIBIT_REASON_IRQWIN) |
BIT(APICV_INHIBIT_REASON_PIT_REINJ) |
- BIT(APICV_INHIBIT_REASON_X2APIC) |
BIT(APICV_INHIBIT_REASON_BLOCKIRQ) |
BIT(APICV_INHIBIT_REASON_SEV);

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 96a1fc1a1d1b..c0a3d4a1f3dc 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4041,7 +4041,6 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
struct kvm_cpuid_entry2 *best;
- struct kvm *kvm = vcpu->kvm;

vcpu->arch.xsaves_enabled = guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) &&
boot_cpu_has(X86_FEATURE_XSAVE) &&
@@ -4073,14 +4072,6 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
vcpu->arch.reserved_gpa_bits &= ~(1UL << (best->ebx & 0x3f));
}

- if (kvm_vcpu_apicv_active(vcpu)) {
- /*
- * AVIC does not work with an x2APIC mode guest. If the X2APIC feature
- * is exposed to the guest, disable AVIC.
- */
- if (guest_cpuid_has(vcpu, X86_FEATURE_X2APIC))
- kvm_set_apicv_inhibit(kvm, APICV_INHIBIT_REASON_X2APIC);
- }
init_vmcb_after_set_cpuid(vcpu);
}

--
2.25.1


2022-05-09 06:40:15

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v4 11/15] KVM: SVM: Do not throw warning when calling avic_vcpu_load on a running vcpu

Originalliy, this WARN_ON is designed to detect when calling
avic_vcpu_load() on an already running vcpu in AVIC mode (i.e. the AVIC
is_running bit is set).

However, for x2AVIC, the vCPU can switch from xAPIC to x2APIC mode while in
running state, in which the avic_vcpu_load() will be called from
svm_refresh_apicv_exec_ctrl().

Therefore, remove this warning since it is no longer appropriate.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index ad2ef6c00559..8e90c659de2d 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -1059,7 +1059,6 @@ void __avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
return;

entry = READ_ONCE(*(svm->avic_physical_id_cache));
- WARN_ON(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);

entry &= ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
entry |= (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK);
--
2.25.1


2022-05-09 07:43:24

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v4 07/15] KVM: SVM: Adding support for configuring x2APIC MSRs interception

When enabling x2APIC virtualization (x2AVIC), the interception of
x2APIC MSRs must be disabled to let the hardware virtualize guest
MSR accesses.

Current implementation keeps track of list of MSR interception state
in the svm_direct_access_msrs array. Therefore, extends the array to
include x2APIC MSRs.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/svm.c | 25 +++++++++++++++++++++++++
arch/x86/kvm/svm/svm.h | 4 ++--
2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 74e6f86f5dc3..314628b6bff4 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -100,6 +100,31 @@ static const struct svm_direct_access_msrs {
{ .index = MSR_IA32_CR_PAT, .always = false },
{ .index = MSR_AMD64_SEV_ES_GHCB, .always = true },
{ .index = MSR_TSC_AUX, .always = false },
+ { .index = (APIC_BASE_MSR + APIC_ID), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_TASKPRI), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_ARBPRI), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_PROCPRI), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_EOI), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_RRR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LDR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_DFR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_SPIV), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_ISR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_TMR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_IRR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_ESR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_ICR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_ICR2), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVTT), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVTTHMR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVTPC), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVT0), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVT1), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVTERR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_TMICT), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_TMCCT), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_TDCR), .always = false },
{ .index = MSR_INVALID, .always = false },
};

diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 678fc7757fe4..5ed958863b81 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -29,8 +29,8 @@
#define IOPM_SIZE PAGE_SIZE * 3
#define MSRPM_SIZE PAGE_SIZE * 2

-#define MAX_DIRECT_ACCESS_MSRS 21
-#define MSRPM_OFFSETS 16
+#define MAX_DIRECT_ACCESS_MSRS 46
+#define MSRPM_OFFSETS 32
extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
extern bool npt_enabled;
extern int vgif;
--
2.25.1


2022-05-09 08:14:49

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v4 04/15] KVM: SVM: Update max number of vCPUs supported for x2AVIC mode

xAVIC and x2AVIC modes can support diffferent number of vcpus.
Update existing logics to support each mode accordingly.

Also, modify the maximum physical APIC ID for AVIC to 255 to reflect
the actual value supported by the architecture.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/svm.h | 12 +++++++++---
arch/x86/kvm/svm/avic.c | 8 +++++---
2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 2c2a104b777e..4c26b0d47d76 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -258,10 +258,16 @@ enum avic_ipi_failure_cause {


/*
- * 0xff is broadcast, so the max index allowed for physical APIC ID
- * table is 0xfe. APIC IDs above 0xff are reserved.
+ * For AVIC, the max index allowed for physical APIC ID
+ * table is 0xff (255).
*/
-#define AVIC_MAX_PHYSICAL_ID_COUNT 0xff
+#define AVIC_MAX_PHYSICAL_ID 0XFEULL
+
+/*
+ * For x2AVIC, the max index allowed for physical APIC ID
+ * table is 0x1ff (511).
+ */
+#define X2AVIC_MAX_PHYSICAL_ID 0x1FFUL

#define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF)
#define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 95006bbdf970..29665b3e4e4e 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -185,7 +185,7 @@ void avic_init_vmcb(struct vcpu_svm *svm, struct vmcb *vmcb)
vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
- vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID_COUNT;
+ vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;

if (kvm_apicv_activated(svm->vcpu.kvm))
@@ -200,7 +200,8 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
u64 *avic_physical_id_table;
struct kvm_svm *kvm_svm = to_kvm_svm(vcpu->kvm);

- if (index >= AVIC_MAX_PHYSICAL_ID_COUNT)
+ if ((avic_mode == AVIC_MODE_X1 && index > AVIC_MAX_PHYSICAL_ID) ||
+ (avic_mode == AVIC_MODE_X2 && index > X2AVIC_MAX_PHYSICAL_ID))
return NULL;

avic_physical_id_table = page_address(kvm_svm->avic_physical_id_table_page);
@@ -247,7 +248,8 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
int id = vcpu->vcpu_id;
struct vcpu_svm *svm = to_svm(vcpu);

- if (id >= AVIC_MAX_PHYSICAL_ID_COUNT)
+ if ((avic_mode == AVIC_MODE_X1 && id > AVIC_MAX_PHYSICAL_ID) ||
+ (avic_mode == AVIC_MODE_X2 && id > X2AVIC_MAX_PHYSICAL_ID))
return -EINVAL;

if (!vcpu->arch.apic->regs)
--
2.25.1


2022-05-09 09:02:25

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v4 01/15] x86/cpufeatures: Introduce x2AVIC CPUID bit

Introduce a new feature bit for virtualized x2APIC (x2AVIC) in
CPUID_Fn8000000A_EDX [SVM Revision and Feature Identification].

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 1d6826eac3e6..2721bd1e8e1e 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -343,6 +343,7 @@
#define X86_FEATURE_AVIC (15*32+13) /* Virtual Interrupt Controller */
#define X86_FEATURE_V_VMSAVE_VMLOAD (15*32+15) /* Virtual VMSAVE VMLOAD */
#define X86_FEATURE_VGIF (15*32+16) /* Virtual GIF */
+#define X86_FEATURE_X2AVIC (15*32+18) /* Virtual x2apic */
#define X86_FEATURE_V_SPEC_CTRL (15*32+20) /* Virtual SPEC_CTRL */
#define X86_FEATURE_SVME_ADDR_CHK (15*32+28) /* "" SVME addr check */

--
2.25.1


2022-05-09 09:24:17

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v4 06/15] KVM: SVM: Do not support updating APIC ID when in x2APIC mode

In X2APIC mode, the Logical Destination Register is read-only,
which provides a fixed mapping between the logical and physical
APIC IDs. Therefore, there is no Logical APIC ID table in X2AVIC
and the processor uses the X2APIC ID in the backing page to create
a vCPU’s logical ID.

In addition, KVM does not support updating APIC ID in x2APIC mode,
which means AVIC does not need to handle this case.

Therefore, check x2APIC mode when handling physical and logical
APIC ID update, and when invalidating logical APIC ID table.

Reviewed-by: Maxim Levitsky <[email protected]>
Suggested-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 7f820cf45173..16ce2d50efac 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -499,8 +499,13 @@ static void avic_invalidate_logical_id_entry(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
bool flat = svm->dfr_reg == APIC_DFR_FLAT;
- u32 *entry = avic_get_logical_id_entry(vcpu, svm->ldr_reg, flat);
+ u32 *entry;

+ /* Note: x2AVIC does not use logical APIC ID table */
+ if (apic_x2apic_mode(vcpu->arch.apic))
+ return;
+
+ entry = avic_get_logical_id_entry(vcpu, svm->ldr_reg, flat);
if (entry)
clear_bit(AVIC_LOGICAL_ID_ENTRY_VALID_BIT, (unsigned long *)entry);
}
@@ -512,6 +517,10 @@ static int avic_handle_ldr_update(struct kvm_vcpu *vcpu)
u32 ldr = kvm_lapic_get_reg(vcpu->arch.apic, APIC_LDR);
u32 id = kvm_xapic_id(vcpu->arch.apic);

+ /* AVIC does not support LDR update for x2APIC */
+ if (apic_x2apic_mode(vcpu->arch.apic))
+ return 0;
+
if (ldr == svm->ldr_reg)
return 0;

@@ -532,6 +541,14 @@ static int avic_handle_apic_id_update(struct kvm_vcpu *vcpu)
struct vcpu_svm *svm = to_svm(vcpu);
u32 id = kvm_xapic_id(vcpu->arch.apic);

+ /*
+ * KVM does not support apic ID update for x2APIC.
+ * Also, need to check if the APIC ID exceed 254.
+ */
+ if (apic_x2apic_mode(vcpu->arch.apic) ||
+ (vcpu->vcpu_id >= APIC_BROADCAST))
+ return 0;
+
if (vcpu->vcpu_id == id)
return 0;

--
2.25.1


2022-05-09 11:01:45

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v4 03/15] KVM: SVM: Detect X2APIC virtualization (x2AVIC) support

On Sat, 2022-05-07 at 21:39 -0500, Suravee Suthikulpanit wrote:
> Add CPUID check for the x2APIC virtualization (x2AVIC) feature.
> If available, the SVM driver can support both AVIC and x2AVIC modes
> when load the kvm_amd driver with avic=1. The operating mode will be
> determined at runtime depending on the guest APIC mode.
>
> Reviewed-by: Maxim Levitsky <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/include/asm/svm.h | 3 +++
> arch/x86/kvm/svm/avic.c | 51 ++++++++++++++++++++++++++++++++++++++
> arch/x86/kvm/svm/svm.c | 15 ++---------
> arch/x86/kvm/svm/svm.h | 1 +
> 4 files changed, 57 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index f70a5108d464..2c2a104b777e 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -195,6 +195,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
> #define AVIC_ENABLE_SHIFT 31
> #define AVIC_ENABLE_MASK (1 << AVIC_ENABLE_SHIFT)
>
> +#define X2APIC_MODE_SHIFT 30
> +#define X2APIC_MODE_MASK (1 << X2APIC_MODE_SHIFT)
> +
> #define LBR_CTL_ENABLE_MASK BIT_ULL(0)
> #define VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK BIT_ULL(1)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index a8f514212b87..95006bbdf970 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -40,6 +40,15 @@
> #define AVIC_GATAG_TO_VMID(x) ((x >> AVIC_VCPU_ID_BITS) & AVIC_VM_ID_MASK)
> #define AVIC_GATAG_TO_VCPUID(x) (x & AVIC_VCPU_ID_MASK)
>
> +enum avic_modes {
> + AVIC_MODE_NONE = 0,
> + AVIC_MODE_X1,
> + AVIC_MODE_X2,
> +};
> +
> +static bool force_avic;
> +module_param_unsafe(force_avic, bool, 0444);
> +
> /* Note:
> * This hash table is used to map VM_ID to a struct kvm_svm,
> * when handling AMD IOMMU GALOG notification to schedule in
> @@ -50,6 +59,7 @@ static DEFINE_HASHTABLE(svm_vm_data_hash, SVM_VM_DATA_HASH_BITS);
> static u32 next_vm_id = 0;
> static bool next_vm_id_wrapped = 0;
> static DEFINE_SPINLOCK(svm_vm_data_hash_lock);
> +static enum avic_modes avic_mode;
>
> /*
> * This is a wrapper of struct amd_iommu_ir_data.
> @@ -1077,3 +1087,44 @@ void avic_vcpu_unblocking(struct kvm_vcpu *vcpu)
>
> avic_vcpu_load(vcpu);
> }
> +
> +/*
> + * Note:
> + * - The module param avic enable both xAPIC and x2APIC mode.
> + * - Hypervisor can support both xAVIC and x2AVIC in the same guest.
> + * - The mode can be switched at run-time.
> + */
> +bool avic_hardware_setup(struct kvm_x86_ops *x86_ops)
> +{
> + if (!npt_enabled)
> + return false;
> +
> + if (boot_cpu_has(X86_FEATURE_AVIC)) {
> + avic_mode = AVIC_MODE_X1;
> + pr_info("AVIC enabled\n");
> + } else if (force_avic) {
> + /*
> + * Some older systems does not advertise AVIC support.
> + * See Revision Guide for specific AMD processor for more detail.
> + */
> + avic_mode = AVIC_MODE_X1;
> + pr_warn("AVIC is not supported in CPUID but force enabled");
> + pr_warn("Your system might crash and burn");
> + }
> +
> + /* AVIC is a prerequisite for x2AVIC. */
> + if (boot_cpu_has(X86_FEATURE_X2AVIC)) {
> + if (avic_mode == AVIC_MODE_X1) {
> + avic_mode = AVIC_MODE_X2;
> + pr_info("x2AVIC enabled\n");
> + } else {
> + pr_warn(FW_BUG "Cannot support x2AVIC due to AVIC is disabled");
> + pr_warn(FW_BUG "Try enable AVIC using force_avic option");
> + }
> + }
> +
> + if (avic_mode != AVIC_MODE_NONE)
> + amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
> +
> + return !!avic_mode;
> +}
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 3b49337998ec..74e6f86f5dc3 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -188,9 +188,6 @@ module_param(tsc_scaling, int, 0444);
> static bool avic;
> module_param(avic, bool, 0444);
>
> -static bool force_avic;
> -module_param_unsafe(force_avic, bool, 0444);
> -
> bool __read_mostly dump_invalid_vmcb;
> module_param(dump_invalid_vmcb, bool, 0644);
>
> @@ -4913,17 +4910,9 @@ static __init int svm_hardware_setup(void)
> nrips = false;
> }
>
> - enable_apicv = avic = avic && npt_enabled && (boot_cpu_has(X86_FEATURE_AVIC) || force_avic);
> + enable_apicv = avic = avic && avic_hardware_setup(&svm_x86_ops);
>
> - if (enable_apicv) {
> - if (!boot_cpu_has(X86_FEATURE_AVIC)) {
> - pr_warn("AVIC is not supported in CPUID but force enabled");
> - pr_warn("Your system might crash and burn");
> - } else
> - pr_info("AVIC enabled\n");
> -
> - amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
> - } else {
> + if (!enable_apicv) {
> svm_x86_ops.vcpu_blocking = NULL;
> svm_x86_ops.vcpu_unblocking = NULL;
> svm_x86_ops.vcpu_get_apicv_inhibit_reasons = NULL;
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 32220a1b0ea2..678fc7757fe4 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -603,6 +603,7 @@ extern struct kvm_x86_nested_ops svm_nested_ops;
>
> /* avic.c */
>
> +bool avic_hardware_setup(struct kvm_x86_ops *ops);
> int avic_ga_log_notifier(u32 ga_tag);
> void avic_vm_destroy(struct kvm *kvm);
> int avic_vm_init(struct kvm *kvm);

Looks great!

Best regars,
Maxim Levitsky


2022-05-09 11:36:57

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v4 08/15] KVM: x86: Deactivate APICv on vCPU with APIC disabled

On Sat, 2022-05-07 at 21:39 -0500, Suravee Suthikulpanit wrote:
> APICv should be deactivated on vCPU that has APIC disabled.
> Therefore, call kvm_vcpu_update_apicv() when changing
> APIC mode, and add additional check for APIC disable mode
> when determine APICV activation,
>
> Suggested-by: Maxim Levitsky <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/lapic.c | 4 +++-
> arch/x86/kvm/x86.c | 4 +++-
> 2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 8b8c4a905976..680824d7aa0d 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2346,8 +2346,10 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
> if (((old_value ^ value) & X2APIC_ENABLE) && (value & X2APIC_ENABLE))
> kvm_apic_set_x2apic_id(apic, vcpu->vcpu_id);
>
> - if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE))
> + if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE)) {
> + kvm_vcpu_update_apicv(vcpu);
> static_call_cond(kvm_x86_set_virtual_apic_mode)(vcpu);

As futher optimization, we might even get rid of .set_virtual_apic_mode
and do all of this in kvm_vcpu_update_apicv.
But no need to do this now.


> + }
>
> apic->base_address = apic->vcpu->arch.apic_base &
> MSR_IA32_APICBASE_BASE;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 8ee8c91fa762..77e49892dea1 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9836,7 +9836,9 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
>
> down_read(&vcpu->kvm->arch.apicv_update_lock);
>
> - activate = kvm_vcpu_apicv_activated(vcpu);
> + /* Do not activate APICV when APIC is disabled */
> + activate = kvm_vcpu_apicv_activated(vcpu) &&
> + (kvm_get_apic_mode(vcpu) != LAPIC_MODE_DISABLED);
>
> if (vcpu->arch.apicv_active == activate)
> goto out;

Looks very good!

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky


2022-05-09 11:37:00

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v4 08/15] KVM: x86: Deactivate APICv on vCPU with APIC disabled

On Sat, 2022-05-07 at 21:39 -0500, Suravee Suthikulpanit wrote:
> APICv should be deactivated on vCPU that has APIC disabled.
> Therefore, call kvm_vcpu_update_apicv() when changing
> APIC mode, and add additional check for APIC disable mode
> when determine APICV activation,
>
> Suggested-by: Maxim Levitsky <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/lapic.c | 4 +++-
> arch/x86/kvm/x86.c | 4 +++-
> 2 files changed, 6 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 8b8c4a905976..680824d7aa0d 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -2346,8 +2346,10 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
> if (((old_value ^ value) & X2APIC_ENABLE) && (value & X2APIC_ENABLE))
> kvm_apic_set_x2apic_id(apic, vcpu->vcpu_id);
>
> - if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE))
> + if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE)) {
> + kvm_vcpu_update_apicv(vcpu);
> static_call_cond(kvm_x86_set_virtual_apic_mode)(vcpu);

As futher optimization, we might even get rid of .set_virtual_apic_mode
and do all of this in kvm_vcpu_update_apicv.
But no need to this now.


> + }
>
> apic->base_address = apic->vcpu->arch.apic_base &
> MSR_IA32_APICBASE_BASE;
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 8ee8c91fa762..77e49892dea1 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -9836,7 +9836,9 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
>
> down_read(&vcpu->kvm->arch.apicv_update_lock);
>
> - activate = kvm_vcpu_apicv_activated(vcpu);
> + /* Do not activate APICV when APIC is disabled */
> + activate = kvm_vcpu_apicv_activated(vcpu) &&
> + (kvm_get_apic_mode(vcpu) != LAPIC_MODE_DISABLED);
>
> if (vcpu->arch.apicv_active == activate)
> goto out;

Looks good!

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky


2022-05-09 11:38:08

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v4 08/15] KVM: x86: Deactivate APICv on vCPU with APIC disabled

On Mon, 2022-05-09 at 13:18 +0300, Maxim Levitsky wrote:
> On Sat, 2022-05-07 at 21:39 -0500, Suravee Suthikulpanit wrote:
> > APICv should be deactivated on vCPU that has APIC disabled.
> > Therefore, call kvm_vcpu_update_apicv() when changing
> > APIC mode, and add additional check for APIC disable mode
> > when determine APICV activation,
> >
> > Suggested-by: Maxim Levitsky <[email protected]>
> > Signed-off-by: Suravee Suthikulpanit <[email protected]>
> > ---
> > arch/x86/kvm/lapic.c | 4 +++-
> > arch/x86/kvm/x86.c | 4 +++-
> > 2 files changed, 6 insertions(+), 2 deletions(-)
> >
> > diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> > index 8b8c4a905976..680824d7aa0d 100644
> > --- a/arch/x86/kvm/lapic.c
> > +++ b/arch/x86/kvm/lapic.c
> > @@ -2346,8 +2346,10 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
> > if (((old_value ^ value) & X2APIC_ENABLE) && (value & X2APIC_ENABLE))
> > kvm_apic_set_x2apic_id(apic, vcpu->vcpu_id);
> >
> > - if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE))
> > + if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE)) {
> > + kvm_vcpu_update_apicv(vcpu);
> > static_call_cond(kvm_x86_set_virtual_apic_mode)(vcpu);
>
> As futher optimization, we might even get rid of .set_virtual_apic_mode
> and do all of this in kvm_vcpu_update_apicv.
> But no need to this now.
>
>
> > + }
> >
> > apic->base_address = apic->vcpu->arch.apic_base &
> > MSR_IA32_APICBASE_BASE;
> > diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> > index 8ee8c91fa762..77e49892dea1 100644
> > --- a/arch/x86/kvm/x86.c
> > +++ b/arch/x86/kvm/x86.c
> > @@ -9836,7 +9836,9 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)
> >
> > down_read(&vcpu->kvm->arch.apicv_update_lock);
> >
> > - activate = kvm_vcpu_apicv_activated(vcpu);
> > + /* Do not activate APICV when APIC is disabled */
> > + activate = kvm_vcpu_apicv_activated(vcpu) &&
> > + (kvm_get_apic_mode(vcpu) != LAPIC_MODE_DISABLED);
> >
> > if (vcpu->arch.apicv_active == activate)
> > goto out;
>
> Looks good!
>
> Reviewed-by: Maxim Levitsky <[email protected]>
>
> Best regards,
> Maxim Levitsky
>

Sorry for a duplicated reply - I tried to cancel it
to correct a typo, but I was too late I see.

Best regards,
Maxim Levitsky


2022-05-09 11:39:08

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v4 00/15] Introducing AMD x2AVIC and hybrid-AVIC modes

On Sat, 2022-05-07 at 21:39 -0500, Suravee Suthikulpanit wrote:
> Introducing support for AMD x2APIC virtualization. This feature is
> indicated by the CPUID Fn8000_000A EDX[14], and it can be activated
> by setting bit 31 (enable AVIC) and bit 30 (x2APIC mode) of VMCB
> offset 60h.
>
> With x2AVIC support, the guest local APIC can be fully virtualized in
> both xAPIC and x2APIC modes, and the mode can be changed during runtime.
> For example, when AVIC is enabled, the hypervisor set VMCB bit 31
> to activate AVIC for each vCPU. Then, it keeps track of each vCPU's
> APIC mode, and updates VMCB bit 30 to enable/disable x2APIC
> virtualization mode accordingly.
>
> Besides setting bit VMCB bit 30 and 31, for x2AVIC, kvm_amd driver needs
> to disable interception for the x2APIC MSR range to allow AVIC hardware
> to virtualize register accesses.
>
> This series also introduce a partial APIC virtualization (hybrid-AVIC)
> mode, where APIC register accesses are trapped (i.e. not virtualized
> by hardware), but leverage AVIC doorbell for interrupt injection.
> This eliminates need to disable x2APIC in the guest on system without
> x2AVIC support. (Note: suggested by Maxim)
>
> Regards,
> Suravee
>
> Testing for v4:
> * Tested booting a Linux VM with x2APIC physical and logical modes upto 512 vCPUs.
> * Test enable AVIC in L0 with xAPIC and x2AVIC modes in L1 and launch L2 guest
> * Test partial AVIC mode by launching a VM with x2APIC mode
>
> Changes from v3:
> (https://lore.kernel.org/lkml/[email protected]/T/)
> * Patch 3 : Update logic force_avic
> * Patch 8 : Move logic for handling APIC disable to common code (new)
> * Patch 9 : Only call avic_refresh_apicv_exec_ctrl
> * Patch 12 : Remove APICV_INHIBIT_REASON_X2APIC, and add more comment for hybrid-AVIC mode
>
> Suravee Suthikulpanit (15):
> x86/cpufeatures: Introduce x2AVIC CPUID bit
> KVM: x86: lapic: Rename [GET/SET]_APIC_DEST_FIELD to
> [GET/SET]_XAPIC_DEST_FIELD
> KVM: SVM: Detect X2APIC virtualization (x2AVIC) support
> KVM: SVM: Update max number of vCPUs supported for x2AVIC mode
> KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID
> KVM: SVM: Do not support updating APIC ID when in x2APIC mode
> KVM: SVM: Adding support for configuring x2APIC MSRs interception
> KVM: x86: Deactivate APICv on vCPU with APIC disabled
> KVM: SVM: Refresh AVIC configuration when changing APIC mode
> KVM: SVM: Introduce helper functions to (de)activate AVIC and x2AVIC
> KVM: SVM: Do not throw warning when calling avic_vcpu_load on a
> running vcpu
> KVM: SVM: Introduce hybrid-AVIC mode
> KVM: x86: Warning APICv inconsistency only when vcpu APIC mode is
> valid
> KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible
> KVM: SVM: Add AVIC doorbell tracepoint
>
> arch/x86/hyperv/hv_apic.c | 2 +-
> arch/x86/include/asm/apicdef.h | 4 +-
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/kvm_host.h | 1 -
> arch/x86/include/asm/svm.h | 21 +++-
> arch/x86/kernel/apic/apic.c | 2 +-
> arch/x86/kernel/apic/ipi.c | 2 +-
> arch/x86/kvm/lapic.c | 6 +-
> arch/x86/kvm/svm/avic.c | 191 ++++++++++++++++++++++++++---
> arch/x86/kvm/svm/svm.c | 56 +++++----
> arch/x86/kvm/svm/svm.h | 6 +-
> arch/x86/kvm/trace.h | 18 +++
> arch/x86/kvm/x86.c | 8 +-
> 13 files changed, 262 insertions(+), 56 deletions(-)
>

Patch series looks good.

I will smoke test it today on my normal AVIC, just in case.

Did you had a chance to look at my comments on your report
that nesting got broken by my nested PAUSE filtering patch?

I tried to reproduce it on my side, so far no luck.

I tried to oversubscribe L1, by booting a VM with 16 vCPUs
all pinned to single physical CPU, and then booting a nested guest
in it with about the same amount of vCPUs. Slow but it did work.


Also did you had a chance to look for my comments about the AMD's manual
asking the user to flush guest's TLB when changing apic backing page,
regardless of ASID?

Best regards,
Maxim Levitsky






2022-05-09 12:03:16

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH v4 12/15] KVM: SVM: Introduce hybrid-AVIC mode

Maxim / Paolo,

On 5/8/2022 9:39 AM, Suravee Suthikulpanit wrote:
> Currently, AVIC is inhibited when booting a VM w/ x2APIC support.
> because AVIC cannot virtualize x2APIC MSR register accesses.
> However, the AVIC doorbell can be used to accelerate interrupt
> injection into a running vCPU, while all guest accesses to x2APIC MSRs
> will be intercepted and emulated by KVM.
>
> With hybrid-AVIC support, the APICV_INHIBIT_REASON_X2APIC is
> no longer enforced.
>
> Suggested-by: Maxim Levitsky<[email protected]>
> Reviewed-by: Maxim Levitsky<[email protected]>

Sorry for a typo here in the email of the "Reviewed-by" line.

Suravee

2022-05-09 13:48:49

by Gupta, Pankaj

[permalink] [raw]
Subject: Re: [PATCH v4 04/15] KVM: SVM: Update max number of vCPUs supported for x2AVIC mode


> xAVIC and x2AVIC modes can support diffferent number of vcpus.
> Update existing logics to support each mode accordingly.
>
> Also, modify the maximum physical APIC ID for AVIC to 255 to reflect
> the actual value supported by the architecture.
>
> Reviewed-by: Maxim Levitsky <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/include/asm/svm.h | 12 +++++++++---
> arch/x86/kvm/svm/avic.c | 8 +++++---
> 2 files changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 2c2a104b777e..4c26b0d47d76 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -258,10 +258,16 @@ enum avic_ipi_failure_cause {
>
>
> /*
> - * 0xff is broadcast, so the max index allowed for physical APIC ID
> - * table is 0xfe. APIC IDs above 0xff are reserved.
> + * For AVIC, the max index allowed for physical APIC ID
> + * table is 0xff (255).
> */
> -#define AVIC_MAX_PHYSICAL_ID_COUNT 0xff
> +#define AVIC_MAX_PHYSICAL_ID 0XFEULL
> +
> +/*
> + * For x2AVIC, the max index allowed for physical APIC ID
> + * table is 0x1ff (511).
> + */
> +#define X2AVIC_MAX_PHYSICAL_ID 0x1FFUL
>
> #define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF)
> #define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 95006bbdf970..29665b3e4e4e 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -185,7 +185,7 @@ void avic_init_vmcb(struct vcpu_svm *svm, struct vmcb *vmcb)
> vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
> vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
> vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
> - vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID_COUNT;
> + vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
> vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;
>
> if (kvm_apicv_activated(svm->vcpu.kvm))
> @@ -200,7 +200,8 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
> u64 *avic_physical_id_table;
> struct kvm_svm *kvm_svm = to_kvm_svm(vcpu->kvm);
>
> - if (index >= AVIC_MAX_PHYSICAL_ID_COUNT)
> + if ((avic_mode == AVIC_MODE_X1 && index > AVIC_MAX_PHYSICAL_ID) ||
> + (avic_mode == AVIC_MODE_X2 && index > X2AVIC_MAX_PHYSICAL_ID))
> return NULL;
>
> avic_physical_id_table = page_address(kvm_svm->avic_physical_id_table_page);
> @@ -247,7 +248,8 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
> int id = vcpu->vcpu_id;
> struct vcpu_svm *svm = to_svm(vcpu);
>
> - if (id >= AVIC_MAX_PHYSICAL_ID_COUNT)
> + if ((avic_mode == AVIC_MODE_X1 && id > AVIC_MAX_PHYSICAL_ID) ||
> + (avic_mode == AVIC_MODE_X2 && id > X2AVIC_MAX_PHYSICAL_ID))
> return -EINVAL;
>
> if (!vcpu->arch.apic->regs)

Looks good to me.

Reviewed-by: Pankaj Gupta <[email protected]>

2022-05-09 13:54:58

by Gupta, Pankaj

[permalink] [raw]
Subject: Re: [PATCH v4 11/15] KVM: SVM: Do not throw warning when calling avic_vcpu_load on a running vcpu

On 5/8/2022 4:39 AM, Suravee Suthikulpanit wrote:
> Originalliy, this WARN_ON is designed to detect when calling
> avic_vcpu_load() on an already running vcpu in AVIC mode (i.e. the AVIC
> is_running bit is set).
>
> However, for x2AVIC, the vCPU can switch from xAPIC to x2APIC mode while in
> running state, in which the avic_vcpu_load() will be called from
> svm_refresh_apicv_exec_ctrl().
>
> Therefore, remove this warning since it is no longer appropriate.
>
> Reviewed-by: Maxim Levitsky <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm/avic.c | 1 -
> 1 file changed, 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index ad2ef6c00559..8e90c659de2d 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -1059,7 +1059,6 @@ void __avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> return;
>
> entry = READ_ONCE(*(svm->avic_physical_id_cache));
> - WARN_ON(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);
>
> entry &= ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
> entry |= (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK);

Looks good to me.

Reviewed-by: Pankaj Gupta <[email protected]>

2022-05-09 13:55:08

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v4 10/15] KVM: SVM: Introduce helper functions to (de)activate AVIC and x2AVIC

On Sat, 2022-05-07 at 21:39 -0500, Suravee Suthikulpanit wrote:
> Refactor the current logic for (de)activate AVIC into helper functions,
> and also add logic for (de)activate x2AVIC. The helper function are used
> when initializing AVIC and switching from AVIC to x2AVIC mode
> (handled by svm_refresh_spicv_exec_ctrl()).
>
> When an AVIC-enabled guest switches from APIC to x2APIC mode during
> runtime, the SVM driver needs to perform the following steps:
>
> 1. Set the x2APIC mode bit for AVIC in VMCB along with the maximum
> APIC ID support for each mode accodingly.
>
> 2. Disable x2APIC MSRs interception in order to allow the hardware
> to virtualize x2APIC MSRs accesses.
>
> Reported-by: kernel test robot <[email protected]>
> Reviewed-by: Maxim Levitsky <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/include/asm/svm.h | 6 +++++
> arch/x86/kvm/svm/avic.c | 54 ++++++++++++++++++++++++++++++++++----
> arch/x86/kvm/svm/svm.c | 6 ++---
> arch/x86/kvm/svm/svm.h | 1 +
> 4 files changed, 58 insertions(+), 9 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 4c26b0d47d76..f5525c0e03f7 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -256,6 +256,7 @@ enum avic_ipi_failure_cause {
> AVIC_IPI_FAILURE_INVALID_BACKING_PAGE,
> };
>
> +#define AVIC_PHYSICAL_MAX_INDEX_MASK GENMASK_ULL(9, 0)
>
> /*
> * For AVIC, the max index allowed for physical APIC ID
> @@ -500,4 +501,9 @@ DEFINE_GHCB_ACCESSORS(sw_exit_info_2)
> DEFINE_GHCB_ACCESSORS(sw_scratch)
> DEFINE_GHCB_ACCESSORS(xcr0)
>
> +struct svm_direct_access_msrs {
> + u32 index; /* Index of the MSR */
> + bool always; /* True if intercept is initially cleared */
> +};
> +
> #endif
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index a82981722018..ad2ef6c00559 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -69,6 +69,51 @@ struct amd_svm_iommu_ir {
> void *data; /* Storing pointer to struct amd_ir_data */
> };
>
> +static inline void avic_set_x2apic_msr_interception(struct vcpu_svm *svm, bool disable)
> +{
> + int i;
> +
> + for (i = 0; i < MAX_DIRECT_ACCESS_MSRS; i++) {
> + int index = direct_access_msrs[i].index;
> +
> + if ((index < APIC_BASE_MSR) ||
> + (index > APIC_BASE_MSR + 0xff))
> + continue;
> + set_msr_interception(&svm->vcpu, svm->msrpm, index,
> + !disable, !disable);
> + }
> +}
> +
> +static void avic_activate_vmcb(struct vcpu_svm *svm)
> +{
> + struct vmcb *vmcb = svm->vmcb01.ptr;
> +
> + vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
> + vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
> +
> + vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> + if (apic_x2apic_mode(svm->vcpu.arch.apic)) {
> + vmcb->control.int_ctl |= X2APIC_MODE_MASK;
> + vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
> + /* Disabling MSR intercept for x2APIC registers */
> + avic_set_x2apic_msr_interception(svm, false);
> + } else {
> + vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
> + /* Enabling MSR intercept for x2APIC registers */
> + avic_set_x2apic_msr_interception(svm, true);
> + }
> +}
> +
> +static void avic_deactivate_vmcb(struct vcpu_svm *svm)
> +{
> + struct vmcb *vmcb = svm->vmcb01.ptr;
> +
> + vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
> + vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
> +
> + /* Enabling MSR intercept for x2APIC registers */
> + avic_set_x2apic_msr_interception(svm, true);
> +}
>
> /* Note:
> * This function is called from IOMMU driver to notify
> @@ -185,13 +230,12 @@ void avic_init_vmcb(struct vcpu_svm *svm, struct vmcb *vmcb)
> vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
> vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
> vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
> - vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
> vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;
>
> if (kvm_apicv_activated(svm->vcpu.kvm))
> - vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> + avic_activate_vmcb(svm);
> else
> - vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
> + avic_deactivate_vmcb(svm);
> }
>
> static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
> @@ -1082,9 +1126,9 @@ void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
> * accordingly before re-activating.
> */
> avic_apicv_post_state_restore(vcpu);
> - vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> + avic_activate_vmcb(svm);
> } else {
> - vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
> + avic_deactivate_vmcb(svm);
> }
> vmcb_mark_dirty(vmcb, VMCB_AVIC);
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 9066568fd19d..96a1fc1a1d1b 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -74,10 +74,8 @@ static uint64_t osvw_len = 4, osvw_status;
>
> static DEFINE_PER_CPU(u64, current_tsc_ratio);
>
> -static const struct svm_direct_access_msrs {
> - u32 index; /* Index of the MSR */
> - bool always; /* True if intercept is initially cleared */
> -} direct_access_msrs[MAX_DIRECT_ACCESS_MSRS] = {
> +const struct svm_direct_access_msrs
> +direct_access_msrs[MAX_DIRECT_ACCESS_MSRS] = {
> { .index = MSR_STAR, .always = true },
> { .index = MSR_IA32_SYSENTER_CS, .always = true },
> { .index = MSR_IA32_SYSENTER_EIP, .always = false },
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 5ed958863b81..bb5bf70de3b2 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -600,6 +600,7 @@ void nested_vmcb02_compute_g_pat(struct vcpu_svm *svm);
> void svm_switch_vmcb(struct vcpu_svm *svm, struct kvm_vmcb_info *target_vmcb);
>
> extern struct kvm_x86_nested_ops svm_nested_ops;
> +extern const struct svm_direct_access_msrs direct_access_msrs[];
>
> /* avic.c */
>


So I did some testing, and reviewed this code again with regard to nesting,
and now I see that it has CVE worthy bug, so have to revoke my Reviewed-By.

This is what happens:

On nested VM entry, *request to inhibit AVIC is done*, and then nested msr bitmap
is calculated, still with all X2AVIC msrs open,

1. nested_svm_vmrun -> enter_svm_guest_mode -> kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu);
2. nested_svm_vmrun -> nested_svm_vmrun_msrpm


But the nested guest will be entered without AVIC active
(since we don't yet support nested avic and it is optional anyway), thus if the nested guest
also doesn't intercept those msrs, it will gain access to the *host* x2apic msrs. Ooops.

I think the easist way to fix this for now, is to make nested_svm_vmrun_msrpm
never open access to x2apic msrs regardless of the host bitmap value, but in the long
term the whole thing needs to be refactored.


Another thing I noted is that avic_deactivate_vmcb should not touch avic msrs
when avic_mode == AVIC_MODE_X1, it is just a waste of time.

Also updating these msr intercepts is pointless if the guest doesn't use x2apic.

Same it true while entering the nested guest - AVIC is inhibited, but there is
no need to update the msr intercepts in L1 msr bitmap, since this bitmap isn't
used by the CPU and vise versa while returing back to L1 from the nested guest.

However optimizing all of this should also be done very carefully to
avoid issue like the above.

I need to think on how to correctly fix/refactor all of this to be honest.

Best regards,
Maxim levitsky




2022-05-11 23:00:05

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v4 10/15] KVM: SVM: Introduce helper functions to (de)activate AVIC and x2AVIC

On Wed, 2022-05-11 at 22:37 +0700, Suravee Suthikulpanit wrote:
> Maxim,
>
> On 5/9/22 8:42 PM, Maxim Levitsky wrote:
> > ...
> >
> > So I did some testing, and reviewed this code again with regard to nesting,
> > and now I see that it has CVE worthy bug, so have to revoke my Reviewed-By.
> >
> > This is what happens:
> >
> > On nested VM entry, *request to inhibit AVIC is done*, and then nested msr bitmap
> > is calculated, still with all X2AVIC msrs open,
> >
> > 1. nested_svm_vmrun -> enter_svm_guest_mode -> kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu);
> > 2. nested_svm_vmrun -> nested_svm_vmrun_msrpm
> >
> > But the nested guest will be entered without AVIC active
> > (since we don't yet support nested avic and it is optional anyway), thus if the nested guest
> > also doesn't intercept those msrs, it will gain access to the *host* x2apic msrs. Ooops.
>
> Shouldn't this be changed to intercept the x2APIC msrs because of the following logic?
>
> kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu)
> kvm_vcpu_update_apicv(vcpu)
> static_call(kvm_x86_refresh_apicv_exec_ctrl)(vcpu)
> avic_deactivate_vmcb()
> svm_set_x2apic_msr_interception(true)

Nope because the above only updates L1 msr intercept bitmap, while 'merged'
msr bitmap that L2 uses still has those msrs open.

Other and better way to fix it would be to fix set_msr_interception
to update the merged bitmap as well.

I think I will post a patch series to clean up this mess soon.

Best regards,
Maxim Levitsky

>
> > I think the easist way to fix this for now, is to make nested_svm_vmrun_msrpm
> > never open access to x2apic msrs regardless of the host bitmap value, but in the long
> > term the whole thing needs to be refactored.
>
> Agree.
>
> > Another thing I noted is that avic_deactivate_vmcb should not touch avic msrs
> > when avic_mode == AVIC_MODE_X1, it is just a waste of time.
>
> We can add the check.
>
> > Also updating these msr intercepts is pointless if the guest doesn't use x2apic.
>
> We can also add the check.
>
> Best Regards,
> Suravee
>



2022-05-12 13:47:14

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH v4 10/15] KVM: SVM: Introduce helper functions to (de)activate AVIC and x2AVIC

Maxim,

On 5/9/22 8:42 PM, Maxim Levitsky wrote:
>...
>
> So I did some testing, and reviewed this code again with regard to nesting,
> and now I see that it has CVE worthy bug, so have to revoke my Reviewed-By.
>
> This is what happens:
>
> On nested VM entry, *request to inhibit AVIC is done*, and then nested msr bitmap
> is calculated, still with all X2AVIC msrs open,
>
> 1. nested_svm_vmrun -> enter_svm_guest_mode -> kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu);
> 2. nested_svm_vmrun -> nested_svm_vmrun_msrpm
>
> But the nested guest will be entered without AVIC active
> (since we don't yet support nested avic and it is optional anyway), thus if the nested guest
> also doesn't intercept those msrs, it will gain access to the *host* x2apic msrs. Ooops.

Shouldn't this be changed to intercept the x2APIC msrs because of the following logic?

kvm_make_request(KVM_REQ_APICV_UPDATE, vcpu)
kvm_vcpu_update_apicv(vcpu)
static_call(kvm_x86_refresh_apicv_exec_ctrl)(vcpu)
avic_deactivate_vmcb()
svm_set_x2apic_msr_interception(true)

> I think the easist way to fix this for now, is to make nested_svm_vmrun_msrpm
> never open access to x2apic msrs regardless of the host bitmap value, but in the long
> term the whole thing needs to be refactored.

Agree.

> Another thing I noted is that avic_deactivate_vmcb should not touch avic msrs
> when avic_mode == AVIC_MODE_X1, it is just a waste of time.

We can add the check.

> Also updating these msr intercepts is pointless if the guest doesn't use x2apic.

We can also add the check.

Best Regards,
Suravee