2022-05-19 12:57:18

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 00/17] Introducing AMD x2AVIC and hybrid-AVIC modes

Introducing support for AMD x2APIC virtualization. This feature is
indicated by the CPUID Fn8000_000A EDX[14], and it can be activated
by setting bit 31 (enable AVIC) and bit 30 (x2APIC mode) of VMCB
offset 60h.

With x2AVIC support, the guest local APIC can be fully virtualized in
both xAPIC and x2APIC modes, and the mode can be changed during runtime.
For example, when AVIC is enabled, the hypervisor set VMCB bit 31
to activate AVIC for each vCPU. Then, it keeps track of each vCPU's
APIC mode, and updates VMCB bit 30 to enable/disable x2APIC
virtualization mode accordingly.

Besides setting bit VMCB bit 30 and 31, for x2AVIC, kvm_amd driver needs
to disable interception for the x2APIC MSR range to allow AVIC hardware
to virtualize register accesses.

This series also introduce a partial APIC virtualization (hybrid-AVIC)
mode, where APIC register accesses are trapped (i.e. not virtualized
by hardware), but leverage AVIC doorbell for interrupt injection.
This eliminates need to disable x2APIC in the guest on system without
x2AVIC support. (Note: suggested by Maxim)

Testing for v5:
* Test partial AVIC mode by launching a VM with x2APIC mode
* Tested booting a Linux VM with x2APIC physical and logical modes upto 512 vCPUs.
* Test the following nested SVM test use cases:

L0 | L1 | L2
----------------------------------
AVIC | APIC | APIC
AVIC | APIC | x2APIC
hybrid-AVIC | x2APIC | APIC
hybrid-AVIC | x2APIC | x2APIC
x2AVIC | APIC | APIC
x2AVIC | APIC | x2APIC
x2AVIC | x2APIC | APIC
x2AVIC | x2APIC | x2APIC

Changes from v5:
(https://lore.kernel.org/lkml/[email protected]/T/#t)
* Re-order patch 16 to 10
* Patch 11: Update commit message

Changes from v4:
(https://lore.kernel.org/lkml/[email protected]/T/)
* Patch 3: Move enum_avic_modes definition to svm.h
* Patch 10: Rename avic_set_x2apic_msr_interception to
svm_set_x2apic_msr_interception and move it to svm.c
to simplify the struct svm_direct_access_msrs declaration.
* Patch 16: New from Maxim
* Patch 17: New from Maxim

Best Regards,
Suravee

Maxim Levitsky (2):
KVM: x86: nSVM: always intercept x2apic msrs
KVM: x86: nSVM: optimize svm_set_x2apic_msr_interception

Suravee Suthikulpanit (15):
x86/cpufeatures: Introduce x2AVIC CPUID bit
KVM: x86: lapic: Rename [GET/SET]_APIC_DEST_FIELD to
[GET/SET]_XAPIC_DEST_FIELD
KVM: SVM: Detect X2APIC virtualization (x2AVIC) support
KVM: SVM: Update max number of vCPUs supported for x2AVIC mode
KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID
KVM: SVM: Do not support updating APIC ID when in x2APIC mode
KVM: SVM: Adding support for configuring x2APIC MSRs interception
KVM: x86: Deactivate APICv on vCPU with APIC disabled
KVM: SVM: Refresh AVIC configuration when changing APIC mode
KVM: SVM: Introduce logic to (de)activate x2AVIC mode
KVM: SVM: Do not throw warning when calling avic_vcpu_load on a
running vcpu
KVM: SVM: Introduce hybrid-AVIC mode
KVM: x86: Warning APICv inconsistency only when vcpu APIC mode is
valid
KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible
KVM: SVM: Add AVIC doorbell tracepoint

arch/x86/hyperv/hv_apic.c | 2 +-
arch/x86/include/asm/apicdef.h | 4 +-
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/kvm_host.h | 1 -
arch/x86/include/asm/svm.h | 16 ++-
arch/x86/kernel/apic/apic.c | 2 +-
arch/x86/kernel/apic/ipi.c | 2 +-
arch/x86/kvm/lapic.c | 6 +-
arch/x86/kvm/svm/avic.c | 178 ++++++++++++++++++++++++++---
arch/x86/kvm/svm/nested.c | 5 +
arch/x86/kvm/svm/svm.c | 75 ++++++++----
arch/x86/kvm/svm/svm.h | 25 +++-
arch/x86/kvm/trace.h | 18 +++
arch/x86/kvm/x86.c | 8 +-
14 files changed, 291 insertions(+), 52 deletions(-)

--
2.25.1



2022-05-19 12:58:02

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 08/17] KVM: x86: Deactivate APICv on vCPU with APIC disabled

APICv should be deactivated on vCPU that has APIC disabled.
Therefore, call kvm_vcpu_update_apicv() when changing
APIC mode, and add additional check for APIC disable mode
when determine APICV activation,

Suggested-by: Maxim Levitsky <[email protected]>
Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/lapic.c | 4 +++-
arch/x86/kvm/x86.c | 4 +++-
2 files changed, 6 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 8b8c4a905976..680824d7aa0d 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -2346,8 +2346,10 @@ void kvm_lapic_set_base(struct kvm_vcpu *vcpu, u64 value)
if (((old_value ^ value) & X2APIC_ENABLE) && (value & X2APIC_ENABLE))
kvm_apic_set_x2apic_id(apic, vcpu->vcpu_id);

- if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE))
+ if ((old_value ^ value) & (MSR_IA32_APICBASE_ENABLE | X2APIC_ENABLE)) {
+ kvm_vcpu_update_apicv(vcpu);
static_call_cond(kvm_x86_set_virtual_apic_mode)(vcpu);
+ }

apic->base_address = apic->vcpu->arch.apic_base &
MSR_IA32_APICBASE_BASE;
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 8ee8c91fa762..77e49892dea1 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -9836,7 +9836,9 @@ void kvm_vcpu_update_apicv(struct kvm_vcpu *vcpu)

down_read(&vcpu->kvm->arch.apicv_update_lock);

- activate = kvm_vcpu_apicv_activated(vcpu);
+ /* Do not activate APICV when APIC is disabled */
+ activate = kvm_vcpu_apicv_activated(vcpu) &&
+ (kvm_get_apic_mode(vcpu) != LAPIC_MODE_DISABLED);

if (vcpu->arch.apicv_active == activate)
goto out;
--
2.25.1


2022-05-19 13:07:44

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 03/17] KVM: SVM: Detect X2APIC virtualization (x2AVIC) support

Add CPUID check for the x2APIC virtualization (x2AVIC) feature.
If available, the SVM driver can support both AVIC and x2AVIC modes
when load the kvm_amd driver with avic=1. The operating mode will be
determined at runtime depending on the guest APIC mode.

Reviewed-by: Maxim Levitsky <[email protected]>
Reviewed-by: Pankaj Gupta <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/svm.h | 3 +++
arch/x86/kvm/svm/avic.c | 45 ++++++++++++++++++++++++++++++++++++++
arch/x86/kvm/svm/svm.c | 15 ++-----------
arch/x86/kvm/svm/svm.h | 9 ++++++++
4 files changed, 59 insertions(+), 13 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index f70a5108d464..2c2a104b777e 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -195,6 +195,9 @@ struct __attribute__ ((__packed__)) vmcb_control_area {
#define AVIC_ENABLE_SHIFT 31
#define AVIC_ENABLE_MASK (1 << AVIC_ENABLE_SHIFT)

+#define X2APIC_MODE_SHIFT 30
+#define X2APIC_MODE_MASK (1 << X2APIC_MODE_SHIFT)
+
#define LBR_CTL_ENABLE_MASK BIT_ULL(0)
#define VIRTUAL_VMLOAD_VMSAVE_ENABLE_MASK BIT_ULL(1)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index a8f514212b87..7d4e73e95acd 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -40,6 +40,9 @@
#define AVIC_GATAG_TO_VMID(x) ((x >> AVIC_VCPU_ID_BITS) & AVIC_VM_ID_MASK)
#define AVIC_GATAG_TO_VCPUID(x) (x & AVIC_VCPU_ID_MASK)

+static bool force_avic;
+module_param_unsafe(force_avic, bool, 0444);
+
/* Note:
* This hash table is used to map VM_ID to a struct kvm_svm,
* when handling AMD IOMMU GALOG notification to schedule in
@@ -50,6 +53,7 @@ static DEFINE_HASHTABLE(svm_vm_data_hash, SVM_VM_DATA_HASH_BITS);
static u32 next_vm_id = 0;
static bool next_vm_id_wrapped = 0;
static DEFINE_SPINLOCK(svm_vm_data_hash_lock);
+enum avic_modes avic_mode;

/*
* This is a wrapper of struct amd_iommu_ir_data.
@@ -1077,3 +1081,44 @@ void avic_vcpu_unblocking(struct kvm_vcpu *vcpu)

avic_vcpu_load(vcpu);
}
+
+/*
+ * Note:
+ * - The module param avic enable both xAPIC and x2APIC mode.
+ * - Hypervisor can support both xAVIC and x2AVIC in the same guest.
+ * - The mode can be switched at run-time.
+ */
+bool avic_hardware_setup(struct kvm_x86_ops *x86_ops)
+{
+ if (!npt_enabled)
+ return false;
+
+ if (boot_cpu_has(X86_FEATURE_AVIC)) {
+ avic_mode = AVIC_MODE_X1;
+ pr_info("AVIC enabled\n");
+ } else if (force_avic) {
+ /*
+ * Some older systems does not advertise AVIC support.
+ * See Revision Guide for specific AMD processor for more detail.
+ */
+ avic_mode = AVIC_MODE_X1;
+ pr_warn("AVIC is not supported in CPUID but force enabled");
+ pr_warn("Your system might crash and burn");
+ }
+
+ /* AVIC is a prerequisite for x2AVIC. */
+ if (boot_cpu_has(X86_FEATURE_X2AVIC)) {
+ if (avic_mode == AVIC_MODE_X1) {
+ avic_mode = AVIC_MODE_X2;
+ pr_info("x2AVIC enabled\n");
+ } else {
+ pr_warn(FW_BUG "Cannot support x2AVIC due to AVIC is disabled");
+ pr_warn(FW_BUG "Try enable AVIC using force_avic option");
+ }
+ }
+
+ if (avic_mode != AVIC_MODE_NONE)
+ amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
+
+ return !!avic_mode;
+}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index aa7b387e0b7c..196bca5751a1 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -188,9 +188,6 @@ module_param(tsc_scaling, int, 0444);
static bool avic;
module_param(avic, bool, 0444);

-static bool force_avic;
-module_param_unsafe(force_avic, bool, 0444);
-
bool __read_mostly dump_invalid_vmcb;
module_param(dump_invalid_vmcb, bool, 0644);

@@ -4913,17 +4910,9 @@ static __init int svm_hardware_setup(void)
nrips = false;
}

- enable_apicv = avic = avic && npt_enabled && (boot_cpu_has(X86_FEATURE_AVIC) || force_avic);
+ enable_apicv = avic = avic && avic_hardware_setup(&svm_x86_ops);

- if (enable_apicv) {
- if (!boot_cpu_has(X86_FEATURE_AVIC)) {
- pr_warn("AVIC is not supported in CPUID but force enabled");
- pr_warn("Your system might crash and burn");
- } else
- pr_info("AVIC enabled\n");
-
- amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
- } else {
+ if (!enable_apicv) {
svm_x86_ops.vcpu_blocking = NULL;
svm_x86_ops.vcpu_unblocking = NULL;
svm_x86_ops.vcpu_get_apicv_inhibit_reasons = NULL;
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 32220a1b0ea2..1731c1f3884b 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -36,6 +36,14 @@ extern bool npt_enabled;
extern int vgif;
extern bool intercept_smi;

+enum avic_modes {
+ AVIC_MODE_NONE = 0,
+ AVIC_MODE_X1,
+ AVIC_MODE_X2,
+};
+
+extern enum avic_modes avic_mode;
+
/*
* Clean bits in VMCB.
* VMCB_ALL_CLEAN_MASK might also need to
@@ -603,6 +611,7 @@ extern struct kvm_x86_nested_ops svm_nested_ops;

/* avic.c */

+bool avic_hardware_setup(struct kvm_x86_ops *ops);
int avic_ga_log_notifier(u32 ga_tag);
void avic_vm_destroy(struct kvm *kvm);
int avic_vm_init(struct kvm *kvm);
--
2.25.1


2022-05-19 14:00:24

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 16/17] KVM: SVM: Add AVIC doorbell tracepoint

Add a tracepoint to track number of doorbells being sent
to signal a running vCPU to process IRQ after being injected.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 4 +++-
arch/x86/kvm/trace.h | 18 ++++++++++++++++++
arch/x86/kvm/x86.c | 1 +
3 files changed, 22 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 9c439a32c343..2a9eb419bdb9 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -324,8 +324,10 @@ void avic_ring_doorbell(struct kvm_vcpu *vcpu)
*/
int cpu = READ_ONCE(vcpu->cpu);

- if (cpu != get_cpu())
+ if (cpu != get_cpu()) {
wrmsrl(MSR_AMD64_SVM_AVIC_DOORBELL, kvm_cpu_get_apicid(cpu));
+ trace_kvm_avic_doorbell(vcpu->vcpu_id, kvm_cpu_get_apicid(cpu));
+ }
put_cpu();
}

diff --git a/arch/x86/kvm/trace.h b/arch/x86/kvm/trace.h
index de4762517569..a47bb0fdea70 100644
--- a/arch/x86/kvm/trace.h
+++ b/arch/x86/kvm/trace.h
@@ -1479,6 +1479,24 @@ TRACE_EVENT(kvm_avic_kick_vcpu_slowpath,
__entry->icrh, __entry->icrl, __entry->index)
);

+TRACE_EVENT(kvm_avic_doorbell,
+ TP_PROTO(u32 vcpuid, u32 apicid),
+ TP_ARGS(vcpuid, apicid),
+
+ TP_STRUCT__entry(
+ __field(u32, vcpuid)
+ __field(u32, apicid)
+ ),
+
+ TP_fast_assign(
+ __entry->vcpuid = vcpuid;
+ __entry->apicid = apicid;
+ ),
+
+ TP_printk("vcpuid=%u, apicid=%u",
+ __entry->vcpuid, __entry->apicid)
+);
+
TRACE_EVENT(kvm_hv_timer_state,
TP_PROTO(unsigned int vcpu_id, unsigned int hv_timer_in_use),
TP_ARGS(vcpu_id, hv_timer_in_use),
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 0febaca80feb..d013f6fc2e33 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -13095,6 +13095,7 @@ EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_unaccelerated_access);
EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_incomplete_ipi);
EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_ga_log);
EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_kick_vcpu_slowpath);
+EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_avic_doorbell);
EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_apicv_accept_irq);
EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_enter);
EXPORT_TRACEPOINT_SYMBOL_GPL(kvm_vmgexit_exit);
--
2.25.1


2022-05-19 14:47:15

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 05/17] KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID

In x2APIC mode, ICRH contains 32-bit destination APIC ID.
So, update the avic_kick_target_vcpus() accordingly.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 6b89303034e3..560c8a886199 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -369,9 +369,15 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
* since entered the guest will have processed pending IRQs at VMRUN.
*/
kvm_for_each_vcpu(i, vcpu, kvm) {
+ u32 dest;
+
+ if (apic_x2apic_mode(vcpu->arch.apic))
+ dest = icrh;
+ else
+ dest = GET_XAPIC_DEST_FIELD(icrh);
+
if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
- GET_XAPIC_DEST_FIELD(icrh),
- icrl & APIC_DEST_MASK)) {
+ dest, icrl & APIC_DEST_MASK)) {
vcpu->arch.apic->irr_pending = true;
svm_complete_interrupt_delivery(vcpu,
icrl & APIC_MODE_MASK,
--
2.25.1


2022-05-19 15:44:58

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 09/17] KVM: SVM: Refresh AVIC configuration when changing APIC mode

AMD AVIC can support xAPIC and x2APIC virtualization,
which requires changing x2APIC bit VMCB and MSR intercepton
for x2APIC MSRs. Therefore, call avic_refresh_apicv_exec_ctrl()
to refresh configuration accordingly.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 12 ++++++++++++
arch/x86/kvm/svm/svm.c | 1 +
2 files changed, 13 insertions(+)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 7aa75931bec1..aa88cef3d41f 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -685,6 +685,18 @@ void avic_apicv_post_state_restore(struct kvm_vcpu *vcpu)
avic_handle_ldr_update(vcpu);
}

+void avic_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
+{
+ if (!lapic_in_kernel(vcpu) || (avic_mode == AVIC_MODE_NONE))
+ return;
+
+ if (kvm_get_apic_mode(vcpu) == LAPIC_MODE_INVALID) {
+ WARN_ONCE(true, "Invalid local APIC state (vcpu_id=%d)", vcpu->vcpu_id);
+ return;
+ }
+ avic_refresh_apicv_exec_ctrl(vcpu);
+}
+
static int avic_set_pi_irte_mode(struct kvm_vcpu *vcpu, bool activate)
{
int ret = 0;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 2cf6710333f8..31b669f3f3de 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4692,6 +4692,7 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
.enable_nmi_window = svm_enable_nmi_window,
.enable_irq_window = svm_enable_irq_window,
.update_cr8_intercept = svm_update_cr8_intercept,
+ .set_virtual_apic_mode = avic_set_virtual_apic_mode,
.refresh_apicv_exec_ctrl = avic_refresh_apicv_exec_ctrl,
.check_apicv_inhibit_reasons = avic_check_apicv_inhibit_reasons,
.apicv_post_state_restore = avic_apicv_post_state_restore,
--
2.25.1


2022-05-19 16:11:30

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 15/17] KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible

For x2AVIC, the index from incomplete IPI #vmexit info is invalid
for logical cluster mode. Only ICRH/ICRL values can be used
to determine the IPI destination APIC ID.

Since QEMU defines guest physical APIC ID to be the same as
vCPU ID, it can be used to quickly identify the target vCPU to deliver IPI,
and avoid the overhead from searching through all vCPUs to match the target
vCPU.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 21 ++++++++++++++++++++-
1 file changed, 20 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index bac876bb1cf1..9c439a32c343 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -358,7 +358,26 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
/* For xAPIC logical mode, the index is for logical APIC table. */
apic_id = avic_logical_id_table[index] & 0x1ff;
} else {
- return -EINVAL;
+ /* For x2APIC logical mode, cannot leverage the index.
+ * Instead, calculate physical ID from logical ID in ICRH.
+ */
+ int apic;
+ int first = ffs(icrh & 0xffff);
+ int last = fls(icrh & 0xffff);
+ int cluster = (icrh & 0xffff0000) >> 16;
+
+ /*
+ * If the x2APIC logical ID sub-field (i.e. icrh[15:0]) contains zero
+ * or more than 1 bits, we cannot match just one vcpu to kick for
+ * fast path.
+ */
+ if (!first || (first != last))
+ return -EINVAL;
+
+ apic = first - 1;
+ if ((apic < 0) || (apic > 15) || (cluster >= 0xfffff))
+ return -EINVAL;
+ apic_id = (cluster << 4) + apic;
}
}

--
2.25.1


2022-05-19 16:13:59

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 14/17] KVM: x86: Warning APICv inconsistency only when vcpu APIC mode is valid

When launching a VM with x2APIC and specify more than 255 vCPUs,
the guest kernel can disable x2APIC (e.g. specify nox2apic kernel option).
The VM fallbacks to xAPIC mode, and disable the vCPU ID 255 and greater.

In this case, APICV is deactivated for the disabled vCPUs.
However, the current APICv consistency warning does not account for
this case, which results in a warning.

Therefore, modify warning logic to report only when vCPU APIC mode
is valid.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/x86.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 77e49892dea1..0febaca80feb 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -10242,7 +10242,8 @@ static int vcpu_enter_guest(struct kvm_vcpu *vcpu)
* per-VM state, and responsing vCPUs must wait for the update
* to complete before servicing KVM_REQ_APICV_UPDATE.
*/
- WARN_ON_ONCE(kvm_vcpu_apicv_activated(vcpu) != kvm_vcpu_apicv_active(vcpu));
+ WARN_ON_ONCE((kvm_vcpu_apicv_activated(vcpu) != kvm_vcpu_apicv_active(vcpu)) &&
+ (kvm_get_apic_mode(vcpu) != LAPIC_MODE_DISABLED));

exit_fastpath = static_call(kvm_x86_vcpu_run)(vcpu);
if (likely(exit_fastpath != EXIT_FASTPATH_REENTER_GUEST))
--
2.25.1


2022-05-19 17:22:40

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 12/17] KVM: SVM: Do not throw warning when calling avic_vcpu_load on a running vcpu

Originalliy, this WARN_ON is designed to detect when calling
avic_vcpu_load() on an already running vcpu in AVIC mode (i.e. the AVIC
is_running bit is set).

However, for x2AVIC, the vCPU can switch from xAPIC to x2APIC mode while in
running state, in which the avic_vcpu_load() will be called from
svm_refresh_apicv_exec_ctrl().

Therefore, remove this warning since it is no longer appropriate.

Reviewed-by: Maxim Levitsky <[email protected]>
Reviewed-by: Pankaj Gupta <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 1 -
1 file changed, 1 deletion(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index d40170082716..2d9455338b1f 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -1038,7 +1038,6 @@ void __avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
return;

entry = READ_ONCE(*(svm->avic_physical_id_cache));
- WARN_ON(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);

entry &= ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
entry |= (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK);
--
2.25.1


2022-05-19 17:23:25

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 04/17] KVM: SVM: Update max number of vCPUs supported for x2AVIC mode

xAVIC and x2AVIC modes can support diffferent number of vcpus.
Update existing logics to support each mode accordingly.

Also, modify the maximum physical APIC ID for AVIC to 255 to reflect
the actual value supported by the architecture.

Reviewed-by: Maxim Levitsky <[email protected]>
Reviewed-by: Pankaj Gupta <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/svm.h | 12 +++++++++---
arch/x86/kvm/svm/avic.c | 8 +++++---
2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 2c2a104b777e..4c26b0d47d76 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -258,10 +258,16 @@ enum avic_ipi_failure_cause {


/*
- * 0xff is broadcast, so the max index allowed for physical APIC ID
- * table is 0xfe. APIC IDs above 0xff are reserved.
+ * For AVIC, the max index allowed for physical APIC ID
+ * table is 0xff (255).
*/
-#define AVIC_MAX_PHYSICAL_ID_COUNT 0xff
+#define AVIC_MAX_PHYSICAL_ID 0XFEULL
+
+/*
+ * For x2AVIC, the max index allowed for physical APIC ID
+ * table is 0x1ff (511).
+ */
+#define X2AVIC_MAX_PHYSICAL_ID 0x1FFUL

#define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF)
#define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 7d4e73e95acd..6b89303034e3 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -179,7 +179,7 @@ void avic_init_vmcb(struct vcpu_svm *svm, struct vmcb *vmcb)
vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
- vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID_COUNT;
+ vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;

if (kvm_apicv_activated(svm->vcpu.kvm))
@@ -194,7 +194,8 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
u64 *avic_physical_id_table;
struct kvm_svm *kvm_svm = to_kvm_svm(vcpu->kvm);

- if (index >= AVIC_MAX_PHYSICAL_ID_COUNT)
+ if ((avic_mode == AVIC_MODE_X1 && index > AVIC_MAX_PHYSICAL_ID) ||
+ (avic_mode == AVIC_MODE_X2 && index > X2AVIC_MAX_PHYSICAL_ID))
return NULL;

avic_physical_id_table = page_address(kvm_svm->avic_physical_id_table_page);
@@ -241,7 +242,8 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
int id = vcpu->vcpu_id;
struct vcpu_svm *svm = to_svm(vcpu);

- if (id >= AVIC_MAX_PHYSICAL_ID_COUNT)
+ if ((avic_mode == AVIC_MODE_X1 && id > AVIC_MAX_PHYSICAL_ID) ||
+ (avic_mode == AVIC_MODE_X2 && id > X2AVIC_MAX_PHYSICAL_ID))
return -EINVAL;

if (!vcpu->arch.apic->regs)
--
2.25.1


2022-05-19 17:30:07

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 07/17] KVM: SVM: Adding support for configuring x2APIC MSRs interception

When enabling x2APIC virtualization (x2AVIC), the interception of
x2APIC MSRs must be disabled to let the hardware virtualize guest
MSR accesses.

Current implementation keeps track of list of MSR interception state
in the svm_direct_access_msrs array. Therefore, extends the array to
include x2APIC MSRs.

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/svm.c | 25 +++++++++++++++++++++++++
arch/x86/kvm/svm/svm.h | 4 ++--
2 files changed, 27 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 196bca5751a1..2cf6710333f8 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -100,6 +100,31 @@ static const struct svm_direct_access_msrs {
{ .index = MSR_IA32_CR_PAT, .always = false },
{ .index = MSR_AMD64_SEV_ES_GHCB, .always = true },
{ .index = MSR_TSC_AUX, .always = false },
+ { .index = (APIC_BASE_MSR + APIC_ID), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_TASKPRI), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_ARBPRI), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_PROCPRI), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_EOI), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_RRR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LDR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_DFR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_SPIV), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_ISR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_TMR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_IRR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_ESR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_ICR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_ICR2), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVTT), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVTTHMR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVTPC), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVT0), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVT1), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_LVTERR), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_TMICT), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_TMCCT), .always = false },
+ { .index = (APIC_BASE_MSR + APIC_TDCR), .always = false },
{ .index = MSR_INVALID, .always = false },
};

diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 1731c1f3884b..16f1d117c98b 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -29,8 +29,8 @@
#define IOPM_SIZE PAGE_SIZE * 3
#define MSRPM_SIZE PAGE_SIZE * 2

-#define MAX_DIRECT_ACCESS_MSRS 21
-#define MSRPM_OFFSETS 16
+#define MAX_DIRECT_ACCESS_MSRS 46
+#define MSRPM_OFFSETS 32
extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
extern bool npt_enabled;
extern int vgif;
--
2.25.1


2022-05-19 17:52:23

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 17/17] KVM: x86: nSVM: optimize svm_set_x2apic_msr_interception

From: Maxim Levitsky <[email protected]>

- Avoid toggling the x2apic msr interception if it is already up to date.

- Avoid touching L0 msr bitmap when AVIC is inhibited on entry to
the guest mode, because in this case the guest usually uses its
own msr bitmap.

Later on VM exit, the 1st optimization will allow KVM to skip
touching the L0 msr bitmap as well.

Reviewed-by: Suravee Suthikulpanit <[email protected]>
Tested-by: Suravee Suthikulpanit <[email protected]>
Signed-off-by: Maxim Levitsky <[email protected]>
---
arch/x86/kvm/svm/avic.c | 8 ++++++++
arch/x86/kvm/svm/svm.c | 7 +++++++
arch/x86/kvm/svm/svm.h | 2 ++
3 files changed, 17 insertions(+)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 2a9eb419bdb9..0d7499678cb9 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -100,6 +100,14 @@ static void avic_deactivate_vmcb(struct vcpu_svm *svm)
vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;

+ /*
+ * If running nested and the guest uses its own MSR bitmap, there
+ * is no need to update L0's msr bitmap
+ */
+ if (is_guest_mode(&svm->vcpu) &&
+ vmcb12_is_intercept(&svm->nested.ctl, INTERCEPT_MSR_PROT))
+ return;
+
/* Enabling MSR intercept for x2APIC registers */
svm_set_x2apic_msr_interception(svm, true);
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index e04a133b98d0..4165317c0b00 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -750,6 +750,9 @@ void svm_set_x2apic_msr_interception(struct vcpu_svm *svm, bool intercept)
{
int i;

+ if (intercept == svm->x2avic_msrs_intercepted)
+ return;
+
if (avic_mode != AVIC_MODE_X2 ||
!apic_x2apic_mode(svm->vcpu.arch.apic))
return;
@@ -763,6 +766,8 @@ void svm_set_x2apic_msr_interception(struct vcpu_svm *svm, bool intercept)
set_msr_interception(&svm->vcpu, svm->msrpm, index,
!intercept, !intercept);
}
+
+ svm->x2avic_msrs_intercepted = intercept;
}

void svm_vcpu_free_msrpm(u32 *msrpm)
@@ -1333,6 +1338,8 @@ static int svm_vcpu_create(struct kvm_vcpu *vcpu)
goto error_free_vmsa_page;
}

+ svm->x2avic_msrs_intercepted = true;
+
svm->vmcb01.ptr = page_address(vmcb01_page);
svm->vmcb01.pa = __sme_set(page_to_pfn(vmcb01_page) << PAGE_SHIFT);
svm_switch_vmcb(svm, &svm->vmcb01);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 309445619756..6395b7791f26 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -272,6 +272,8 @@ struct vcpu_svm {
struct vcpu_sev_es_state sev_es;

bool guest_state_loaded;
+
+ bool x2avic_msrs_intercepted;
};

struct svm_cpu_data {
--
2.25.1


2022-05-19 23:44:39

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 13/17] KVM: SVM: Introduce hybrid-AVIC mode

Currently, AVIC is inhibited when booting a VM w/ x2APIC support.
because AVIC cannot virtualize x2APIC MSR register accesses.
However, the AVIC doorbell can be used to accelerate interrupt
injection into a running vCPU, while all guest accesses to x2APIC MSRs
will be intercepted and emulated by KVM.

With hybrid-AVIC support, the APICV_INHIBIT_REASON_X2APIC is
no longer enforced.

Suggested-by: Maxim Levitsky <[email protected]>
Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 -
arch/x86/kvm/svm/avic.c | 13 +++++++++++--
arch/x86/kvm/svm/svm.c | 9 ---------
3 files changed, 11 insertions(+), 12 deletions(-)

diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c59fea4bdb6e..da03111b05f6 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1051,7 +1051,6 @@ enum kvm_apicv_inhibit {
APICV_INHIBIT_REASON_NESTED,
APICV_INHIBIT_REASON_IRQWIN,
APICV_INHIBIT_REASON_PIT_REINJ,
- APICV_INHIBIT_REASON_X2APIC,
APICV_INHIBIT_REASON_BLOCKIRQ,
APICV_INHIBIT_REASON_ABSENT,
APICV_INHIBIT_REASON_SEV,
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 2d9455338b1f..bac876bb1cf1 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -71,12 +71,22 @@ static void avic_activate_vmcb(struct vcpu_svm *svm)
vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;

vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
- if (apic_x2apic_mode(svm->vcpu.arch.apic)) {
+
+ /* Note:
+ * KVM can support hybrid-AVIC mode, where KVM emulates x2APIC
+ * MSR accesses, while interrupt injection to a running vCPU
+ * can be achieved using AVIC doorbell. The AVIC hardware still
+ * accelerate MMIO accesses, but this does not cause any harm
+ * as the guest is not supposed to access xAPIC mmio when uses x2APIC.
+ */
+ if (apic_x2apic_mode(svm->vcpu.arch.apic) &&
+ (avic_mode == AVIC_MODE_X2)) {
vmcb->control.int_ctl |= X2APIC_MODE_MASK;
vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
/* Disabling MSR intercept for x2APIC registers */
svm_set_x2apic_msr_interception(svm, false);
} else {
+ /* For xAVIC and hybrid-xAVIC modes */
vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
/* Enabling MSR intercept for x2APIC registers */
svm_set_x2apic_msr_interception(svm, true);
@@ -978,7 +988,6 @@ bool avic_check_apicv_inhibit_reasons(enum kvm_apicv_inhibit reason)
BIT(APICV_INHIBIT_REASON_NESTED) |
BIT(APICV_INHIBIT_REASON_IRQWIN) |
BIT(APICV_INHIBIT_REASON_PIT_REINJ) |
- BIT(APICV_INHIBIT_REASON_X2APIC) |
BIT(APICV_INHIBIT_REASON_BLOCKIRQ) |
BIT(APICV_INHIBIT_REASON_SEV);

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 0ec2444c342d..e04a133b98d0 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4061,7 +4061,6 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
struct kvm_cpuid_entry2 *best;
- struct kvm *kvm = vcpu->kvm;

vcpu->arch.xsaves_enabled = guest_cpuid_has(vcpu, X86_FEATURE_XSAVE) &&
boot_cpu_has(X86_FEATURE_XSAVE) &&
@@ -4093,14 +4092,6 @@ static void svm_vcpu_after_set_cpuid(struct kvm_vcpu *vcpu)
vcpu->arch.reserved_gpa_bits &= ~(1UL << (best->ebx & 0x3f));
}

- if (kvm_vcpu_apicv_active(vcpu)) {
- /*
- * AVIC does not work with an x2APIC mode guest. If the X2APIC feature
- * is exposed to the guest, disable AVIC.
- */
- if (guest_cpuid_has(vcpu, X86_FEATURE_X2APIC))
- kvm_set_apicv_inhibit(kvm, APICV_INHIBIT_REASON_X2APIC);
- }
init_vmcb_after_set_cpuid(vcpu);
}

--
2.25.1


2022-05-20 09:11:03

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 11/17] KVM: SVM: Introduce logic to (de)activate x2AVIC mode

Introduce logic to (de)activate AVIC, which also allows
switching between AVIC to x2AVIC mode at runtime.

When an AVIC-enabled guest switches from APIC to x2APIC mode,
the SVM driver needs to perform the following steps:

1. Set the x2APIC mode bit for AVIC in VMCB along with the maximum
APIC ID support for each mode accodingly.

2. Disable x2APIC MSRs interception in order to allow the hardware
to virtualize x2APIC MSRs accesses.

Reported-by: kernel test robot <[email protected]>
Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/svm.h | 1 +
arch/x86/kvm/svm/avic.c | 39 +++++++++++++++++++++++++++++++++-----
arch/x86/kvm/svm/svm.c | 18 ++++++++++++++++++
arch/x86/kvm/svm/svm.h | 1 +
4 files changed, 54 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 4c26b0d47d76..13d315b4eaba 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -256,6 +256,7 @@ enum avic_ipi_failure_cause {
AVIC_IPI_FAILURE_INVALID_BACKING_PAGE,
};

+#define AVIC_PHYSICAL_MAX_INDEX_MASK GENMASK_ULL(9, 0)

/*
* For AVIC, the max index allowed for physical APIC ID
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index aa88cef3d41f..d40170082716 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -63,6 +63,36 @@ struct amd_svm_iommu_ir {
void *data; /* Storing pointer to struct amd_ir_data */
};

+static void avic_activate_vmcb(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
+ vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
+
+ vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ if (apic_x2apic_mode(svm->vcpu.arch.apic)) {
+ vmcb->control.int_ctl |= X2APIC_MODE_MASK;
+ vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
+ /* Disabling MSR intercept for x2APIC registers */
+ svm_set_x2apic_msr_interception(svm, false);
+ } else {
+ vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
+ /* Enabling MSR intercept for x2APIC registers */
+ svm_set_x2apic_msr_interception(svm, true);
+ }
+}
+
+static void avic_deactivate_vmcb(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
+ vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
+
+ /* Enabling MSR intercept for x2APIC registers */
+ svm_set_x2apic_msr_interception(svm, true);
+}

/* Note:
* This function is called from IOMMU driver to notify
@@ -179,13 +209,12 @@ void avic_init_vmcb(struct vcpu_svm *svm, struct vmcb *vmcb)
vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
- vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;

if (kvm_apicv_activated(svm->vcpu.kvm))
- vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ avic_activate_vmcb(svm);
else
- vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
+ avic_deactivate_vmcb(svm);
}

static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
@@ -1076,9 +1105,9 @@ void avic_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
* accordingly before re-activating.
*/
avic_apicv_post_state_restore(vcpu);
- vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ avic_activate_vmcb(svm);
} else {
- vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
+ avic_deactivate_vmcb(svm);
}
vmcb_mark_dirty(vmcb, VMCB_AVIC);

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 31b669f3f3de..0ec2444c342d 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -746,6 +746,24 @@ void svm_vcpu_init_msrpm(struct kvm_vcpu *vcpu, u32 *msrpm)
}
}

+void svm_set_x2apic_msr_interception(struct vcpu_svm *svm, bool intercept)
+{
+ int i;
+
+ if (avic_mode != AVIC_MODE_X2 ||
+ !apic_x2apic_mode(svm->vcpu.arch.apic))
+ return;
+
+ for (i = 0; i < MAX_DIRECT_ACCESS_MSRS; i++) {
+ int index = direct_access_msrs[i].index;
+
+ if ((index < APIC_BASE_MSR) ||
+ (index > APIC_BASE_MSR + 0xff))
+ continue;
+ set_msr_interception(&svm->vcpu, svm->msrpm, index,
+ !intercept, !intercept);
+ }
+}

void svm_vcpu_free_msrpm(u32 *msrpm)
{
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 7e53474c8834..309445619756 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -551,6 +551,7 @@ void svm_set_gif(struct vcpu_svm *svm, bool value);
int svm_invoke_exit_handler(struct kvm_vcpu *vcpu, u64 exit_code);
void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr,
int read, int write);
+void svm_set_x2apic_msr_interception(struct vcpu_svm *svm, bool disable);
void svm_complete_interrupt_delivery(struct kvm_vcpu *vcpu, int delivery_mode,
int trig_mode, int vec);

--
2.25.1


2022-05-20 14:15:30

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 01/17] x86/cpufeatures: Introduce x2AVIC CPUID bit

Introduce a new feature bit for virtualized x2APIC (x2AVIC) in
CPUID_Fn8000000A_EDX [SVM Revision and Feature Identification].

Reviewed-by: Maxim Levitsky <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index 1d6826eac3e6..2721bd1e8e1e 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -343,6 +343,7 @@
#define X86_FEATURE_AVIC (15*32+13) /* Virtual Interrupt Controller */
#define X86_FEATURE_V_VMSAVE_VMLOAD (15*32+15) /* Virtual VMSAVE VMLOAD */
#define X86_FEATURE_VGIF (15*32+16) /* Virtual GIF */
+#define X86_FEATURE_X2AVIC (15*32+18) /* Virtual x2apic */
#define X86_FEATURE_V_SPEC_CTRL (15*32+20) /* Virtual SPEC_CTRL */
#define X86_FEATURE_SVME_ADDR_CHK (15*32+28) /* "" SVME addr check */

--
2.25.1


2022-05-23 07:08:39

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [PATCH v6 02/17] KVM: x86: lapic: Rename [GET/SET]_APIC_DEST_FIELD to [GET/SET]_XAPIC_DEST_FIELD

To signify that the macros only support 8-bit xAPIC destination ID.

Suggested-by: Maxim Levitsky <[email protected]>
Reviewed-by: Maxim Levitsky <[email protected]>
Reviewed-by: Pankaj Gupta <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/hyperv/hv_apic.c | 2 +-
arch/x86/include/asm/apicdef.h | 4 ++--
arch/x86/kernel/apic/apic.c | 2 +-
arch/x86/kernel/apic/ipi.c | 2 +-
arch/x86/kvm/lapic.c | 2 +-
arch/x86/kvm/svm/avic.c | 4 ++--
6 files changed, 8 insertions(+), 8 deletions(-)

diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
index db2d92fb44da..fb8b2c088681 100644
--- a/arch/x86/hyperv/hv_apic.c
+++ b/arch/x86/hyperv/hv_apic.c
@@ -46,7 +46,7 @@ static void hv_apic_icr_write(u32 low, u32 id)
{
u64 reg_val;

- reg_val = SET_APIC_DEST_FIELD(id);
+ reg_val = SET_XAPIC_DEST_FIELD(id);
reg_val = reg_val << 32;
reg_val |= low;

diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h
index 5716f22f81ac..863c2cad5872 100644
--- a/arch/x86/include/asm/apicdef.h
+++ b/arch/x86/include/asm/apicdef.h
@@ -89,8 +89,8 @@
#define APIC_DM_EXTINT 0x00700
#define APIC_VECTOR_MASK 0x000FF
#define APIC_ICR2 0x310
-#define GET_APIC_DEST_FIELD(x) (((x) >> 24) & 0xFF)
-#define SET_APIC_DEST_FIELD(x) ((x) << 24)
+#define GET_XAPIC_DEST_FIELD(x) (((x) >> 24) & 0xFF)
+#define SET_XAPIC_DEST_FIELD(x) ((x) << 24)
#define APIC_LVTT 0x320
#define APIC_LVTTHMR 0x330
#define APIC_LVTPC 0x340
diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
index b70344bf6600..e6b754e43ed7 100644
--- a/arch/x86/kernel/apic/apic.c
+++ b/arch/x86/kernel/apic/apic.c
@@ -275,7 +275,7 @@ void native_apic_icr_write(u32 low, u32 id)
unsigned long flags;

local_irq_save(flags);
- apic_write(APIC_ICR2, SET_APIC_DEST_FIELD(id));
+ apic_write(APIC_ICR2, SET_XAPIC_DEST_FIELD(id));
apic_write(APIC_ICR, low);
local_irq_restore(flags);
}
diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c
index d1fb874fbe64..2a6509e8c840 100644
--- a/arch/x86/kernel/apic/ipi.c
+++ b/arch/x86/kernel/apic/ipi.c
@@ -99,7 +99,7 @@ void native_send_call_func_ipi(const struct cpumask *mask)

static inline int __prepare_ICR2(unsigned int mask)
{
- return SET_APIC_DEST_FIELD(mask);
+ return SET_XAPIC_DEST_FIELD(mask);
}

static inline void __xapic_wait_icr_idle(void)
diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
index 137c3a2f5180..8b8c4a905976 100644
--- a/arch/x86/kvm/lapic.c
+++ b/arch/x86/kvm/lapic.c
@@ -1326,7 +1326,7 @@ void kvm_apic_send_ipi(struct kvm_lapic *apic, u32 icr_low, u32 icr_high)
if (apic_x2apic_mode(apic))
irq.dest_id = icr_high;
else
- irq.dest_id = GET_APIC_DEST_FIELD(icr_high);
+ irq.dest_id = GET_XAPIC_DEST_FIELD(icr_high);

trace_kvm_apic_ipi(icr_low, irq.dest_id);

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 54fe03714f8a..a8f514212b87 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -328,7 +328,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
if (apic_x2apic_mode(vcpu->arch.apic))
dest = icrh;
else
- dest = GET_APIC_DEST_FIELD(icrh);
+ dest = GET_XAPIC_DEST_FIELD(icrh);

/*
* Try matching the destination APIC ID with the vCPU.
@@ -364,7 +364,7 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
*/
kvm_for_each_vcpu(i, vcpu, kvm) {
if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
- GET_APIC_DEST_FIELD(icrh),
+ GET_XAPIC_DEST_FIELD(icrh),
icrl & APIC_DEST_MASK)) {
vcpu->arch.apic->irr_pending = true;
svm_complete_interrupt_delivery(vcpu,
--
2.25.1


2022-06-07 09:06:35

by Jim Mattson

[permalink] [raw]
Subject: Re: [PATCH v6 00/17] Introducing AMD x2AVIC and hybrid-AVIC modes

On Thu, May 19, 2022 at 3:32 AM Suravee Suthikulpanit
<[email protected]> wrote:
>
> Introducing support for AMD x2APIC virtualization. This feature is
> indicated by the CPUID Fn8000_000A EDX[14], and it can be activated
> by setting bit 31 (enable AVIC) and bit 30 (x2APIC mode) of VMCB
> offset 60h.
>
> With x2AVIC support, the guest local APIC can be fully virtualized in
> both xAPIC and x2APIC modes, and the mode can be changed during runtime.
> For example, when AVIC is enabled, the hypervisor set VMCB bit 31
> to activate AVIC for each vCPU. Then, it keeps track of each vCPU's
> APIC mode, and updates VMCB bit 30 to enable/disable x2APIC
> virtualization mode accordingly.
>
> Besides setting bit VMCB bit 30 and 31, for x2AVIC, kvm_amd driver needs
> to disable interception for the x2APIC MSR range to allow AVIC hardware
> to virtualize register accesses.
>
> This series also introduce a partial APIC virtualization (hybrid-AVIC)
> mode, where APIC register accesses are trapped (i.e. not virtualized
> by hardware), but leverage AVIC doorbell for interrupt injection.
> This eliminates need to disable x2APIC in the guest on system without
> x2AVIC support. (Note: suggested by Maxim)
>
> Testing for v5:
> * Test partial AVIC mode by launching a VM with x2APIC mode
> * Tested booting a Linux VM with x2APIC physical and logical modes upto 512 vCPUs.
> * Test the following nested SVM test use cases:
>
> L0 | L1 | L2
> ----------------------------------
> AVIC | APIC | APIC
> AVIC | APIC | x2APIC
> hybrid-AVIC | x2APIC | APIC
> hybrid-AVIC | x2APIC | x2APIC
> x2AVIC | APIC | APIC
> x2AVIC | APIC | x2APIC
> x2AVIC | x2APIC | APIC
> x2AVIC | x2APIC | x2APIC
>
> Changes from v5:
> (https://lore.kernel.org/lkml/[email protected]/T/#t)
> * Re-order patch 16 to 10
> * Patch 11: Update commit message
>
> Changes from v4:
> (https://lore.kernel.org/lkml/[email protected]/T/)
> * Patch 3: Move enum_avic_modes definition to svm.h
> * Patch 10: Rename avic_set_x2apic_msr_interception to
> svm_set_x2apic_msr_interception and move it to svm.c
> to simplify the struct svm_direct_access_msrs declaration.
> * Patch 16: New from Maxim
> * Patch 17: New from Maxim
>
> Best Regards,
> Suravee
>
> Maxim Levitsky (2):
> KVM: x86: nSVM: always intercept x2apic msrs
> KVM: x86: nSVM: optimize svm_set_x2apic_msr_interception
>
> Suravee Suthikulpanit (15):
> x86/cpufeatures: Introduce x2AVIC CPUID bit
> KVM: x86: lapic: Rename [GET/SET]_APIC_DEST_FIELD to
> [GET/SET]_XAPIC_DEST_FIELD
> KVM: SVM: Detect X2APIC virtualization (x2AVIC) support
> KVM: SVM: Update max number of vCPUs supported for x2AVIC mode
> KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID
> KVM: SVM: Do not support updating APIC ID when in x2APIC mode
> KVM: SVM: Adding support for configuring x2APIC MSRs interception
> KVM: x86: Deactivate APICv on vCPU with APIC disabled
> KVM: SVM: Refresh AVIC configuration when changing APIC mode
> KVM: SVM: Introduce logic to (de)activate x2AVIC mode
> KVM: SVM: Do not throw warning when calling avic_vcpu_load on a
> running vcpu
> KVM: SVM: Introduce hybrid-AVIC mode
> KVM: x86: Warning APICv inconsistency only when vcpu APIC mode is
> valid
> KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible
> KVM: SVM: Add AVIC doorbell tracepoint
>
> arch/x86/hyperv/hv_apic.c | 2 +-
> arch/x86/include/asm/apicdef.h | 4 +-
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/kvm_host.h | 1 -
> arch/x86/include/asm/svm.h | 16 ++-
> arch/x86/kernel/apic/apic.c | 2 +-
> arch/x86/kernel/apic/ipi.c | 2 +-
> arch/x86/kvm/lapic.c | 6 +-
> arch/x86/kvm/svm/avic.c | 178 ++++++++++++++++++++++++++---
> arch/x86/kvm/svm/nested.c | 5 +
> arch/x86/kvm/svm/svm.c | 75 ++++++++----
> arch/x86/kvm/svm/svm.h | 25 +++-
> arch/x86/kvm/trace.h | 18 +++
> arch/x86/kvm/x86.c | 8 +-
> 14 files changed, 291 insertions(+), 52 deletions(-)
>
> --
> 2.25.1

When will we see this feature in silicon?

Where is the official documentation?

2022-06-24 16:26:32

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v6 01/17] x86/cpufeatures: Introduce x2AVIC CPUID bit

On 5/19/22 12:26, Suravee Suthikulpanit wrote:
> Introduce a new feature bit for virtualized x2APIC (x2AVIC) in
> CPUID_Fn8000000A_EDX [SVM Revision and Feature Identification].
>
> Reviewed-by: Maxim Levitsky <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/include/asm/cpufeatures.h | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index 1d6826eac3e6..2721bd1e8e1e 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -343,6 +343,7 @@
> #define X86_FEATURE_AVIC (15*32+13) /* Virtual Interrupt Controller */
> #define X86_FEATURE_V_VMSAVE_VMLOAD (15*32+15) /* Virtual VMSAVE VMLOAD */
> #define X86_FEATURE_VGIF (15*32+16) /* Virtual GIF */
> +#define X86_FEATURE_X2AVIC (15*32+18) /* Virtual x2apic */
> #define X86_FEATURE_V_SPEC_CTRL (15*32+20) /* Virtual SPEC_CTRL */
> #define X86_FEATURE_SVME_ADDR_CHK (15*32+28) /* "" SVME addr check */
>

Reviewed-by: Paolo Bonzini <[email protected]>

2022-06-24 16:47:21

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v6 02/17] KVM: x86: lapic: Rename [GET/SET]_APIC_DEST_FIELD to [GET/SET]_XAPIC_DEST_FIELD

On 5/19/22 12:26, Suravee Suthikulpanit wrote:
> To signify that the macros only support 8-bit xAPIC destination ID.
>
> Suggested-by: Maxim Levitsky <[email protected]>
> Reviewed-by: Maxim Levitsky <[email protected]>
> Reviewed-by: Pankaj Gupta <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/hyperv/hv_apic.c | 2 +-
> arch/x86/include/asm/apicdef.h | 4 ++--
> arch/x86/kernel/apic/apic.c | 2 +-
> arch/x86/kernel/apic/ipi.c | 2 +-
> arch/x86/kvm/lapic.c | 2 +-
> arch/x86/kvm/svm/avic.c | 4 ++--
> 6 files changed, 8 insertions(+), 8 deletions(-)
>
> diff --git a/arch/x86/hyperv/hv_apic.c b/arch/x86/hyperv/hv_apic.c
> index db2d92fb44da..fb8b2c088681 100644
> --- a/arch/x86/hyperv/hv_apic.c
> +++ b/arch/x86/hyperv/hv_apic.c
> @@ -46,7 +46,7 @@ static void hv_apic_icr_write(u32 low, u32 id)
> {
> u64 reg_val;
>
> - reg_val = SET_APIC_DEST_FIELD(id);
> + reg_val = SET_XAPIC_DEST_FIELD(id);
> reg_val = reg_val << 32;
> reg_val |= low;
>
> diff --git a/arch/x86/include/asm/apicdef.h b/arch/x86/include/asm/apicdef.h
> index 5716f22f81ac..863c2cad5872 100644
> --- a/arch/x86/include/asm/apicdef.h
> +++ b/arch/x86/include/asm/apicdef.h
> @@ -89,8 +89,8 @@
> #define APIC_DM_EXTINT 0x00700
> #define APIC_VECTOR_MASK 0x000FF
> #define APIC_ICR2 0x310
> -#define GET_APIC_DEST_FIELD(x) (((x) >> 24) & 0xFF)
> -#define SET_APIC_DEST_FIELD(x) ((x) << 24)
> +#define GET_XAPIC_DEST_FIELD(x) (((x) >> 24) & 0xFF)
> +#define SET_XAPIC_DEST_FIELD(x) ((x) << 24)
> #define APIC_LVTT 0x320
> #define APIC_LVTTHMR 0x330
> #define APIC_LVTPC 0x340
> diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c
> index b70344bf6600..e6b754e43ed7 100644
> --- a/arch/x86/kernel/apic/apic.c
> +++ b/arch/x86/kernel/apic/apic.c
> @@ -275,7 +275,7 @@ void native_apic_icr_write(u32 low, u32 id)
> unsigned long flags;
>
> local_irq_save(flags);
> - apic_write(APIC_ICR2, SET_APIC_DEST_FIELD(id));
> + apic_write(APIC_ICR2, SET_XAPIC_DEST_FIELD(id));
> apic_write(APIC_ICR, low);
> local_irq_restore(flags);
> }
> diff --git a/arch/x86/kernel/apic/ipi.c b/arch/x86/kernel/apic/ipi.c
> index d1fb874fbe64..2a6509e8c840 100644
> --- a/arch/x86/kernel/apic/ipi.c
> +++ b/arch/x86/kernel/apic/ipi.c
> @@ -99,7 +99,7 @@ void native_send_call_func_ipi(const struct cpumask *mask)
>
> static inline int __prepare_ICR2(unsigned int mask)
> {
> - return SET_APIC_DEST_FIELD(mask);
> + return SET_XAPIC_DEST_FIELD(mask);
> }
>
> static inline void __xapic_wait_icr_idle(void)
> diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c
> index 137c3a2f5180..8b8c4a905976 100644
> --- a/arch/x86/kvm/lapic.c
> +++ b/arch/x86/kvm/lapic.c
> @@ -1326,7 +1326,7 @@ void kvm_apic_send_ipi(struct kvm_lapic *apic, u32 icr_low, u32 icr_high)
> if (apic_x2apic_mode(apic))
> irq.dest_id = icr_high;
> else
> - irq.dest_id = GET_APIC_DEST_FIELD(icr_high);
> + irq.dest_id = GET_XAPIC_DEST_FIELD(icr_high);
>
> trace_kvm_apic_ipi(icr_low, irq.dest_id);
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 54fe03714f8a..a8f514212b87 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -328,7 +328,7 @@ static int avic_kick_target_vcpus_fast(struct kvm *kvm, struct kvm_lapic *source
> if (apic_x2apic_mode(vcpu->arch.apic))
> dest = icrh;
> else
> - dest = GET_APIC_DEST_FIELD(icrh);
> + dest = GET_XAPIC_DEST_FIELD(icrh);
>
> /*
> * Try matching the destination APIC ID with the vCPU.
> @@ -364,7 +364,7 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
> */
> kvm_for_each_vcpu(i, vcpu, kvm) {
> if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
> - GET_APIC_DEST_FIELD(icrh),
> + GET_XAPIC_DEST_FIELD(icrh),
> icrl & APIC_DEST_MASK)) {
> vcpu->arch.apic->irr_pending = true;
> svm_complete_interrupt_delivery(vcpu,

Reviewed-by: Paolo Bonzini <[email protected]>

2022-06-24 16:51:26

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v6 15/17] KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible

On 5/19/22 12:27, Suravee Suthikulpanit wrote:
> + * If the x2APIC logical ID sub-field (i.e. icrh[15:0]) contains zero
> + * or more than 1 bits, we cannot match just one vcpu to kick for
> + * fast path.
> + */
> + if (!first || (first != last))
> + return -EINVAL;
> +
> + apic = first - 1;
> + if ((apic < 0) || (apic > 15) || (cluster >= 0xfffff))
> + return -EINVAL;

Neither of these is possible: first == 0 has been cheked above, and
ffs(icrh & 0xffff) cannot exceed 15. Likewise, cluster is actually
limited to 16 bits, not 20.

Plus, C is not Pascal so no parentheses. :)

Putting everything together, it can be simplified to this:

+ int cluster = (icrh & 0xffff0000) >> 16;
+ int apic = ffs(icrh & 0xffff) - 1;
+
+ /*
+ * If the x2APIC logical ID sub-field (i.e. icrh[15:0])
+ * contains anything but a single bit, we cannot use the
+ * fast path, because it is limited to a single vCPU.
+ */
+ if (apic < 0 || icrh != (1 << apic))
+ return -EINVAL;
+
+ l1_physical_id = (cluster << 4) + apic;


> + apic_id = (cluster << 4) + apic;

2022-06-24 17:27:44

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH v6 00/17] Introducing AMD x2AVIC and hybrid-AVIC modes

On 5/19/22 12:26, Suravee Suthikulpanit wrote:
> Introducing support for AMD x2APIC virtualization. This feature is
> indicated by the CPUID Fn8000_000A EDX[14], and it can be activated
> by setting bit 31 (enable AVIC) and bit 30 (x2APIC mode) of VMCB
> offset 60h.
>
> With x2AVIC support, the guest local APIC can be fully virtualized in
> both xAPIC and x2APIC modes, and the mode can be changed during runtime.
> For example, when AVIC is enabled, the hypervisor set VMCB bit 31
> to activate AVIC for each vCPU. Then, it keeps track of each vCPU's
> APIC mode, and updates VMCB bit 30 to enable/disable x2APIC
> virtualization mode accordingly.
>
> Besides setting bit VMCB bit 30 and 31, for x2AVIC, kvm_amd driver needs
> to disable interception for the x2APIC MSR range to allow AVIC hardware
> to virtualize register accesses.
>
> This series also introduce a partial APIC virtualization (hybrid-AVIC)
> mode, where APIC register accesses are trapped (i.e. not virtualized
> by hardware), but leverage AVIC doorbell for interrupt injection.
> This eliminates need to disable x2APIC in the guest on system without
> x2AVIC support. (Note: suggested by Maxim)
>
> Testing for v5:
> * Test partial AVIC mode by launching a VM with x2APIC mode
> * Tested booting a Linux VM with x2APIC physical and logical modes upto 512 vCPUs.
> * Test the following nested SVM test use cases:
>
> L0 | L1 | L2
> ----------------------------------
> AVIC | APIC | APIC
> AVIC | APIC | x2APIC
> hybrid-AVIC | x2APIC | APIC
> hybrid-AVIC | x2APIC | x2APIC
> x2AVIC | APIC | APIC
> x2AVIC | APIC | x2APIC
> x2AVIC | x2APIC | APIC
> x2AVIC | x2APIC | x2APIC
>
> Changes from v5:
> (https://lore.kernel.org/lkml/[email protected]/T/#t)
> * Re-order patch 16 to 10
> * Patch 11: Update commit message
>
> Changes from v4:
> (https://lore.kernel.org/lkml/[email protected]/T/)
> * Patch 3: Move enum_avic_modes definition to svm.h
> * Patch 10: Rename avic_set_x2apic_msr_interception to
> svm_set_x2apic_msr_interception and move it to svm.c
> to simplify the struct svm_direct_access_msrs declaration.
> * Patch 16: New from Maxim
> * Patch 17: New from Maxim
>
> Best Regards,
> Suravee
>
> Maxim Levitsky (2):
> KVM: x86: nSVM: always intercept x2apic msrs
> KVM: x86: nSVM: optimize svm_set_x2apic_msr_interception
>
> Suravee Suthikulpanit (15):
> x86/cpufeatures: Introduce x2AVIC CPUID bit
> KVM: x86: lapic: Rename [GET/SET]_APIC_DEST_FIELD to
> [GET/SET]_XAPIC_DEST_FIELD
> KVM: SVM: Detect X2APIC virtualization (x2AVIC) support
> KVM: SVM: Update max number of vCPUs supported for x2AVIC mode
> KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID
> KVM: SVM: Do not support updating APIC ID when in x2APIC mode
> KVM: SVM: Adding support for configuring x2APIC MSRs interception
> KVM: x86: Deactivate APICv on vCPU with APIC disabled
> KVM: SVM: Refresh AVIC configuration when changing APIC mode
> KVM: SVM: Introduce logic to (de)activate x2AVIC mode
> KVM: SVM: Do not throw warning when calling avic_vcpu_load on a
> running vcpu
> KVM: SVM: Introduce hybrid-AVIC mode
> KVM: x86: Warning APICv inconsistency only when vcpu APIC mode is
> valid
> KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible
> KVM: SVM: Add AVIC doorbell tracepoint
>
> arch/x86/hyperv/hv_apic.c | 2 +-
> arch/x86/include/asm/apicdef.h | 4 +-
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/include/asm/kvm_host.h | 1 -
> arch/x86/include/asm/svm.h | 16 ++-
> arch/x86/kernel/apic/apic.c | 2 +-
> arch/x86/kernel/apic/ipi.c | 2 +-
> arch/x86/kvm/lapic.c | 6 +-
> arch/x86/kvm/svm/avic.c | 178 ++++++++++++++++++++++++++---
> arch/x86/kvm/svm/nested.c | 5 +
> arch/x86/kvm/svm/svm.c | 75 ++++++++----
> arch/x86/kvm/svm/svm.h | 25 +++-
> arch/x86/kvm/trace.h | 18 +++
> arch/x86/kvm/x86.c | 8 +-
> 14 files changed, 291 insertions(+), 52 deletions(-)
>

I haven't quite finished reviewing this, but it passes both
kvm-unit-tests and selftests so I pushed it to kvm/queue.

Paolo

2022-06-27 23:21:11

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v6 15/17] KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible

On Fri, 2022-06-24 at 18:41 +0200, Paolo Bonzini wrote:
> On 5/19/22 12:27, Suravee Suthikulpanit wrote:
> > + * If the x2APIC logical ID sub-field (i.e. icrh[15:0]) contains zero
> > + * or more than 1 bits, we cannot match just one vcpu to kick for
> > + * fast path.
> > + */
> > + if (!first || (first != last))
> > + return -EINVAL;
> > +
> > + apic = first - 1;
> > + if ((apic < 0) || (apic > 15) || (cluster >= 0xfffff))
> > + return -EINVAL;
>
> Neither of these is possible: first == 0 has been cheked above, and
> ffs(icrh & 0xffff) cannot exceed 15. Likewise, cluster is actually
> limited to 16 bits, not 20.
>
> Plus, C is not Pascal so no parentheses. :)
>
> Putting everything together, it can be simplified to this:
>
> + int cluster = (icrh & 0xffff0000) >> 16;
> + int apic = ffs(icrh & 0xffff) - 1;
> +
> + /*
> + * If the x2APIC logical ID sub-field (i.e. icrh[15:0])
> + * contains anything but a single bit, we cannot use the
> + * fast path, because it is limited to a single vCPU.
> + */
> + if (apic < 0 || icrh != (1 << apic))
> + return -EINVAL;
> +
> + l1_physical_id = (cluster << 4) + apic;
>
>
> > + apic_id = (cluster << 4) + apic;

Hi Paolo and Suravee Suthikulpanit!

Note that this patch is not needed anymore, I fixed the avic_kick_target_vcpus_fast function,
and added the support for x2apic because it was very easy to do
(I already needed to parse logical id for flat and cluser modes)

Best regards,
Maxim Levitsky

2022-06-28 02:58:06

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH v6 15/17] KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible



On 6/28/2022 5:55 AM, Maxim Levitsky wrote:
> On Fri, 2022-06-24 at 18:41 +0200, Paolo Bonzini wrote:
>> On 5/19/22 12:27, Suravee Suthikulpanit wrote:
>>> + * If the x2APIC logical ID sub-field (i.e. icrh[15:0]) contains zero
>>> + * or more than 1 bits, we cannot match just one vcpu to kick for
>>> + * fast path.
>>> + */
>>> + if (!first || (first != last))
>>> + return -EINVAL;
>>> +
>>> + apic = first - 1;
>>> + if ((apic < 0) || (apic > 15) || (cluster >= 0xfffff))
>>> + return -EINVAL;
>>
>> Neither of these is possible: first == 0 has been cheked above, and
>> ffs(icrh & 0xffff) cannot exceed 15. Likewise, cluster is actually
>> limited to 16 bits, not 20.
>>
>> Plus, C is not Pascal so no parentheses. :)
>>
>> Putting everything together, it can be simplified to this:
>>
>> + int cluster = (icrh & 0xffff0000) >> 16;
>> + int apic = ffs(icrh & 0xffff) - 1;
>> +
>> + /*
>> + * If the x2APIC logical ID sub-field (i.e. icrh[15:0])
>> + * contains anything but a single bit, we cannot use the
>> + * fast path, because it is limited to a single vCPU.
>> + */
>> + if (apic < 0 || icrh != (1 << apic))
>> + return -EINVAL;
>> +
>> + l1_physical_id = (cluster << 4) + apic;
>>
>>
>>> + apic_id = (cluster << 4) + apic;
>
> Hi Paolo and Suravee Suthikulpanit!
>
> Note that this patch is not needed anymore, I fixed the avic_kick_target_vcpus_fast function,
> and added the support for x2apic because it was very easy to do
> (I already needed to parse logical id for flat and cluser modes)
>
> Best regards,
> Maxim Levitsky
>

Understood. I was about to send v7 to remove this patch from the series, but too late. I'll test the current queue branch and provide update.

Best Regards,
Suravee

2022-06-28 09:40:40

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v6 15/17] KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible

On Tue, 2022-06-28 at 09:35 +0700, Suthikulpanit, Suravee wrote:
>
> On 6/28/2022 5:55 AM, Maxim Levitsky wrote:
> > On Fri, 2022-06-24 at 18:41 +0200, Paolo Bonzini wrote:
> > > On 5/19/22 12:27, Suravee Suthikulpanit wrote:
> > > > + * If the x2APIC logical ID sub-field (i.e. icrh[15:0]) contains zero
> > > > + * or more than 1 bits, we cannot match just one vcpu to kick for
> > > > + * fast path.
> > > > + */
> > > > + if (!first || (first != last))
> > > > + return -EINVAL;
> > > > +
> > > > + apic = first - 1;
> > > > + if ((apic < 0) || (apic > 15) || (cluster >= 0xfffff))
> > > > + return -EINVAL;
> > >
> > > Neither of these is possible: first == 0 has been cheked above, and
> > > ffs(icrh & 0xffff) cannot exceed 15. Likewise, cluster is actually
> > > limited to 16 bits, not 20.
> > >
> > > Plus, C is not Pascal so no parentheses. :)
> > >
> > > Putting everything together, it can be simplified to this:
> > >
> > > + int cluster = (icrh & 0xffff0000) >> 16;
> > > + int apic = ffs(icrh & 0xffff) - 1;
> > > +
> > > + /*
> > > + * If the x2APIC logical ID sub-field (i.e. icrh[15:0])
> > > + * contains anything but a single bit, we cannot use the
> > > + * fast path, because it is limited to a single vCPU.
> > > + */
> > > + if (apic < 0 || icrh != (1 << apic))
> > > + return -EINVAL;
> > > +
> > > + l1_physical_id = (cluster << 4) + apic;
> > >
> > >
> > > > + apic_id = (cluster << 4) + apic;
> >
> > Hi Paolo and Suravee Suthikulpanit!
> >
> > Note that this patch is not needed anymore, I fixed the avic_kick_target_vcpus_fast function,
> > and added the support for x2apic because it was very easy to do
> > (I already needed to parse logical id for flat and cluser modes)
> >
> > Best regards,
> > Maxim Levitsky
> >
>
> Understood. I was about to send v7 to remove this patch from the series, but too late. I'll test the current queue branch and provide update.

Also this really needs a KVM unit test, to avoid breaking corner cases like
sending IPI to 0xFF address, which was the reason I had to fix the
avic_kick_target_vcpus_fast.

We do have 'apic' test in kvm unit tests,
and I was already looking to extend it to cover more cases and to run it with AVIC's
compatible settings. I hope I will be able to do this this week.

Best regards,
Maxim Levitsky


>
> Best Regards,
> Suravee
>


2022-06-28 12:47:02

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH v6 15/17] KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible

Paolo / Maxim

On 6/28/2022 3:59 PM, Maxim Levitsky wrote:
>>> Hi Paolo and Suravee Suthikulpanit!
>>>
>>> Note that this patch is not needed anymore, I fixed the avic_kick_target_vcpus_fast function,
>>> and added the support for x2apic because it was very easy to do
>>> (I already needed to parse logical id for flat and cluser modes)
>>>
>>> Best regards,
>>> Maxim Levitsky
>>>
>> Understood. I was about to send v7 to remove this patch from the series, but too late. I'll test the current queue branch and provide update.
> Also this really needs a KVM unit test, to avoid breaking corner cases like
> sending IPI to 0xFF address, which was the reason I had to fix the
> avic_kick_target_vcpus_fast.
>
> We do have 'apic' test in kvm unit tests,
> and I was already looking to extend it to cover more cases and to run it with AVIC's
> compatible settings. I hope I will be able to do this this week.

Thanks. Would you please CC me as well once ready?

>
> Best regards,
> Maxim Levitsky

I have also submitted a patch to fix the 603ccef42ce9 ("KVM: x86: SVM: fix avic_kick_target_vcpus_fast"),
which was queued previously.

Best Regards,
Suravee Suthikulpanit

2022-06-28 13:18:07

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v6 15/17] KVM: SVM: Use target APIC ID to complete x2AVIC IRQs when possible

On Tue, 2022-06-28 at 19:36 +0700, Suthikulpanit, Suravee wrote:
> Paolo / Maxim
>
> On 6/28/2022 3:59 PM, Maxim Levitsky wrote:
> > > > Hi Paolo and Suravee Suthikulpanit!
> > > >
> > > > Note that this patch is not needed anymore, I fixed the avic_kick_target_vcpus_fast function,
> > > > and added the support for x2apic because it was very easy to do
> > > > (I already needed to parse logical id for flat and cluser modes)
> > > >
> > > > Best regards,
> > > > Maxim Levitsky
> > > >
> > > Understood. I was about to send v7 to remove this patch from the series, but too late. I'll test the current queue branch and provide update.
> > Also this really needs a KVM unit test, to avoid breaking corner cases like
> > sending IPI to 0xFF address, which was the reason I had to fix the
> > avic_kick_target_vcpus_fast.
> >
> > We do have 'apic' test in kvm unit tests,
> > and I was already looking to extend it to cover more cases and to run it with AVIC's
> > compatible settings. I hope I will be able to do this this week.
>
> Thanks. Would you please CC me as well once ready?

Of course!
>
> > Best regards,
> > Maxim Levitsky
>
> I have also submitted a patch to fix the 603ccef42ce9 ("KVM: x86: SVM: fix avic_kick_target_vcpus_fast"),
> which was queued previously.

Thank you very much!

Best regards,
Maxim Levitsky


>
> Best Regards,
> Suravee Suthikulpanit
>


2022-06-28 13:30:26

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH v6 00/17] Introducing AMD x2AVIC and hybrid-AVIC modes

Maxim,

On 5/19/2022 5:26 PM, Suravee Suthikulpanit wrote:
> Introducing support for AMD x2APIC virtualization. This feature is
> indicated by the CPUID Fn8000_000A EDX[14], and it can be activated
> by setting bit 31 (enable AVIC) and bit 30 (x2APIC mode) of VMCB
> offset 60h.
>
> With x2AVIC support, the guest local APIC can be fully virtualized in
> both xAPIC and x2APIC modes, and the mode can be changed during runtime.
> For example, when AVIC is enabled, the hypervisor set VMCB bit 31
> to activate AVIC for each vCPU. Then, it keeps track of each vCPU's
> APIC mode, and updates VMCB bit 30 to enable/disable x2APIC
> virtualization mode accordingly.
>
> Besides setting bit VMCB bit 30 and 31, for x2AVIC, kvm_amd driver needs
> to disable interception for the x2APIC MSR range to allow AVIC hardware
> to virtualize register accesses.
>
> This series also introduce a partial APIC virtualization (hybrid-AVIC)
> mode, where APIC register accesses are trapped (i.e. not virtualized
> by hardware), but leverage AVIC doorbell for interrupt injection.
> This eliminates need to disable x2APIC in the guest on system without
> x2AVIC support. (Note: suggested by Maxim)
>
> Testing for v5:
> * Test partial AVIC mode by launching a VM with x2APIC mode
> * Tested booting a Linux VM with x2APIC physical and logical modes upto 512 vCPUs.
> * Test the following nested SVM test use cases:
>
> L0 | L1 | L2
> ----------------------------------
> AVIC | APIC | APIC
> AVIC | APIC | x2APIC
> hybrid-AVIC | x2APIC | APIC
> hybrid-AVIC | x2APIC | x2APIC
> x2AVIC | APIC | APIC
> x2AVIC | APIC | x2APIC
> x2AVIC | x2APIC | APIC
> x2AVIC | x2APIC | x2APIC

With the commit 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base"),
APICV/AVIC is now inhibit when the guest kernel boots w/ option "nox2apic" or "x2apic_phys"
due to APICV_INHIBIT_REASON_APIC_ID_MODIFIED.

These cases used to work. In theory, we should be able to allow AVIC works in this case.
Is there a way to modify logic in kvm_lapic_xapic_id_updated() to allow these use cases
to work w/ APICv/AVIC?

Best Regards,
Suravee

2022-06-28 13:45:21

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v6 00/17] Introducing AMD x2AVIC and hybrid-AVIC modes

On Tue, 2022-06-28 at 20:20 +0700, Suthikulpanit, Suravee wrote:
> Maxim,
>
> On 5/19/2022 5:26 PM, Suravee Suthikulpanit wrote:
> > Introducing support for AMD x2APIC virtualization. This feature is
> > indicated by the CPUID Fn8000_000A EDX[14], and it can be activated
> > by setting bit 31 (enable AVIC) and bit 30 (x2APIC mode) of VMCB
> > offset 60h.
> >
> > With x2AVIC support, the guest local APIC can be fully virtualized in
> > both xAPIC and x2APIC modes, and the mode can be changed during runtime.
> > For example, when AVIC is enabled, the hypervisor set VMCB bit 31
> > to activate AVIC for each vCPU. Then, it keeps track of each vCPU's
> > APIC mode, and updates VMCB bit 30 to enable/disable x2APIC
> > virtualization mode accordingly.
> >
> > Besides setting bit VMCB bit 30 and 31, for x2AVIC, kvm_amd driver needs
> > to disable interception for the x2APIC MSR range to allow AVIC hardware
> > to virtualize register accesses.
> >
> > This series also introduce a partial APIC virtualization (hybrid-AVIC)
> > mode, where APIC register accesses are trapped (i.e. not virtualized
> > by hardware), but leverage AVIC doorbell for interrupt injection.
> > This eliminates need to disable x2APIC in the guest on system without
> > x2AVIC support. (Note: suggested by Maxim)
> >
> > Testing for v5:
> > * Test partial AVIC mode by launching a VM with x2APIC mode
> > * Tested booting a Linux VM with x2APIC physical and logical modes upto 512 vCPUs.
> > * Test the following nested SVM test use cases:
> >
> > L0 | L1 | L2
> > ----------------------------------
> > AVIC | APIC | APIC
> > AVIC | APIC | x2APIC
> > hybrid-AVIC | x2APIC | APIC
> > hybrid-AVIC | x2APIC | x2APIC
> > x2AVIC | APIC | APIC
> > x2AVIC | APIC | x2APIC
> > x2AVIC | x2APIC | APIC
> > x2AVIC | x2APIC | x2APIC
>
> With the commit 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base"),
> APICV/AVIC is now inhibit when the guest kernel boots w/ option "nox2apic" or "x2apic_phys"
> due to APICV_INHIBIT_REASON_APIC_ID_MODIFIED.
>
> These cases used to work. In theory, we should be able to allow AVIC works in this case.
> Is there a way to modify logic in kvm_lapic_xapic_id_updated() to allow these use cases
> to work w/ APICv/AVIC?
>
> Best Regards,
> Suravee
>

This seems very strange, I assume you test the kvm/queue of today,

which contains a fix for a typo I had in the list of inhibit reasons
(commit 5bdae49fc2f689b5f896b54bd9230425d3643dab - KVM: SEV: fix misplaced closing parenthesis)


Could you share more details on the test? How many vCPUs in the guest, is x2apic exposed to the guest?


Looking through the code the the __x2apic_disable, touches the MSR_IA32_APICBASE so I would expect
the APICV_INHIBIT_REASON_APIC_BASE_MODIFIED inhibit to be triggered and not APICV_INHIBIT_REASON_APIC_ID_MODIFIED


I don't see yet how the x2apic_phys can trigger these inhibits.

Best regards,
Maxim Levitsky

2022-06-28 16:59:51

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [PATCH v6 00/17] Introducing AMD x2AVIC and hybrid-AVIC modes

Maxim,

On 6/28/2022 8:43 PM, Maxim Levitsky wrote:
>> With the commit 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base"),
>> APICV/AVIC is now inhibit when the guest kernel boots w/ option "nox2apic" or "x2apic_phys"
>> due to APICV_INHIBIT_REASON_APIC_ID_MODIFIED.
>>
>> These cases used to work. In theory, we should be able to allow AVIC works in this case.
>> Is there a way to modify logic in kvm_lapic_xapic_id_updated() to allow these use cases
>> to work w/ APICv/AVIC?
>>
>> Best Regards,
>> Suravee
>>
> This seems very strange, I assume you test the kvm/queue of today,

Yes

> which contains a fix for a typo I had in the list of inhibit reasons
> (commit 5bdae49fc2f689b5f896b54bd9230425d3643dab - KVM: SEV: fix misplaced closing parenthesis)

Yes

> Could you share more details on the test? How many vCPUs in the guest, is x2apic exposed to the guest?

With the problem happens w/ 257 vCPUs or more (i.e. vcpu ID 0x100).

> Looking through the code the the __x2apic_disable, touches the MSR_IA32_APICBASE so I would expect
> the APICV_INHIBIT_REASON_APIC_BASE_MODIFIED inhibit to be triggered and not APICV_INHIBIT_REASON_APIC_ID_MODIFIED
>
>
> I don't see yet how the x2apic_phys can trigger these inhibits.

When I add WARN_ON_ONCE at the point when we set the APICV_INHIBIT_REASON_APIC_ID_MODIFIED,
I get this call stack.

11 [ 105.470685] ------------[ cut here ]------------
12 [ 105.470686] WARNING: CPU: 279 PID: 11511 at arch/x86/kvm/lapic.c:2057 kvm_lapic_xapic_id_updated.cold+0x13/0x2f [kvm]
13 [ 105.470769] Modules linked in: kvm_amd kvm irqbypass nf_tables nfnetlink bridge stp llc squashfs loop vfat fat dm_multipath intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd wmi_bmof sg ipmi_ssif dm_mod acpi_ipmi ccp k10temp ipmi_si acpi_cpufreq sch_fq_codel ipmi_devintf ipmi_msghandler fuse ip_tables ext4 mbcache jbd2 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy as ync_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 linear ast
i2c_algo_bit drm_vram_helper drm_ttm_helper ttm crct10dif_pclmul crc32_pclmul ses crc32c_intel drm_kms_helper enclosure ghash_clmulni_intel nvme sd_mod scsi_transport_sas syscopyarea aesni_intel sysfillrect crypto_simd nvme_core sysimgblt cryptd t10_pi fb_sys_fops tg3 uas crc64_rocksoft_generic i2c_designware_platform ptp crc64_rocksoft drm i2c_piix4 i2c_designware_core usb_storage pps_core crc64 wmi pinctrl_amd i2c_core
14 [ 105.470851] CPU: 279 PID: 11511 Comm: qemu-system-x86 Kdump: loaded Not tainted 5.19.0-rc1-kvm-queue-x2avic+ #38
15 [ 105.470856] Hardware name: AMD Corporation QUARTZ/QUARTZ, BIOS TQZ0080D 05/11/2022
16 [ 105.470858] RIP: 0010:kvm_lapic_xapic_id_updated.cold+0x13/0x2f [kvm]
17 [ 105.470906] Code: db 8f fd ff 48 c7 c7 8d e8 ca c0 e8 43 27 88 ce 31 c0 e9 f8 90 fd ff 48 c7 c6 00 6a ca c0 48 c7 c7 e5 e8 ca c0 e8 29 27 88 ce <0f> 0b 48 8b 83 90 00 00 00 ba 01 00 00 00 be 04 00 00 00 5b 48 8b
18 [ 105.470909] RSP: 0018:ffffb13a436d7d40 EFLAGS: 00010246
19 [ 105.470913] RAX: 0000000000000030 RBX: ffff9f0372c98400 RCX: 0000000000000000
20 [ 105.470916] RDX: 0000000000000000 RSI: ffffffff8fd59e05 RDI: 00000000ffffffff
21 [ 105.470918] RBP: ffffb13a436d7e40 R08: 0000000000000030 R09: 0000000000000002
22 [ 105.470920] R10: 000000000000000f R11: ffff9f21c5c2fc80 R12: ffff9f0372c64250
23 [ 105.470921] R13: ffff9f0372c64250 R14: 00007f9ac9ffa2f0 R15: ffff9f0344da7000
24 [ 105.470930] FS: 00007f9ac9ffb640(0000) GS:ffff9f118edc0000(0000) knlGS:0000000000000000
25 [ 105.470932] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
26 [ 105.470934] CR2: 00007fa34c73c001 CR3: 00000001b71a2003 CR4: 0000000000770ee0
27 [ 105.470936] PKRU: 55555554
28 [ 105.470938] Call Trace:
29 [ 105.470942] <TASK>
30 [ 105.470945] kvm_apic_state_fixup+0x85/0xb0 [kvm]
31 [ 105.471002] kvm_arch_vcpu_ioctl+0xa01/0x14b0 [kvm]
32 [ 105.471080] ? __local_bh_enable_ip+0x37/0x70
33 [ 105.471088] ? copy_fpstate_to_sigframe+0x2f6/0x360
34 [ 105.471099] ? mod_objcg_state+0xd2/0x360
35 [ 105.471109] ? refill_obj_stock+0xb0/0x160
36 [ 105.471116] ? kvm_vcpu_ioctl+0x4bc/0x680 [kvm]
37 [ 105.471156] kvm_vcpu_ioctl+0x4bc/0x680 [kvm]
38 [ 105.471197] __x64_sys_ioctl+0x83/0xb0
39 [ 105.471206] do_syscall_64+0x3b/0x90
40 [ 105.471218] entry_SYSCALL_64_after_hwframe+0x46/0xb0
41 [ 105.471228] RIP: 0033:0x7fa356d19a2b
42 [ 105.471232] Code: ff ff ff 85 c0 79 8b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 f3 0f 00 f7 d8 64 89 01 48
43 [ 105.471235] RSP: 002b:00007f9ac9ffa248 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
44 [ 105.471240] RAX: ffffffffffffffda RBX: 000000008400ae8e RCX: 00007fa356d19a2b
45 [ 105.471243] RDX: 00007f9ac9ffa2f0 RSI: ffffffff8400ae8e RDI: 000000000000010c
46 [ 105.471245] RBP: 0000561ce47ee560 R08: 0000561ce2351954 R09: 0000561ce2351c5c
47 [ 105.471248] R10: 00007f9ab80037b0 R11: 0000000000000246 R12: 00007f9ac9ffa2f0
48 [ 105.471266] R13: 00007f9ab80037b0 R14: fff0000000000000 R15: 00007f9ac97fb000
49 [ 105.471270] </TASK>
50 [ 105.471272] ---[ end trace 0000000000000000 ]---

Best Regards,
Suravee

2022-06-29 07:18:55

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [PATCH v6 00/17] Introducing AMD x2AVIC and hybrid-AVIC modes

On Tue, 2022-06-28 at 23:34 +0700, Suthikulpanit, Suravee wrote:
> Maxim,
>
> On 6/28/2022 8:43 PM, Maxim Levitsky wrote:
> > > With the commit 3743c2f02517 ("KVM: x86: inhibit APICv/AVIC on changes to APIC ID or APIC base"),
> > > APICV/AVIC is now inhibit when the guest kernel boots w/ option "nox2apic" or "x2apic_phys"
> > > due to APICV_INHIBIT_REASON_APIC_ID_MODIFIED.
> > >
> > > These cases used to work. In theory, we should be able to allow AVIC works in this case.
> > > Is there a way to modify logic in kvm_lapic_xapic_id_updated() to allow these use cases
> > > to work w/ APICv/AVIC?
> > >
> > > Best Regards,
> > > Suravee
> > >
> > This seems very strange, I assume you test the kvm/queue of today,
>
> Yes
>
> > which contains a fix for a typo I had in the list of inhibit reasons
> > (commit 5bdae49fc2f689b5f896b54bd9230425d3643dab - KVM: SEV: fix misplaced closing parenthesis)
>
> Yes
>
> > Could you share more details on the test? How many vCPUs in the guest, is x2apic exposed to the guest?
>
> With the problem happens w/ 257 vCPUs or more (i.e. vcpu ID 0x100).
>
> > Looking through the code the the __x2apic_disable, touches the MSR_IA32_APICBASE so I would expect
> > the APICV_INHIBIT_REASON_APIC_BASE_MODIFIED inhibit to be triggered and not APICV_INHIBIT_REASON_APIC_ID_MODIFIED
> >
> >
> > I don't see yet how the x2apic_phys can trigger these inhibits.
>
> When I add WARN_ON_ONCE at the point when we set the APICV_INHIBIT_REASON_APIC_ID_MODIFIED,
> I get this call stack.

Great, thanks for the info, now it all clear.

For > 255 vCPUs, it is not possible to have APIC_ID == vcpu_id, thus the check kvm_lapic_xapic_id_updated
should truncate the vcpu_id to 8 bit.

I'll send a patch to fix this, very soon.

In addition to that later we should check that both AVIC (I think it doesn't crash) and APICv doesn't crash in this case
(when a guest still attempts to enable APIC on vCPU > 254 (255 also can't be used for regular apic)).

Thanks,
Best regards,
Maxim Levitsky

>
> 11 [ 105.470685] ------------[ cut here ]------------
> 12 [ 105.470686] WARNING: CPU: 279 PID: 11511 at arch/x86/kvm/lapic.c:2057 kvm_lapic_xapic_id_updated.cold+0x13/0x2f [kvm]
> 13 [ 105.470769] Modules linked in: kvm_amd kvm irqbypass nf_tables nfnetlink bridge stp llc squashfs loop vfat fat dm_multipath intel_rapl_msr intel_rapl_common amd64_edac edac_mce_amd wmi_bmof sg ipmi_ssif dm_mod acpi_ipmi ccp k10temp ipmi_si acpi_cpufreq sch_fq_codel ipmi_devintf ipmi_msghandler fuse ip_tables ext4 mbcache jbd2 btrfs blake2b_generic zstd_compress raid10 raid456 async_raid6_recov async_memcpy as ync_pq async_xor async_tx xor raid6_pq libcrc32c raid1 raid0 linear ast
> i2c_algo_bit drm_vram_helper drm_ttm_helper ttm crct10dif_pclmul crc32_pclmul ses crc32c_intel drm_kms_helper enclosure ghash_clmulni_intel nvme sd_mod scsi_transport_sas syscopyarea aesni_intel sysfillrect crypto_simd nvme_core sysimgblt cryptd t10_pi fb_sys_fops tg3 uas crc64_rocksoft_generic i2c_designware_platform ptp crc64_rocksoft drm i2c_piix4 i2c_designware_core usb_storage pps_core crc64 wmi pinctrl_amd i2c_core
> 14 [ 105.470851] CPU: 279 PID: 11511 Comm: qemu-system-x86 Kdump: loaded Not tainted 5.19.0-rc1-kvm-queue-x2avic+ #38
> 15 [ 105.470856] Hardware name: AMD Corporation QUARTZ/QUARTZ, BIOS TQZ0080D 05/11/2022
> 16 [ 105.470858] RIP: 0010:kvm_lapic_xapic_id_updated.cold+0x13/0x2f [kvm]
> 17 [ 105.470906] Code: db 8f fd ff 48 c7 c7 8d e8 ca c0 e8 43 27 88 ce 31 c0 e9 f8 90 fd ff 48 c7 c6 00 6a ca c0 48 c7 c7 e5 e8 ca c0 e8 29 27 88 ce <0f> 0b 48 8b 83 90 00 00 00 ba 01 00 00 00 be 04 00 00 00 5b 48 8b
> 18 [ 105.470909] RSP: 0018:ffffb13a436d7d40 EFLAGS: 00010246
> 19 [ 105.470913] RAX: 0000000000000030 RBX: ffff9f0372c98400 RCX: 0000000000000000
> 20 [ 105.470916] RDX: 0000000000000000 RSI: ffffffff8fd59e05 RDI: 00000000ffffffff
> 21 [ 105.470918] RBP: ffffb13a436d7e40 R08: 0000000000000030 R09: 0000000000000002
> 22 [ 105.470920] R10: 000000000000000f R11: ffff9f21c5c2fc80 R12: ffff9f0372c64250
> 23 [ 105.470921] R13: ffff9f0372c64250 R14: 00007f9ac9ffa2f0 R15: ffff9f0344da7000
> 24 [ 105.470930] FS: 00007f9ac9ffb640(0000) GS:ffff9f118edc0000(0000) knlGS:0000000000000000
> 25 [ 105.470932] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> 26 [ 105.470934] CR2: 00007fa34c73c001 CR3: 00000001b71a2003 CR4: 0000000000770ee0
> 27 [ 105.470936] PKRU: 55555554
> 28 [ 105.470938] Call Trace:
> 29 [ 105.470942] <TASK>
> 30 [ 105.470945] kvm_apic_state_fixup+0x85/0xb0 [kvm]
> 31 [ 105.471002] kvm_arch_vcpu_ioctl+0xa01/0x14b0 [kvm]
> 32 [ 105.471080] ? __local_bh_enable_ip+0x37/0x70
> 33 [ 105.471088] ? copy_fpstate_to_sigframe+0x2f6/0x360
> 34 [ 105.471099] ? mod_objcg_state+0xd2/0x360
> 35 [ 105.471109] ? refill_obj_stock+0xb0/0x160
> 36 [ 105.471116] ? kvm_vcpu_ioctl+0x4bc/0x680 [kvm]
> 37 [ 105.471156] kvm_vcpu_ioctl+0x4bc/0x680 [kvm]
> 38 [ 105.471197] __x64_sys_ioctl+0x83/0xb0
> 39 [ 105.471206] do_syscall_64+0x3b/0x90
> 40 [ 105.471218] entry_SYSCALL_64_after_hwframe+0x46/0xb0
> 41 [ 105.471228] RIP: 0033:0x7fa356d19a2b
> 42 [ 105.471232] Code: ff ff ff 85 c0 79 8b 49 c7 c4 ff ff ff ff 5b 5d 4c 89 e0 41 5c c3 66 0f 1f 84 00 00 00 00 00 f3 0f 1e fa b8 10 00 00 00 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d d5 f3 0f 00 f7 d8 64 89 01 48
> 43 [ 105.471235] RSP: 002b:00007f9ac9ffa248 EFLAGS: 00000246 ORIG_RAX: 0000000000000010
> 44 [ 105.471240] RAX: ffffffffffffffda RBX: 000000008400ae8e RCX: 00007fa356d19a2b
> 45 [ 105.471243] RDX: 00007f9ac9ffa2f0 RSI: ffffffff8400ae8e RDI: 000000000000010c
> 46 [ 105.471245] RBP: 0000561ce47ee560 R08: 0000561ce2351954 R09: 0000561ce2351c5c
> 47 [ 105.471248] R10: 00007f9ab80037b0 R11: 0000000000000246 R12: 00007f9ac9ffa2f0
> 48 [ 105.471266] R13: 00007f9ab80037b0 R14: fff0000000000000 R15: 00007f9ac97fb000
> 49 [ 105.471270] </TASK>
> 50 [ 105.471272] ---[ end trace 0000000000000000 ]---
>
> Best Regards,
> Suravee
>