2022-02-21 06:40:36

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFC PATCH 00/13] Introducing AMD x2APIC Virtualization (x2AVIC) support.

Previously, with AVIC, guest needs to disable x2APIC capability and
can only run in APIC mode to activate hardware-accelerated interrupt
virtualization. With x2AVIC, guest can run in x2APIC mode.
This feature is indicated by the CPUID Fn8000_000A EDX[14],
and it can be activated by setting bit 31 (enable AVIC) and
bit 30 (x2APIC mode) of VMCB offset 60h.

The mode of interrupt virtualization can dynamically change during runtime.
For example, when AVIC is enabled, the hypervisor currently keeps track of
the AVIC activation and set the VMCB bit 31 accordingly. With x2AVIC,
the guest OS can also switch between APIC and x2APIC modes during runtime.
The kvm_amd driver needs to also keep track and set the VMCB
bit 30 accordingly.

Besides, for x2AVIC, kvm_amd driver needs to disable interception for the
x2APIC MSR range to allow AVIC hardware to virtualize register accesses.

Testing:
* This series has been tested booting a Linux VM with x2APIC physical
and logical modes upto 511 vCPUs.

Regards,
Suravee


Suravee Suthikulpanit (13):
KVM: SVM: Add warning when encounter invalid APIC ID
x86/cpufeatures: Introduce x2AVIC CPUID bit
KVM: SVM: Detect X2APIC virtualization (x2AVIC) support
KVM: SVM: Only call vcpu_(un)blocking when AVIC is enabled.
KVM: SVM: Update max number of vCPUs supported for x2AVIC mode
KVM: SVM: Add logic to determine x2APIC mode
KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID
KVM: SVM: Do not update logical APIC ID table when in x2APIC mode
KVM: SVM: Introduce helper function avic_get_apic_id
KVM: SVM: Adding support for configuring x2APIC MSRs interception
KVM: SVM: Add logic to switch between APIC and x2APIC virtualization
mode
KVM: SVM: Remove APICv inhibit reasone due to x2APIC
KVM: SVM: Use fastpath x2apic IPI emulation when #vmexit with x2AVIC

arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/svm.h | 15 +-
arch/x86/kvm/svm/avic.c | 216 ++++++++++++++++++++++++++---
arch/x86/kvm/svm/svm.c | 102 ++++++++------
arch/x86/kvm/svm/svm.h | 13 +-
arch/x86/kvm/x86.c | 3 +-
arch/x86/kvm/x86.h | 1 +
7 files changed, 281 insertions(+), 70 deletions(-)

--
2.25.1


2022-02-21 09:13:14

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFC PATCH 08/13] KVM: SVM: Do not update logical APIC ID table when in x2APIC mode

In X2APIC mode the Logical Destination Register is read-only,
which provides a fixed mapping between the logical and physical
APIC IDs. Therefore, there is no Logical APIC ID table in X2AVIC
and the processor uses the X2APIC ID in the backing page to create
a vCPU’s logical ID.

Therefore, add logic to check x2APIC mode before updating logical
APIC ID table.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 215d8a7dbc1d..55b3b703b93b 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -417,6 +417,10 @@ static int avic_ldr_write(struct kvm_vcpu *vcpu, u8 g_physical_id, u32 ldr)
bool flat;
u32 *entry, new_entry;

+ /* Note: x2AVIC does not use logical APIC ID table */
+ if (apic_x2apic_mode(vcpu->arch.apic))
+ return 0;
+
flat = kvm_lapic_get_reg(vcpu->arch.apic, APIC_DFR) == APIC_DFR_FLAT;
entry = avic_get_logical_id_entry(vcpu, ldr, flat);
if (!entry)
@@ -435,8 +439,13 @@ static void avic_invalidate_logical_id_entry(struct kvm_vcpu *vcpu)
{
struct vcpu_svm *svm = to_svm(vcpu);
bool flat = svm->dfr_reg == APIC_DFR_FLAT;
- u32 *entry = avic_get_logical_id_entry(vcpu, svm->ldr_reg, flat);
+ u32 *entry;
+
+ /* Note: x2AVIC does not use logical APIC ID table */
+ if (apic_x2apic_mode(vcpu->arch.apic))
+ return;

+ entry = avic_get_logical_id_entry(vcpu, svm->ldr_reg, flat);
if (entry)
clear_bit(AVIC_LOGICAL_ID_ENTRY_VALID_BIT, (unsigned long *)entry);
}
--
2.25.1

2022-02-21 09:16:49

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFC PATCH 04/13] KVM: SVM: Only call vcpu_(un)blocking when AVIC is enabled.

The kvm_x86_ops.vcpu_(un)blocking are needed by AVIC only.
Therefore, set the ops only when AVIC is enabled.

Suggested-by: Sean Christopherson <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 12 ++++++++++--
arch/x86/kvm/svm/svm.c | 7 -------
arch/x86/kvm/svm/svm.h | 2 --
3 files changed, 10 insertions(+), 11 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index abde08ca23ab..0040824e4376 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -996,7 +996,7 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu)
WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
}

-void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
+static void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
{
if (!kvm_vcpu_apicv_active(vcpu))
return;
@@ -1021,7 +1021,7 @@ void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
preempt_enable();
}

-void avic_vcpu_unblocking(struct kvm_vcpu *vcpu)
+static void avic_vcpu_unblocking(struct kvm_vcpu *vcpu)
{
int cpu;

@@ -1057,6 +1057,14 @@ bool avic_hardware_setup(struct kvm_x86_ops *x86_ops)
pr_info("x2AVIC enabled\n");
}

+ if (avic_mode) {
+ x86_ops->vcpu_blocking = avic_vcpu_blocking;
+ x86_ops->vcpu_unblocking = avic_vcpu_unblocking;
+ } else {
+ x86_ops->vcpu_blocking = NULL;
+ x86_ops->vcpu_unblocking = NULL;
+ }
+
amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
return !!avic_mode;
}
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3048f4b758d6..3687026f2859 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -4531,8 +4531,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
.prepare_guest_switch = svm_prepare_guest_switch,
.vcpu_load = svm_vcpu_load,
.vcpu_put = svm_vcpu_put,
- .vcpu_blocking = avic_vcpu_blocking,
- .vcpu_unblocking = avic_vcpu_unblocking,

.update_exception_bitmap = svm_update_exception_bitmap,
.get_msr_feature = svm_get_msr_feature,
@@ -4819,11 +4817,6 @@ static __init int svm_hardware_setup(void)

enable_apicv = avic = avic && avic_hardware_setup(&svm_x86_ops);

- if (!enable_apicv) {
- svm_x86_ops.vcpu_blocking = NULL;
- svm_x86_ops.vcpu_unblocking = NULL;
- }
-
if (vls) {
if (!npt_enabled ||
!boot_cpu_has(X86_FEATURE_V_VMSAVE_VMLOAD) ||
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index b53c83a44ec2..1a0bf6b853df 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -578,8 +578,6 @@ void svm_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr);
bool svm_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu);
int svm_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
uint32_t guest_irq, bool set);
-void avic_vcpu_blocking(struct kvm_vcpu *vcpu);
-void avic_vcpu_unblocking(struct kvm_vcpu *vcpu);
void avic_ring_doorbell(struct kvm_vcpu *vcpu);

/* sev.c */
--
2.25.1

2022-02-21 09:23:29

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFC PATCH 11/13] KVM: SVM: Add logic to switch between APIC and x2APIC virtualization mode

When an AVIC-enabled guest switch from APIC to x2APIC mode during runtime,
the SVM driver needs to

1. Set the x2APIC mode bit for AVIC in VMCB along with the maximum
APIC ID support for each mode accodingly.

2. Disable x2APIC MSRs interception in order to allow the hardware
to virtualize x2APIC MSRs accesses.

This is currently handled in the svm_refresh_apicv_exec_ctrl().

Note that guest kerenel does not need to disable APIC before swtiching
to x2APIC. Therefore the WARN_ON in vcpu_load() to check if the vCPU is
currently running is no longer appropriate.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 61 +++++++++++++++++++++++++++++++++++++----
1 file changed, 55 insertions(+), 6 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 3543b7a4514a..3306b74f1d8b 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -79,6 +79,50 @@ static inline enum avic_modes avic_get_vcpu_apic_mode(struct vcpu_svm *svm)
return AVIC_MODE_NONE;
}

+static inline void avic_set_x2apic_msr_interception(struct vcpu_svm *svm, bool disable)
+{
+ int i;
+
+ for (i = 0x800; i <= 0x8ff; i++)
+ set_msr_interception(&svm->vcpu, svm->msrpm, i,
+ !disable, !disable);
+}
+
+void avic_activate_vmcb(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+
+ if (svm->x2apic_enabled) {
+ vmcb->control.int_ctl |= X2APIC_MODE_MASK;
+ vmcb->control.avic_physical_id &= ~X2AVIC_MAX_PHYSICAL_ID;
+ vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
+ /* Disabling MSR intercept for x2APIC registers */
+ avic_set_x2apic_msr_interception(svm, false);
+ } else {
+ vmcb->control.avic_physical_id &= ~AVIC_MAX_PHYSICAL_ID;
+ vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
+ /* Enabling MSR intercept for x2APIC registers */
+ avic_set_x2apic_msr_interception(svm, true);
+ }
+}
+
+void avic_deactivate_vmcb(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
+
+ if (svm->x2apic_enabled)
+ vmcb->control.avic_physical_id &= ~X2AVIC_MAX_PHYSICAL_ID;
+ else
+ vmcb->control.avic_physical_id &= ~AVIC_MAX_PHYSICAL_ID;
+
+ /* Enabling MSR intercept for x2APIC registers */
+ avic_set_x2apic_msr_interception(svm, true);
+}
+
/* Note:
* This function is called from IOMMU driver to notify
* SVM to schedule in a particular vCPU of a particular VM.
@@ -195,13 +239,12 @@ void avic_init_vmcb(struct vcpu_svm *svm)
vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
- vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;

if (kvm_apicv_activated(svm->vcpu.kvm))
- vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ avic_activate_vmcb(svm);
else
- vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
+ avic_deactivate_vmcb(svm);
}

static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
@@ -657,6 +700,13 @@ void avic_update_vapic_bar(struct vcpu_svm *svm, u64 data)
svm->x2apic_enabled ? "x2APIC" : "xAPIC");
vmcb_mark_dirty(svm->vmcb, VMCB_AVIC);
kvm_vcpu_update_apicv(&svm->vcpu);
+
+ /*
+ * The VM could be running w/ AVIC activated switching from APIC
+ * to x2APIC mode. We need to all refresh to make sure that all
+ * x2AVIC configuration are being done.
+ */
+ svm_refresh_apicv_exec_ctrl(&svm->vcpu);
}

void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
@@ -722,9 +772,9 @@ void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
* accordingly before re-activating.
*/
avic_post_state_restore(vcpu);
- vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ avic_activate_vmcb(svm);
} else {
- vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
+ avic_deactivate_vmcb(svm);
}
vmcb_mark_dirty(vmcb, VMCB_AVIC);

@@ -1019,7 +1069,6 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
return;

entry = READ_ONCE(*(svm->avic_physical_id_cache));
- WARN_ON(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);

entry &= ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
entry |= (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK);
--
2.25.1

2022-02-21 09:39:40

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFC PATCH 07/13] KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID

In x2APIC mode, ICRH contains 32-bit destination APIC ID.
So, update the avic_kick_target_vcpus() accordingly.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 13 +++++++++----
1 file changed, 9 insertions(+), 4 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 60f30e48d816..215d8a7dbc1d 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -307,10 +307,16 @@ void avic_ring_doorbell(struct kvm_vcpu *vcpu)
}

static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
- u32 icrl, u32 icrh)
+ u32 icrl, u32 icrh, bool x2apic_enabled)
{
struct kvm_vcpu *vcpu;
unsigned long i;
+ u32 dest;
+
+ if (x2apic_enabled)
+ dest = icrh;
+ else
+ dest = GET_APIC_DEST_FIELD(icrh);

/*
* Wake any target vCPUs that are blocking, i.e. waiting for a wake
@@ -320,8 +326,7 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
*/
kvm_for_each_vcpu(i, vcpu, kvm) {
if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
- GET_APIC_DEST_FIELD(icrh),
- icrl & APIC_DEST_MASK)) {
+ dest, icrl & APIC_DEST_MASK)) {
vcpu->arch.apic->irr_pending = true;
svm_complete_interrupt_delivery(vcpu,
icrl & APIC_MODE_MASK,
@@ -364,7 +369,7 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu)
* set the appropriate IRR bits on the valid target
* vcpus. So, we just need to kick the appropriate vcpu.
*/
- avic_kick_target_vcpus(vcpu->kvm, apic, icrl, icrh);
+ avic_kick_target_vcpus(vcpu->kvm, apic, icrl, icrh, svm->x2apic_enabled);
break;
case AVIC_IPI_FAILURE_INVALID_TARGET:
break;
--
2.25.1

2022-02-21 09:40:20

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFC PATCH 10/13] KVM: SVM: Adding support for configuring x2APIC MSRs interception

When enabling x2APIC virtualization (x2AVIC), the interception of
x2APIC MSRs must be disabled to let the hardware virtualize guest
MSR accesses.

Current implementation keeps track of MSR interception state
for generic MSRs in the svm_direct_access_msrs array.
For x2APIC MSRs, introduce direct_access_x2apic_msrs array.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/svm.c | 67 +++++++++++++++++++++++++++++++-----------
arch/x86/kvm/svm/svm.h | 7 +++++
2 files changed, 57 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 4e6dc1feeac7..afca26aa1f40 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -89,7 +89,7 @@ static uint64_t osvw_len = 4, osvw_status;
static DEFINE_PER_CPU(u64, current_tsc_ratio);
#define TSC_RATIO_DEFAULT 0x0100000000ULL

-static const struct svm_direct_access_msrs {
+static struct svm_direct_access_msrs {
u32 index; /* Index of the MSR */
bool always; /* True if intercept is initially cleared */
} direct_access_msrs[MAX_DIRECT_ACCESS_MSRS] = {
@@ -117,6 +117,9 @@ static const struct svm_direct_access_msrs {
{ .index = MSR_INVALID, .always = false },
};

+static struct svm_direct_access_msrs
+direct_access_x2apic_msrs[NUM_DIRECT_ACCESS_X2APIC_MSRS + 1];
+
/*
* These 2 parameters are used to config the controls for Pause-Loop Exiting:
* pause_filter_count: On processors that support Pause filtering(indicated
@@ -609,41 +612,42 @@ static int svm_cpu_init(int cpu)

}

-static int direct_access_msr_slot(u32 msr)
+static int direct_access_msr_slot(u32 msr, struct svm_direct_access_msrs *msrs)
{
u32 i;

- for (i = 0; direct_access_msrs[i].index != MSR_INVALID; i++)
- if (direct_access_msrs[i].index == msr)
+ for (i = 0; msrs[i].index != MSR_INVALID; i++)
+ if (msrs[i].index == msr)
return i;

return -ENOENT;
}

-static void set_shadow_msr_intercept(struct kvm_vcpu *vcpu, u32 msr, int read,
- int write)
+static void set_shadow_msr_intercept(struct kvm_vcpu *vcpu,
+ struct svm_direct_access_msrs *msrs, u32 msr,
+ int read, void *read_bits,
+ int write, void *write_bits)
{
- struct vcpu_svm *svm = to_svm(vcpu);
- int slot = direct_access_msr_slot(msr);
+ int slot = direct_access_msr_slot(msr, msrs);

if (slot == -ENOENT)
return;

/* Set the shadow bitmaps to the desired intercept states */
if (read)
- set_bit(slot, svm->shadow_msr_intercept.read);
+ set_bit(slot, read_bits);
else
- clear_bit(slot, svm->shadow_msr_intercept.read);
+ clear_bit(slot, read_bits);

if (write)
- set_bit(slot, svm->shadow_msr_intercept.write);
+ set_bit(slot, write_bits);
else
- clear_bit(slot, svm->shadow_msr_intercept.write);
+ clear_bit(slot, write_bits);
}

-static bool valid_msr_intercept(u32 index)
+static bool valid_msr_intercept(u32 index, struct svm_direct_access_msrs *msrs)
{
- return direct_access_msr_slot(index) != -ENOENT;
+ return direct_access_msr_slot(index, msrs) != -ENOENT;
}

static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
@@ -674,9 +678,12 @@ static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,

/*
* If this warning triggers extend the direct_access_msrs list at the
- * beginning of the file
+ * beginning of the file. The direct_access_x2apic_msrs is only for
+ * x2apic MSRs.
*/
- WARN_ON(!valid_msr_intercept(msr));
+ WARN_ON(!valid_msr_intercept(msr, direct_access_msrs) &&
+ (boot_cpu_has(X86_FEATURE_X2AVIC) &&
+ !valid_msr_intercept(msr, direct_access_x2apic_msrs)));

/* Enforce non allowed MSRs to trap */
if (read && !kvm_msr_allowed(vcpu, msr, KVM_MSR_FILTER_READ))
@@ -704,7 +711,16 @@ static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr,
int read, int write)
{
- set_shadow_msr_intercept(vcpu, msr, read, write);
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ if (msr < 0x800 || msr > 0x8ff)
+ set_shadow_msr_intercept(vcpu, direct_access_msrs, msr,
+ read, svm->shadow_msr_intercept.read,
+ write, svm->shadow_msr_intercept.write);
+ else
+ set_shadow_msr_intercept(vcpu, direct_access_x2apic_msrs, msr,
+ read, svm->shadow_x2apic_msr_intercept.read,
+ write, svm->shadow_x2apic_msr_intercept.write);
set_msr_interception_bitmap(vcpu, msrpm, msr, read, write);
}

@@ -786,6 +802,22 @@ static void add_msr_offset(u32 offset)
BUG();
}

+static void init_direct_access_x2apic_msrs(void)
+{
+ int i;
+
+ /* Initialize x2APIC direct_access_x2apic_msrs entries */
+ for (i = 0; i < NUM_DIRECT_ACCESS_X2APIC_MSRS; i++) {
+ direct_access_x2apic_msrs[i].index = boot_cpu_has(X86_FEATURE_X2AVIC) ?
+ (0x800 + i) : MSR_INVALID;
+ direct_access_x2apic_msrs[i].always = false;
+ }
+
+ /* Initialize last entry */
+ direct_access_x2apic_msrs[i].index = MSR_INVALID;
+ direct_access_x2apic_msrs[i].always = false;
+}
+
static void init_msrpm_offsets(void)
{
int i;
@@ -4752,6 +4784,7 @@ static __init int svm_hardware_setup(void)
memset(iopm_va, 0xff, PAGE_SIZE * (1 << order));
iopm_base = page_to_pfn(iopm_pages) << PAGE_SHIFT;

+ init_direct_access_x2apic_msrs();
init_msrpm_offsets();

supported_xcr0 &= ~(XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index bfbebb933da2..41514df5107e 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -29,6 +29,8 @@

#define MAX_DIRECT_ACCESS_MSRS 20
#define MSRPM_OFFSETS 16
+#define NUM_DIRECT_ACCESS_X2APIC_MSRS 0x100
+
extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
extern bool npt_enabled;
extern bool intercept_smi;
@@ -242,6 +244,11 @@ struct vcpu_svm {
DECLARE_BITMAP(write, MAX_DIRECT_ACCESS_MSRS);
} shadow_msr_intercept;

+ struct {
+ DECLARE_BITMAP(read, NUM_DIRECT_ACCESS_X2APIC_MSRS);
+ DECLARE_BITMAP(write, NUM_DIRECT_ACCESS_X2APIC_MSRS);
+ } shadow_x2apic_msr_intercept;
+
struct vcpu_sev_es_state sev_es;

bool guest_state_loaded;
--
2.25.1

2022-02-21 09:49:12

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFC PATCH 01/13] KVM: SVM: Add warning when encounter invalid APIC ID

Current logic checks if avic_get_physical_id_entry() fails to
get the entry, and return error. This could silently cause
AVIC to fail to operate. Therefore, add WARN_ON to help
report this error.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index f07956f15d3b..472445aaaf42 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -447,8 +447,10 @@ static int avic_handle_apic_id_update(struct kvm_vcpu *vcpu)

old = avic_get_physical_id_entry(vcpu, vcpu->vcpu_id);
new = avic_get_physical_id_entry(vcpu, id);
- if (!new || !old)
+ if (!new || !old) {
+ WARN_ON(1);
return 1;
+ }

/* We need to move physical_id_entry to new offset */
*new = *old;
--
2.25.1

2022-02-21 09:54:50

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFC PATCH 06/13] KVM: SVM: Add logic to determine x2APIC mode

Introduce avic_update_vapic_bar(), which checks the x2APIC enable bit
of the APIC Base register and update the new struct vcpu_svm.x2apic_enabled
to keep track of current APIC mode of each vCPU,

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 13 +++++++++++++
arch/x86/kvm/svm/svm.c | 4 ++++
arch/x86/kvm/svm/svm.h | 2 ++
3 files changed, 19 insertions(+)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 1999076966fd..60f30e48d816 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -609,6 +609,19 @@ void avic_post_state_restore(struct kvm_vcpu *vcpu)
avic_handle_ldr_update(vcpu);
}

+void avic_update_vapic_bar(struct vcpu_svm *svm, u64 data)
+{
+ svm->vmcb->control.avic_vapic_bar = data & VMCB_AVIC_APIC_BAR_MASK;
+
+ /* Set x2APIC mode bit if guest enable x2apic mode. */
+ svm->x2apic_enabled = (avic_mode == AVIC_MODE_X2 &&
+ kvm_apic_mode(data) == LAPIC_MODE_X2APIC);
+ pr_debug("vcpu_id:%d switch to %s\n", svm->vcpu.vcpu_id,
+ svm->x2apic_enabled ? "x2APIC" : "xAPIC");
+ vmcb_mark_dirty(svm->vmcb, VMCB_AVIC);
+ kvm_vcpu_update_apicv(&svm->vcpu);
+}
+
void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
{
return;
diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3687026f2859..4e6dc1feeac7 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -2867,6 +2867,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
svm->msr_decfg = data;
break;
}
+ case MSR_IA32_APICBASE:
+ if (kvm_vcpu_apicv_active(vcpu))
+ avic_update_vapic_bar(to_svm(vcpu), data);
+ fallthrough;
default:
return kvm_set_msr_common(vcpu, msr);
}
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index 1a0bf6b853df..bfbebb933da2 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -225,6 +225,7 @@ struct vcpu_svm {
u32 dfr_reg;
struct page *avic_backing_page;
u64 *avic_physical_id_cache;
+ bool x2apic_enabled;

/*
* Per-vcpu list of struct amd_svm_iommu_ir:
@@ -566,6 +567,7 @@ void avic_init_vmcb(struct vcpu_svm *svm);
int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu);
int avic_unaccelerated_access_interception(struct kvm_vcpu *vcpu);
int avic_init_vcpu(struct vcpu_svm *svm);
+void avic_update_vapic_bar(struct vcpu_svm *svm, u64 data);
void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
void avic_vcpu_put(struct kvm_vcpu *vcpu);
void avic_post_state_restore(struct kvm_vcpu *vcpu);
--
2.25.1

2022-02-21 09:59:38

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFC PATCH 05/13] KVM: SVM: Update max number of vCPUs supported for x2AVIC mode

xAVIC and x2AVIC modes can support diffferent number of vcpus.
Update existing logics to support each mode accordingly.

Also, modify the maximum physical APIC ID for AVIC to 255 to reflect
the actual value supported by the architecture.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/svm.h | 12 +++++++++---
arch/x86/kvm/svm/avic.c | 8 +++++---
2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 7a7a2297165b..681a348a9365 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -250,10 +250,16 @@ enum avic_ipi_failure_cause {


/*
- * 0xff is broadcast, so the max index allowed for physical APIC ID
- * table is 0xfe. APIC IDs above 0xff are reserved.
+ * For AVIC, the max index allowed for physical APIC ID
+ * table is 0xff (255).
*/
-#define AVIC_MAX_PHYSICAL_ID_COUNT 0xff
+#define AVIC_MAX_PHYSICAL_ID 0XFFULL
+
+/*
+ * For x2AVIC, the max index allowed for physical APIC ID
+ * table is 0x1ff (511).
+ */
+#define X2AVIC_MAX_PHYSICAL_ID 0x1FFUL

#define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF)
#define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 0040824e4376..1999076966fd 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -195,7 +195,7 @@ void avic_init_vmcb(struct vcpu_svm *svm)
vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
- vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID_COUNT;
+ vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;

if (kvm_apicv_activated(svm->vcpu.kvm))
@@ -210,7 +210,8 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
u64 *avic_physical_id_table;
struct kvm_svm *kvm_svm = to_kvm_svm(vcpu->kvm);

- if (index >= AVIC_MAX_PHYSICAL_ID_COUNT)
+ if ((avic_mode == AVIC_MODE_X1 && index > AVIC_MAX_PHYSICAL_ID) ||
+ (avic_mode == AVIC_MODE_X2 && index > X2AVIC_MAX_PHYSICAL_ID))
return NULL;

avic_physical_id_table = page_address(kvm_svm->avic_physical_id_table_page);
@@ -257,7 +258,8 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
int id = vcpu->vcpu_id;
struct vcpu_svm *svm = to_svm(vcpu);

- if (id >= AVIC_MAX_PHYSICAL_ID_COUNT)
+ if ((avic_mode == AVIC_MODE_X1 && id > AVIC_MAX_PHYSICAL_ID) ||
+ (avic_mode == AVIC_MODE_X2 && id > X2AVIC_MAX_PHYSICAL_ID))
return -EINVAL;

if (!vcpu->arch.apic->regs)
--
2.25.1

2022-02-21 10:03:36

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFC PATCH 13/13] KVM: SVM: Use fastpath x2apic IPI emulation when #vmexit with x2AVIC

When sends IPI to a halting vCPU, the hardware generates
avic_incomplete_ipi #vmexit with the
AVIC_IPI_FAILURE_TARGET_NOT_RUNNING reason.

For x2AVIC, enable fastpath emulation.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 2 ++
arch/x86/kvm/x86.c | 3 ++-
arch/x86/kvm/x86.h | 1 +
3 files changed, 5 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 874c89f8fd47..758a79ee7f99 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -428,6 +428,8 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu)
kvm_lapic_reg_write(apic, APIC_ICR, icrl);
break;
case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING:
+ handle_fastpath_set_x2apic_icr_irqoff(vcpu, svm->vmcb->control.exit_info_1);
+
/*
* At this point, we expect that the AVIC HW has already
* set the appropriate IRR bits on the valid target
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 641044db415d..c293027c7c10 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -2008,7 +2008,7 @@ static inline bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu)
* from guest to host, e.g. reacquiring KVM's SRCU lock. In contrast to the
* other cases which must be called after interrupts are enabled on the host.
*/
-static int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data)
+int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data)
{
if (!lapic_in_kernel(vcpu) || !apic_x2apic_mode(vcpu->arch.apic))
return 1;
@@ -2028,6 +2028,7 @@ static int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data

return 1;
}
+EXPORT_SYMBOL_GPL(handle_fastpath_set_x2apic_icr_irqoff);

static int handle_fastpath_set_tscdeadline(struct kvm_vcpu *vcpu, u64 data)
{
diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
index 767ec7f99516..035d20f83ca6 100644
--- a/arch/x86/kvm/x86.h
+++ b/arch/x86/kvm/x86.h
@@ -286,6 +286,7 @@ int x86_decode_emulated_instruction(struct kvm_vcpu *vcpu, int emulation_type,
int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
int emulation_type, void *insn, int insn_len);
fastpath_t handle_fastpath_set_msr_irqoff(struct kvm_vcpu *vcpu);
+int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data);

extern u64 host_xcr0;
extern u64 supported_xcr0;
--
2.25.1

2022-02-22 06:04:31

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFC PATCH 00/13] Introducing AMD x2APIC Virtualization (x2AVIC) support.

On 2/21/2022 9:19 AM, Suravee Suthikulpanit wrote:
> Testing:
> * This series has been tested booting a Linux VM with x2APIC physical
> and logical modes upto 511 vCPUs.

Update:
* This series has been tested booting a Linux VM with x2APIC physical
and logical modes upto 512 vCPUs with the following change in QEMU:

diff --git a/hw/i386/pc_q35.c b/hw/i386/pc_q35.c
index 797e09500b..282036df98 100644
--- a/hw/i386/pc_q35.c
+++ b/hw/i386/pc_q35.c
@@ -352,7 +352,7 @@ static void pc_q35_machine_options(MachineClass *m)
machine_class_allow_dynamic_sysbus_dev(m, TYPE_INTEL_IOMMU_DEVICE);
machine_class_allow_dynamic_sysbus_dev(m, TYPE_RAMFB_DEVICE);
machine_class_allow_dynamic_sysbus_dev(m, TYPE_VMBUS_BRIDGE);
- m->max_cpus = 288;
+ m->max_cpus = 512;
}

Regards,
Suravee

2022-02-22 06:04:41

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFC PATCH 11/13] KVM: SVM: Add logic to switch between APIC and x2APIC virtualization mode



On 2/21/2022 9:19 AM, Suravee Suthikulpanit wrote:
> ....
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 3543b7a4514a..3306b74f1d8b 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -79,6 +79,50 @@ static inline enum avic_modes avic_get_vcpu_apic_mode(struct vcpu_svm *svm)
> ..
> +void avic_activate_vmcb(struct vcpu_svm *svm)

This should be static void.

> +{
> + struct vmcb *vmcb = svm->vmcb01.ptr;
> +
> + vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> +
> + if (svm->x2apic_enabled) {
> + vmcb->control.int_ctl |= X2APIC_MODE_MASK;
> + vmcb->control.avic_physical_id &= ~X2AVIC_MAX_PHYSICAL_ID;
> + vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
> + /* Disabling MSR intercept for x2APIC registers */
> + avic_set_x2apic_msr_interception(svm, false);
> + } else {
> + vmcb->control.avic_physical_id &= ~AVIC_MAX_PHYSICAL_ID;
> + vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
> + /* Enabling MSR intercept for x2APIC registers */
> + avic_set_x2apic_msr_interception(svm, true);
> + }
> +}
> +
> +void avic_deactivate_vmcb(struct vcpu_svm *svm)

This should be static void.

Reported-by: kernel test robot <[email protected]>

Regards,
Suravee

2022-02-24 16:57:42

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFC PATCH 01/13] KVM: SVM: Add warning when encounter invalid APIC ID

On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
> Current logic checks if avic_get_physical_id_entry() fails to
> get the entry, and return error. This could silently cause
> AVIC to fail to operate. Therefore, add WARN_ON to help
> report this error.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm/avic.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index f07956f15d3b..472445aaaf42 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -447,8 +447,10 @@ static int avic_handle_apic_id_update(struct kvm_vcpu *vcpu)
>
> old = avic_get_physical_id_entry(vcpu, vcpu->vcpu_id);
> new = avic_get_physical_id_entry(vcpu, id);
> - if (!new || !old)
> + if (!new || !old) {
> + WARN_ON(1);
> return 1;
> + }
>
> /* We need to move physical_id_entry to new offset */
> *new = *old;

I about to purge all of this code, and disallow setting the apicid regardles
if x2apic, or plain xapic was used at least when AVIC is enabled.
I really hope for this to be accepted, so we won't need to fix this code.

Best regards,
Maxim Levitsky

2022-02-24 16:59:06

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFC PATCH 04/13] KVM: SVM: Only call vcpu_(un)blocking when AVIC is enabled.

On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
> The kvm_x86_ops.vcpu_(un)blocking are needed by AVIC only.
> Therefore, set the ops only when AVIC is enabled.
>
> Suggested-by: Sean Christopherson <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm/avic.c | 12 ++++++++++--
> arch/x86/kvm/svm/svm.c | 7 -------
> arch/x86/kvm/svm/svm.h | 2 --
> 3 files changed, 10 insertions(+), 11 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index abde08ca23ab..0040824e4376 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -996,7 +996,7 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu)
> WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
> }
>
> -void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
> +static void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
> {
> if (!kvm_vcpu_apicv_active(vcpu))
> return;
> @@ -1021,7 +1021,7 @@ void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
> preempt_enable();
> }
>
> -void avic_vcpu_unblocking(struct kvm_vcpu *vcpu)
> +static void avic_vcpu_unblocking(struct kvm_vcpu *vcpu)
> {
> int cpu;
>
> @@ -1057,6 +1057,14 @@ bool avic_hardware_setup(struct kvm_x86_ops *x86_ops)
> pr_info("x2AVIC enabled\n");
> }
>
> + if (avic_mode) {
> + x86_ops->vcpu_blocking = avic_vcpu_blocking;
> + x86_ops->vcpu_unblocking = avic_vcpu_unblocking;
> + } else {
> + x86_ops->vcpu_blocking = NULL;
> + x86_ops->vcpu_unblocking = NULL;
> + }
> +
> amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
> return !!avic_mode;
> }
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 3048f4b758d6..3687026f2859 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -4531,8 +4531,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
> .prepare_guest_switch = svm_prepare_guest_switch,
> .vcpu_load = svm_vcpu_load,
> .vcpu_put = svm_vcpu_put,
> - .vcpu_blocking = avic_vcpu_blocking,
> - .vcpu_unblocking = avic_vcpu_unblocking,
>
> .update_exception_bitmap = svm_update_exception_bitmap,
> .get_msr_feature = svm_get_msr_feature,
> @@ -4819,11 +4817,6 @@ static __init int svm_hardware_setup(void)
>
> enable_apicv = avic = avic && avic_hardware_setup(&svm_x86_ops);
>
> - if (!enable_apicv) {
> - svm_x86_ops.vcpu_blocking = NULL;
> - svm_x86_ops.vcpu_unblocking = NULL;
> - }

Isn't this code already zeros these callbacks when avic is not enabled?

I am not sure why this patch is needed to be honest.

Best regards,
Maxim Levitsky

> -
> if (vls) {
> if (!npt_enabled ||
> !boot_cpu_has(X86_FEATURE_V_VMSAVE_VMLOAD) ||
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index b53c83a44ec2..1a0bf6b853df 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -578,8 +578,6 @@ void svm_hwapic_isr_update(struct kvm_vcpu *vcpu, int max_isr);
> bool svm_dy_apicv_has_pending_interrupt(struct kvm_vcpu *vcpu);
> int svm_update_pi_irte(struct kvm *kvm, unsigned int host_irq,
> uint32_t guest_irq, bool set);
> -void avic_vcpu_blocking(struct kvm_vcpu *vcpu);
> -void avic_vcpu_unblocking(struct kvm_vcpu *vcpu);
> void avic_ring_doorbell(struct kvm_vcpu *vcpu);
>
> /* sev.c */


2022-02-24 18:17:01

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFC PATCH 08/13] KVM: SVM: Do not update logical APIC ID table when in x2APIC mode

On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
> In X2APIC mode the Logical Destination Register is read-only,
> which provides a fixed mapping between the logical and physical
> APIC IDs. Therefore, there is no Logical APIC ID table in X2AVIC
> and the processor uses the X2APIC ID in the backing page to create
> a vCPU’s logical ID.
>
> Therefore, add logic to check x2APIC mode before updating logical
> APIC ID table.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm/avic.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 215d8a7dbc1d..55b3b703b93b 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -417,6 +417,10 @@ static int avic_ldr_write(struct kvm_vcpu *vcpu, u8 g_physical_id, u32 ldr)
> bool flat;
> u32 *entry, new_entry;
>
> + /* Note: x2AVIC does not use logical APIC ID table */
> + if (apic_x2apic_mode(vcpu->arch.apic))
> + return 0;
> +
> flat = kvm_lapic_get_reg(vcpu->arch.apic, APIC_DFR) == APIC_DFR_FLAT;
> entry = avic_get_logical_id_entry(vcpu, ldr, flat);
> if (!entry)
> @@ -435,8 +439,13 @@ static void avic_invalidate_logical_id_entry(struct kvm_vcpu *vcpu)
> {
> struct vcpu_svm *svm = to_svm(vcpu);
> bool flat = svm->dfr_reg == APIC_DFR_FLAT;
> - u32 *entry = avic_get_logical_id_entry(vcpu, svm->ldr_reg, flat);
> + u32 *entry;
> +
> + /* Note: x2AVIC does not use logical APIC ID table */
> + if (apic_x2apic_mode(vcpu->arch.apic))
> + return;
>
> + entry = avic_get_logical_id_entry(vcpu, svm->ldr_reg, flat);
> if (entry)
> clear_bit(AVIC_LOGICAL_ID_ENTRY_VALID_BIT, (unsigned long *)entry);
> }


Here actually the good apic_x2apic_mode was used.

However, shouldn't we inject #GP in avic_ldr_write to make this read realy read-only?
It might be too late to do so here, since most AVIC writes are trap like.

Thus we need to make the msr that corresponds to LDR to be write protected in the msr bitmap,
and inject #GP when write it attempted.

Then we can add WARN_ON in this function for this case instead.

Best regards,
Maxim Levitsky

2022-02-24 19:24:10

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFC PATCH 07/13] KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID

On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
> In x2APIC mode, ICRH contains 32-bit destination APIC ID.
> So, update the avic_kick_target_vcpus() accordingly.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm/avic.c | 13 +++++++++----
> 1 file changed, 9 insertions(+), 4 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 60f30e48d816..215d8a7dbc1d 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -307,10 +307,16 @@ void avic_ring_doorbell(struct kvm_vcpu *vcpu)
> }
>
> static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
> - u32 icrl, u32 icrh)
> + u32 icrl, u32 icrh, bool x2apic_enabled)
> {
> struct kvm_vcpu *vcpu;
> unsigned long i;
> + u32 dest;
> +
> + if (x2apic_enabled)
> + dest = icrh;
> + else
> + dest = GET_APIC_DEST_FIELD(icrh);


Just use 'apic_x2apic_mode(apic)', no need for x2apic_enabled parameter
as I said in patch 6.

Also maybe rename GET_APIC_DEST_FIELD to GET_XAPIC_DEST_FIELD or something as it is
wrong for x2apic.

Best regards,
Maxim Levitsky

>
> /*
> * Wake any target vCPUs that are blocking, i.e. waiting for a wake
> @@ -320,8 +326,7 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
> */
> kvm_for_each_vcpu(i, vcpu, kvm) {
> if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
> - GET_APIC_DEST_FIELD(icrh),
> - icrl & APIC_DEST_MASK)) {
> + dest, icrl & APIC_DEST_MASK)) {
> vcpu->arch.apic->irr_pending = true;
> svm_complete_interrupt_delivery(vcpu,
> icrl & APIC_MODE_MASK,
> @@ -364,7 +369,7 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu)
> * set the appropriate IRR bits on the valid target
> * vcpus. So, we just need to kick the appropriate vcpu.
> */
> - avic_kick_target_vcpus(vcpu->kvm, apic, icrl, icrh);
> + avic_kick_target_vcpus(vcpu->kvm, apic, icrl, icrh, svm->x2apic_enabled);
> break;
> case AVIC_IPI_FAILURE_INVALID_TARGET:
> break;


2022-02-24 19:52:31

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFC PATCH 06/13] KVM: SVM: Add logic to determine x2APIC mode

On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
> Introduce avic_update_vapic_bar(), which checks the x2APIC enable bit
> of the APIC Base register and update the new struct vcpu_svm.x2apic_enabled
> to keep track of current APIC mode of each vCPU,
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm/avic.c | 13 +++++++++++++
> arch/x86/kvm/svm/svm.c | 4 ++++
> arch/x86/kvm/svm/svm.h | 2 ++
> 3 files changed, 19 insertions(+)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 1999076966fd..60f30e48d816 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -609,6 +609,19 @@ void avic_post_state_restore(struct kvm_vcpu *vcpu)
> avic_handle_ldr_update(vcpu);
> }
>
> +void avic_update_vapic_bar(struct vcpu_svm *svm, u64 data)
> +{
> + svm->vmcb->control.avic_vapic_bar = data & VMCB_AVIC_APIC_BAR_MASK;
> +
> + /* Set x2APIC mode bit if guest enable x2apic mode. */
> + svm->x2apic_enabled = (avic_mode == AVIC_MODE_X2 &&
> + kvm_apic_mode(data) == LAPIC_MODE_X2APIC);
> + pr_debug("vcpu_id:%d switch to %s\n", svm->vcpu.vcpu_id,
> + svm->x2apic_enabled ? "x2APIC" : "xAPIC");
> + vmcb_mark_dirty(svm->vmcb, VMCB_AVIC);
> + kvm_vcpu_update_apicv(&svm->vcpu);
> +}
> +
> void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
> {
> return;
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 3687026f2859..4e6dc1feeac7 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -2867,6 +2867,10 @@ static int svm_set_msr(struct kvm_vcpu *vcpu, struct msr_data *msr)
> svm->msr_decfg = data;
> break;
> }
> + case MSR_IA32_APICBASE:
> + if (kvm_vcpu_apicv_active(vcpu))
> + avic_update_vapic_bar(to_svm(vcpu), data);
> + fallthrough;
> default:
> return kvm_set_msr_common(vcpu, msr);
> }
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index 1a0bf6b853df..bfbebb933da2 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -225,6 +225,7 @@ struct vcpu_svm {
> u32 dfr_reg;
> struct page *avic_backing_page;
> u64 *avic_physical_id_cache;
> + bool x2apic_enabled;
>
> /*
> * Per-vcpu list of struct amd_svm_iommu_ir:
> @@ -566,6 +567,7 @@ void avic_init_vmcb(struct vcpu_svm *svm);
> int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu);
> int avic_unaccelerated_access_interception(struct kvm_vcpu *vcpu);
> int avic_init_vcpu(struct vcpu_svm *svm);
> +void avic_update_vapic_bar(struct vcpu_svm *svm, u64 data);
> void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
> void avic_vcpu_put(struct kvm_vcpu *vcpu);
> void avic_post_state_restore(struct kvm_vcpu *vcpu);


Have you looked at how this is done on Intel's APICv side?
You need to implement .set_virtual_apic_mode instead.
(look at vmx_set_virtual_apic_mode)

I also don't think you need x2apic_enabled boolean.
You can already know if a vCPU uses apic or x2apic via

kvm_get_apic_mode(vcpu);

in fact I don't think avic code should have any bookeeping in regard to x2apic/x2avic mode,
but rather kvm's apic mode (which is read directly from apic base msr (vcpu->arch.apic_base),
should enable avic, or x2avic if possible, or inhibit avic if not possible.

e.g it should drive the bits in vmcb and such.

Best regards,
Maxim Levitsky

2022-02-24 20:30:49

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFC PATCH 10/13] KVM: SVM: Adding support for configuring x2APIC MSRs interception

On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
> When enabling x2APIC virtualization (x2AVIC), the interception of
> x2APIC MSRs must be disabled to let the hardware virtualize guest
> MSR accesses.
>
> Current implementation keeps track of MSR interception state
> for generic MSRs in the svm_direct_access_msrs array.
> For x2APIC MSRs, introduce direct_access_x2apic_msrs array.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm/svm.c | 67 +++++++++++++++++++++++++++++++-----------
> arch/x86/kvm/svm/svm.h | 7 +++++
> 2 files changed, 57 insertions(+), 17 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 4e6dc1feeac7..afca26aa1f40 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -89,7 +89,7 @@ static uint64_t osvw_len = 4, osvw_status;
> static DEFINE_PER_CPU(u64, current_tsc_ratio);
> #define TSC_RATIO_DEFAULT 0x0100000000ULL
>
> -static const struct svm_direct_access_msrs {
> +static struct svm_direct_access_msrs {
> u32 index; /* Index of the MSR */
> bool always; /* True if intercept is initially cleared */
> } direct_access_msrs[MAX_DIRECT_ACCESS_MSRS] = {
> @@ -117,6 +117,9 @@ static const struct svm_direct_access_msrs {
> { .index = MSR_INVALID, .always = false },
> };
>
> +static struct svm_direct_access_msrs
> +direct_access_x2apic_msrs[NUM_DIRECT_ACCESS_X2APIC_MSRS + 1];
> +
> /*
> * These 2 parameters are used to config the controls for Pause-Loop Exiting:
> * pause_filter_count: On processors that support Pause filtering(indicated
> @@ -609,41 +612,42 @@ static int svm_cpu_init(int cpu)
>
> }
>
> -static int direct_access_msr_slot(u32 msr)
> +static int direct_access_msr_slot(u32 msr, struct svm_direct_access_msrs *msrs)
> {
> u32 i;
>
> - for (i = 0; direct_access_msrs[i].index != MSR_INVALID; i++)
> - if (direct_access_msrs[i].index == msr)
> + for (i = 0; msrs[i].index != MSR_INVALID; i++)
> + if (msrs[i].index == msr)
> return i;
>
> return -ENOENT;
> }
>
> -static void set_shadow_msr_intercept(struct kvm_vcpu *vcpu, u32 msr, int read,
> - int write)
> +static void set_shadow_msr_intercept(struct kvm_vcpu *vcpu,
> + struct svm_direct_access_msrs *msrs, u32 msr,
> + int read, void *read_bits,
> + int write, void *write_bits)
> {
> - struct vcpu_svm *svm = to_svm(vcpu);
> - int slot = direct_access_msr_slot(msr);
> + int slot = direct_access_msr_slot(msr, msrs);
>
> if (slot == -ENOENT)
> return;
>
> /* Set the shadow bitmaps to the desired intercept states */
> if (read)
> - set_bit(slot, svm->shadow_msr_intercept.read);
> + set_bit(slot, read_bits);
> else
> - clear_bit(slot, svm->shadow_msr_intercept.read);
> + clear_bit(slot, read_bits);
>
> if (write)
> - set_bit(slot, svm->shadow_msr_intercept.write);
> + set_bit(slot, write_bits);
> else
> - clear_bit(slot, svm->shadow_msr_intercept.write);
> + clear_bit(slot, write_bits);
> }
>
> -static bool valid_msr_intercept(u32 index)
> +static bool valid_msr_intercept(u32 index, struct svm_direct_access_msrs *msrs)
> {
> - return direct_access_msr_slot(index) != -ENOENT;
> + return direct_access_msr_slot(index, msrs) != -ENOENT;
> }
>
> static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
> @@ -674,9 +678,12 @@ static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
>
> /*
> * If this warning triggers extend the direct_access_msrs list at the
> - * beginning of the file
> + * beginning of the file. The direct_access_x2apic_msrs is only for
> + * x2apic MSRs.
> */
> - WARN_ON(!valid_msr_intercept(msr));
> + WARN_ON(!valid_msr_intercept(msr, direct_access_msrs) &&
> + (boot_cpu_has(X86_FEATURE_X2AVIC) &&
> + !valid_msr_intercept(msr, direct_access_x2apic_msrs)));
>
> /* Enforce non allowed MSRs to trap */
> if (read && !kvm_msr_allowed(vcpu, msr, KVM_MSR_FILTER_READ))
> @@ -704,7 +711,16 @@ static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
> void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr,
> int read, int write)
> {
> - set_shadow_msr_intercept(vcpu, msr, read, write);
> + struct vcpu_svm *svm = to_svm(vcpu);
> +
> + if (msr < 0x800 || msr > 0x8ff)
> + set_shadow_msr_intercept(vcpu, direct_access_msrs, msr,
> + read, svm->shadow_msr_intercept.read,
> + write, svm->shadow_msr_intercept.write);
> + else
> + set_shadow_msr_intercept(vcpu, direct_access_x2apic_msrs, msr,
> + read, svm->shadow_x2apic_msr_intercept.read,
> + write, svm->shadow_x2apic_msr_intercept.write);
> set_msr_interception_bitmap(vcpu, msrpm, msr, read, write);
> }
>
> @@ -786,6 +802,22 @@ static void add_msr_offset(u32 offset)
> BUG();
> }
>
> +static void init_direct_access_x2apic_msrs(void)
> +{
> + int i;
> +
> + /* Initialize x2APIC direct_access_x2apic_msrs entries */
> + for (i = 0; i < NUM_DIRECT_ACCESS_X2APIC_MSRS; i++) {
> + direct_access_x2apic_msrs[i].index = boot_cpu_has(X86_FEATURE_X2AVIC) ?
> + (0x800 + i) : MSR_INVALID;
> + direct_access_x2apic_msrs[i].always = false;
> + }
> +
> + /* Initialize last entry */
> + direct_access_x2apic_msrs[i].index = MSR_INVALID;
> + direct_access_x2apic_msrs[i].always = false;
> +}
> +
> static void init_msrpm_offsets(void)
> {
> int i;
> @@ -4752,6 +4784,7 @@ static __init int svm_hardware_setup(void)
> memset(iopm_va, 0xff, PAGE_SIZE * (1 << order));
> iopm_base = page_to_pfn(iopm_pages) << PAGE_SHIFT;
>
> + init_direct_access_x2apic_msrs();
> init_msrpm_offsets();
>
> supported_xcr0 &= ~(XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR);
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index bfbebb933da2..41514df5107e 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -29,6 +29,8 @@
>
> #define MAX_DIRECT_ACCESS_MSRS 20
> #define MSRPM_OFFSETS 16
> +#define NUM_DIRECT_ACCESS_X2APIC_MSRS 0x100
> +
> extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
> extern bool npt_enabled;
> extern bool intercept_smi;
> @@ -242,6 +244,11 @@ struct vcpu_svm {
> DECLARE_BITMAP(write, MAX_DIRECT_ACCESS_MSRS);
> } shadow_msr_intercept;
>
> + struct {
> + DECLARE_BITMAP(read, NUM_DIRECT_ACCESS_X2APIC_MSRS);
> + DECLARE_BITMAP(write, NUM_DIRECT_ACCESS_X2APIC_MSRS);
> + } shadow_x2apic_msr_intercept;
> +
> struct vcpu_sev_es_state sev_es;
>
> bool guest_state_loaded;

I only gave this a cursory look, the whole thing is a bit ugly (not your fault),
I feel like it should be refactored a bit.


Best regards,
Maxim Levitsky

2022-02-24 23:10:13

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFC PATCH 13/13] KVM: SVM: Use fastpath x2apic IPI emulation when #vmexit with x2AVIC

On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
> When sends IPI to a halting vCPU, the hardware generates
> avic_incomplete_ipi #vmexit with the
> AVIC_IPI_FAILURE_TARGET_NOT_RUNNING reason.
>
> For x2AVIC, enable fastpath emulation.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm/avic.c | 2 ++
> arch/x86/kvm/x86.c | 3 ++-
> arch/x86/kvm/x86.h | 1 +
> 3 files changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 874c89f8fd47..758a79ee7f99 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -428,6 +428,8 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu)
> kvm_lapic_reg_write(apic, APIC_ICR, icrl);
> break;
> case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING:
> + handle_fastpath_set_x2apic_icr_irqoff(vcpu, svm->vmcb->control.exit_info_1);

This just doesn't seem right - it sends IPI to the target, while we just need to wake it up.
avic_kick_target_vcpus already does all of this, and it really should be optimized to avoid
going over all vcpus as it does currently.

Best regards,
Maxim Levitsky



> +
> /*
> * At this point, we expect that the AVIC HW has already
> * set the appropriate IRR bits on the valid target
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 641044db415d..c293027c7c10 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -2008,7 +2008,7 @@ static inline bool kvm_vcpu_exit_request(struct kvm_vcpu *vcpu)
> * from guest to host, e.g. reacquiring KVM's SRCU lock. In contrast to the
> * other cases which must be called after interrupts are enabled on the host.
> */
> -static int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data)
> +int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data)
> {
> if (!lapic_in_kernel(vcpu) || !apic_x2apic_mode(vcpu->arch.apic))
> return 1;
> @@ -2028,6 +2028,7 @@ static int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data
>
> return 1;
> }
> +EXPORT_SYMBOL_GPL(handle_fastpath_set_x2apic_icr_irqoff);
>
> static int handle_fastpath_set_tscdeadline(struct kvm_vcpu *vcpu, u64 data)
> {
> diff --git a/arch/x86/kvm/x86.h b/arch/x86/kvm/x86.h
> index 767ec7f99516..035d20f83ca6 100644
> --- a/arch/x86/kvm/x86.h
> +++ b/arch/x86/kvm/x86.h
> @@ -286,6 +286,7 @@ int x86_decode_emulated_instruction(struct kvm_vcpu *vcpu, int emulation_type,
> int x86_emulate_instruction(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> int emulation_type, void *insn, int insn_len);
> fastpath_t handle_fastpath_set_msr_irqoff(struct kvm_vcpu *vcpu);
> +int handle_fastpath_set_x2apic_icr_irqoff(struct kvm_vcpu *vcpu, u64 data);
>
> extern u64 host_xcr0;
> extern u64 supported_xcr0;


2022-02-24 23:57:24

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFC PATCH 05/13] KVM: SVM: Update max number of vCPUs supported for x2AVIC mode

On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
> xAVIC and x2AVIC modes can support diffferent number of vcpus.
> Update existing logics to support each mode accordingly.
>
> Also, modify the maximum physical APIC ID for AVIC to 255 to reflect
> the actual value supported by the architecture.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/include/asm/svm.h | 12 +++++++++---
> arch/x86/kvm/svm/avic.c | 8 +++++---
> 2 files changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 7a7a2297165b..681a348a9365 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -250,10 +250,16 @@ enum avic_ipi_failure_cause {
>
>
> /*
> - * 0xff is broadcast, so the max index allowed for physical APIC ID
> - * table is 0xfe. APIC IDs above 0xff are reserved.
> + * For AVIC, the max index allowed for physical APIC ID
> + * table is 0xff (255).
> */
> -#define AVIC_MAX_PHYSICAL_ID_COUNT 0xff
> +#define AVIC_MAX_PHYSICAL_ID 0XFFULL
> +
> +/*
> + * For x2AVIC, the max index allowed for physical APIC ID
> + * table is 0x1ff (511).
> + */
> +#define X2AVIC_MAX_PHYSICAL_ID 0x1FFUL

Yep, physid page can't hold more entries...

This brings the inventible question of what to do when a VM has more
that 512 vCPUs...

With AVIC, since it is xapic, it would be easy - xapic supports up to
254 CPUs.

But with x2apic, there is no such restriction on max 512 CPUs,
thus it is legal to create a VM with x2apic and more that 512 CPUs,
and x2AVIC won't work well in this case.

I guess AVIC_IPI_FAILURE_INVALID_TARGET, has to be extened to support those
cases, even with loss of performance, or we need to inhibit x2AVIC.

Best regards,
Maxim Levitsky

>
> #define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF)
> #define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 0040824e4376..1999076966fd 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -195,7 +195,7 @@ void avic_init_vmcb(struct vcpu_svm *svm)
> vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
> vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
> vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
> - vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID_COUNT;
> + vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
> vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;
>
> if (kvm_apicv_activated(svm->vcpu.kvm))
> @@ -210,7 +210,8 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
> u64 *avic_physical_id_table;
> struct kvm_svm *kvm_svm = to_kvm_svm(vcpu->kvm);
>
> - if (index >= AVIC_MAX_PHYSICAL_ID_COUNT)
> + if ((avic_mode == AVIC_MODE_X1 && index > AVIC_MAX_PHYSICAL_ID) ||
> + (avic_mode == AVIC_MODE_X2 && index > X2AVIC_MAX_PHYSICAL_ID))
> return NULL;
>
> avic_physical_id_table = page_address(kvm_svm->avic_physical_id_table_page);
> @@ -257,7 +258,8 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
> int id = vcpu->vcpu_id;
> struct vcpu_svm *svm = to_svm(vcpu);
>
> - if (id >= AVIC_MAX_PHYSICAL_ID_COUNT)
> + if ((avic_mode == AVIC_MODE_X1 && id > AVIC_MAX_PHYSICAL_ID) ||
> + (avic_mode == AVIC_MODE_X2 && id > X2AVIC_MAX_PHYSICAL_ID))
> return -EINVAL;
>
> if (!vcpu->arch.apic->regs)


2022-02-25 05:06:41

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFC PATCH 11/13] KVM: SVM: Add logic to switch between APIC and x2APIC virtualization mode

On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
> When an AVIC-enabled guest switch from APIC to x2APIC mode during runtime,
> the SVM driver needs to
>
> 1. Set the x2APIC mode bit for AVIC in VMCB along with the maximum
> APIC ID support for each mode accodingly.
>
> 2. Disable x2APIC MSRs interception in order to allow the hardware
> to virtualize x2APIC MSRs accesses.
>
> This is currently handled in the svm_refresh_apicv_exec_ctrl().
>
> Note that guest kerenel does not need to disable APIC before swtiching
> to x2APIC. Therefore the WARN_ON in vcpu_load() to check if the vCPU is
> currently running is no longer appropriate.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm/avic.c | 61 +++++++++++++++++++++++++++++++++++++----
> 1 file changed, 55 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 3543b7a4514a..3306b74f1d8b 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -79,6 +79,50 @@ static inline enum avic_modes avic_get_vcpu_apic_mode(struct vcpu_svm *svm)
> return AVIC_MODE_NONE;
> }
>
> +static inline void avic_set_x2apic_msr_interception(struct vcpu_svm *svm, bool disable)
> +{
> + int i;
> +
> + for (i = 0x800; i <= 0x8ff; i++)
> + set_msr_interception(&svm->vcpu, svm->msrpm, i,
> + !disable, !disable);
> +}
> +
> +void avic_activate_vmcb(struct vcpu_svm *svm)
> +{
> + struct vmcb *vmcb = svm->vmcb01.ptr;
> +
> + vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> +
> + if (svm->x2apic_enabled) {
Use apic_x2apic_mode here as well

> + vmcb->control.int_ctl |= X2APIC_MODE_MASK;
> + vmcb->control.avic_physical_id &= ~X2AVIC_MAX_PHYSICAL_ID;
> + vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
Why not just use

phys_addr_t ppa = __sme_set(page_to_phys(kvm_svm->avic_physical_id_table_page));;
vmcb->control.avic_physical_id = ppa | X2AVIC_MAX_PHYSICAL_ID;

> + /* Disabling MSR intercept for x2APIC registers */
> + avic_set_x2apic_msr_interception(svm, false);
> + } else {
> + vmcb->control.avic_physical_id &= ~AVIC_MAX_PHYSICAL_ID;
> + vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
Same here....
> + /* Enabling MSR intercept for x2APIC registers */
> + avic_set_x2apic_msr_interception(svm, true);
> + }
> +}
> +
> +void avic_deactivate_vmcb(struct vcpu_svm *svm)
> +{
> + struct vmcb *vmcb = svm->vmcb01.ptr;
> +
> + vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
> +
> + if (svm->x2apic_enabled)
> + vmcb->control.avic_physical_id &= ~X2AVIC_MAX_PHYSICAL_ID;
> + else
> + vmcb->control.avic_physical_id &= ~AVIC_MAX_PHYSICAL_ID;
> +
> + /* Enabling MSR intercept for x2APIC registers */
> + avic_set_x2apic_msr_interception(svm, true);
> +}
> +
> /* Note:
> * This function is called from IOMMU driver to notify
> * SVM to schedule in a particular vCPU of a particular VM.
> @@ -195,13 +239,12 @@ void avic_init_vmcb(struct vcpu_svm *svm)
> vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
> vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
> vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
> - vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
> vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;
>
> if (kvm_apicv_activated(svm->vcpu.kvm))
> - vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> + avic_activate_vmcb(svm);
Why not set AVIC_ENABLE_MASK in avic_activate_vmcb ?

> else
> - vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
> + avic_deactivate_vmcb(svm);
> }
>
> static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
> @@ -657,6 +700,13 @@ void avic_update_vapic_bar(struct vcpu_svm *svm, u64 data)
> svm->x2apic_enabled ? "x2APIC" : "xAPIC");
> vmcb_mark_dirty(svm->vmcb, VMCB_AVIC);
> kvm_vcpu_update_apicv(&svm->vcpu);
> +
> + /*
> + * The VM could be running w/ AVIC activated switching from APIC
> + * to x2APIC mode. We need to all refresh to make sure that all
> + * x2AVIC configuration are being done.
> + */
> + svm_refresh_apicv_exec_ctrl(&svm->vcpu);


That also should be done in avic_set_virtual_apic_mode instead, but otherwise should be fine.

Also it seems that .avic_set_virtual_apic_mode will cover you on the case when x2apic is disabled
in the guest cpuid - kvm_set_apic_base checks if the guest cpuid has x2apic support and refuses
to enable it if it is not set.

But still a WARN_ON_ONCE won't hurt to see that you are not enabling x2avic when not supported.

> }
>
> void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
> @@ -722,9 +772,9 @@ void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
> * accordingly before re-activating.
> */
> avic_post_state_restore(vcpu);
> - vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> + avic_activate_vmcb(svm);
> } else {
> - vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
> + avic_deactivate_vmcb(svm);
> }
> vmcb_mark_dirty(vmcb, VMCB_AVIC);
>
> @@ -1019,7 +1069,6 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> return;
>
> entry = READ_ONCE(*(svm->avic_physical_id_cache));
> - WARN_ON(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);
Why?
>
> entry &= ~AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK;
> entry |= (h_physical_id & AVIC_PHYSICAL_ID_ENTRY_HOST_PHYSICAL_ID_MASK);

Best regards,
Maxim Levitsky


2022-03-01 11:49:11

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFC PATCH 05/13] KVM: SVM: Update max number of vCPUs supported for x2AVIC mode

On Tue, 2022-03-01 at 17:47 +0700, Suravee Suthikulpanit wrote:
> Hi Maxim,
>
> On 2/25/22 12:18 AM, Maxim Levitsky wrote:
> > On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
> > > xAVIC and x2AVIC modes can support diffferent number of vcpus.
> > > Update existing logics to support each mode accordingly.
> > >
> > > Also, modify the maximum physical APIC ID for AVIC to 255 to reflect
> > > the actual value supported by the architecture.
> > >
> > > Signed-off-by: Suravee Suthikulpanit<[email protected]>
> > > ---
> > > arch/x86/include/asm/svm.h | 12 +++++++++---
> > > arch/x86/kvm/svm/avic.c | 8 +++++---
> > > 2 files changed, 14 insertions(+), 6 deletions(-)
> > >
> > > diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> > > index 7a7a2297165b..681a348a9365 100644
> > > --- a/arch/x86/include/asm/svm.h
> > > +++ b/arch/x86/include/asm/svm.h
> > > @@ -250,10 +250,16 @@ enum avic_ipi_failure_cause {
> > >
> > >
> > > /*
> > > - * 0xff is broadcast, so the max index allowed for physical APIC ID
> > > - * table is 0xfe. APIC IDs above 0xff are reserved.
> > > + * For AVIC, the max index allowed for physical APIC ID
> > > + * table is 0xff (255).
> > > */
> > > -#define AVIC_MAX_PHYSICAL_ID_COUNT 0xff
> > > +#define AVIC_MAX_PHYSICAL_ID 0XFFULL
> > > +
> > > +/*
> > > + * For x2AVIC, the max index allowed for physical APIC ID
> > > + * table is 0x1ff (511).
> > > + */
> > > +#define X2AVIC_MAX_PHYSICAL_ID 0x1FFUL
> > Yep, physid page can't hold more entries...
> >
> > This brings the inventible question of what to do when a VM has more
> > that 512 vCPUs...
> >
> > With AVIC, since it is xapic, it would be easy - xapic supports up to
> > 254 CPUs.
>
> Actually, 255 vCPUs.

Sorry for off-by-one mistake - just remembered that 0xFF is reserved,
but then 255 is already 1 less that 256.

>
> > But with x2apic, there is no such restriction on max 512 CPUs,
> > thus it is legal to create a VM with x2apic and more that 512 CPUs,
> > and x2AVIC won't work well in this case.
> >
> > I guess AVIC_IPI_FAILURE_INVALID_TARGET, has to be extened to support those
> > cases, even with loss of performance, or we need to inhibit x2AVIC.
>
> In case of x2APIC-enabled guest w/ vCPU exceeding the max APIC ID (512) limit,
> the ioctl operation for KVM_CREATE_VCPU will fail. For QEMU, this would
> exit with error code. Would this be sufficient?
Yes, this is the best.


Best regards,
Maxim Levitsky


>
> Regards,
> Suravee
>
>
>


2022-03-01 12:50:13

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFC PATCH 04/13] KVM: SVM: Only call vcpu_(un)blocking when AVIC is enabled.

Hi Maxim,

On 2/24/22 11:54 PM, Maxim Levitsky wrote:
> On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
>> The kvm_x86_ops.vcpu_(un)blocking are needed by AVIC only.
>> Therefore, set the ops only when AVIC is enabled.
>>
>> Suggested-by: Sean Christopherson<[email protected]>
>> Signed-off-by: Suravee Suthikulpanit<[email protected]>
>> ---
>> arch/x86/kvm/svm/avic.c | 12 ++++++++++--
>> arch/x86/kvm/svm/svm.c | 7 -------
>> arch/x86/kvm/svm/svm.h | 2 --
>> 3 files changed, 10 insertions(+), 11 deletions(-)
>>
>> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
>> index abde08ca23ab..0040824e4376 100644
>> --- a/arch/x86/kvm/svm/avic.c
>> +++ b/arch/x86/kvm/svm/avic.c
>> @@ -996,7 +996,7 @@ void avic_vcpu_put(struct kvm_vcpu *vcpu)
>> WRITE_ONCE(*(svm->avic_physical_id_cache), entry);
>> }
>>
>> -void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
>> +static void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
>> {
>> if (!kvm_vcpu_apicv_active(vcpu))
>> return;
>> @@ -1021,7 +1021,7 @@ void avic_vcpu_blocking(struct kvm_vcpu *vcpu)
>> preempt_enable();
>> }
>>
>> -void avic_vcpu_unblocking(struct kvm_vcpu *vcpu)
>> +static void avic_vcpu_unblocking(struct kvm_vcpu *vcpu)
>> {
>> int cpu;
>>
>> @@ -1057,6 +1057,14 @@ bool avic_hardware_setup(struct kvm_x86_ops *x86_ops)
>> pr_info("x2AVIC enabled\n");
>> }
>>
>> + if (avic_mode) {
>> + x86_ops->vcpu_blocking = avic_vcpu_blocking;
>> + x86_ops->vcpu_unblocking = avic_vcpu_unblocking;
>> + } else {
>> + x86_ops->vcpu_blocking = NULL;
>> + x86_ops->vcpu_unblocking = NULL;
>> + }
>> +
>> amd_iommu_register_ga_log_notifier(&avic_ga_log_notifier);
>> return !!avic_mode;
>> }
>> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
>> index 3048f4b758d6..3687026f2859 100644
>> --- a/arch/x86/kvm/svm/svm.c
>> +++ b/arch/x86/kvm/svm/svm.c
>> @@ -4531,8 +4531,6 @@ static struct kvm_x86_ops svm_x86_ops __initdata = {
>> .prepare_guest_switch = svm_prepare_guest_switch,
>> .vcpu_load = svm_vcpu_load,
>> .vcpu_put = svm_vcpu_put,
>> - .vcpu_blocking = avic_vcpu_blocking,
>> - .vcpu_unblocking = avic_vcpu_unblocking,
>>
>> .update_exception_bitmap = svm_update_exception_bitmap,
>> .get_msr_feature = svm_get_msr_feature,
>> @@ -4819,11 +4817,6 @@ static __init int svm_hardware_setup(void)
>>
>> enable_apicv = avic = avic && avic_hardware_setup(&svm_x86_ops);
>>
>> - if (!enable_apicv) {
>> - svm_x86_ops.vcpu_blocking = NULL;
>> - svm_x86_ops.vcpu_unblocking = NULL;
>> - }
> Isn't this code already zeros these callbacks when avic is not enabled?

Ah, right. I'll remove the setting to NULL.

> I am not sure why this patch is needed to be honest.

It's not related to x2AVIC. It was recommended by Sean earlier
in another patch series:

https://lore.kernel.org/lkml/[email protected]/

Since this series introduces the helper function avic_hardware_setup(),
and re-factor the AVIC setup code into the function. So, I am including
his recommendation in this series instead..

Regards,
Suravee

2022-03-01 15:46:03

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFC PATCH 05/13] KVM: SVM: Update max number of vCPUs supported for x2AVIC mode

Hi Maxim,

On 2/25/22 12:18 AM, Maxim Levitsky wrote:
> On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
>> xAVIC and x2AVIC modes can support diffferent number of vcpus.
>> Update existing logics to support each mode accordingly.
>>
>> Also, modify the maximum physical APIC ID for AVIC to 255 to reflect
>> the actual value supported by the architecture.
>>
>> Signed-off-by: Suravee Suthikulpanit<[email protected]>
>> ---
>> arch/x86/include/asm/svm.h | 12 +++++++++---
>> arch/x86/kvm/svm/avic.c | 8 +++++---
>> 2 files changed, 14 insertions(+), 6 deletions(-)
>>
>> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
>> index 7a7a2297165b..681a348a9365 100644
>> --- a/arch/x86/include/asm/svm.h
>> +++ b/arch/x86/include/asm/svm.h
>> @@ -250,10 +250,16 @@ enum avic_ipi_failure_cause {
>>
>>
>> /*
>> - * 0xff is broadcast, so the max index allowed for physical APIC ID
>> - * table is 0xfe. APIC IDs above 0xff are reserved.
>> + * For AVIC, the max index allowed for physical APIC ID
>> + * table is 0xff (255).
>> */
>> -#define AVIC_MAX_PHYSICAL_ID_COUNT 0xff
>> +#define AVIC_MAX_PHYSICAL_ID 0XFFULL
>> +
>> +/*
>> + * For x2AVIC, the max index allowed for physical APIC ID
>> + * table is 0x1ff (511).
>> + */
>> +#define X2AVIC_MAX_PHYSICAL_ID 0x1FFUL
> Yep, physid page can't hold more entries...
>
> This brings the inventible question of what to do when a VM has more
> that 512 vCPUs...
>
> With AVIC, since it is xapic, it would be easy - xapic supports up to
> 254 CPUs.

Actually, 255 vCPUs.

> But with x2apic, there is no such restriction on max 512 CPUs,
> thus it is legal to create a VM with x2apic and more that 512 CPUs,
> and x2AVIC won't work well in this case.
>
> I guess AVIC_IPI_FAILURE_INVALID_TARGET, has to be extened to support those
> cases, even with loss of performance, or we need to inhibit x2AVIC.

In case of x2APIC-enabled guest w/ vCPU exceeding the max APIC ID (512) limit,
the ioctl operation for KVM_CREATE_VCPU will fail. For QEMU, this would
exit with error code. Would this be sufficient?

Regards,
Suravee



2022-03-03 02:12:31

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFC PATCH 06/13] KVM: SVM: Add logic to determine x2APIC mode

Hi Maxim,

On 2/25/2022 12:29 AM, Maxim Levitsky wrote:
>> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
>> index 1a0bf6b853df..bfbebb933da2 100644
>> --- a/arch/x86/kvm/svm/svm.h
>> +++ b/arch/x86/kvm/svm/svm.h
>> @@ -225,6 +225,7 @@ struct vcpu_svm {
>> u32 dfr_reg;
>> struct page *avic_backing_page;
>> u64 *avic_physical_id_cache;
>> + bool x2apic_enabled;
>>
>> /*
>> * Per-vcpu list of struct amd_svm_iommu_ir:
>> @@ -566,6 +567,7 @@ void avic_init_vmcb(struct vcpu_svm *svm);
>> int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu);
>> int avic_unaccelerated_access_interception(struct kvm_vcpu *vcpu);
>> int avic_init_vcpu(struct vcpu_svm *svm);
>> +void avic_update_vapic_bar(struct vcpu_svm *svm, u64 data);
>> void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
>> void avic_vcpu_put(struct kvm_vcpu *vcpu);
>> void avic_post_state_restore(struct kvm_vcpu *vcpu);
>
> Have you looked at how this is done on Intel's APICv side?
> You need to implement .set_virtual_apic_mode instead.
> (look at vmx_set_virtual_apic_mode)

Actually, that would be better. I'll update this part to use svm_set_virtual_apic_mode,
which is doing nothing at the moment.

Regards,
Suravee

2022-03-03 15:09:46

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFC PATCH 07/13] KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID

Maxim,

On 2/25/22 12:35 AM, Maxim Levitsky wrote:
> On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
>> In x2APIC mode, ICRH contains 32-bit destination APIC ID.
>> So, update the avic_kick_target_vcpus() accordingly.
>>
>> Signed-off-by: Suravee Suthikulpanit<[email protected]>
>> ---
>> arch/x86/kvm/svm/avic.c | 13 +++++++++----
>> 1 file changed, 9 insertions(+), 4 deletions(-)
>>
>> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
>> index 60f30e48d816..215d8a7dbc1d 100644
>> --- a/arch/x86/kvm/svm/avic.c
>> +++ b/arch/x86/kvm/svm/avic.c
>> @@ -307,10 +307,16 @@ void avic_ring_doorbell(struct kvm_vcpu *vcpu)
>> }
>>
>> static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
>> - u32 icrl, u32 icrh)
>> + u32 icrl, u32 icrh, bool x2apic_enabled)
>> {
>> struct kvm_vcpu *vcpu;
>> unsigned long i;
>> + u32 dest;
>> +
>> + if (x2apic_enabled)
>> + dest = icrh;
>> + else
>> + dest = GET_APIC_DEST_FIELD(icrh);
>
> Just use 'apic_x2apic_mode(apic)', no need for x2apic_enabled parameter
> as I said in patch 6.
>
> Also maybe rename GET_APIC_DEST_FIELD to GET_XAPIC_DEST_FIELD or something as it is
> wrong for x2apic.

I'll send a separate patch to rename the macros as you suggested.

Regards,
Suravee

2022-03-03 16:04:11

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFC PATCH 06/13] KVM: SVM: Add logic to determine x2APIC mode

Hi Maxim

On 2/25/22 12:29 AM, Maxim Levitsky wrote:
>> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
>> index 1a0bf6b853df..bfbebb933da2 100644
>> --- a/arch/x86/kvm/svm/svm.h
>> +++ b/arch/x86/kvm/svm/svm.h
>> @@ -225,6 +225,7 @@ struct vcpu_svm {
>> u32 dfr_reg;
>> struct page *avic_backing_page;
>> u64 *avic_physical_id_cache;
>> + bool x2apic_enabled;
>>
>> /*
>> * Per-vcpu list of struct amd_svm_iommu_ir:
>> @@ -566,6 +567,7 @@ void avic_init_vmcb(struct vcpu_svm *svm);
>> int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu);
>> int avic_unaccelerated_access_interception(struct kvm_vcpu *vcpu);
>> int avic_init_vcpu(struct vcpu_svm *svm);
>> +void avic_update_vapic_bar(struct vcpu_svm *svm, u64 data);
>> void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu);
>> void avic_vcpu_put(struct kvm_vcpu *vcpu);
>> void avic_post_state_restore(struct kvm_vcpu *vcpu);
>
> ....
>
> I also don't think you need x2apic_enabled boolean.
> You can already know if a vCPU uses apic or x2apic via
>
> kvm_get_apic_mode(vcpu);
>
> in fact I don't think avic code should have any bookeeping in regard to x2apic/x2avic mode,
> but rather kvm's apic mode (which is read directly from apic base msr (vcpu->arch.apic_base),
> should enable avic, or x2avic if possible, or inhibit avic if not possible.
>
> e.g it should drive the bits in vmcb and such.

I'll also clean this up in V2.

Regards,
Suravee

2022-03-04 13:49:13

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFC PATCH 11/13] KVM: SVM: Add logic to switch between APIC and x2APIC virtualization mode

Maxim,

On 2/25/22 3:03 AM, Maxim Levitsky wrote:
> On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
>> ....
>>
>> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
>> index 3543b7a4514a..3306b74f1d8b 100644
>> --- a/arch/x86/kvm/svm/avic.c
>> +++ b/arch/x86/kvm/svm/avic.c
>> @@ -79,6 +79,50 @@ static inline enum avic_modes avic_get_vcpu_apic_mode(struct vcpu_svm *svm)
>> return AVIC_MODE_NONE;
>> }
>>
>> +static inline void avic_set_x2apic_msr_interception(struct vcpu_svm *svm, bool disable)
>> +{
>> + int i;
>> +
>> + for (i = 0x800; i <= 0x8ff; i++)
>> + set_msr_interception(&svm->vcpu, svm->msrpm, i,
>> + !disable, !disable);
>> +}
>> +
>> +void avic_activate_vmcb(struct vcpu_svm *svm)
>> +{
>> + struct vmcb *vmcb = svm->vmcb01.ptr;
>> +
>> + vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
>> +
>> + if (svm->x2apic_enabled) {
> Use apic_x2apic_mode here as well

Okay

>
>> + vmcb->control.int_ctl |= X2APIC_MODE_MASK;
>> + vmcb->control.avic_physical_id &= ~X2AVIC_MAX_PHYSICAL_ID;
>> + vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
> Why not just use
>
> phys_addr_t ppa = __sme_set(page_to_phys(kvm_svm->avic_physical_id_table_page));;
> vmcb->control.avic_physical_id = ppa | X2AVIC_MAX_PHYSICAL_ID;
>

Sorry, I don't quiet understand this part. We just want to update certain bits in the VMCB register.

>> ...
>> +void avic_deactivate_vmcb(struct vcpu_svm *svm)
>> +{
>> + struct vmcb *vmcb = svm->vmcb01.ptr;
>> +
>> + vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
>> +
>> + if (svm->x2apic_enabled)
>> + vmcb->control.avic_physical_id &= ~X2AVIC_MAX_PHYSICAL_ID;
>> + else
>> + vmcb->control.avic_physical_id &= ~AVIC_MAX_PHYSICAL_ID;
>> +
>> + /* Enabling MSR intercept for x2APIC registers */
>> + avic_set_x2apic_msr_interception(svm, true);
>> +}
>> +
>> /* Note:
>> * This function is called from IOMMU driver to notify
>> * SVM to schedule in a particular vCPU of a particular VM.
>> @@ -195,13 +239,12 @@ void avic_init_vmcb(struct vcpu_svm *svm)
>> vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
>> vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
>> vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
>> - vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
>> vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;
>>
>> if (kvm_apicv_activated(svm->vcpu.kvm))
>> - vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
>> + avic_activate_vmcb(svm);
> Why not set AVIC_ENABLE_MASK in avic_activate_vmcb ?

It's already doing "vmcb->control.int_ctl |= X2APIC_MODE_MASK;" in avic_activate_vmcb().

>> else
>> - vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
>> + avic_deactivate_vmcb(svm);
>> }
>>
>> static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
>> @@ -657,6 +700,13 @@ void avic_update_vapic_bar(struct vcpu_svm *svm, u64 data)
>> svm->x2apic_enabled ? "x2APIC" : "xAPIC");
>> vmcb_mark_dirty(svm->vmcb, VMCB_AVIC);
>> kvm_vcpu_update_apicv(&svm->vcpu);
>> +
>> + /*
>> + * The VM could be running w/ AVIC activated switching from APIC
>> + * to x2APIC mode. We need to all refresh to make sure that all
>> + * x2AVIC configuration are being done.
>> + */
>> + svm_refresh_apicv_exec_ctrl(&svm->vcpu);
>
>
> That also should be done in avic_set_virtual_apic_mode instead, but otherwise should be fine.

Agree, and will be updated to use svm_set_virtual_apic_mode() in v2.

> Also it seems that .avic_set_virtual_apic_mode will cover you on the case when x2apic is disabled
> in the guest cpuid - kvm_set_apic_base checks if the guest cpuid has x2apic support and refuses
> to enable it if it is not set.
>
> But still a WARN_ON_ONCE won't hurt to see that you are not enabling x2avic when not supported.

Not sure if we need this. The logic for activating x2AVIC in VMCB already
check if the guest x2APIC mode is set, which can only happen if x2APIC CPUID
is set.

>> }
>>
>> void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
>> @@ -722,9 +772,9 @@ void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
>> * accordingly before re-activating.
>> */
>> avic_post_state_restore(vcpu);
>> - vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
>> + avic_activate_vmcb(svm);
>> } else {
>> - vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
>> + avic_deactivate_vmcb(svm);
>> }
>> vmcb_mark_dirty(vmcb, VMCB_AVIC);
>>
>> @@ -1019,7 +1069,6 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
>> return;
>>
>> entry = READ_ONCE(*(svm->avic_physical_id_cache));
>> - WARN_ON(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);
> Why?

For AVIC, this WARN_ON is designed to catch the scenario when the vCPU is calling
avic_vcpu_load() while it is already running. However, with x2AVIC support,
the vCPU can switch from xAPIC to x2APIC mode while in running state
(i.e. the AVIC is_running is set). This warning is currently observed due to
the call from svm_refresh_apicv_exec_ctrl().

Regards,
Suravee

2022-03-04 20:30:11

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFC PATCH 11/13] KVM: SVM: Add logic to switch between APIC and x2APIC virtualization mode

On Fri, 2022-03-04 at 18:22 +0700, Suravee Suthikulpanit wrote:
> Maxim,
>
> On 2/25/22 3:03 AM, Maxim Levitsky wrote:
> > On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
> > > ....
> > >
> > > diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> > > index 3543b7a4514a..3306b74f1d8b 100644
> > > --- a/arch/x86/kvm/svm/avic.c
> > > +++ b/arch/x86/kvm/svm/avic.c
> > > @@ -79,6 +79,50 @@ static inline enum avic_modes avic_get_vcpu_apic_mode(struct vcpu_svm *svm)
> > > return AVIC_MODE_NONE;
> > > }
> > >
> > > +static inline void avic_set_x2apic_msr_interception(struct vcpu_svm *svm, bool disable)
> > > +{
> > > + int i;
> > > +
> > > + for (i = 0x800; i <= 0x8ff; i++)
> > > + set_msr_interception(&svm->vcpu, svm->msrpm, i,
> > > + !disable, !disable);
> > > +}
> > > +
> > > +void avic_activate_vmcb(struct vcpu_svm *svm)
> > > +{
> > > + struct vmcb *vmcb = svm->vmcb01.ptr;
> > > +
> > > + vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> > > +
> > > + if (svm->x2apic_enabled) {
> > Use apic_x2apic_mode here as well
>
> Okay
>
> > > + vmcb->control.int_ctl |= X2APIC_MODE_MASK;
> > > + vmcb->control.avic_physical_id &= ~X2AVIC_MAX_PHYSICAL_ID;
> > > + vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
> > Why not just use
> >
> > phys_addr_t ppa = __sme_set(page_to_phys(kvm_svm->avic_physical_id_table_page));;
> > vmcb->control.avic_physical_id = ppa | X2AVIC_MAX_PHYSICAL_ID;
> >
>
> Sorry, I don't quiet understand this part. We just want to update certain bits in the VMCB register.

It seems a bit cleaner to me to create that field again instead of erasing bits like that.
But honestly I don't mind it that much.


>
> > > ...
> > > +void avic_deactivate_vmcb(struct vcpu_svm *svm)
> > > +{
> > > + struct vmcb *vmcb = svm->vmcb01.ptr;
> > > +
> > > + vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
> > > +
> > > + if (svm->x2apic_enabled)
> > > + vmcb->control.avic_physical_id &= ~X2AVIC_MAX_PHYSICAL_ID;
> > > + else
> > > + vmcb->control.avic_physical_id &= ~AVIC_MAX_PHYSICAL_ID;
> > > +
> > > + /* Enabling MSR intercept for x2APIC registers */
> > > + avic_set_x2apic_msr_interception(svm, true);
> > > +}
> > > +
> > > /* Note:
> > > * This function is called from IOMMU driver to notify
> > > * SVM to schedule in a particular vCPU of a particular VM.
> > > @@ -195,13 +239,12 @@ void avic_init_vmcb(struct vcpu_svm *svm)
> > > vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
> > > vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
> > > vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
> > > - vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
> > > vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;
> > >
> > > if (kvm_apicv_activated(svm->vcpu.kvm))
> > > - vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> > > + avic_activate_vmcb(svm);
> > Why not set AVIC_ENABLE_MASK in avic_activate_vmcb ?
>
> It's already doing "vmcb->control.int_ctl |= X2APIC_MODE_MASK;" in avic_activate_vmcb().
Yes, but why not also to set/clear AVIC_ENABLE_MASK there as well?

>
> > > else
> > > - vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
> > > + avic_deactivate_vmcb(svm);
> > > }
> > >
> > > static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
> > > @@ -657,6 +700,13 @@ void avic_update_vapic_bar(struct vcpu_svm *svm, u64 data)
> > > svm->x2apic_enabled ? "x2APIC" : "xAPIC");
> > > vmcb_mark_dirty(svm->vmcb, VMCB_AVIC);
> > > kvm_vcpu_update_apicv(&svm->vcpu);
> > > +
> > > + /*
> > > + * The VM could be running w/ AVIC activated switching from APIC
> > > + * to x2APIC mode. We need to all refresh to make sure that all
> > > + * x2AVIC configuration are being done.
> > > + */
> > > + svm_refresh_apicv_exec_ctrl(&svm->vcpu);
> >
> > That also should be done in avic_set_virtual_apic_mode instead, but otherwise should be fine.
>
> Agree, and will be updated to use svm_set_virtual_apic_mode() in v2.
>
> > Also it seems that .avic_set_virtual_apic_mode will cover you on the case when x2apic is disabled
> > in the guest cpuid - kvm_set_apic_base checks if the guest cpuid has x2apic support and refuses
> > to enable it if it is not set.
> >
> > But still a WARN_ON_ONCE won't hurt to see that you are not enabling x2avic when not supported.
>
> Not sure if we need this. The logic for activating x2AVIC in VMCB already
> check if the guest x2APIC mode is set, which can only happen if x2APIC CPUID
> is set.
I don't mind that much, just a suggestion.

>
> > > }
> > >
> > > void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
> > > @@ -722,9 +772,9 @@ void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
> > > * accordingly before re-activating.
> > > */
> > > avic_post_state_restore(vcpu);
> > > - vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> > > + avic_activate_vmcb(svm);
> > > } else {
> > > - vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
> > > + avic_deactivate_vmcb(svm);
> > > }
> > > vmcb_mark_dirty(vmcb, VMCB_AVIC);
> > >
> > > @@ -1019,7 +1069,6 @@ void avic_vcpu_load(struct kvm_vcpu *vcpu, int cpu)
> > > return;
> > >
> > > entry = READ_ONCE(*(svm->avic_physical_id_cache));
> > > - WARN_ON(entry & AVIC_PHYSICAL_ID_ENTRY_IS_RUNNING_MASK);
> > Why?
>
> For AVIC, this WARN_ON is designed to catch the scenario when the vCPU is calling
> avic_vcpu_load() while it is already running. However, with x2AVIC support,
> the vCPU can switch from xAPIC to x2APIC mode while in running state
> (i.e. the AVIC is_running is set). This warning is currently observed due to
> the call from svm_refresh_apicv_exec_ctrl().

Ah, understand you now!

Best regards,
Thanks,
Maxim Levitsky

>
> Regards,
> Suravee
>


2022-03-07 08:08:54

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFC PATCH 13/13] KVM: SVM: Use fastpath x2apic IPI emulation when #vmexit with x2AVIC

Maxim,

On 2/25/2022 3:12 AM, Maxim Levitsky wrote:
> On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
>> When sends IPI to a halting vCPU, the hardware generates
>> avic_incomplete_ipi #vmexit with the
>> AVIC_IPI_FAILURE_TARGET_NOT_RUNNING reason.
>>
>> For x2AVIC, enable fastpath emulation.
>>
>> Signed-off-by: Suravee Suthikulpanit<[email protected]>
>> ---
>> arch/x86/kvm/svm/avic.c | 2 ++
>> arch/x86/kvm/x86.c | 3 ++-
>> arch/x86/kvm/x86.h | 1 +
>> 3 files changed, 5 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
>> index 874c89f8fd47..758a79ee7f99 100644
>> --- a/arch/x86/kvm/svm/avic.c
>> +++ b/arch/x86/kvm/svm/avic.c
>> @@ -428,6 +428,8 @@ int avic_incomplete_ipi_interception(struct kvm_vcpu *vcpu)
>> kvm_lapic_reg_write(apic, APIC_ICR, icrl);
>> break;
>> case AVIC_IPI_FAILURE_TARGET_NOT_RUNNING:
>> + handle_fastpath_set_x2apic_icr_irqoff(vcpu, svm->vmcb->control.exit_info_1);
> This just doesn't seem right - it sends IPI to the target, while we just need to wake it up.
> avic_kick_target_vcpus already does all of this, and it really should be optimized to avoid
> going over all vcpus as it does currently.

Ah, you are right. I'll remove this patch.

Regards,
Suravee

2022-03-07 19:41:03

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFC PATCH 10/13] KVM: SVM: Adding support for configuring x2APIC MSRs interception

Maxim,

On 2/25/2022 2:51 AM, Maxim Levitsky wrote:
> On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
>
> I only gave this a cursory look, the whole thing is a bit ugly (not your fault),
> I feel like it should be refactored a bit.

I agree. I am not sure how to make it more friendly. Any suggestions on the refactoring?

Regards,
Suravee

2022-03-08 18:23:56

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFC PATCH 08/13] KVM: SVM: Do not update logical APIC ID table when in x2APIC mode

Maxim,

On 2/25/2022 12:41 AM, Maxim Levitsky wrote:
> On Sun, 2022-02-20 at 20:19 -0600, Suravee Suthikulpanit wrote:
>> In X2APIC mode the Logical Destination Register is read-only,
>> which provides a fixed mapping between the logical and physical
>> APIC IDs. Therefore, there is no Logical APIC ID table in X2AVIC
>> and the processor uses the X2APIC ID in the backing page to create
>> a vCPU’s logical ID.
>>
>> Therefore, add logic to check x2APIC mode before updating logical
>> APIC ID table.
>>
>> Signed-off-by: Suravee Suthikulpanit<[email protected]>
>> ---
>> arch/x86/kvm/svm/avic.c | 11 ++++++++++-
>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
>> index 215d8a7dbc1d..55b3b703b93b 100644
>> --- a/arch/x86/kvm/svm/avic.c
>> +++ b/arch/x86/kvm/svm/avic.c
>> @@ -417,6 +417,10 @@ static int avic_ldr_write(struct kvm_vcpu *vcpu, u8 g_physical_id, u32 ldr)
>> bool flat;
>> u32 *entry, new_entry;
>>
>> + /* Note: x2AVIC does not use logical APIC ID table */
>> + if (apic_x2apic_mode(vcpu->arch.apic))
>> + return 0;
>> +
>> flat = kvm_lapic_get_reg(vcpu->arch.apic, APIC_DFR) == APIC_DFR_FLAT;
>> entry = avic_get_logical_id_entry(vcpu, ldr, flat);
>> if (!entry)
>> @@ -435,8 +439,13 @@ static void avic_invalidate_logical_id_entry(struct kvm_vcpu *vcpu)
>> {
>> struct vcpu_svm *svm = to_svm(vcpu);
>> bool flat = svm->dfr_reg == APIC_DFR_FLAT;
>> - u32 *entry = avic_get_logical_id_entry(vcpu, svm->ldr_reg, flat);
>> + u32 *entry;
>> +
>> + /* Note: x2AVIC does not use logical APIC ID table */
>> + if (apic_x2apic_mode(vcpu->arch.apic))
>> + return;
>>
>> + entry = avic_get_logical_id_entry(vcpu, svm->ldr_reg, flat);
>> if (entry)
>> clear_bit(AVIC_LOGICAL_ID_ENTRY_VALID_BIT, (unsigned long *)entry);
>> }
>
> Here actually the good apic_x2apic_mode was used.
>
> However, shouldn't we inject #GP in avic_ldr_write to make this read realy read-only?
> It might be too late to do so here, since most AVIC writes are trap like.

I'm checking to see how HW would respond to LDR write in x2AVIC enabled case.

> Thus we need to make the msr that corresponds to LDR to be write protected in the msr bitmap,
> and inject #GP when write it attempted.

Actually, we can setup the MSR interception for LDR register (0x80d) to intercept
into hypervisor (i.e. not virtualized by AVIC HW), and let the current KVM
implementation handles the WRMSR emulation (i.e. inject #GP). Would that be sufficient?

Regards,
Suravee