2022-03-09 00:43:45

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFCv2 PATCH 00/12] Introducing AMD x2APIC Virtualization (x2AVIC) support.

Previously, with AVIC, guest needs to disable x2APIC capability and
can only run in APIC mode to activate hardware-accelerated interrupt
virtualization. With x2AVIC, guest can run in x2APIC mode.
This feature is indicated by the CPUID Fn8000_000A EDX[14],
and it can be activated by setting bit 31 (enable AVIC) and
bit 30 (x2APIC mode) of VMCB offset 60h.

The mode of interrupt virtualization can dynamically change during runtime.
For example, when AVIC is enabled, the hypervisor currently keeps track of
the AVIC activation and set the VMCB bit 31 accordingly. With x2AVIC,
the guest OS can also switch between APIC and x2APIC modes during runtime.
The kvm_amd driver needs to also keep track and set the VMCB
bit 30 accordingly.

Besides, for x2AVIC, kvm_amd driver needs to disable interception for the
x2APIC MSR range to allow AVIC hardware to virtualize register accesses.

Testing:
* This series has been tested booting a Linux VM with x2APIC physical
and logical modes upto 512 vCPUs.

Regards,
Suravee

Change from RFCv1 (https://lkml.org/lkml/2022/2/20/435)
* Mostly update the series based on review comments from Maxim.
* Patch 2/12 is new to the series
* Patch 3/12 removes unused helper function.
* Patch 6/12 update commit message w/ the expected hardware
behavior when writing to x2APIC LDR register to address
a concern in the review comment.
* Patch 7/12 has been redesigned to return proper error code
and move the function definition to arch/x86/kvm/lapic.c.
* Patch 9/12 moves logic into svm_set_virtual_apic_mode().
* Patch 11/12 separates the warning removal into a separate patch
w/ detail description.
* Remove non-x2AVIC related patches, which will be sent separately.

Suravee Suthikulpanit (12):
x86/cpufeatures: Introduce x2AVIC CPUID bit
KVM: x86: lapic: Rename [GET/SET]_APIC_DEST_FIELD to
[GET/SET]_XAPIC_DEST_FIELD
KVM: SVM: Detect X2APIC virtualization (x2AVIC) support
KVM: SVM: Update max number of vCPUs supported for x2AVIC mode
KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID
KVM: SVM: Do not update logical APIC ID table when in x2APIC mode
KVM: SVM: Introduce helper function kvm_get_apic_id
KVM: SVM: Adding support for configuring x2APIC MSRs interception
KVM: SVM: Refresh AVIC settings when changing APIC mode
KVM: SVM: Introduce helper functions to (de)activate AVIC and x2AVIC
KVM: SVM: Do not throw warning when calling avic_vcpu_load on a
running vcpu
KVM: SVM: Do not inhibit APICv when x2APIC is present

arch/x86/hyperv/hv_apic.c | 2 +-
arch/x86/include/asm/apicdef.h | 4 +-
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/include/asm/svm.h | 16 ++-
arch/x86/kernel/apic/apic.c | 2 +-
arch/x86/kernel/apic/ipi.c | 2 +-
arch/x86/kvm/lapic.c | 25 ++++-
arch/x86/kvm/lapic.h | 5 +-
arch/x86/kvm/svm/avic.c | 171 ++++++++++++++++++++++++++---
arch/x86/kvm/svm/svm.c | 93 +++++++++-------
arch/x86/kvm/svm/svm.h | 9 ++
11 files changed, 262 insertions(+), 68 deletions(-)

--
2.25.1


2022-03-09 01:08:12

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFCv2 PATCH 09/12] KVM: SVM: Refresh AVIC settings when changing APIC mode

When APIC mode is updated (e.g. from xAPIC to x2APIC),
KVM needs to update AVIC settings accordingly, whic is
handled by svm_refresh_apicv_exec_ctrl().

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 19 ++++++++++++++++++-
1 file changed, 18 insertions(+), 1 deletion(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 7e5a39a8e698..53559b8dfa52 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -625,7 +625,24 @@ void avic_post_state_restore(struct kvm_vcpu *vcpu)

void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
{
- return;
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ if (!lapic_in_kernel(vcpu) || (avic_mode == AVIC_MODE_NONE))
+ return;
+
+ if (kvm_get_apic_mode(vcpu) == LAPIC_MODE_INVALID)
+ WARN_ONCE(true, "Invalid local APIC state");
+
+ svm->vmcb->control.avic_vapic_bar = svm->vcpu.arch.apic_base &
+ VMCB_AVIC_APIC_BAR_MASK;
+ kvm_vcpu_update_apicv(&svm->vcpu);
+
+ /*
+ * The VM could be running w/ AVIC activated switching from APIC
+ * to x2APIC mode. We need to all refresh to make sure that all
+ * x2AVIC configuration are being done.
+ */
+ svm_refresh_apicv_exec_ctrl(&svm->vcpu);
}

void svm_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)
--
2.25.1

2022-03-09 01:18:28

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFCv2 PATCH 08/12] KVM: SVM: Adding support for configuring x2APIC MSRs interception

When enabling x2APIC virtualization (x2AVIC), the interception of
x2APIC MSRs must be disabled to let the hardware virtualize guest
MSR accesses.

Current implementation keeps track of MSR interception state
for generic MSRs in the svm_direct_access_msrs array.
For x2APIC MSRs, introduce direct_access_x2apic_msrs array.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/svm.c | 67 +++++++++++++++++++++++++++++++-----------
arch/x86/kvm/svm/svm.h | 7 +++++
2 files changed, 57 insertions(+), 17 deletions(-)

diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
index 3048f4b758d6..ce3c68a785cf 100644
--- a/arch/x86/kvm/svm/svm.c
+++ b/arch/x86/kvm/svm/svm.c
@@ -89,7 +89,7 @@ static uint64_t osvw_len = 4, osvw_status;
static DEFINE_PER_CPU(u64, current_tsc_ratio);
#define TSC_RATIO_DEFAULT 0x0100000000ULL

-static const struct svm_direct_access_msrs {
+static struct svm_direct_access_msrs {
u32 index; /* Index of the MSR */
bool always; /* True if intercept is initially cleared */
} direct_access_msrs[MAX_DIRECT_ACCESS_MSRS] = {
@@ -117,6 +117,9 @@ static const struct svm_direct_access_msrs {
{ .index = MSR_INVALID, .always = false },
};

+static struct svm_direct_access_msrs
+direct_access_x2apic_msrs[NUM_DIRECT_ACCESS_X2APIC_MSRS + 1];
+
/*
* These 2 parameters are used to config the controls for Pause-Loop Exiting:
* pause_filter_count: On processors that support Pause filtering(indicated
@@ -609,41 +612,42 @@ static int svm_cpu_init(int cpu)

}

-static int direct_access_msr_slot(u32 msr)
+static int direct_access_msr_slot(u32 msr, struct svm_direct_access_msrs *msrs)
{
u32 i;

- for (i = 0; direct_access_msrs[i].index != MSR_INVALID; i++)
- if (direct_access_msrs[i].index == msr)
+ for (i = 0; msrs[i].index != MSR_INVALID; i++)
+ if (msrs[i].index == msr)
return i;

return -ENOENT;
}

-static void set_shadow_msr_intercept(struct kvm_vcpu *vcpu, u32 msr, int read,
- int write)
+static void set_shadow_msr_intercept(struct kvm_vcpu *vcpu,
+ struct svm_direct_access_msrs *msrs, u32 msr,
+ int read, void *read_bits,
+ int write, void *write_bits)
{
- struct vcpu_svm *svm = to_svm(vcpu);
- int slot = direct_access_msr_slot(msr);
+ int slot = direct_access_msr_slot(msr, msrs);

if (slot == -ENOENT)
return;

/* Set the shadow bitmaps to the desired intercept states */
if (read)
- set_bit(slot, svm->shadow_msr_intercept.read);
+ set_bit(slot, read_bits);
else
- clear_bit(slot, svm->shadow_msr_intercept.read);
+ clear_bit(slot, read_bits);

if (write)
- set_bit(slot, svm->shadow_msr_intercept.write);
+ set_bit(slot, write_bits);
else
- clear_bit(slot, svm->shadow_msr_intercept.write);
+ clear_bit(slot, write_bits);
}

-static bool valid_msr_intercept(u32 index)
+static bool valid_msr_intercept(u32 index, struct svm_direct_access_msrs *msrs)
{
- return direct_access_msr_slot(index) != -ENOENT;
+ return direct_access_msr_slot(index, msrs) != -ENOENT;
}

static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
@@ -674,9 +678,12 @@ static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,

/*
* If this warning triggers extend the direct_access_msrs list at the
- * beginning of the file
+ * beginning of the file. The direct_access_x2apic_msrs is only for
+ * x2apic MSRs.
*/
- WARN_ON(!valid_msr_intercept(msr));
+ WARN_ON(!valid_msr_intercept(msr, direct_access_msrs) &&
+ (boot_cpu_has(X86_FEATURE_X2AVIC) &&
+ !valid_msr_intercept(msr, direct_access_x2apic_msrs)));

/* Enforce non allowed MSRs to trap */
if (read && !kvm_msr_allowed(vcpu, msr, KVM_MSR_FILTER_READ))
@@ -704,7 +711,16 @@ static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr,
int read, int write)
{
- set_shadow_msr_intercept(vcpu, msr, read, write);
+ struct vcpu_svm *svm = to_svm(vcpu);
+
+ if (msr < 0x800 || msr > 0x8ff)
+ set_shadow_msr_intercept(vcpu, direct_access_msrs, msr,
+ read, svm->shadow_msr_intercept.read,
+ write, svm->shadow_msr_intercept.write);
+ else
+ set_shadow_msr_intercept(vcpu, direct_access_x2apic_msrs, msr,
+ read, svm->shadow_x2apic_msr_intercept.read,
+ write, svm->shadow_x2apic_msr_intercept.write);
set_msr_interception_bitmap(vcpu, msrpm, msr, read, write);
}

@@ -786,6 +802,22 @@ static void add_msr_offset(u32 offset)
BUG();
}

+static void init_direct_access_x2apic_msrs(void)
+{
+ int i;
+
+ /* Initialize x2APIC direct_access_x2apic_msrs entries */
+ for (i = 0; i < NUM_DIRECT_ACCESS_X2APIC_MSRS; i++) {
+ direct_access_x2apic_msrs[i].index = boot_cpu_has(X86_FEATURE_X2AVIC) ?
+ (0x800 + i) : MSR_INVALID;
+ direct_access_x2apic_msrs[i].always = false;
+ }
+
+ /* Initialize last entry */
+ direct_access_x2apic_msrs[i].index = MSR_INVALID;
+ direct_access_x2apic_msrs[i].always = false;
+}
+
static void init_msrpm_offsets(void)
{
int i;
@@ -4750,6 +4782,7 @@ static __init int svm_hardware_setup(void)
memset(iopm_va, 0xff, PAGE_SIZE * (1 << order));
iopm_base = page_to_pfn(iopm_pages) << PAGE_SHIFT;

+ init_direct_access_x2apic_msrs();
init_msrpm_offsets();

supported_xcr0 &= ~(XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR);
diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
index b53c83a44ec2..19ad40b8383b 100644
--- a/arch/x86/kvm/svm/svm.h
+++ b/arch/x86/kvm/svm/svm.h
@@ -29,6 +29,8 @@

#define MAX_DIRECT_ACCESS_MSRS 20
#define MSRPM_OFFSETS 16
+#define NUM_DIRECT_ACCESS_X2APIC_MSRS 0x100
+
extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
extern bool npt_enabled;
extern bool intercept_smi;
@@ -241,6 +243,11 @@ struct vcpu_svm {
DECLARE_BITMAP(write, MAX_DIRECT_ACCESS_MSRS);
} shadow_msr_intercept;

+ struct {
+ DECLARE_BITMAP(read, NUM_DIRECT_ACCESS_X2APIC_MSRS);
+ DECLARE_BITMAP(write, NUM_DIRECT_ACCESS_X2APIC_MSRS);
+ } shadow_x2apic_msr_intercept;
+
struct vcpu_sev_es_state sev_es;

bool guest_state_loaded;
--
2.25.1

2022-03-09 01:22:51

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFCv2 PATCH 04/12] KVM: SVM: Update max number of vCPUs supported for x2AVIC mode

xAVIC and x2AVIC modes can support diffferent number of vcpus.
Update existing logics to support each mode accordingly.

Also, modify the maximum physical APIC ID for AVIC to 255 to reflect
the actual value supported by the architecture.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/svm.h | 12 +++++++++---
arch/x86/kvm/svm/avic.c | 8 +++++---
2 files changed, 14 insertions(+), 6 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 7a7a2297165b..681a348a9365 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -250,10 +250,16 @@ enum avic_ipi_failure_cause {


/*
- * 0xff is broadcast, so the max index allowed for physical APIC ID
- * table is 0xfe. APIC IDs above 0xff are reserved.
+ * For AVIC, the max index allowed for physical APIC ID
+ * table is 0xff (255).
*/
-#define AVIC_MAX_PHYSICAL_ID_COUNT 0xff
+#define AVIC_MAX_PHYSICAL_ID 0XFFULL
+
+/*
+ * For x2AVIC, the max index allowed for physical APIC ID
+ * table is 0x1ff (511).
+ */
+#define X2AVIC_MAX_PHYSICAL_ID 0x1FFUL

#define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF)
#define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 49b185f0d42e..f128b0189d4a 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -183,7 +183,7 @@ void avic_init_vmcb(struct vcpu_svm *svm)
vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
- vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID_COUNT;
+ vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;

if (kvm_apicv_activated(svm->vcpu.kvm))
@@ -198,7 +198,8 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
u64 *avic_physical_id_table;
struct kvm_svm *kvm_svm = to_kvm_svm(vcpu->kvm);

- if (index >= AVIC_MAX_PHYSICAL_ID_COUNT)
+ if ((avic_mode == AVIC_MODE_X1 && index > AVIC_MAX_PHYSICAL_ID) ||
+ (avic_mode == AVIC_MODE_X2 && index > X2AVIC_MAX_PHYSICAL_ID))
return NULL;

avic_physical_id_table = page_address(kvm_svm->avic_physical_id_table_page);
@@ -245,7 +246,8 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
int id = vcpu->vcpu_id;
struct vcpu_svm *svm = to_svm(vcpu);

- if (id >= AVIC_MAX_PHYSICAL_ID_COUNT)
+ if ((avic_mode == AVIC_MODE_X1 && id > AVIC_MAX_PHYSICAL_ID) ||
+ (avic_mode == AVIC_MODE_X2 && id > X2AVIC_MAX_PHYSICAL_ID))
return -EINVAL;

if (!vcpu->arch.apic->regs)
--
2.25.1

2022-03-09 01:56:03

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFCv2 PATCH 05/12] KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID

In x2APIC mode, ICRH contains 32-bit destination APIC ID.
So, update the avic_kick_target_vcpus() accordingly.

Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/kvm/svm/avic.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index f128b0189d4a..5329b93dc4cd 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -307,9 +307,15 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
* since entered the guest will have processed pending IRQs at VMRUN.
*/
kvm_for_each_vcpu(i, vcpu, kvm) {
+ u32 dest;
+
+ if (apic_x2apic_mode(vcpu->arch.apic))
+ dest = icrh;
+ else
+ dest = GET_XAPIC_DEST_FIELD(icrh);
+
if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
- GET_XAPIC_DEST_FIELD(icrh),
- icrl & APIC_DEST_MASK)) {
+ dest, icrl & APIC_DEST_MASK)) {
vcpu->arch.apic->irr_pending = true;
svm_complete_interrupt_delivery(vcpu,
icrl & APIC_MODE_MASK,
--
2.25.1

2022-03-09 01:56:27

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: [RFCv2 PATCH 10/12] KVM: SVM: Introduce helper functions to (de)activate AVIC and x2AVIC

Refactor the current logic for (de)activate AVIC into helper functions,
and also add logic for (de)activate x2AVIC. The helper function are used
when initializing AVIC and switching from AVIC to x2AVIC mode
(handled by svm_refresh_spicv_exec_ctrl()).

When an AVIC-enabled guest switches from APIC to x2APIC mode during
runtime, the SVM driver needs to perform the following steps:

1. Set the x2APIC mode bit for AVIC in VMCB along with the maximum
APIC ID support for each mode accodingly.

2. Disable x2APIC MSRs interception in order to allow the hardware
to virtualize x2APIC MSRs accesses.

Reported-by: kernel test robot <[email protected]>
Signed-off-by: Suravee Suthikulpanit <[email protected]>
---
arch/x86/include/asm/svm.h | 1 +
arch/x86/kvm/svm/avic.c | 48 ++++++++++++++++++++++++++++++++++----
2 files changed, 44 insertions(+), 5 deletions(-)

diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
index 681a348a9365..f5337022104d 100644
--- a/arch/x86/include/asm/svm.h
+++ b/arch/x86/include/asm/svm.h
@@ -248,6 +248,7 @@ enum avic_ipi_failure_cause {
AVIC_IPI_FAILURE_INVALID_BACKING_PAGE,
};

+#define AVIC_PHYSICAL_MAX_INDEX_MASK GENMASK_ULL(9, 0)

/*
* For AVIC, the max index allowed for physical APIC ID
diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
index 53559b8dfa52..b8d6bf6b6ed5 100644
--- a/arch/x86/kvm/svm/avic.c
+++ b/arch/x86/kvm/svm/avic.c
@@ -66,6 +66,45 @@ struct amd_svm_iommu_ir {
void *data; /* Storing pointer to struct amd_ir_data */
};

+static inline void avic_set_x2apic_msr_interception(struct vcpu_svm *svm, bool disable)
+{
+ int i;
+
+ for (i = 0x800; i <= 0x8ff; i++)
+ set_msr_interception(&svm->vcpu, svm->msrpm, i,
+ !disable, !disable);
+}
+
+static void avic_activate_vmcb(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
+ vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
+
+ vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ if (apic_x2apic_mode(svm->vcpu.arch.apic)) {
+ vmcb->control.int_ctl |= X2APIC_MODE_MASK;
+ vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
+ /* Disabling MSR intercept for x2APIC registers */
+ avic_set_x2apic_msr_interception(svm, false);
+ } else {
+ vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
+ /* Enabling MSR intercept for x2APIC registers */
+ avic_set_x2apic_msr_interception(svm, true);
+ }
+}
+
+static void avic_deactivate_vmcb(struct vcpu_svm *svm)
+{
+ struct vmcb *vmcb = svm->vmcb01.ptr;
+
+ vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
+ vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
+
+ /* Enabling MSR intercept for x2APIC registers */
+ avic_set_x2apic_msr_interception(svm, true);
+}

/* Note:
* This function is called from IOMMU driver to notify
@@ -183,13 +222,12 @@ void avic_init_vmcb(struct vcpu_svm *svm)
vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
- vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;

if (kvm_apicv_activated(svm->vcpu.kvm))
- vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ avic_activate_vmcb(svm);
else
- vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
+ avic_deactivate_vmcb(svm);
}

static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
@@ -703,9 +741,9 @@ void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
* accordingly before re-activating.
*/
avic_post_state_restore(vcpu);
- vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
+ avic_activate_vmcb(svm);
} else {
- vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
+ avic_deactivate_vmcb(svm);
}
vmcb_mark_dirty(vmcb, VMCB_AVIC);

--
2.25.1

2022-03-24 12:47:38

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFCv2 PATCH 04/12] KVM: SVM: Update max number of vCPUs supported for x2AVIC mode

On Tue, 2022-03-08 at 10:39 -0600, Suravee Suthikulpanit wrote:
> xAVIC and x2AVIC modes can support diffferent number of vcpus.
> Update existing logics to support each mode accordingly.
>
> Also, modify the maximum physical APIC ID for AVIC to 255 to reflect
> the actual value supported by the architecture.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/include/asm/svm.h | 12 +++++++++---
> arch/x86/kvm/svm/avic.c | 8 +++++---
> 2 files changed, 14 insertions(+), 6 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 7a7a2297165b..681a348a9365 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -250,10 +250,16 @@ enum avic_ipi_failure_cause {
>
>
> /*
> - * 0xff is broadcast, so the max index allowed for physical APIC ID
> - * table is 0xfe. APIC IDs above 0xff are reserved.
> + * For AVIC, the max index allowed for physical APIC ID
> + * table is 0xff (255).
> */
> -#define AVIC_MAX_PHYSICAL_ID_COUNT 0xff
This should be 0xFE, since index 0xFF is reserved in AVIC mode.
It used to work because (see below) check used to be '>=',
but I do like that you switched to '>' check instead.


> +#define AVIC_MAX_PHYSICAL_ID 0XFFULL
> +
> +/*
> + * For x2AVIC, the max index allowed for physical APIC ID
> + * table is 0x1ff (511).
> + */
> +#define X2AVIC_MAX_PHYSICAL_ID 0x1FFUL


>
> #define AVIC_HPA_MASK ~((0xFFFULL << 52) | 0xFFF)
> #define VMCB_AVIC_APIC_BAR_MASK 0xFFFFFFFFFF000ULL
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 49b185f0d42e..f128b0189d4a 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -183,7 +183,7 @@ void avic_init_vmcb(struct vcpu_svm *svm)
> vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
> vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
> vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
> - vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID_COUNT;
> + vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
> vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;
>
> if (kvm_apicv_activated(svm->vcpu.kvm))
> @@ -198,7 +198,8 @@ static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
> u64 *avic_physical_id_table;
> struct kvm_svm *kvm_svm = to_kvm_svm(vcpu->kvm);
>
> - if (index >= AVIC_MAX_PHYSICAL_ID_COUNT)
This is the check I am talking about

> + if ((avic_mode == AVIC_MODE_X1 && index > AVIC_MAX_PHYSICAL_ID) ||
> + (avic_mode == AVIC_MODE_X2 && index > X2AVIC_MAX_PHYSICAL_ID))
> return NULL;

I would probably like to ask to move this check to a function,
but I see that avic_get_physical_id_entry is only used in avic_handle_apic_id_update
in addition to avic_init_backing_page which has this check,
and I will sooner or later remove the anywat broken avic_handle_apic_id_update and
inline the avic_get_physical_id_entry probably so no need to do this.

>
> avic_physical_id_table = page_address(kvm_svm->avic_physical_id_table_page);
> @@ -245,7 +246,8 @@ static int avic_init_backing_page(struct kvm_vcpu *vcpu)
> int id = vcpu->vcpu_id;
> struct vcpu_svm *svm = to_svm(vcpu);
>
> - if (id >= AVIC_MAX_PHYSICAL_ID_COUNT)
> + if ((avic_mode == AVIC_MODE_X1 && id > AVIC_MAX_PHYSICAL_ID) ||
> + (avic_mode == AVIC_MODE_X2 && id > X2AVIC_MAX_PHYSICAL_ID))
> return -EINVAL;


>
> if (!vcpu->arch.apic->regs)


So except the off-by-one error in AVIC_MAX_PHYSICAL_ID_COUNT:

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky

2022-03-24 14:17:13

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFCv2 PATCH 05/12] KVM: SVM: Update avic_kick_target_vcpus to support 32-bit APIC ID

On Tue, 2022-03-08 at 10:39 -0600, Suravee Suthikulpanit wrote:
> In x2APIC mode, ICRH contains 32-bit destination APIC ID.
> So, update the avic_kick_target_vcpus() accordingly.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm/avic.c | 10 ++++++++--
> 1 file changed, 8 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index f128b0189d4a..5329b93dc4cd 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -307,9 +307,15 @@ static void avic_kick_target_vcpus(struct kvm *kvm, struct kvm_lapic *source,
> * since entered the guest will have processed pending IRQs at VMRUN.
> */
> kvm_for_each_vcpu(i, vcpu, kvm) {
> + u32 dest;
> +
> + if (apic_x2apic_mode(vcpu->arch.apic))
> + dest = icrh;
> + else
> + dest = GET_XAPIC_DEST_FIELD(icrh);
> +
> if (kvm_apic_match_dest(vcpu, source, icrl & APIC_SHORT_MASK,
> - GET_XAPIC_DEST_FIELD(icrh),
> - icrl & APIC_DEST_MASK)) {
> + dest, icrl & APIC_DEST_MASK)) {
> vcpu->arch.apic->irr_pending = true;
> svm_complete_interrupt_delivery(vcpu,
> icrl & APIC_MODE_MASK,

Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky

2022-03-24 17:54:50

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFCv2 PATCH 08/12] KVM: SVM: Adding support for configuring x2APIC MSRs interception

On Tue, 2022-03-08 at 10:39 -0600, Suravee Suthikulpanit wrote:
> When enabling x2APIC virtualization (x2AVIC), the interception of
> x2APIC MSRs must be disabled to let the hardware virtualize guest
> MSR accesses.
>
> Current implementation keeps track of MSR interception state
> for generic MSRs in the svm_direct_access_msrs array.
> For x2APIC MSRs, introduce direct_access_x2apic_msrs array.
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm/svm.c | 67 +++++++++++++++++++++++++++++++-----------
> arch/x86/kvm/svm/svm.h | 7 +++++
> 2 files changed, 57 insertions(+), 17 deletions(-)
>
> diff --git a/arch/x86/kvm/svm/svm.c b/arch/x86/kvm/svm/svm.c
> index 3048f4b758d6..ce3c68a785cf 100644
> --- a/arch/x86/kvm/svm/svm.c
> +++ b/arch/x86/kvm/svm/svm.c
> @@ -89,7 +89,7 @@ static uint64_t osvw_len = 4, osvw_status;
> static DEFINE_PER_CPU(u64, current_tsc_ratio);
> #define TSC_RATIO_DEFAULT 0x0100000000ULL
>
> -static const struct svm_direct_access_msrs {
> +static struct svm_direct_access_msrs {
> u32 index; /* Index of the MSR */
> bool always; /* True if intercept is initially cleared */
> } direct_access_msrs[MAX_DIRECT_ACCESS_MSRS] = {
> @@ -117,6 +117,9 @@ static const struct svm_direct_access_msrs {
> { .index = MSR_INVALID, .always = false },
> };
>
> +static struct svm_direct_access_msrs
> +direct_access_x2apic_msrs[NUM_DIRECT_ACCESS_X2APIC_MSRS + 1];
> +
> /*
> * These 2 parameters are used to config the controls for Pause-Loop Exiting:
> * pause_filter_count: On processors that support Pause filtering(indicated
> @@ -609,41 +612,42 @@ static int svm_cpu_init(int cpu)
>
> }
>
> -static int direct_access_msr_slot(u32 msr)
> +static int direct_access_msr_slot(u32 msr, struct svm_direct_access_msrs *msrs)
> {
> u32 i;
>
> - for (i = 0; direct_access_msrs[i].index != MSR_INVALID; i++)
> - if (direct_access_msrs[i].index == msr)
> + for (i = 0; msrs[i].index != MSR_INVALID; i++)
> + if (msrs[i].index == msr)
> return i;
>
> return -ENOENT;
> }
>
> -static void set_shadow_msr_intercept(struct kvm_vcpu *vcpu, u32 msr, int read,
> - int write)
> +static void set_shadow_msr_intercept(struct kvm_vcpu *vcpu,
> + struct svm_direct_access_msrs *msrs, u32 msr,
> + int read, void *read_bits,
> + int write, void *write_bits)
> {
> - struct vcpu_svm *svm = to_svm(vcpu);
> - int slot = direct_access_msr_slot(msr);direct_access_msrs
> + int slot = direct_access_msr_slot(msr, msrs);
>
> if (slot == -ENOENT)
> return;
>
> /* Set the shadow bitmaps to the desired intercept states */
> if (read)
> - set_bit(slot, svm->shadow_msr_intercept.read);
> + set_bit(slot, read_bits);
> else
> - clear_bit(slot, svm->shadow_msr_intercept.read);
> + clear_bit(slot, read_bits);
>
> if (write)
> - set_bit(slot, svm->shadow_msr_intercept.write);
> + set_bit(slot, write_bits);
> else
> - clear_bit(slot, svm->shadow_msr_intercept.write);
> + clear_bit(slot, write_bits);
> }
>
> -static bool valid_msr_intercept(u32 index)
> +static bool valid_msr_intercept(u32 index, struct svm_direct_access_msrs *msrs)
> {
> - return direct_access_msr_slot(index) != -ENOENT;
> + return direct_access_msr_slot(index, msrs) != -ENOENT;
> }
>
> static bool msr_write_intercepted(struct kvm_vcpu *vcpu, u32 msr)
> @@ -674,9 +678,12 @@ static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
>
> /*
> * If this warning triggers extend the direct_access_msrs list at the
> - * beginning of the file
> + * beginning of the file. The direct_access_x2apic_msrs is only for
> + * x2apic MSRs.
> */
> - WARN_ON(!valid_msr_intercept(msr));
> + WARN_ON(!valid_msr_intercept(msr, direct_access_msrs) &&
> + (boot_cpu_has(X86_FEATURE_X2AVIC) &&
> + !valid_msr_intercept(msr, direct_access_x2apic_msrs)));
>
> /* Enforce non allowed MSRs to trap */
> if (read && !kvm_msr_allowed(vcpu, msr, KVM_MSR_FILTER_READ))
> @@ -704,7 +711,16 @@ static void set_msr_interception_bitmap(struct kvm_vcpu *vcpu, u32 *msrpm,
> void set_msr_interception(struct kvm_vcpu *vcpu, u32 *msrpm, u32 msr,
> int read, int write)
> {
> - set_shadow_msr_intercept(vcpu, msr, read, write);
> + struct vcpu_svm *svm = to_svm(vcpu);
> +
> + if (msr < 0x800 || msr > 0x8ff)
> + set_shadow_msr_intercept(vcpu, direct_access_msrs, msr,
> + read, svm->shadow_msr_intercept.read,
> + write, svm->shadow_msr_intercept.write);
> + else
> + set_shadow_msr_intercept(vcpu, direct_access_x2apic_msrs, msr,
> + read, svm->shadow_x2apic_msr_intercept.read,
> + write, svm->shadow_x2apic_msr_intercept.write);
> set_msr_interception_bitmap(vcpu, msrpm, msr, read, write);
> }
>
> @@ -786,6 +802,22 @@ static void add_msr_offset(u32 offset)
> BUG();
> }
>
> +static void init_direct_access_x2apic_msrs(void)
> +{
> + int i;
> +
> + /* Initialize x2APIC direct_access_x2apic_msrs entries */
> + for (i = 0; i < NUM_DIRECT_ACCESS_X2APIC_MSRS; i++) {
> + direct_access_x2apic_msrs[i].index = boot_cpu_has(X86_FEATURE_X2AVIC) ?
> + (0x800 + i) : MSR_INVALID;
> + direct_access_x2apic_msrs[i].always = false;
> + }
> +
> + /* Initialize last entry */
> + direct_access_x2apic_msrs[i].index = MSR_INVALID;
> + direct_access_x2apic_msrs[i].always = false;
> +}
> +
> static void init_msrpm_offsets(void)
> {
> int i;
> @@ -4750,6 +4782,7 @@ static __init int svm_hardware_setup(void)
> memset(iopm_va, 0xff, PAGE_SIZE * (1 << order));
> iopm_base = page_to_pfn(iopm_pages) << PAGE_SHIFT;
>
> + init_direct_access_x2apic_msrs();
> init_msrpm_offsets();
>
> supported_xcr0 &= ~(XFEATURE_MASK_BNDREGS | XFEATURE_MASK_BNDCSR);
> diff --git a/arch/x86/kvm/svm/svm.h b/arch/x86/kvm/svm/svm.h
> index b53c83a44ec2..19ad40b8383b 100644
> --- a/arch/x86/kvm/svm/svm.h
> +++ b/arch/x86/kvm/svm/svm.h
> @@ -29,6 +29,8 @@
>
> #define MAX_DIRECT_ACCESS_MSRS 20
> #define MSRPM_OFFSETS 16
> +#define NUM_DIRECT_ACCESS_X2APIC_MSRS 0x100
> +
> extern u32 msrpm_offsets[MSRPM_OFFSETS] __read_mostly;
> extern bool npt_enabled;
> extern bool intercept_smi;
> @@ -241,6 +243,11 @@ struct vcpu_svm {
> DECLARE_BITMAP(write, MAX_DIRECT_ACCESS_MSRS);
> } shadow_msr_intercept;
>
> + struct {
> + DECLARE_BITMAP(read, NUM_DIRECT_ACCESS_X2APIC_MSRS);
> + DECLARE_BITMAP(write, NUM_DIRECT_ACCESS_X2APIC_MSRS);
> + } shadow_x2apic_msr_intercept;
> +
> struct vcpu_sev_es_state sev_es;
>
> bool guest_state_loaded;


I did some homework on this, and it looks mostly correct.

However I do wonder if we need that separation of svm_direct_access_msrs and
direct_access_x2apic_msrs. I understand the peformance wise, the
direct_access_msrs will get longer otherwise (but we don't have to allow
all x2apic msr range, but only known x2apic registers which aren't that many).

One of the things that I see that *is* broken (at least in theory) is nesting.

init_msrpm_offsets goes over direct_access_msrs and puts the offsets of corresponding
bits in the hardware msr bitmap into the 'msrpm_offsets'

Then on nested VM entry the nested_svm_vmrun_msrpm uses this list to merge the nested
and host MSR bitmaps.
Without x2apic msrs, this means that if L1 chooses to allow L2 to access its x2apic msrs
it won't work. It is not something that L1 would do often but still allowed to overall.

Honestly we need to write track the nested MSR bitmap to avoid updating it on each VM entry,
then with this hot path eliminated, I don't think there are other places which update
the msr interception often, and thus we could just put the x2apic msrs into the
direct_access_msrs.

Best regards,
Maxim Levitsky



2022-03-24 21:55:40

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFCv2 PATCH 09/12] KVM: SVM: Refresh AVIC settings when changing APIC mode

On Tue, 2022-03-08 at 10:39 -0600, Suravee Suthikulpanit wrote:
> When APIC mode is updated (e.g. from xAPIC to x2APIC),
> KVM needs to update AVIC settings accordingly, whic is
> handled by svm_refresh_apicv_exec_ctrl().
>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/kvm/svm/avic.c | 19 ++++++++++++++++++-
> 1 file changed, 18 insertions(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 7e5a39a8e698..53559b8dfa52 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -625,7 +625,24 @@ void avic_post_state_restore(struct kvm_vcpu *vcpu)
>
> void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
> {
> - return;
> + struct vcpu_svm *svm = to_svm(vcpu);
> +
> + if (!lapic_in_kernel(vcpu) || (avic_mode == AVIC_MODE_NONE))
> + return;
> +
> + if (kvm_get_apic_mode(vcpu) == LAPIC_MODE_INVALID)
> + WARN_ONCE(true, "Invalid local APIC state");
> +
> + svm->vmcb->control.avic_vapic_bar = svm->vcpu.arch.apic_base &
> + VMCB_AVIC_APIC_BAR_MASK;

No need for that - APIC base relocation doesn't work when AVIC is enabled,
since the page which contains it has to be marked R/W in NPT, which we
only do for the default APIC base.

I recently removed the code from AVIC which still tried to set the
'avic_vapic_bar' like this.



> + kvm_vcpu_update_apicv(&svm->vcpu);
> +
> + /*
> + * The VM could be running w/ AVIC activated switching from APIC
> + * to x2APIC mode. We need to all refresh to make sure that all
> + * x2AVIC configuration are being done.

Why? When AVIC is un-inhibited later then the svm_refresh_apicv_exec_ctrl will be called
again and switch to x2avic mode I think.

When AVIC is inhibited, then regardless of x2apic mode, VMCB must not have
any avic bits set, and all x2apic msrs should be read/write intercepted.,
thus I don't think that svm_refresh_apicv_exec_ctrl should be force called.


> + */
> + svm_refresh_apicv_exec_ctrl(&svm->vcpu);
> }
>
> void svm_hwapic_irr_update(struct kvm_vcpu *vcpu, int max_irr)

Best regards,
Maxim Levitsky

2022-03-25 15:54:38

by Maxim Levitsky

[permalink] [raw]
Subject: Re: [RFCv2 PATCH 10/12] KVM: SVM: Introduce helper functions to (de)activate AVIC and x2AVIC

On Tue, 2022-03-08 at 10:39 -0600, Suravee Suthikulpanit wrote:
> Refactor the current logic for (de)activate AVIC into helper functions,
> and also add logic for (de)activate x2AVIC. The helper function are used
> when initializing AVIC and switching from AVIC to x2AVIC mode
> (handled by svm_refresh_spicv_exec_ctrl()).
>
> When an AVIC-enabled guest switches from APIC to x2APIC mode during
> runtime, the SVM driver needs to perform the following steps:
>
> 1. Set the x2APIC mode bit for AVIC in VMCB along with the maximum
> APIC ID support for each mode accodingly.
>
> 2. Disable x2APIC MSRs interception in order to allow the hardware
> to virtualize x2APIC MSRs accesses.
>
> Reported-by: kernel test robot <[email protected]>
> Signed-off-by: Suravee Suthikulpanit <[email protected]>
> ---
> arch/x86/include/asm/svm.h | 1 +
> arch/x86/kvm/svm/avic.c | 48 ++++++++++++++++++++++++++++++++++----
> 2 files changed, 44 insertions(+), 5 deletions(-)
>
> diff --git a/arch/x86/include/asm/svm.h b/arch/x86/include/asm/svm.h
> index 681a348a9365..f5337022104d 100644
> --- a/arch/x86/include/asm/svm.h
> +++ b/arch/x86/include/asm/svm.h
> @@ -248,6 +248,7 @@ enum avic_ipi_failure_cause {
> AVIC_IPI_FAILURE_INVALID_BACKING_PAGE,
> };
>
> +#define AVIC_PHYSICAL_MAX_INDEX_MASK GENMASK_ULL(9, 0)
>
> /*
> * For AVIC, the max index allowed for physical APIC ID
> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
> index 53559b8dfa52..b8d6bf6b6ed5 100644
> --- a/arch/x86/kvm/svm/avic.c
> +++ b/arch/x86/kvm/svm/avic.c
> @@ -66,6 +66,45 @@ struct amd_svm_iommu_ir {
> void *data; /* Storing pointer to struct amd_ir_data */
> };
>
> +static inline void avic_set_x2apic_msr_interception(struct vcpu_svm *svm, bool disable)
> +{
> + int i;
> +
> + for (i = 0x800; i <= 0x8ff; i++)
> + set_msr_interception(&svm->vcpu, svm->msrpm, i,
> + !disable, !disable);
> +}
> +
> +static void avic_activate_vmcb(struct vcpu_svm *svm)
> +{
> + struct vmcb *vmcb = svm->vmcb01.ptr;
> +
> + vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
> + vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;

This looks a bit better, I don't 100% like this but let it be.

Honestly I will eventualy add code to calculate and update this maximum
dynamically to avoid wasting microcode going over the whole table,
or worse having nested avic code doing so.


> +
> + vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> + if (apic_x2apic_mode(svm->vcpu.arch.apic)) {
> + vmcb->control.int_ctl |= X2APIC_MODE_MASK;
> + vmcb->control.avic_physical_id |= X2AVIC_MAX_PHYSICAL_ID;
> + /* Disabling MSR intercept for x2APIC registers */
> + avic_set_x2apic_msr_interception(svm, false);
> + } else {
> + vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
> + /* Enabling MSR intercept for x2APIC registers */
> + avic_set_x2apic_msr_interception(svm, true);
> + }
> +}

> +
> +static void avic_deactivate_vmcb(struct vcpu_svm *svm)
> +{
> + struct vmcb *vmcb = svm->vmcb01.ptr;
> +
> + vmcb->control.int_ctl &= ~(AVIC_ENABLE_MASK | X2APIC_MODE_MASK);
> + vmcb->control.avic_physical_id &= ~AVIC_PHYSICAL_MAX_INDEX_MASK;
> +
> + /* Enabling MSR intercept for x2APIC registers */
> + avic_set_x2apic_msr_interception(svm, true);
> +}
Makes sense.


>
> /* Note:
> * This function is called from IOMMU driver to notify
> @@ -183,13 +222,12 @@ void avic_init_vmcb(struct vcpu_svm *svm)
> vmcb->control.avic_backing_page = bpa & AVIC_HPA_MASK;
> vmcb->control.avic_logical_id = lpa & AVIC_HPA_MASK;
> vmcb->control.avic_physical_id = ppa & AVIC_HPA_MASK;
> - vmcb->control.avic_physical_id |= AVIC_MAX_PHYSICAL_ID;
> vmcb->control.avic_vapic_bar = APIC_DEFAULT_PHYS_BASE & VMCB_AVIC_APIC_BAR_MASK;
>
> if (kvm_apicv_activated(svm->vcpu.kvm))
> - vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> + avic_activate_vmcb(svm);
> else
> - vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
> + avic_deactivate_vmcb(svm);
> }
>
> static u64 *avic_get_physical_id_entry(struct kvm_vcpu *vcpu,
> @@ -703,9 +741,9 @@ void svm_refresh_apicv_exec_ctrl(struct kvm_vcpu *vcpu)
> * accordingly before re-activating.
> */
> avic_post_state_restore(vcpu);
> - vmcb->control.int_ctl |= AVIC_ENABLE_MASK;
> + avic_activate_vmcb(svm);
> } else {
> - vmcb->control.int_ctl &= ~AVIC_ENABLE_MASK;
> + avic_deactivate_vmcb(svm);
> }
> vmcb_mark_dirty(vmcb, VMCB_AVIC);
>


Reviewed-by: Maxim Levitsky <[email protected]>

Best regards,
Maxim Levitsky

2022-03-31 05:15:08

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFCv2 PATCH 09/12] KVM: SVM: Refresh AVIC settings when changing APIC mode

Hi Maxim,

On 3/24/22 10:35 PM, Maxim Levitsky wrote:
> On Tue, 2022-03-08 at 10:39 -0600, Suravee Suthikulpanit wrote:
>> When APIC mode is updated (e.g. from xAPIC to x2APIC),
>> KVM needs to update AVIC settings accordingly, whic is
>> handled by svm_refresh_apicv_exec_ctrl().
>>
>> Signed-off-by: Suravee Suthikulpanit <[email protected]>
>> ---
>> arch/x86/kvm/svm/avic.c | 19 ++++++++++++++++++-
>> 1 file changed, 18 insertions(+), 1 deletion(-)
>>
>> diff --git a/arch/x86/kvm/svm/avic.c b/arch/x86/kvm/svm/avic.c
>> index 7e5a39a8e698..53559b8dfa52 100644
>> --- a/arch/x86/kvm/svm/avic.c
>> +++ b/arch/x86/kvm/svm/avic.c
>> @@ -625,7 +625,24 @@ void avic_post_state_restore(struct kvm_vcpu *vcpu)
>>
>> void svm_set_virtual_apic_mode(struct kvm_vcpu *vcpu)
>> {
>> - return;
>> + struct vcpu_svm *svm = to_svm(vcpu);
>> +
>> + if (!lapic_in_kernel(vcpu) || (avic_mode == AVIC_MODE_NONE))
>> + return;
>> +
>> + if (kvm_get_apic_mode(vcpu) == LAPIC_MODE_INVALID)
>> + WARN_ONCE(true, "Invalid local APIC state");
>> +
>> + svm->vmcb->control.avic_vapic_bar = svm->vcpu.arch.apic_base &
>> + VMCB_AVIC_APIC_BAR_MASK;
>
> No need for that - APIC base relocation doesn't work when AVIC is enabled,
> since the page which contains it has to be marked R/W in NPT, which we
> only do for the default APIC base.
>
> I recently removed the code from AVIC which still tried to set the
> 'avic_vapic_bar' like this.

Got it. I'll remove this part.

>
>> + kvm_vcpu_update_apicv(&svm->vcpu);
>> +
>> + /*
>> + * The VM could be running w/ AVIC activated switching from APIC
>> + * to x2APIC mode. We need to all refresh to make sure that all
>> + * x2AVIC configuration are being done.
>
> Why? When AVIC is un-inhibited later then the svm_refresh_apicv_exec_ctrl will be called
> again and switch to x2avic mode I think.

Current version does not disable AVIC when APIC is disabled, which happens during
APIC mode switching (i.e. xAPIC -> disabled -> x2APIC). This needs to be fixed.
Then we can remove the force refresh.

> When AVIC is inhibited, then regardless of x2apic mode, VMCB must not have
> any avic bits set, and all x2apic msrs should be read/write intercepted.,
> thus I don't think that svm_refresh_apicv_exec_ctrl should be force called.

The refresh is normally called only when there is APICV update request (e.g.
kvm_request_apicv_update(APICV_INHIBIT_REASON_IRQWIN)), which could happen or not.


However, I have reworked this part. The svm_refresh_apicv_exec_ctrl()
force is no longer needed.

Regards,
Suravee

2022-04-05 03:42:56

by Suthikulpanit, Suravee

[permalink] [raw]
Subject: Re: [RFCv2 PATCH 08/12] KVM: SVM: Adding support for configuring x2APIC MSRs interception

Hi Maxim,

On 3/24/22 10:19 PM, Maxim Levitsky wrote:
> I did some homework on this, and it looks mostly correct.
>
> However I do wonder if we need that separation of svm_direct_access_msrs and
> direct_access_x2apic_msrs. I understand the peformance wise, the
> direct_access_msrs will get longer otherwise (but we don't have to allow
> all x2apic msr range, but only known x2apic registers which aren't that many).
>
> One of the things that I see that*is* broken (at least in theory) is nesting.
>
> init_msrpm_offsets goes over direct_access_msrs and puts the offsets of corresponding
> bits in the hardware msr bitmap into the 'msrpm_offsets'
>
> Then on nested VM entry the nested_svm_vmrun_msrpm uses this list to merge the nested
> and host MSR bitmaps.
> Without x2apic msrs, this means that if L1 chooses to allow L2 to access its x2apic msrs
> it won't work. It is not something that L1 would do often but still allowed to overall.
>
> Honestly we need to write track the nested MSR bitmap to avoid updating it on each VM entry,
> then with this hot path eliminated, I don't think there are other places which update
> the msr interception often, and thus we could just put the x2apic msrs into the
> direct_access_msrs.
>
> Best regards,
> Maxim Levitsky

Good point. I will fix this.

Suravee