2024-06-13 06:08:35

by Yan Zhao

[permalink] [raw]
Subject: [PATCH 0/5] Introduce a quirk to control memslot zap behavior

Today "zapping only memslot leaf SPTEs" on moving/deleting a memslot is not
done. Instead, KVM opts to invalidate all page tables and generate fresh
new ones based on the new memslot layout (referred to as "zap all" for
short). This "zap all" behavior is of low overhead for most use cases, and
is adopted primarily due to a bug which caused VM instability when a VM is
with Nvidia Geforce GPU assigned (see link in patch 1).

However, the "zap all" behavior is not desired for certain specific
scenarios. e.g.
- It's not viable for TDX,
a) TDX requires root page of private page table remains unaltered
throughout the TD life cycle.
b) TDX mandates that leaf entries in private page table must be zapped
prior to non-leaf entries.
c) TDX requires re-accepting of private pages after page dropping.
- It's not performant for scenarios involving frequent deletion and
re-adding of numerous small memslots.

This series therefore introduces the KVM_X86_QUIRK_SLOT_ZAP_ALL quirk,
enabling users to control the behavior of memslot zapping when a memslot is
moved/deleted.

The quirk is turned on by default, leading to invalidation/zapping to all
SPTEs when a memslot is moved/deleted.

Users have the option to turn off the quirk. Doing so will limit the
zapping to only leaf SPTEs within the range of memslot being moved/deleted.

This series has been tested with
- Normal VMs
w/ and w/o device assignment, and kvm selftests

- TDX guests.
Memslot deletion typically does not occur without device assignment for a
TD. Therefore, it is tested with shared device assignment.

Note: For TDX integration, the quirk is currently disabled via TDX code in
QEMU rather than being automatically disabled based on VM type in
KVM, which is not safe. A malfunctioning QEMU that fails to disable
the quirk could result in the shared EPT being invalidated while the
private EPT remains unaffected, as kvm_mmu_zap_all_fast() only
targets the shared EPT.

However, current kvm->arch.disabled_quirks is entirely
user-controlled, and there is no mechanism for users to verify if a
quirk has been disabled by the kernel.
We are therefore wondering which below options are better for TDX:

a) Add a condition for TDX VM type in kvm_arch_flush_shadow_memslot()
besides the testing of kvm_check_has_quirk(). It is similar to
"all new VM types have the quirk disabled". e.g.

static inline bool kvm_memslot_flush_zap_all(struct kvm *kvm)
{
     return kvm->arch.vm_type != KVM_X86_TDX_VM &&
         kvm_check_has_quirk(kvm, KVM_X86_QUIRK_SLOT_ZAP_ALL);
}

b) Init the disabled_quirks based on VM type in kernel, extend
disabled_quirk querying/setting interface to enforce the quirk to
be disabled for TDX.

Patch 1: KVM changes.
Patch 2-5: Selftests updates. Verify memslot move/deletion functionality
with the quirk enabled/disabled.


Yan Zhao (5):
KVM: x86/mmu: Introduce a quirk to control memslot zap behavior
KVM: selftests: Test slot move/delete with slot zap quirk
enabled/disabled
KVM: selftests: Allow slot modification stress test with quirk
disabled
KVM: selftests: Test memslot move in memslot_perf_test with quirk
disabled
KVM: selftests: Test private access to deleted memslot with quirk
disabled

Documentation/virt/kvm/api.rst | 6 ++++
arch/x86/include/asm/kvm_host.h | 3 +-
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/kvm/mmu/mmu.c | 36 ++++++++++++++++++-
.../kvm/memslot_modification_stress_test.c | 19 ++++++++--
.../testing/selftests/kvm/memslot_perf_test.c | 12 ++++++-
.../selftests/kvm/set_memory_region_test.c | 29 ++++++++++-----
.../kvm/x86_64/private_mem_kvm_exits_test.c | 11 ++++--
8 files changed, 102 insertions(+), 15 deletions(-)

base-commit: dd5a440a31fae6e459c0d6271dddd62825505361
--
2.43.2


2024-06-13 06:11:07

by Yan Zhao

[permalink] [raw]
Subject: [PATCH 1/5] KVM: x86/mmu: Introduce a quirk to control memslot zap behavior

Introduce a quirk KVM_X86_QUIRK_SLOT_ZAP_ALL to allow users to choose
between enabling the quirk to invalidate/zap all SPTEs when a memslot is
moved/deleted and disabling the quirk to zap precisely/slowly of only leaf
SPTEs that are within the range of moving/deleting memslot.

The quirk is turned on by default. This is to work around a bug [1] where
the changing the zapping behavior of memslot move/deletion would cause VM
instability for VMs with an Nvidia GPU assigned.

Users have the option to deactivate the quirk for specific VMs that are
unaffected by this bug. Turning off the quirk enables a more precise
zapping of SPTEs within the memory slot range, enhancing performance for
certain scenarios [2] and meeting the functional requirements for TDX.
In TDX, it is crucial that the root page of the private page table remains
unchanged, with leaf entries being zapped before non-leaf entries.
Additionally, any pages dropped in TDX would necessitate their
re-acceptance by the guest.

Previously, an attempt was made to introduce a per-VM capability [3] as a
workaround for the bug. However, this approach was not preferred because
the root cause of the bug remained unidentified. An alternative solution
involving a per-memslot flag [4] was also considered but ultimately
rejected. Sean and Paolo thereafter recommended the implementation of this
quirk and explained that it's the least bad option [5].

For the quirk disabled case, add a new function kvm_mmu_zap_memslot_leafs()
to zap leaf SPTEs within a memslot when moving/deleting a memslot. Rather
than further calling kvm_unmap_gfn_range() for the actual zapping, this
function bypasses the special handling to APIC_ACCESS_PAGE_PRIVATE_MEMSLOT.
This is based on the considerations that
1) The APIC_ACCESS_PAGE_PRIVATE_MEMSLOT cannot be created by users, nor can
it be moved. It is only deleted by KVM when APICv is permanently
inhibited.
2) kvm_vcpu_reload_apic_access_page() effectively does nothing when
APIC_ACCESS_PAGE_PRIVATE_MEMSLOT is deleted.
3) Avoid making all cpus request of KVM_REQ_APIC_PAGE_RELOAD can save on
costly IPIs.

Suggested-by: Kai Huang <[email protected]>
Suggested-by: Sean Christopherson <[email protected]>
Suggested-by: Paolo Bonzini <[email protected]>
Cc: Rick Edgecombe <[email protected]>
Link: https://patchwork.kernel.org/project/kvm/patch/[email protected] [1]
Link: https://patchwork.kernel.org/project/kvm/patch/[email protected]/#25054908 [2]
Link: https://lore.kernel.org/kvm/[email protected]/T/#mabc0119583dacf621025e9d873c85f4fbaa66d5c [3]
Link: https://lore.kernel.org/all/[email protected] [4]
Link: https://lore.kernel.org/all/[email protected] [5]
Signed-off-by: Yan Zhao <[email protected]>
---
Documentation/virt/kvm/api.rst | 6 ++++++
arch/x86/include/asm/kvm_host.h | 3 ++-
arch/x86/include/uapi/asm/kvm.h | 1 +
arch/x86/kvm/mmu/mmu.c | 36 ++++++++++++++++++++++++++++++++-
4 files changed, 44 insertions(+), 2 deletions(-)

diff --git a/Documentation/virt/kvm/api.rst b/Documentation/virt/kvm/api.rst
index ebdf88078515..37b5ecb25778 100644
--- a/Documentation/virt/kvm/api.rst
+++ b/Documentation/virt/kvm/api.rst
@@ -8146,6 +8146,12 @@ KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS By default, KVM emulates MONITOR/MWAIT (if
guest CPUID on writes to MISC_ENABLE if
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT is
disabled.
+
+KVM_X86_QUIRK_SLOT_ZAP_ALL By default, KVM invalidates all SPTEs in
+ fast way when a memslot is deleted.
+ When this quirk is disabled, KVM zaps only
+ leaf SPTEs that are within the range of the
+ memslot being deleted.
=================================== ============================================

7.32 KVM_CAP_MAX_VCPU_ID
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index c9499e3b5915..8152b5259435 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -2383,7 +2383,8 @@ int memslot_rmap_alloc(struct kvm_memory_slot *slot, unsigned long npages);
KVM_X86_QUIRK_OUT_7E_INC_RIP | \
KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT | \
KVM_X86_QUIRK_FIX_HYPERCALL_INSN | \
- KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS)
+ KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS | \
+ KVM_X86_QUIRK_SLOT_ZAP_ALL)

/*
* KVM previously used a u32 field in kvm_run to indicate the hypercall was
diff --git a/arch/x86/include/uapi/asm/kvm.h b/arch/x86/include/uapi/asm/kvm.h
index f64421c55266..c5d189f9ca34 100644
--- a/arch/x86/include/uapi/asm/kvm.h
+++ b/arch/x86/include/uapi/asm/kvm.h
@@ -438,6 +438,7 @@ struct kvm_sync_regs {
#define KVM_X86_QUIRK_MISC_ENABLE_NO_MWAIT (1 << 4)
#define KVM_X86_QUIRK_FIX_HYPERCALL_INSN (1 << 5)
#define KVM_X86_QUIRK_MWAIT_NEVER_UD_FAULTS (1 << 6)
+#define KVM_X86_QUIRK_SLOT_ZAP_ALL (1 << 7)

#define KVM_STATE_NESTED_FORMAT_VMX 0
#define KVM_STATE_NESTED_FORMAT_SVM 1
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 1fd2f8ea6fab..6269fa315888 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6987,10 +6987,44 @@ void kvm_arch_flush_shadow_all(struct kvm *kvm)
kvm_mmu_zap_all(kvm);
}

+/*
+ * Zapping leaf SPTEs with memslot range when a memslot is moved/deleted.
+ *
+ * Zapping non-leaf SPTEs, a.k.a. not-last SPTEs, isn't required, worst
+ * case scenario we'll have unused shadow pages lying around until they
+ * are recycled due to age or when the VM is destroyed.
+ */
+static void kvm_mmu_zap_memslot_leafs(struct kvm *kvm, struct kvm_memory_slot *slot)
+{
+ struct kvm_gfn_range range = {
+ .slot = slot,
+ .start = slot->base_gfn,
+ .end = slot->base_gfn + slot->npages,
+ .may_block = true,
+ };
+ bool flush = false;
+
+ write_lock(&kvm->mmu_lock);
+
+ if (kvm_memslots_have_rmaps(kvm))
+ flush = kvm_handle_gfn_range(kvm, &range, kvm_zap_rmap);
+
+ if (tdp_mmu_enabled)
+ flush = kvm_tdp_mmu_unmap_gfn_range(kvm, &range, flush);
+
+ if (flush)
+ kvm_flush_remote_tlbs_memslot(kvm, slot);
+
+ write_unlock(&kvm->mmu_lock);
+}
+
void kvm_arch_flush_shadow_memslot(struct kvm *kvm,
struct kvm_memory_slot *slot)
{
- kvm_mmu_zap_all_fast(kvm);
+ if (kvm_check_has_quirk(kvm, KVM_X86_QUIRK_SLOT_ZAP_ALL))
+ kvm_mmu_zap_all_fast(kvm);
+ else
+ kvm_mmu_zap_memslot_leafs(kvm, slot);
}

void kvm_mmu_invalidate_mmio_sptes(struct kvm *kvm, u64 gen)
--
2.43.2


2024-06-13 06:11:43

by Yan Zhao

[permalink] [raw]
Subject: [PATCH 2/5] KVM: selftests: Test slot move/delete with slot zap quirk enabled/disabled

Update set_memory_region_test to make sure memslot move and deletion
function correctly both when slot zap quirk KVM_X86_QUIRK_SLOT_ZAP_ALL is
enabled and disabled.

Signed-off-by: Yan Zhao <[email protected]>
---
.../selftests/kvm/set_memory_region_test.c | 29 ++++++++++++++-----
1 file changed, 21 insertions(+), 8 deletions(-)

diff --git a/tools/testing/selftests/kvm/set_memory_region_test.c b/tools/testing/selftests/kvm/set_memory_region_test.c
index bb8002084f52..a8267628e9ed 100644
--- a/tools/testing/selftests/kvm/set_memory_region_test.c
+++ b/tools/testing/selftests/kvm/set_memory_region_test.c
@@ -175,7 +175,7 @@ static void guest_code_move_memory_region(void)
GUEST_DONE();
}

-static void test_move_memory_region(void)
+static void test_move_memory_region(bool disable_slot_zap_quirk)
{
pthread_t vcpu_thread;
struct kvm_vcpu *vcpu;
@@ -184,6 +184,9 @@ static void test_move_memory_region(void)

vm = spawn_vm(&vcpu, &vcpu_thread, guest_code_move_memory_region);

+ if (disable_slot_zap_quirk)
+ vm_enable_cap(vm, KVM_CAP_DISABLE_QUIRKS2, KVM_X86_QUIRK_SLOT_ZAP_ALL);
+
hva = addr_gpa2hva(vm, MEM_REGION_GPA);

/*
@@ -266,7 +269,7 @@ static void guest_code_delete_memory_region(void)
GUEST_ASSERT(0);
}

-static void test_delete_memory_region(void)
+static void test_delete_memory_region(bool disable_slot_zap_quirk)
{
pthread_t vcpu_thread;
struct kvm_vcpu *vcpu;
@@ -276,6 +279,9 @@ static void test_delete_memory_region(void)

vm = spawn_vm(&vcpu, &vcpu_thread, guest_code_delete_memory_region);

+ if (disable_slot_zap_quirk)
+ vm_enable_cap(vm, KVM_CAP_DISABLE_QUIRKS2, KVM_X86_QUIRK_SLOT_ZAP_ALL);
+
/* Delete the memory region, the guest should not die. */
vm_mem_region_delete(vm, MEM_REGION_SLOT);
wait_for_vcpu();
@@ -553,7 +559,10 @@ int main(int argc, char *argv[])
{
#ifdef __x86_64__
int i, loops;
+ int j, disable_slot_zap_quirk = 0;

+ if (kvm_check_cap(KVM_CAP_DISABLE_QUIRKS2) & KVM_X86_QUIRK_SLOT_ZAP_ALL)
+ disable_slot_zap_quirk = 1;
/*
* FIXME: the zero-memslot test fails on aarch64 and s390x because
* KVM_RUN fails with ENOEXEC or EFAULT.
@@ -579,13 +588,17 @@ int main(int argc, char *argv[])
else
loops = 10;

- pr_info("Testing MOVE of in-use region, %d loops\n", loops);
- for (i = 0; i < loops; i++)
- test_move_memory_region();
+ for (j = 0; j <= disable_slot_zap_quirk; j++) {
+ pr_info("Testing MOVE of in-use region, %d loops, slot zap quirk %s\n",
+ loops, j ? "disabled" : "enabled");
+ for (i = 0; i < loops; i++)
+ test_move_memory_region(!!j);

- pr_info("Testing DELETE of in-use region, %d loops\n", loops);
- for (i = 0; i < loops; i++)
- test_delete_memory_region();
+ pr_info("Testing DELETE of in-use region, %d loops, slot zap quirk %s\n",
+ loops, j ? "disabled" : "enabled");
+ for (i = 0; i < loops; i++)
+ test_delete_memory_region(!!j);
+ }
#endif

return 0;
--
2.43.2


2024-06-13 06:13:21

by Yan Zhao

[permalink] [raw]
Subject: [PATCH 5/5] KVM: selftests: Test private access to deleted memslot with quirk disabled

Update private_mem_kvm_exits_test to make sure private access to deleted
memslot functional both when quirk KVM_X86_QUIRK_SLOT_ZAP_ALL is enabled
and disabled.

Signed-off-by: Yan Zhao <[email protected]>
---
.../selftests/kvm/x86_64/private_mem_kvm_exits_test.c | 11 +++++++++--
1 file changed, 9 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/x86_64/private_mem_kvm_exits_test.c b/tools/testing/selftests/kvm/x86_64/private_mem_kvm_exits_test.c
index 13e72fcec8dd..a4d304fce294 100644
--- a/tools/testing/selftests/kvm/x86_64/private_mem_kvm_exits_test.c
+++ b/tools/testing/selftests/kvm/x86_64/private_mem_kvm_exits_test.c
@@ -44,7 +44,7 @@ const struct vm_shape protected_vm_shape = {
.type = KVM_X86_SW_PROTECTED_VM,
};

-static void test_private_access_memslot_deleted(void)
+static void test_private_access_memslot_deleted(bool disable_slot_zap_quirk)
{
struct kvm_vm *vm;
struct kvm_vcpu *vcpu;
@@ -55,6 +55,9 @@ static void test_private_access_memslot_deleted(void)
vm = vm_create_shape_with_one_vcpu(protected_vm_shape, &vcpu,
guest_repeatedly_read);

+ if (disable_slot_zap_quirk)
+ vm_enable_cap(vm, KVM_CAP_DISABLE_QUIRKS2, KVM_X86_QUIRK_SLOT_ZAP_ALL);
+
vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS,
EXITS_TEST_GPA, EXITS_TEST_SLOT,
EXITS_TEST_NPAGES,
@@ -115,6 +118,10 @@ int main(int argc, char *argv[])
{
TEST_REQUIRE(kvm_check_cap(KVM_CAP_VM_TYPES) & BIT(KVM_X86_SW_PROTECTED_VM));

- test_private_access_memslot_deleted();
+ test_private_access_memslot_deleted(false);
+
+ if (kvm_check_cap(KVM_CAP_DISABLE_QUIRKS2) & KVM_X86_QUIRK_SLOT_ZAP_ALL)
+ test_private_access_memslot_deleted(true);
+
test_private_access_memslot_not_private();
}
--
2.43.2


2024-06-13 06:14:41

by Yan Zhao

[permalink] [raw]
Subject: [PATCH 3/5] KVM: selftests: Allow slot modification stress test with quirk disabled

Add a new user option to memslot_modification_stress_test to allow testing
with slot zap quirk KVM_X86_QUIRK_SLOT_ZAP_ALL disabled.

Signed-off-by: Yan Zhao <[email protected]>
---
.../kvm/memslot_modification_stress_test.c | 19 +++++++++++++++++--
1 file changed, 17 insertions(+), 2 deletions(-)

diff --git a/tools/testing/selftests/kvm/memslot_modification_stress_test.c b/tools/testing/selftests/kvm/memslot_modification_stress_test.c
index 05fcf902e067..c6f22ded4c96 100644
--- a/tools/testing/selftests/kvm/memslot_modification_stress_test.c
+++ b/tools/testing/selftests/kvm/memslot_modification_stress_test.c
@@ -85,6 +85,7 @@ struct test_params {
useconds_t delay;
uint64_t nr_iterations;
bool partition_vcpu_memory_access;
+ bool disable_slot_zap_quirk;
};

static void run_test(enum vm_guest_mode mode, void *arg)
@@ -95,6 +96,13 @@ static void run_test(enum vm_guest_mode mode, void *arg)
vm = memstress_create_vm(mode, nr_vcpus, guest_percpu_mem_size, 1,
VM_MEM_SRC_ANONYMOUS,
p->partition_vcpu_memory_access);
+#ifdef __x86_64__
+ if (p->disable_slot_zap_quirk)
+ vm_enable_cap(vm, KVM_CAP_DISABLE_QUIRKS2, KVM_X86_QUIRK_SLOT_ZAP_ALL);
+
+ pr_info("Memslot zap quirk %s\n", p->disable_slot_zap_quirk ?
+ "disabled" : "enabled");
+#endif

pr_info("Finished creating vCPUs\n");

@@ -113,11 +121,12 @@ static void run_test(enum vm_guest_mode mode, void *arg)
static void help(char *name)
{
puts("");
- printf("usage: %s [-h] [-m mode] [-d delay_usec]\n"
+ printf("usage: %s [-h] [-m mode] [-d delay_usec] [-q]\n"
" [-b memory] [-v vcpus] [-o] [-i iterations]\n", name);
guest_modes_help();
printf(" -d: add a delay between each iteration of adding and\n"
" deleting a memslot in usec.\n");
+ printf(" -q: Disable memslot zap quirk.\n");
printf(" -b: specify the size of the memory region which should be\n"
" accessed by each vCPU. e.g. 10M or 3G.\n"
" Default: 1G\n");
@@ -143,7 +152,7 @@ int main(int argc, char *argv[])

guest_modes_append_default();

- while ((opt = getopt(argc, argv, "hm:d:b:v:oi:")) != -1) {
+ while ((opt = getopt(argc, argv, "hm:d:qb:v:oi:")) != -1) {
switch (opt) {
case 'm':
guest_modes_cmdline(optarg);
@@ -166,6 +175,12 @@ int main(int argc, char *argv[])
case 'i':
p.nr_iterations = atoi_positive("Number of iterations", optarg);
break;
+ case 'q':
+ p.disable_slot_zap_quirk = true;
+
+ TEST_REQUIRE(kvm_check_cap(KVM_CAP_DISABLE_QUIRKS2) &
+ KVM_X86_QUIRK_SLOT_ZAP_ALL);
+ break;
case 'h':
default:
help(argv[0]);
--
2.43.2


2024-06-13 06:15:49

by Yan Zhao

[permalink] [raw]
Subject: [PATCH 4/5] KVM: selftests: Test memslot move in memslot_perf_test with quirk disabled

Add a new user option to memslot_perf_test to allow testing memslot move
with quirk KVM_X86_QUIRK_SLOT_ZAP_ALL disabled.

Signed-off-by: Yan Zhao <[email protected]>
---
tools/testing/selftests/kvm/memslot_perf_test.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)

diff --git a/tools/testing/selftests/kvm/memslot_perf_test.c b/tools/testing/selftests/kvm/memslot_perf_test.c
index 579a64f97333..893366982f77 100644
--- a/tools/testing/selftests/kvm/memslot_perf_test.c
+++ b/tools/testing/selftests/kvm/memslot_perf_test.c
@@ -113,6 +113,7 @@ static_assert(ATOMIC_BOOL_LOCK_FREE == 2, "atomic bool is not lockless");
static sem_t vcpu_ready;

static bool map_unmap_verify;
+static bool disable_slot_zap_quirk;

static bool verbose;
#define pr_info_v(...) \
@@ -578,6 +579,9 @@ static bool test_memslot_move_prepare(struct vm_data *data,
uint32_t guest_page_size = data->vm->page_size;
uint64_t movesrcgpa, movetestgpa;

+ if (disable_slot_zap_quirk)
+ vm_enable_cap(data->vm, KVM_CAP_DISABLE_QUIRKS2, KVM_X86_QUIRK_SLOT_ZAP_ALL);
+
movesrcgpa = vm_slot2gpa(data, data->nslots - 1);

if (isactive) {
@@ -896,6 +900,7 @@ static void help(char *name, struct test_args *targs)
pr_info(" -h: print this help screen.\n");
pr_info(" -v: enable verbose mode (not for benchmarking).\n");
pr_info(" -d: enable extra debug checks.\n");
+ pr_info(" -q: Disable memslot zap quirk during memslot move.\n");
pr_info(" -s: specify memslot count cap (-1 means no cap; currently: %i)\n",
targs->nslots);
pr_info(" -f: specify the first test to run (currently: %i; max %zu)\n",
@@ -954,7 +959,7 @@ static bool parse_args(int argc, char *argv[],
uint32_t max_mem_slots;
int opt;

- while ((opt = getopt(argc, argv, "hvds:f:e:l:r:")) != -1) {
+ while ((opt = getopt(argc, argv, "hvdqs:f:e:l:r:")) != -1) {
switch (opt) {
case 'h':
default:
@@ -966,6 +971,11 @@ static bool parse_args(int argc, char *argv[],
case 'd':
map_unmap_verify = true;
break;
+ case 'q':
+ disable_slot_zap_quirk = true;
+ TEST_REQUIRE(kvm_check_cap(KVM_CAP_DISABLE_QUIRKS2) &
+ KVM_X86_QUIRK_SLOT_ZAP_ALL);
+ break;
case 's':
targs->nslots = atoi_paranoid(optarg);
if (targs->nslots <= 1 && targs->nslots != -1) {
--
2.43.2


2024-06-13 20:09:04

by Edgecombe, Rick P

[permalink] [raw]
Subject: Re: [PATCH 0/5] Introduce a quirk to control memslot zap behavior

On Thu, 2024-06-13 at 14:06 +0800, Yan Zhao wrote:
>       a) Add a condition for TDX VM type in kvm_arch_flush_shadow_memslot()
>          besides the testing of kvm_check_has_quirk(). It is similar to
>          "all new VM types have the quirk disabled". e.g.
>
>          static inline bool kvm_memslot_flush_zap_all(struct kvm
> *kvm)                   
>         
> {                                                                             
>   
>               return kvm->arch.vm_type != KVM_X86_TDX_VM
> &&                              
>                      kvm_check_has_quirk(kvm,
> KVM_X86_QUIRK_SLOT_ZAP_ALL);               
>          }
>         
>       b) Init the disabled_quirks based on VM type in kernel, extend
>          disabled_quirk querying/setting interface to enforce the quirk to
>          be disabled for TDX.

I'd prefer to go with option (a) here. Because we don't have any behavior
defined yet for KVM_X86_TDX_VM, we don't really need to "disable a quirk" of it.

Instead we could just define KVM_X86_QUIRK_SLOT_ZAP_ALL to be about the behavior
of the existing vm_types. It would be a few lines of documentation to save
implementing and maintaining a whole interface with special logic for TDX. So to
me it doesn't seem worth it, unless there is some other user for a new more
complex quirk interface.