Given the high cost of NX hugepages in terms of TLB performance, it may
be desirable to disable the mitigation on a per-VM basis. In the case of public
cloud providers with many VMs on a single host, some VMs may be more trusted
than others. In order to maximize performance on critical VMs, while still
providing some protection to the host from iTLB Multihit, allow the mitigation
to be selectively disabled.
Disabling NX hugepages on a VM is relatively straightforward, but I took this
as an opportunity to add some NX hugepages test coverage and clean up selftests
infrastructure a bit.
This series was tested with the new selftest and the rest of the KVM selftests
on an Intel Haswell machine.
The following tests failed, but I do not believe that has anything to do with
this series:
userspace_io_test
vmx_nested_tsc_scaling_test
vmx_preemption_timer_test
Changelog:
v1->v2:
Dropped the complicated memslot refactor in favor of Ricardo Koller's
patch with a similar effect.
Incorporated David Dunn's feedback and reviewed by tag: shortened waits
to speed up test.
Ben Gardon (10):
KVM: selftests: Dump VM stats in binary stats test
KVM: selftests: Test reading a single stat
KVM: selftests: Add memslot parameter to elf_load
KVM: selftests: Improve error message in vm_phy_pages_alloc
KVM: selftests: Add NX huge pages test
KVM: x86/MMU: Factor out updating NX hugepages state for a VM
KVM: x86/MMU: Track NX hugepages on a per-VM basis
KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis
KVM: x86: Fix errant brace in KVM capability handling
KVM: x86/MMU: Require reboot permission to disable NX hugepages
Ricardo Koller (1):
KVM: selftests: Add vm_alloc_page_table_in_memslot library function
arch/x86/include/asm/kvm_host.h | 3 +
arch/x86/kvm/mmu.h | 9 +-
arch/x86/kvm/mmu/mmu.c | 23 +-
arch/x86/kvm/mmu/spte.c | 7 +-
arch/x86/kvm/mmu/spte.h | 3 +-
arch/x86/kvm/mmu/tdp_mmu.c | 3 +-
arch/x86/kvm/x86.c | 24 +-
include/uapi/linux/kvm.h | 1 +
tools/testing/selftests/kvm/Makefile | 3 +-
.../selftests/kvm/include/kvm_util_base.h | 10 +
.../selftests/kvm/kvm_binary_stats_test.c | 6 +
tools/testing/selftests/kvm/lib/elf.c | 13 +-
tools/testing/selftests/kvm/lib/kvm_util.c | 230 +++++++++++++++++-
.../kvm/lib/x86_64/nx_huge_pages_guest.S | 45 ++++
.../selftests/kvm/x86_64/nx_huge_pages_test.c | 160 ++++++++++++
.../kvm/x86_64/nx_huge_pages_test.sh | 25 ++
16 files changed, 538 insertions(+), 27 deletions(-)
create mode 100644 tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S
create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
--
2.35.1.894.gb6a874cedc-goog
Factor out the code to update the NX hugepages state for an individual
VM. This will be expanded in future commits to allow per-VM control of
Nx hugepages.
No functional change intended.
Signed-off-by: Ben Gardon <[email protected]>
---
arch/x86/kvm/mmu/mmu.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index 3b8da8b0745e..1b59b56642f1 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6195,6 +6195,15 @@ static void __set_nx_huge_pages(bool val)
nx_huge_pages = itlb_multihit_kvm_mitigation = val;
}
+static int kvm_update_nx_huge_pages(struct kvm *kvm)
+{
+ mutex_lock(&kvm->slots_lock);
+ kvm_mmu_zap_all_fast(kvm);
+ mutex_unlock(&kvm->slots_lock);
+
+ wake_up_process(kvm->arch.nx_lpage_recovery_thread);
+}
+
static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
{
bool old_val = nx_huge_pages;
@@ -6217,13 +6226,8 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
mutex_lock(&kvm_lock);
- list_for_each_entry(kvm, &vm_list, vm_list) {
- mutex_lock(&kvm->slots_lock);
- kvm_mmu_zap_all_fast(kvm);
- mutex_unlock(&kvm->slots_lock);
-
- wake_up_process(kvm->arch.nx_lpage_recovery_thread);
- }
+ list_for_each_entry(kvm, &vm_list, vm_list)
+ kvm_set_nx_huge_pages(kvm);
mutex_unlock(&kvm_lock);
}
--
2.35.1.894.gb6a874cedc-goog
There's currently no test coverage of NX hugepages in KVM selftests, so
add a basic test to ensure that the feature works as intended.
Reviewed-by: David Dunn <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>
---
tools/testing/selftests/kvm/Makefile | 3 +-
.../kvm/lib/x86_64/nx_huge_pages_guest.S | 45 ++++++
.../selftests/kvm/x86_64/nx_huge_pages_test.c | 133 ++++++++++++++++++
.../kvm/x86_64/nx_huge_pages_test.sh | 25 ++++
4 files changed, 205 insertions(+), 1 deletion(-)
create mode 100644 tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S
create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index 04099f453b59..6ee30c0df323 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -38,7 +38,7 @@ ifeq ($(ARCH),riscv)
endif
LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
-LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
+LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S lib/x86_64/nx_huge_pages_guest.S
LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c
LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
LIBKVM_riscv = lib/riscv/processor.c lib/riscv/ucall.c
@@ -56,6 +56,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/kvm_clock_test
TEST_GEN_PROGS_x86_64 += x86_64/kvm_pv_test
TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test
TEST_GEN_PROGS_x86_64 += x86_64/mmu_role_test
+TEST_GEN_PROGS_x86_64 += x86_64/nx_huge_pages_test
TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test
TEST_GEN_PROGS_x86_64 += x86_64/pmu_event_filter_test
TEST_GEN_PROGS_x86_64 += x86_64/set_boot_cpu_id
diff --git a/tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S b/tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S
new file mode 100644
index 000000000000..09c66b9562a3
--- /dev/null
+++ b/tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S
@@ -0,0 +1,45 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * tools/testing/selftests/kvm/nx_huge_page_guest.S
+ *
+ * Copyright (C) 2022, Google LLC.
+ */
+
+.include "kvm_util.h"
+
+#define HPAGE_SIZE (2*1024*1024)
+#define PORT_SUCCESS 0x70
+
+.global guest_code0
+.global guest_code1
+
+.align HPAGE_SIZE
+exit_vm:
+ mov $0x1,%edi
+ mov $0x2,%esi
+ mov a_string,%edx
+ mov $0x1,%ecx
+ xor %eax,%eax
+ jmp ucall
+
+
+guest_code0:
+ mov data1, %eax
+ mov data2, %eax
+ jmp exit_vm
+
+.align HPAGE_SIZE
+guest_code1:
+ mov data1, %eax
+ mov data2, %eax
+ jmp exit_vm
+data1:
+.quad 0
+
+.align HPAGE_SIZE
+data2:
+.quad 0
+a_string:
+.string "why does the ucall function take a string argument?"
+
+
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
new file mode 100644
index 000000000000..2bcbe4efdc6a
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
@@ -0,0 +1,133 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * tools/testing/selftests/kvm/nx_huge_page_test.c
+ *
+ * Usage: to be run via nx_huge_page_test.sh, which does the necessary
+ * environment setup and teardown
+ *
+ * Copyright (C) 2022, Google LLC.
+ */
+
+#define _GNU_SOURCE
+
+#include <fcntl.h>
+#include <stdint.h>
+#include <time.h>
+
+#include <test_util.h>
+#include "kvm_util.h"
+
+#define HPAGE_SLOT 10
+#define HPAGE_PADDR_START (10*1024*1024)
+#define HPAGE_SLOT_NPAGES (100*1024*1024/4096)
+
+/* Defined in nx_huge_page_guest.S */
+void guest_code0(void);
+void guest_code1(void);
+
+static void run_guest_code(struct kvm_vm *vm, void (*guest_code)(void))
+{
+ struct kvm_regs regs;
+
+ vcpu_regs_get(vm, 0, ®s);
+ regs.rip = (uint64_t)guest_code;
+ vcpu_regs_set(vm, 0, ®s);
+ vcpu_run(vm, 0);
+}
+
+static void check_2m_page_count(struct kvm_vm *vm, int expected_pages_2m)
+{
+ int actual_pages_2m;
+
+ actual_pages_2m = vm_get_single_stat(vm, "pages_2m");
+
+ TEST_ASSERT(actual_pages_2m == expected_pages_2m,
+ "Unexpected 2m page count. Expected %d, got %d",
+ expected_pages_2m, actual_pages_2m);
+}
+
+static void check_split_count(struct kvm_vm *vm, int expected_splits)
+{
+ int actual_splits;
+
+ actual_splits = vm_get_single_stat(vm, "nx_lpage_splits");
+
+ TEST_ASSERT(actual_splits == expected_splits,
+ "Unexpected nx lpage split count. Expected %d, got %d",
+ expected_splits, actual_splits);
+}
+
+int main(int argc, char **argv)
+{
+ struct kvm_vm *vm;
+ struct timespec ts;
+
+ vm = vm_create(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES, O_RDWR);
+
+ vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
+ HPAGE_PADDR_START, HPAGE_SLOT,
+ HPAGE_SLOT_NPAGES, 0);
+
+ kvm_vm_elf_load_memslot(vm, program_invocation_name, HPAGE_SLOT);
+
+ vm_vcpu_add_default(vm, 0, guest_code0);
+
+ check_2m_page_count(vm, 0);
+ check_split_count(vm, 0);
+
+ /*
+ * Running guest_code0 will access data1 and data2.
+ * This should result in part of the huge page containing guest_code0,
+ * and part of the hugepage containing the ucall function being mapped
+ * at 4K. The huge pages containing data1 and data2 will be mapped
+ * at 2M.
+ */
+ run_guest_code(vm, guest_code0);
+ check_2m_page_count(vm, 2);
+ check_split_count(vm, 2);
+
+ /*
+ * guest_code1 is in the same huge page as data1, so it will cause
+ * that huge page to be remapped at 4k.
+ */
+ run_guest_code(vm, guest_code1);
+ check_2m_page_count(vm, 1);
+ check_split_count(vm, 3);
+
+ /* Run guest_code0 again to check that is has no effect. */
+ run_guest_code(vm, guest_code0);
+ check_2m_page_count(vm, 1);
+ check_split_count(vm, 3);
+
+ /*
+ * Give recovery thread time to run. The wrapper script sets
+ * recovery_period_ms to 100, so wait 1.5x that.
+ */
+ ts.tv_sec = 0;
+ ts.tv_nsec = 150000000;
+ nanosleep(&ts, NULL);
+
+ /*
+ * Now that the reclaimer has run, all the split pages should be gone.
+ */
+ check_2m_page_count(vm, 1);
+ check_split_count(vm, 0);
+
+ /*
+ * The split 2M pages should have been reclaimed, so run guest_code0
+ * again to check that pages are mapped at 2M again.
+ */
+ run_guest_code(vm, guest_code0);
+ check_2m_page_count(vm, 2);
+ check_split_count(vm, 2);
+
+ /* Pages are once again split from running guest_code1. */
+ run_guest_code(vm, guest_code1);
+ check_2m_page_count(vm, 1);
+ check_split_count(vm, 3);
+
+ kvm_vm_free(vm);
+
+ return 0;
+}
+
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
new file mode 100755
index 000000000000..19fc95723fcb
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
@@ -0,0 +1,25 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only */
+
+# tools/testing/selftests/kvm/nx_huge_page_test.sh
+# Copyright (C) 2022, Google LLC.
+
+NX_HUGE_PAGES=$(cat /sys/module/kvm/parameters/nx_huge_pages)
+NX_HUGE_PAGES_RECOVERY_RATIO=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio)
+NX_HUGE_PAGES_RECOVERY_PERIOD=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms)
+HUGE_PAGES=$(cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
+
+echo 1 > /sys/module/kvm/parameters/nx_huge_pages
+echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
+echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
+echo 200 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+./nx_huge_pages_test
+RET=$?
+
+echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
+echo $NX_HUGE_PAGES_RECOVERY_RATIO > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
+echo $NX_HUGE_PAGES_RECOVERY_PERIOD > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
+echo $HUGE_PAGES > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+exit $RET
--
2.35.1.894.gb6a874cedc-goog
The braces around the KVM_CAP_XSAVE2 block also surround the
KVM_CAP_PMU_CAPABILITY block, likely the result of a merge issue. Simply
move the curly brace back to where it belongs.
Fixes: ba7bb663f5547 ("KVM: x86: Provide per VM capability for disabling PMU virtualization")
Signed-off-by: Ben Gardon <[email protected]>
---
arch/x86/kvm/x86.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 73df90a6932b..74351cbb9b5b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4352,10 +4352,10 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
if (r < sizeof(struct kvm_xsave))
r = sizeof(struct kvm_xsave);
break;
+ }
case KVM_CAP_PMU_CAPABILITY:
r = enable_pmu ? KVM_CAP_PMU_VALID_MASK : 0;
break;
- }
case KVM_CAP_DISABLE_QUIRKS2:
r = KVM_X86_VALID_QUIRKS;
break;
--
2.35.1.894.gb6a874cedc-goog
Ensure that the userspace actor attempting to disable NX hugepages has
permission to reboot the system. Since disabling NX hugepages would
allow a guest to crash the system, it is similar to reboot permissions.
This approach is the simplest permission gating, but passing a file
descriptor opened for write for the module parameter would also work
well and be more precise.
The latter approach was suggested by Sean Christopherson.
Suggested-by: Jim Mattson <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>
---
arch/x86/kvm/x86.c | 18 ++++++-
.../selftests/kvm/include/kvm_util_base.h | 2 +
tools/testing/selftests/kvm/lib/kvm_util.c | 7 +++
.../selftests/kvm/x86_64/nx_huge_pages_test.c | 49 ++++++++++++++-----
.../kvm/x86_64/nx_huge_pages_test.sh | 2 +-
5 files changed, 65 insertions(+), 13 deletions(-)
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 74351cbb9b5b..995f30667619 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4256,7 +4256,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_SYS_ATTRIBUTES:
case KVM_CAP_VAPIC:
case KVM_CAP_ENABLE_CAP:
- case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
r = 1;
break;
case KVM_CAP_EXIT_HYPERCALL:
@@ -4359,6 +4358,14 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_DISABLE_QUIRKS2:
r = KVM_X86_VALID_QUIRKS;
break;
+ case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
+ /*
+ * Since the risk of disabling NX hugepages is a guest crashing
+ * the system, ensure the userspace process has permission to
+ * reboot the system.
+ */
+ r = capable(CAP_SYS_BOOT);
+ break;
default:
break;
}
@@ -6050,6 +6057,15 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
mutex_unlock(&kvm->lock);
break;
case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
+ /*
+ * Since the risk of disabling NX hugepages is a guest crashing
+ * the system, ensure the userspace process has permission to
+ * reboot the system.
+ */
+ if (!capable(CAP_SYS_BOOT)) {
+ r = -EPERM;
+ break;
+ }
kvm->arch.disable_nx_huge_pages = true;
kvm_update_nx_huge_pages(kvm);
r = 0;
diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 72163ba2f878..4db8251c3ce5 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -411,4 +411,6 @@ uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name);
uint32_t guest_get_vcpuid(void);
+void vm_disable_nx_huge_pages(struct kvm_vm *vm);
+
#endif /* SELFTEST_KVM_UTIL_BASE_H */
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 9d72d1bb34fa..46a7fa08d3e0 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -2765,3 +2765,10 @@ uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name)
return value;
}
+void vm_disable_nx_huge_pages(struct kvm_vm *vm)
+{
+ struct kvm_enable_cap cap = { 0 };
+
+ cap.cap = KVM_CAP_VM_DISABLE_NX_HUGE_PAGES;
+ vm_enable_cap(vm, &cap);
+}
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
index 2bcbe4efdc6a..5ce98f759bc8 100644
--- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
@@ -57,13 +57,40 @@ static void check_split_count(struct kvm_vm *vm, int expected_splits)
expected_splits, actual_splits);
}
+static void help(void)
+{
+ puts("");
+ printf("usage: nx_huge_pages_test.sh [-x]\n");
+ puts("");
+ printf(" -x: Allow executable huge pages on the VM.\n");
+ puts("");
+ exit(0);
+}
+
int main(int argc, char **argv)
{
struct kvm_vm *vm;
struct timespec ts;
+ bool disable_nx = false;
+ int opt;
+
+ while ((opt = getopt(argc, argv, "x")) != -1) {
+ switch (opt) {
+ case 'x':
+ disable_nx = true;
+ break;
+ case 'h':
+ default:
+ help();
+ break;
+ }
+ }
vm = vm_create(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES, O_RDWR);
+ if (disable_nx)
+ vm_disable_nx_huge_pages(vm);
+
vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
HPAGE_PADDR_START, HPAGE_SLOT,
HPAGE_SLOT_NPAGES, 0);
@@ -83,21 +110,21 @@ int main(int argc, char **argv)
* at 2M.
*/
run_guest_code(vm, guest_code0);
- check_2m_page_count(vm, 2);
- check_split_count(vm, 2);
+ check_2m_page_count(vm, disable_nx ? 4 : 2);
+ check_split_count(vm, disable_nx ? 0 : 2);
/*
* guest_code1 is in the same huge page as data1, so it will cause
* that huge page to be remapped at 4k.
*/
run_guest_code(vm, guest_code1);
- check_2m_page_count(vm, 1);
- check_split_count(vm, 3);
+ check_2m_page_count(vm, disable_nx ? 4 : 1);
+ check_split_count(vm, disable_nx ? 0 : 3);
/* Run guest_code0 again to check that is has no effect. */
run_guest_code(vm, guest_code0);
- check_2m_page_count(vm, 1);
- check_split_count(vm, 3);
+ check_2m_page_count(vm, disable_nx ? 4 : 1);
+ check_split_count(vm, disable_nx ? 0 : 3);
/*
* Give recovery thread time to run. The wrapper script sets
@@ -110,7 +137,7 @@ int main(int argc, char **argv)
/*
* Now that the reclaimer has run, all the split pages should be gone.
*/
- check_2m_page_count(vm, 1);
+ check_2m_page_count(vm, disable_nx ? 4 : 1);
check_split_count(vm, 0);
/*
@@ -118,13 +145,13 @@ int main(int argc, char **argv)
* again to check that pages are mapped at 2M again.
*/
run_guest_code(vm, guest_code0);
- check_2m_page_count(vm, 2);
- check_split_count(vm, 2);
+ check_2m_page_count(vm, disable_nx ? 4 : 2);
+ check_split_count(vm, disable_nx ? 0 : 2);
/* Pages are once again split from running guest_code1. */
run_guest_code(vm, guest_code1);
- check_2m_page_count(vm, 1);
- check_split_count(vm, 3);
+ check_2m_page_count(vm, disable_nx ? 4 : 1);
+ check_split_count(vm, disable_nx ? 0 : 3);
kvm_vm_free(vm);
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
index 19fc95723fcb..29f999f48848 100755
--- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
@@ -14,7 +14,7 @@ echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
echo 200 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
-./nx_huge_pages_test
+./nx_huge_pages_test "${@}"
RET=$?
echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
--
2.35.1.894.gb6a874cedc-goog
In some cases, the NX hugepage mitigation for iTLB multihit is not
needed for all guests on a host. Allow disabling the mitigation on a
per-VM basis to avoid the performance hit of NX hugepages on trusted
workloads.
Signed-off-by: Ben Gardon <[email protected]>
---
arch/x86/include/asm/kvm_host.h | 1 +
arch/x86/kvm/mmu.h | 1 +
arch/x86/kvm/mmu/mmu.c | 6 ++++--
arch/x86/kvm/x86.c | 6 ++++++
include/uapi/linux/kvm.h | 1 +
5 files changed, 13 insertions(+), 2 deletions(-)
diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
index 0a0c54639dd8..04ddfc475ce0 100644
--- a/arch/x86/include/asm/kvm_host.h
+++ b/arch/x86/include/asm/kvm_host.h
@@ -1242,6 +1242,7 @@ struct kvm_arch {
#endif
bool nx_huge_pages;
+ bool disable_nx_huge_pages;
};
struct kvm_vm_stat {
diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
index dd28fe8d13ae..36d8d84ca6c6 100644
--- a/arch/x86/kvm/mmu.h
+++ b/arch/x86/kvm/mmu.h
@@ -177,6 +177,7 @@ static inline bool is_nx_huge_page_enabled(struct kvm *kvm)
{
return READ_ONCE(kvm->arch.nx_huge_pages);
}
+void kvm_update_nx_huge_pages(struct kvm *kvm);
static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
u32 err, bool prefetch)
diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
index dc9672f70468..a7d387ccfd74 100644
--- a/arch/x86/kvm/mmu/mmu.c
+++ b/arch/x86/kvm/mmu/mmu.c
@@ -6195,9 +6195,10 @@ static void __set_nx_huge_pages(bool val)
nx_huge_pages = itlb_multihit_kvm_mitigation = val;
}
-static void kvm_update_nx_huge_pages(struct kvm *kvm)
+void kvm_update_nx_huge_pages(struct kvm *kvm)
{
- kvm->arch.nx_huge_pages = nx_huge_pages;
+ kvm->arch.nx_huge_pages = nx_huge_pages &&
+ !kvm->arch.disable_nx_huge_pages;
mutex_lock(&kvm->slots_lock);
kvm_mmu_zap_all_fast(kvm);
@@ -6451,6 +6452,7 @@ int kvm_mmu_post_init_vm(struct kvm *kvm)
int err;
kvm->arch.nx_huge_pages = READ_ONCE(nx_huge_pages);
+ kvm->arch.disable_nx_huge_pages = false;
err = kvm_vm_create_worker_thread(kvm, kvm_nx_lpage_recovery_worker, 0,
"kvm-nx-lpage-recovery",
&kvm->arch.nx_lpage_recovery_thread);
diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
index 51106d32f04e..73df90a6932b 100644
--- a/arch/x86/kvm/x86.c
+++ b/arch/x86/kvm/x86.c
@@ -4256,6 +4256,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
case KVM_CAP_SYS_ATTRIBUTES:
case KVM_CAP_VAPIC:
case KVM_CAP_ENABLE_CAP:
+ case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
r = 1;
break;
case KVM_CAP_EXIT_HYPERCALL:
@@ -6048,6 +6049,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
}
mutex_unlock(&kvm->lock);
break;
+ case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
+ kvm->arch.disable_nx_huge_pages = true;
+ kvm_update_nx_huge_pages(kvm);
+ r = 0;
+ break;
default:
r = -EINVAL;
break;
diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
index ee5cc9e2a837..6f9fa7ecfd1e 100644
--- a/include/uapi/linux/kvm.h
+++ b/include/uapi/linux/kvm.h
@@ -1144,6 +1144,7 @@ struct kvm_ppc_resize_hpt {
#define KVM_CAP_S390_MEM_OP_EXTENSION 211
#define KVM_CAP_PMU_CAPABILITY 212
#define KVM_CAP_DISABLE_QUIRKS2 213
+#define KVM_CAP_VM_DISABLE_NX_HUGE_PAGES 214
#ifdef KVM_CAP_IRQ_ROUTING
--
2.35.1.894.gb6a874cedc-goog
On Mon, Mar 21, 2022 at 04:48:44PM -0700, Ben Gardon wrote:
> Ensure that the userspace actor attempting to disable NX hugepages has
> permission to reboot the system. Since disabling NX hugepages would
> allow a guest to crash the system, it is similar to reboot permissions.
>
> This approach is the simplest permission gating, but passing a file
> descriptor opened for write for the module parameter would also work
> well and be more precise.
> The latter approach was suggested by Sean Christopherson.
>
> Suggested-by: Jim Mattson <[email protected]>
> Signed-off-by: Ben Gardon <[email protected]>
> ---
> arch/x86/kvm/x86.c | 18 ++++++-
> .../selftests/kvm/include/kvm_util_base.h | 2 +
> tools/testing/selftests/kvm/lib/kvm_util.c | 7 +++
> .../selftests/kvm/x86_64/nx_huge_pages_test.c | 49 ++++++++++++++-----
> .../kvm/x86_64/nx_huge_pages_test.sh | 2 +-
> 5 files changed, 65 insertions(+), 13 deletions(-)
>
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 74351cbb9b5b..995f30667619 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4256,7 +4256,6 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> case KVM_CAP_SYS_ATTRIBUTES:
> case KVM_CAP_VAPIC:
> case KVM_CAP_ENABLE_CAP:
> - case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
> r = 1;
> break;
> case KVM_CAP_EXIT_HYPERCALL:
> @@ -4359,6 +4358,14 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> case KVM_CAP_DISABLE_QUIRKS2:
> r = KVM_X86_VALID_QUIRKS;
> break;
> + case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
> + /*
> + * Since the risk of disabling NX hugepages is a guest crashing
> + * the system, ensure the userspace process has permission to
> + * reboot the system.
> + */
> + r = capable(CAP_SYS_BOOT);
Duplicating this check and comment isn't ideal. I think it would be fine
to unconditionally return true here (KVM, after all, does support the
capability) and only check for CAP_SYS_BOOT when userspace attempts to
enable the capability.
> + break;
> default:
> break;
> }
> @@ -6050,6 +6057,15 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> mutex_unlock(&kvm->lock);
> break;
> case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
> + /*
> + * Since the risk of disabling NX hugepages is a guest crashing
> + * the system, ensure the userspace process has permission to
> + * reboot the system.
> + */
> + if (!capable(CAP_SYS_BOOT)) {
> + r = -EPERM;
> + break;
> + }
> kvm->arch.disable_nx_huge_pages = true;
> kvm_update_nx_huge_pages(kvm);
> r = 0;
> diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
> index 72163ba2f878..4db8251c3ce5 100644
> --- a/tools/testing/selftests/kvm/include/kvm_util_base.h
> +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
Can you split out the selftests changes to a separate commit? I have a
feeling you meant to :).
> @@ -411,4 +411,6 @@ uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name);
>
> uint32_t guest_get_vcpuid(void);
>
> +void vm_disable_nx_huge_pages(struct kvm_vm *vm);
> +
> #endif /* SELFTEST_KVM_UTIL_BASE_H */
> diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
> index 9d72d1bb34fa..46a7fa08d3e0 100644
> --- a/tools/testing/selftests/kvm/lib/kvm_util.c
> +++ b/tools/testing/selftests/kvm/lib/kvm_util.c
> @@ -2765,3 +2765,10 @@ uint64_t vm_get_single_stat(struct kvm_vm *vm, const char *stat_name)
> return value;
> }
>
> +void vm_disable_nx_huge_pages(struct kvm_vm *vm)
> +{
> + struct kvm_enable_cap cap = { 0 };
> +
> + cap.cap = KVM_CAP_VM_DISABLE_NX_HUGE_PAGES;
> + vm_enable_cap(vm, &cap);
> +}
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> index 2bcbe4efdc6a..5ce98f759bc8 100644
> --- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
Will you add a test to exercise the CAP_SYS_BOOT check? At minimum the
selftest should check if it has CAP_SYS_BOOT and act accordingly (e.g.
exiting with KSFT_SKIP).
> @@ -57,13 +57,40 @@ static void check_split_count(struct kvm_vm *vm, int expected_splits)
> expected_splits, actual_splits);
> }
>
> +static void help(void)
> +{
> + puts("");
> + printf("usage: nx_huge_pages_test.sh [-x]\n");
> + puts("");
> + printf(" -x: Allow executable huge pages on the VM.\n");
> + puts("");
> + exit(0);
> +}
> +
> int main(int argc, char **argv)
> {
> struct kvm_vm *vm;
> struct timespec ts;
> + bool disable_nx = false;
> + int opt;
> +
> + while ((opt = getopt(argc, argv, "x")) != -1) {
> + switch (opt) {
> + case 'x':
> + disable_nx = true;
> + break;
> + case 'h':
> + default:
> + help();
> + break;
> + }
> + }
>
> vm = vm_create(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES, O_RDWR);
>
> + if (disable_nx)
> + vm_disable_nx_huge_pages(vm);
> +
> vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
> HPAGE_PADDR_START, HPAGE_SLOT,
> HPAGE_SLOT_NPAGES, 0);
> @@ -83,21 +110,21 @@ int main(int argc, char **argv)
> * at 2M.
> */
> run_guest_code(vm, guest_code0);
> - check_2m_page_count(vm, 2);
> - check_split_count(vm, 2);
> + check_2m_page_count(vm, disable_nx ? 4 : 2);
> + check_split_count(vm, disable_nx ? 0 : 2);
>
> /*
> * guest_code1 is in the same huge page as data1, so it will cause
> * that huge page to be remapped at 4k.
> */
> run_guest_code(vm, guest_code1);
> - check_2m_page_count(vm, 1);
> - check_split_count(vm, 3);
> + check_2m_page_count(vm, disable_nx ? 4 : 1);
> + check_split_count(vm, disable_nx ? 0 : 3);
>
> /* Run guest_code0 again to check that is has no effect. */
> run_guest_code(vm, guest_code0);
> - check_2m_page_count(vm, 1);
> - check_split_count(vm, 3);
> + check_2m_page_count(vm, disable_nx ? 4 : 1);
> + check_split_count(vm, disable_nx ? 0 : 3);
>
> /*
> * Give recovery thread time to run. The wrapper script sets
> @@ -110,7 +137,7 @@ int main(int argc, char **argv)
> /*
> * Now that the reclaimer has run, all the split pages should be gone.
> */
> - check_2m_page_count(vm, 1);
> + check_2m_page_count(vm, disable_nx ? 4 : 1);
> check_split_count(vm, 0);
>
> /*
> @@ -118,13 +145,13 @@ int main(int argc, char **argv)
> * again to check that pages are mapped at 2M again.
> */
> run_guest_code(vm, guest_code0);
> - check_2m_page_count(vm, 2);
> - check_split_count(vm, 2);
> + check_2m_page_count(vm, disable_nx ? 4 : 2);
> + check_split_count(vm, disable_nx ? 0 : 2);
>
> /* Pages are once again split from running guest_code1. */
> run_guest_code(vm, guest_code1);
> - check_2m_page_count(vm, 1);
> - check_split_count(vm, 3);
> + check_2m_page_count(vm, disable_nx ? 4 : 1);
> + check_split_count(vm, disable_nx ? 0 : 3);
>
> kvm_vm_free(vm);
>
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> index 19fc95723fcb..29f999f48848 100755
> --- a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> @@ -14,7 +14,7 @@ echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> echo 200 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
>
> -./nx_huge_pages_test
> +./nx_huge_pages_test "${@}"
> RET=$?
>
> echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
> --
> 2.35.1.894.gb6a874cedc-goog
>
On Mon, Mar 21, 2022 at 04:48:40PM -0700, Ben Gardon wrote:
> Factor out the code to update the NX hugepages state for an individual
> VM. This will be expanded in future commits to allow per-VM control of
> Nx hugepages.
>
> No functional change intended.
>
> Signed-off-by: Ben Gardon <[email protected]>
> ---
> arch/x86/kvm/mmu/mmu.c | 18 +++++++++++-------
> 1 file changed, 11 insertions(+), 7 deletions(-)
>
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index 3b8da8b0745e..1b59b56642f1 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6195,6 +6195,15 @@ static void __set_nx_huge_pages(bool val)
> nx_huge_pages = itlb_multihit_kvm_mitigation = val;
> }
>
> +static int kvm_update_nx_huge_pages(struct kvm *kvm)
> +{
> + mutex_lock(&kvm->slots_lock);
> + kvm_mmu_zap_all_fast(kvm);
> + mutex_unlock(&kvm->slots_lock);
> +
> + wake_up_process(kvm->arch.nx_lpage_recovery_thread);
> +}
> +
> static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
> {
> bool old_val = nx_huge_pages;
> @@ -6217,13 +6226,8 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
>
> mutex_lock(&kvm_lock);
>
nit: This blank line is asymmetrical with mutex_unlock().
> - list_for_each_entry(kvm, &vm_list, vm_list) {
> - mutex_lock(&kvm->slots_lock);
> - kvm_mmu_zap_all_fast(kvm);
> - mutex_unlock(&kvm->slots_lock);
> -
> - wake_up_process(kvm->arch.nx_lpage_recovery_thread);
> - }
> + list_for_each_entry(kvm, &vm_list, vm_list)
> + kvm_set_nx_huge_pages(kvm);
This should be kvm_update_nx_huge_pages() right?
> mutex_unlock(&kvm_lock);
> }
>
> --
> 2.35.1.894.gb6a874cedc-goog
>
On Mon, Mar 21, 2022 at 04:48:42PM -0700, Ben Gardon wrote:
> In some cases, the NX hugepage mitigation for iTLB multihit is not
> needed for all guests on a host. Allow disabling the mitigation on a
> per-VM basis to avoid the performance hit of NX hugepages on trusted
> workloads.
>
> Signed-off-by: Ben Gardon <[email protected]>
> ---
> arch/x86/include/asm/kvm_host.h | 1 +
> arch/x86/kvm/mmu.h | 1 +
> arch/x86/kvm/mmu/mmu.c | 6 ++++--
> arch/x86/kvm/x86.c | 6 ++++++
> include/uapi/linux/kvm.h | 1 +
> 5 files changed, 13 insertions(+), 2 deletions(-)
>
> diff --git a/arch/x86/include/asm/kvm_host.h b/arch/x86/include/asm/kvm_host.h
> index 0a0c54639dd8..04ddfc475ce0 100644
> --- a/arch/x86/include/asm/kvm_host.h
> +++ b/arch/x86/include/asm/kvm_host.h
> @@ -1242,6 +1242,7 @@ struct kvm_arch {
> #endif
>
> bool nx_huge_pages;
> + bool disable_nx_huge_pages;
> };
>
> struct kvm_vm_stat {
> diff --git a/arch/x86/kvm/mmu.h b/arch/x86/kvm/mmu.h
> index dd28fe8d13ae..36d8d84ca6c6 100644
> --- a/arch/x86/kvm/mmu.h
> +++ b/arch/x86/kvm/mmu.h
> @@ -177,6 +177,7 @@ static inline bool is_nx_huge_page_enabled(struct kvm *kvm)
> {
> return READ_ONCE(kvm->arch.nx_huge_pages);
> }
> +void kvm_update_nx_huge_pages(struct kvm *kvm);
>
> static inline int kvm_mmu_do_page_fault(struct kvm_vcpu *vcpu, gpa_t cr2_or_gpa,
> u32 err, bool prefetch)
> diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> index dc9672f70468..a7d387ccfd74 100644
> --- a/arch/x86/kvm/mmu/mmu.c
> +++ b/arch/x86/kvm/mmu/mmu.c
> @@ -6195,9 +6195,10 @@ static void __set_nx_huge_pages(bool val)
> nx_huge_pages = itlb_multihit_kvm_mitigation = val;
> }
>
> -static void kvm_update_nx_huge_pages(struct kvm *kvm)
> +void kvm_update_nx_huge_pages(struct kvm *kvm)
> {
> - kvm->arch.nx_huge_pages = nx_huge_pages;
> + kvm->arch.nx_huge_pages = nx_huge_pages &&
> + !kvm->arch.disable_nx_huge_pages;
kvm->arch.nx_huge_pages seems like it could be dropped and
is_nx_huge_page_enabled() could just check this condition.
>
> mutex_lock(&kvm->slots_lock);
> kvm_mmu_zap_all_fast(kvm);
> @@ -6451,6 +6452,7 @@ int kvm_mmu_post_init_vm(struct kvm *kvm)
> int err;
>
> kvm->arch.nx_huge_pages = READ_ONCE(nx_huge_pages);
> + kvm->arch.disable_nx_huge_pages = false;
I believe this can be omitted since kvm_arch is zero-initialized.
> err = kvm_vm_create_worker_thread(kvm, kvm_nx_lpage_recovery_worker, 0,
> "kvm-nx-lpage-recovery",
> &kvm->arch.nx_lpage_recovery_thread);
> diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c
> index 51106d32f04e..73df90a6932b 100644
> --- a/arch/x86/kvm/x86.c
> +++ b/arch/x86/kvm/x86.c
> @@ -4256,6 +4256,7 @@ int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
> case KVM_CAP_SYS_ATTRIBUTES:
> case KVM_CAP_VAPIC:
> case KVM_CAP_ENABLE_CAP:
> + case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
Please document the new capability.
> r = 1;
> break;
> case KVM_CAP_EXIT_HYPERCALL:
> @@ -6048,6 +6049,11 @@ int kvm_vm_ioctl_enable_cap(struct kvm *kvm,
> }
> mutex_unlock(&kvm->lock);
> break;
> + case KVM_CAP_VM_DISABLE_NX_HUGE_PAGES:
> + kvm->arch.disable_nx_huge_pages = true;
> + kvm_update_nx_huge_pages(kvm);
> + r = 0;
> + break;
> default:
> r = -EINVAL;
> break;
> diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h
> index ee5cc9e2a837..6f9fa7ecfd1e 100644
> --- a/include/uapi/linux/kvm.h
> +++ b/include/uapi/linux/kvm.h
> @@ -1144,6 +1144,7 @@ struct kvm_ppc_resize_hpt {
> #define KVM_CAP_S390_MEM_OP_EXTENSION 211
> #define KVM_CAP_PMU_CAPABILITY 212
> #define KVM_CAP_DISABLE_QUIRKS2 213
> +#define KVM_CAP_VM_DISABLE_NX_HUGE_PAGES 214
>
> #ifdef KVM_CAP_IRQ_ROUTING
>
> --
> 2.35.1.894.gb6a874cedc-goog
>
On Mon, Mar 21, 2022 at 04:48:39PM -0700, Ben Gardon wrote:
> There's currently no test coverage of NX hugepages in KVM selftests, so
> add a basic test to ensure that the feature works as intended.
>
> Reviewed-by: David Dunn <[email protected]>
>
> Signed-off-by: Ben Gardon <[email protected]>
> ---
> tools/testing/selftests/kvm/Makefile | 3 +-
> .../kvm/lib/x86_64/nx_huge_pages_guest.S | 45 ++++++
> .../selftests/kvm/x86_64/nx_huge_pages_test.c | 133 ++++++++++++++++++
> .../kvm/x86_64/nx_huge_pages_test.sh | 25 ++++
> 4 files changed, 205 insertions(+), 1 deletion(-)
> create mode 100644 tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S
> create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
>
> diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> index 04099f453b59..6ee30c0df323 100644
> --- a/tools/testing/selftests/kvm/Makefile
> +++ b/tools/testing/selftests/kvm/Makefile
> @@ -38,7 +38,7 @@ ifeq ($(ARCH),riscv)
> endif
>
> LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
> -LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
> +LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S lib/x86_64/nx_huge_pages_guest.S
> LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c
> LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
> LIBKVM_riscv = lib/riscv/processor.c lib/riscv/ucall.c
> @@ -56,6 +56,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/kvm_clock_test
> TEST_GEN_PROGS_x86_64 += x86_64/kvm_pv_test
> TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test
> TEST_GEN_PROGS_x86_64 += x86_64/mmu_role_test
> +TEST_GEN_PROGS_x86_64 += x86_64/nx_huge_pages_test
This will make the selftest infrastructure treat nx_huge_pages_test as
the selftest that gets run by default (e.g. if someone runs `make
kselftest`). But you actually want nx_huge_pages_test.sh to be the
selftest that gets run (nx_huge_pages_test is really just a helper
binary). Is that correct?
Take a look at [1] for how to set this up. Specifically I think you want
to move nx_huge_pages_test to TEST_GEN_PROGS_EXTENDED and add
nx_huge_pages_test.sh to TEST_PROGS.
I'd love to have the infrastructure in place for doing this because I've
been wanting to add some shell script wrappers for dirty_log_perf_test
to set up HugeTLBFS and invoke it with various different arguments.
[1] https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html#contributing-new-tests-details
> TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test
> TEST_GEN_PROGS_x86_64 += x86_64/pmu_event_filter_test
> TEST_GEN_PROGS_x86_64 += x86_64/set_boot_cpu_id
> diff --git a/tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S b/tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S
> new file mode 100644
> index 000000000000..09c66b9562a3
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S
> @@ -0,0 +1,45 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * tools/testing/selftests/kvm/nx_huge_page_guest.S
> + *
> + * Copyright (C) 2022, Google LLC.
> + */
> +
> +.include "kvm_util.h"
> +
> +#define HPAGE_SIZE (2*1024*1024)
> +#define PORT_SUCCESS 0x70
> +
> +.global guest_code0
> +.global guest_code1
> +
> +.align HPAGE_SIZE
> +exit_vm:
> + mov $0x1,%edi
> + mov $0x2,%esi
> + mov a_string,%edx
> + mov $0x1,%ecx
> + xor %eax,%eax
> + jmp ucall
> +
> +
> +guest_code0:
> + mov data1, %eax
> + mov data2, %eax
> + jmp exit_vm
> +
> +.align HPAGE_SIZE
> +guest_code1:
> + mov data1, %eax
> + mov data2, %eax
> + jmp exit_vm
> +data1:
> +.quad 0
> +
> +.align HPAGE_SIZE
> +data2:
> +.quad 0
> +a_string:
> +.string "why does the ucall function take a string argument?"
> +
> +
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> new file mode 100644
> index 000000000000..2bcbe4efdc6a
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> @@ -0,0 +1,133 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * tools/testing/selftests/kvm/nx_huge_page_test.c
> + *
> + * Usage: to be run via nx_huge_page_test.sh, which does the necessary
> + * environment setup and teardown
> + *
> + * Copyright (C) 2022, Google LLC.
> + */
> +
> +#define _GNU_SOURCE
> +
> +#include <fcntl.h>
> +#include <stdint.h>
> +#include <time.h>
> +
> +#include <test_util.h>
> +#include "kvm_util.h"
> +
> +#define HPAGE_SLOT 10
> +#define HPAGE_PADDR_START (10*1024*1024)
> +#define HPAGE_SLOT_NPAGES (100*1024*1024/4096)
> +
> +/* Defined in nx_huge_page_guest.S */
> +void guest_code0(void);
> +void guest_code1(void);
> +
> +static void run_guest_code(struct kvm_vm *vm, void (*guest_code)(void))
> +{
> + struct kvm_regs regs;
> +
> + vcpu_regs_get(vm, 0, ®s);
> + regs.rip = (uint64_t)guest_code;
> + vcpu_regs_set(vm, 0, ®s);
> + vcpu_run(vm, 0);
> +}
> +
> +static void check_2m_page_count(struct kvm_vm *vm, int expected_pages_2m)
> +{
> + int actual_pages_2m;
> +
> + actual_pages_2m = vm_get_single_stat(vm, "pages_2m");
> +
> + TEST_ASSERT(actual_pages_2m == expected_pages_2m,
> + "Unexpected 2m page count. Expected %d, got %d",
> + expected_pages_2m, actual_pages_2m);
> +}
> +
> +static void check_split_count(struct kvm_vm *vm, int expected_splits)
> +{
> + int actual_splits;
> +
> + actual_splits = vm_get_single_stat(vm, "nx_lpage_splits");
> +
> + TEST_ASSERT(actual_splits == expected_splits,
> + "Unexpected nx lpage split count. Expected %d, got %d",
> + expected_splits, actual_splits);
> +}
> +
> +int main(int argc, char **argv)
> +{
> + struct kvm_vm *vm;
> + struct timespec ts;
> +
> + vm = vm_create(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES, O_RDWR);
> +
> + vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
> + HPAGE_PADDR_START, HPAGE_SLOT,
> + HPAGE_SLOT_NPAGES, 0);
> +
> + kvm_vm_elf_load_memslot(vm, program_invocation_name, HPAGE_SLOT);
> +
> + vm_vcpu_add_default(vm, 0, guest_code0);
> +
> + check_2m_page_count(vm, 0);
> + check_split_count(vm, 0);
> +
> + /*
> + * Running guest_code0 will access data1 and data2.
> + * This should result in part of the huge page containing guest_code0,
> + * and part of the hugepage containing the ucall function being mapped
> + * at 4K. The huge pages containing data1 and data2 will be mapped
> + * at 2M.
> + */
> + run_guest_code(vm, guest_code0);
> + check_2m_page_count(vm, 2);
> + check_split_count(vm, 2);
> +
> + /*
> + * guest_code1 is in the same huge page as data1, so it will cause
> + * that huge page to be remapped at 4k.
> + */
> + run_guest_code(vm, guest_code1);
> + check_2m_page_count(vm, 1);
> + check_split_count(vm, 3);
> +
> + /* Run guest_code0 again to check that is has no effect. */
> + run_guest_code(vm, guest_code0);
> + check_2m_page_count(vm, 1);
> + check_split_count(vm, 3);
> +
> + /*
> + * Give recovery thread time to run. The wrapper script sets
> + * recovery_period_ms to 100, so wait 1.5x that.
> + */
> + ts.tv_sec = 0;
> + ts.tv_nsec = 150000000;
> + nanosleep(&ts, NULL);
> +
> + /*
> + * Now that the reclaimer has run, all the split pages should be gone.
> + */
> + check_2m_page_count(vm, 1);
> + check_split_count(vm, 0);
> +
> + /*
> + * The split 2M pages should have been reclaimed, so run guest_code0
> + * again to check that pages are mapped at 2M again.
> + */
> + run_guest_code(vm, guest_code0);
> + check_2m_page_count(vm, 2);
> + check_split_count(vm, 2);
> +
> + /* Pages are once again split from running guest_code1. */
> + run_guest_code(vm, guest_code1);
> + check_2m_page_count(vm, 1);
> + check_split_count(vm, 3);
> +
> + kvm_vm_free(vm);
> +
> + return 0;
> +}
> +
> diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> new file mode 100755
> index 000000000000..19fc95723fcb
> --- /dev/null
> +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> @@ -0,0 +1,25 @@
> +#!/bin/bash
> +# SPDX-License-Identifier: GPL-2.0-only */
> +
> +# tools/testing/selftests/kvm/nx_huge_page_test.sh
> +# Copyright (C) 2022, Google LLC.
> +
> +NX_HUGE_PAGES=$(cat /sys/module/kvm/parameters/nx_huge_pages)
> +NX_HUGE_PAGES_RECOVERY_RATIO=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio)
> +NX_HUGE_PAGES_RECOVERY_PERIOD=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms)
> +HUGE_PAGES=$(cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
> +
> +echo 1 > /sys/module/kvm/parameters/nx_huge_pages
> +echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> +echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> +echo 200 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> +
> +./nx_huge_pages_test
> +RET=$?
> +
> +echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
> +echo $NX_HUGE_PAGES_RECOVERY_RATIO > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> +echo $NX_HUGE_PAGES_RECOVERY_PERIOD > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> +echo $HUGE_PAGES > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> +
> +exit $RET
> --
> 2.35.1.894.gb6a874cedc-goog
>
On Mon, Mar 28, 2022 at 11:57 AM David Matlack <[email protected]> wrote:
>
> On Mon, Mar 21, 2022 at 04:48:39PM -0700, Ben Gardon wrote:
> > There's currently no test coverage of NX hugepages in KVM selftests, so
> > add a basic test to ensure that the feature works as intended.
> >
> > Reviewed-by: David Dunn <[email protected]>
> >
> > Signed-off-by: Ben Gardon <[email protected]>
> > ---
> > tools/testing/selftests/kvm/Makefile | 3 +-
> > .../kvm/lib/x86_64/nx_huge_pages_guest.S | 45 ++++++
> > .../selftests/kvm/x86_64/nx_huge_pages_test.c | 133 ++++++++++++++++++
> > .../kvm/x86_64/nx_huge_pages_test.sh | 25 ++++
> > 4 files changed, 205 insertions(+), 1 deletion(-)
> > create mode 100644 tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S
> > create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> > create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> >
> > diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
> > index 04099f453b59..6ee30c0df323 100644
> > --- a/tools/testing/selftests/kvm/Makefile
> > +++ b/tools/testing/selftests/kvm/Makefile
> > @@ -38,7 +38,7 @@ ifeq ($(ARCH),riscv)
> > endif
> >
> > LIBKVM = lib/assert.c lib/elf.c lib/io.c lib/kvm_util.c lib/rbtree.c lib/sparsebit.c lib/test_util.c lib/guest_modes.c lib/perf_test_util.c
> > -LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S
> > +LIBKVM_x86_64 = lib/x86_64/apic.c lib/x86_64/processor.c lib/x86_64/vmx.c lib/x86_64/svm.c lib/x86_64/ucall.c lib/x86_64/handlers.S lib/x86_64/nx_huge_pages_guest.S
> > LIBKVM_aarch64 = lib/aarch64/processor.c lib/aarch64/ucall.c lib/aarch64/handlers.S lib/aarch64/spinlock.c lib/aarch64/gic.c lib/aarch64/gic_v3.c lib/aarch64/vgic.c
> > LIBKVM_s390x = lib/s390x/processor.c lib/s390x/ucall.c lib/s390x/diag318_test_handler.c
> > LIBKVM_riscv = lib/riscv/processor.c lib/riscv/ucall.c
> > @@ -56,6 +56,7 @@ TEST_GEN_PROGS_x86_64 += x86_64/kvm_clock_test
> > TEST_GEN_PROGS_x86_64 += x86_64/kvm_pv_test
> > TEST_GEN_PROGS_x86_64 += x86_64/mmio_warning_test
> > TEST_GEN_PROGS_x86_64 += x86_64/mmu_role_test
> > +TEST_GEN_PROGS_x86_64 += x86_64/nx_huge_pages_test
>
> This will make the selftest infrastructure treat nx_huge_pages_test as
> the selftest that gets run by default (e.g. if someone runs `make
> kselftest`). But you actually want nx_huge_pages_test.sh to be the
> selftest that gets run (nx_huge_pages_test is really just a helper
> binary). Is that correct?
>
> Take a look at [1] for how to set this up. Specifically I think you want
> to move nx_huge_pages_test to TEST_GEN_PROGS_EXTENDED and add
> nx_huge_pages_test.sh to TEST_PROGS.
>
> I'd love to have the infrastructure in place for doing this because I've
> been wanting to add some shell script wrappers for dirty_log_perf_test
> to set up HugeTLBFS and invoke it with various different arguments.
>
> [1] https://www.kernel.org/doc/html/latest/dev-tools/kselftest.html#contributing-new-tests-details
Oh awesome, thank you for the tip! I'll try that.
>
> > TEST_GEN_PROGS_x86_64 += x86_64/platform_info_test
> > TEST_GEN_PROGS_x86_64 += x86_64/pmu_event_filter_test
> > TEST_GEN_PROGS_x86_64 += x86_64/set_boot_cpu_id
> > diff --git a/tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S b/tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S
> > new file mode 100644
> > index 000000000000..09c66b9562a3
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/lib/x86_64/nx_huge_pages_guest.S
> > @@ -0,0 +1,45 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * tools/testing/selftests/kvm/nx_huge_page_guest.S
> > + *
> > + * Copyright (C) 2022, Google LLC.
> > + */
> > +
> > +.include "kvm_util.h"
> > +
> > +#define HPAGE_SIZE (2*1024*1024)
> > +#define PORT_SUCCESS 0x70
> > +
> > +.global guest_code0
> > +.global guest_code1
> > +
> > +.align HPAGE_SIZE
> > +exit_vm:
> > + mov $0x1,%edi
> > + mov $0x2,%esi
> > + mov a_string,%edx
> > + mov $0x1,%ecx
> > + xor %eax,%eax
> > + jmp ucall
> > +
> > +
> > +guest_code0:
> > + mov data1, %eax
> > + mov data2, %eax
> > + jmp exit_vm
> > +
> > +.align HPAGE_SIZE
> > +guest_code1:
> > + mov data1, %eax
> > + mov data2, %eax
> > + jmp exit_vm
> > +data1:
> > +.quad 0
> > +
> > +.align HPAGE_SIZE
> > +data2:
> > +.quad 0
> > +a_string:
> > +.string "why does the ucall function take a string argument?"
> > +
> > +
> > diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> > new file mode 100644
> > index 000000000000..2bcbe4efdc6a
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
> > @@ -0,0 +1,133 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * tools/testing/selftests/kvm/nx_huge_page_test.c
> > + *
> > + * Usage: to be run via nx_huge_page_test.sh, which does the necessary
> > + * environment setup and teardown
> > + *
> > + * Copyright (C) 2022, Google LLC.
> > + */
> > +
> > +#define _GNU_SOURCE
> > +
> > +#include <fcntl.h>
> > +#include <stdint.h>
> > +#include <time.h>
> > +
> > +#include <test_util.h>
> > +#include "kvm_util.h"
> > +
> > +#define HPAGE_SLOT 10
> > +#define HPAGE_PADDR_START (10*1024*1024)
> > +#define HPAGE_SLOT_NPAGES (100*1024*1024/4096)
> > +
> > +/* Defined in nx_huge_page_guest.S */
> > +void guest_code0(void);
> > +void guest_code1(void);
> > +
> > +static void run_guest_code(struct kvm_vm *vm, void (*guest_code)(void))
> > +{
> > + struct kvm_regs regs;
> > +
> > + vcpu_regs_get(vm, 0, ®s);
> > + regs.rip = (uint64_t)guest_code;
> > + vcpu_regs_set(vm, 0, ®s);
> > + vcpu_run(vm, 0);
> > +}
> > +
> > +static void check_2m_page_count(struct kvm_vm *vm, int expected_pages_2m)
> > +{
> > + int actual_pages_2m;
> > +
> > + actual_pages_2m = vm_get_single_stat(vm, "pages_2m");
> > +
> > + TEST_ASSERT(actual_pages_2m == expected_pages_2m,
> > + "Unexpected 2m page count. Expected %d, got %d",
> > + expected_pages_2m, actual_pages_2m);
> > +}
> > +
> > +static void check_split_count(struct kvm_vm *vm, int expected_splits)
> > +{
> > + int actual_splits;
> > +
> > + actual_splits = vm_get_single_stat(vm, "nx_lpage_splits");
> > +
> > + TEST_ASSERT(actual_splits == expected_splits,
> > + "Unexpected nx lpage split count. Expected %d, got %d",
> > + expected_splits, actual_splits);
> > +}
> > +
> > +int main(int argc, char **argv)
> > +{
> > + struct kvm_vm *vm;
> > + struct timespec ts;
> > +
> > + vm = vm_create(VM_MODE_DEFAULT, DEFAULT_GUEST_PHY_PAGES, O_RDWR);
> > +
> > + vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
> > + HPAGE_PADDR_START, HPAGE_SLOT,
> > + HPAGE_SLOT_NPAGES, 0);
> > +
> > + kvm_vm_elf_load_memslot(vm, program_invocation_name, HPAGE_SLOT);
> > +
> > + vm_vcpu_add_default(vm, 0, guest_code0);
> > +
> > + check_2m_page_count(vm, 0);
> > + check_split_count(vm, 0);
> > +
> > + /*
> > + * Running guest_code0 will access data1 and data2.
> > + * This should result in part of the huge page containing guest_code0,
> > + * and part of the hugepage containing the ucall function being mapped
> > + * at 4K. The huge pages containing data1 and data2 will be mapped
> > + * at 2M.
> > + */
> > + run_guest_code(vm, guest_code0);
> > + check_2m_page_count(vm, 2);
> > + check_split_count(vm, 2);
> > +
> > + /*
> > + * guest_code1 is in the same huge page as data1, so it will cause
> > + * that huge page to be remapped at 4k.
> > + */
> > + run_guest_code(vm, guest_code1);
> > + check_2m_page_count(vm, 1);
> > + check_split_count(vm, 3);
> > +
> > + /* Run guest_code0 again to check that is has no effect. */
> > + run_guest_code(vm, guest_code0);
> > + check_2m_page_count(vm, 1);
> > + check_split_count(vm, 3);
> > +
> > + /*
> > + * Give recovery thread time to run. The wrapper script sets
> > + * recovery_period_ms to 100, so wait 1.5x that.
> > + */
> > + ts.tv_sec = 0;
> > + ts.tv_nsec = 150000000;
> > + nanosleep(&ts, NULL);
> > +
> > + /*
> > + * Now that the reclaimer has run, all the split pages should be gone.
> > + */
> > + check_2m_page_count(vm, 1);
> > + check_split_count(vm, 0);
> > +
> > + /*
> > + * The split 2M pages should have been reclaimed, so run guest_code0
> > + * again to check that pages are mapped at 2M again.
> > + */
> > + run_guest_code(vm, guest_code0);
> > + check_2m_page_count(vm, 2);
> > + check_split_count(vm, 2);
> > +
> > + /* Pages are once again split from running guest_code1. */
> > + run_guest_code(vm, guest_code1);
> > + check_2m_page_count(vm, 1);
> > + check_split_count(vm, 3);
> > +
> > + kvm_vm_free(vm);
> > +
> > + return 0;
> > +}
> > +
> > diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> > new file mode 100755
> > index 000000000000..19fc95723fcb
> > --- /dev/null
> > +++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
> > @@ -0,0 +1,25 @@
> > +#!/bin/bash
> > +# SPDX-License-Identifier: GPL-2.0-only */
> > +
> > +# tools/testing/selftests/kvm/nx_huge_page_test.sh
> > +# Copyright (C) 2022, Google LLC.
> > +
> > +NX_HUGE_PAGES=$(cat /sys/module/kvm/parameters/nx_huge_pages)
> > +NX_HUGE_PAGES_RECOVERY_RATIO=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio)
> > +NX_HUGE_PAGES_RECOVERY_PERIOD=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms)
> > +HUGE_PAGES=$(cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
> > +
> > +echo 1 > /sys/module/kvm/parameters/nx_huge_pages
> > +echo 1 > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> > +echo 100 > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> > +echo 200 > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> > +
> > +./nx_huge_pages_test
> > +RET=$?
> > +
> > +echo $NX_HUGE_PAGES > /sys/module/kvm/parameters/nx_huge_pages
> > +echo $NX_HUGE_PAGES_RECOVERY_RATIO > /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
> > +echo $NX_HUGE_PAGES_RECOVERY_PERIOD > /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
> > +echo $HUGE_PAGES > /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> > +
> > +exit $RET
> > --
> > 2.35.1.894.gb6a874cedc-goog
> >
On Mon, Mar 28, 2022 at 1:18 PM David Matlack <[email protected]> wrote:
>
> On Mon, Mar 21, 2022 at 04:48:40PM -0700, Ben Gardon wrote:
> > Factor out the code to update the NX hugepages state for an individual
> > VM. This will be expanded in future commits to allow per-VM control of
> > Nx hugepages.
> >
> > No functional change intended.
> >
> > Signed-off-by: Ben Gardon <[email protected]>
> > ---
> > arch/x86/kvm/mmu/mmu.c | 18 +++++++++++-------
> > 1 file changed, 11 insertions(+), 7 deletions(-)
> >
> > diff --git a/arch/x86/kvm/mmu/mmu.c b/arch/x86/kvm/mmu/mmu.c
> > index 3b8da8b0745e..1b59b56642f1 100644
> > --- a/arch/x86/kvm/mmu/mmu.c
> > +++ b/arch/x86/kvm/mmu/mmu.c
> > @@ -6195,6 +6195,15 @@ static void __set_nx_huge_pages(bool val)
> > nx_huge_pages = itlb_multihit_kvm_mitigation = val;
> > }
> >
> > +static int kvm_update_nx_huge_pages(struct kvm *kvm)
> > +{
> > + mutex_lock(&kvm->slots_lock);
> > + kvm_mmu_zap_all_fast(kvm);
> > + mutex_unlock(&kvm->slots_lock);
> > +
> > + wake_up_process(kvm->arch.nx_lpage_recovery_thread);
> > +}
> > +
> > static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
> > {
> > bool old_val = nx_huge_pages;
> > @@ -6217,13 +6226,8 @@ static int set_nx_huge_pages(const char *val, const struct kernel_param *kp)
> >
> > mutex_lock(&kvm_lock);
> >
>
> nit: This blank line is asymmetrical with mutex_unlock().
>
> > - list_for_each_entry(kvm, &vm_list, vm_list) {
> > - mutex_lock(&kvm->slots_lock);
> > - kvm_mmu_zap_all_fast(kvm);
> > - mutex_unlock(&kvm->slots_lock);
> > -
> > - wake_up_process(kvm->arch.nx_lpage_recovery_thread);
> > - }
> > + list_for_each_entry(kvm, &vm_list, vm_list)
> > + kvm_set_nx_huge_pages(kvm);
>
> This should be kvm_update_nx_huge_pages() right?
Oh woops, duh. Apparently I did not compile-test this patch individually.
>
> > mutex_unlock(&kvm_lock);
> > }
> >
> > --
> > 2.35.1.894.gb6a874cedc-goog
> >