2022-06-13 22:04:26

by Ben Gardon

[permalink] [raw]
Subject: [PATCH v9 00/10] KVM: x86: Add a cap to disable NX hugepages on a VM

Given the high cost of NX hugepages in terms of TLB performance, it may
be desirable to disable the mitigation on a per-VM basis. In the case of public
cloud providers with many VMs on a single host, some VMs may be more trusted
than others. In order to maximize performance on critical VMs, while still
providing some protection to the host from iTLB Multihit, allow the mitigation
to be selectively disabled.

Disabling NX hugepages on a VM is relatively straightforward, but I took this
as an opportunity to add some NX hugepages test coverage and clean up selftests
infrastructure a bit.

This series was tested with the new selftest and the rest of the KVM selftests
on an Intel Haswell machine.

Changelog:
v1->v2:
Dropped the complicated memslot refactor in favor of Ricardo Koller's
patch with a similar effect.
Incorporated David Dunn's feedback and reviewed by tag: shortened waits
to speed up test.
v2->v3:
Incorporated a suggestion from David on how to build the NX huge pages
test.
Fixed a build breakage identified by David.
Dropped the per-vm nx_huge_pages field in favor of simply checking the
global + per-VM disable override.
Documented the new capability
Separated out the commit to test disabling NX huge pages
Removed permission check when checking if the disable NX capability is
supported.
Added test coverage for the permission check.
v3->v4:
Collected RB's from Jing and David
Modified stat collection to reduce a memory allocation [David]
Incorporated various improvments to the NX test [David]
Changed the NX disable test to run by default [David]
Removed some now unnecessary commits
Dropped the code to dump KVM stats from the binary stats test, and
factor out parts of the existing test to library functions instead.
[David, Jing, Sean]
Dropped the improvement to a debugging log message as it's no longer
relevant to this series.
v4->v5:
Incorporated cleanup suggestions from David and Sean
Added a patch with style fixes for the binary stats test from Sean
Added a restriction that NX huge pages can only be disabled before
vCPUs are created [Sean]

v5->v6:
Scooped up David's RBs
Added a magic token to skip nx_huge_pages_test when not run via
wrapper script [Sean]
Made the call to nx_huge_pages_test in the wrapper script more
robust [Sean]
Incorportated various nits and comment / documentation suggestions from
Sean.
Improved negative testing of NX disable without reboot permissions. [Sean]

v6->v7:
Collected Peter Xu's Reviewed-by tags
Added stats metadata caching to kvm_util
Misc NX test fixups

v7->v8:
Spell out descriptors in library function names [Sean]
Reorganize stat descriptor size calculation [Sean]
Addded a get_stats_descriptor helper [Sean]
Remove misleading comment about error reporting in read_stat_data() [Sean]
Use unsigned size_t for input to pread [Sean]
Clean up read_stat_data() [Sean]
Add nx_huge_pages_test to .gitignore [Sean]
Fix organization of get_vm_stat() functions. [Sean]
Clean up #defines in NX huge pages test [Sean]
Add flag parsing and reclaim period flag to NX test [Sean]
Don't reduce hugepage allocation for NX test [Sean]
Fix error check when disabling NX huge pages [Sean]
Don't leave reboot permissions on test binary when executing as root [Sean]

v8->v9:
Rebased on top of Sean's giant selftests refactor series

Ben Gardon (9):
KVM: selftests: Remove dynamic memory allocation for stats header
KVM: selftests: Read binary stats header in lib
KVM: selftests: Read binary stats desc in lib
KVM: selftests: Read binary stat data in lib
KVM: selftests: Add NX huge pages test
KVM: x86: Fix errant brace in KVM capability handling
KVM: x86/MMU: Allow NX huge pages to be disabled on a per-vm basis
KVM: selftests: Test disabling NX hugepages on a VM
KVM: selftests: Cache binary stats metadata for duration of test

Sean Christopherson (1):
KVM: selftests: Clean up coding style in binary stats test

Documentation/virt/kvm/api.rst | 16 ++
arch/x86/include/asm/kvm_host.h | 2 +
arch/x86/kvm/mmu/mmu_internal.h | 7 +-
arch/x86/kvm/mmu/spte.c | 7 +-
arch/x86/kvm/mmu/spte.h | 3 +-
arch/x86/kvm/mmu/tdp_mmu.c | 2 +-
arch/x86/kvm/x86.c | 32 ++-
include/uapi/linux/kvm.h | 1 +
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 10 +
.../selftests/kvm/include/kvm_util_base.h | 59 ++++
.../selftests/kvm/kvm_binary_stats_test.c | 138 +++++----
tools/testing/selftests/kvm/lib/kvm_util.c | 116 ++++++++
.../selftests/kvm/x86_64/nx_huge_pages_test.c | 269 ++++++++++++++++++
.../kvm/x86_64/nx_huge_pages_test.sh | 52 ++++
15 files changed, 635 insertions(+), 80 deletions(-)
create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh

--
2.36.1.476.g0c4daa206d-goog


2022-06-13 22:21:47

by Ben Gardon

[permalink] [raw]
Subject: [PATCH v9 03/10] KVM: selftests: Read binary stats desc in lib

Move the code to read the binary stats descriptors to the KVM selftests
library. It will be re-used by other tests to check KVM behavior.

No functional change intended.

Reviewed-by: David Matlack <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>
---
.../selftests/kvm/include/kvm_util_base.h | 25 ++++++++++++++
.../selftests/kvm/kvm_binary_stats_test.c | 16 +++------
tools/testing/selftests/kvm/lib/kvm_util.c | 33 +++++++++++++++++++
3 files changed, 63 insertions(+), 11 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 3d3dd144f427..6c66c6ef485b 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -311,6 +311,31 @@ static inline void read_stats_header(int stats_fd, struct kvm_stats_header *head
TEST_ASSERT(ret == sizeof(*header), "Read stats header");
}

+struct kvm_stats_desc *read_stats_descriptors(int stats_fd,
+ struct kvm_stats_header *header);
+
+static inline ssize_t get_stats_descriptor_size(struct kvm_stats_header *header)
+{
+ /*
+ * The base size of the descriptor is defined by KVM's ABI, but the
+ * size of the name field is variable, as far as KVM's ABI is
+ * concerned. For a given instance of KVM, the name field is the same
+ * size for all stats and is provided in the overall stats header.
+ */
+ return sizeof(struct kvm_stats_desc) + header->name_size;
+}
+
+static inline struct kvm_stats_desc *get_stats_descriptor(struct kvm_stats_desc *stats,
+ int index,
+ struct kvm_stats_header *header)
+{
+ /*
+ * Note, size_desc includes the size of the name field, which is
+ * variable. i.e. this is NOT equivalent to &stats_desc[i].
+ */
+ return (void *)stats + index * get_stats_descriptor_size(header);
+}
+
void vm_create_irqchip(struct kvm_vm *vm);

void vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags,
diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index 64db17faacd2..9d3b9a0ce2ef 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -35,7 +35,7 @@ static void stats_test(int stats_fd)
/* Read kvm stats header */
read_stats_header(stats_fd, &header);

- size_desc = sizeof(*stats_desc) + header.name_size;
+ size_desc = get_stats_descriptor_size(&header);

/* Read kvm stats id string */
id = malloc(header.name_size);
@@ -62,18 +62,12 @@ static void stats_test(int stats_fd)
header.data_offset),
"Descriptor block is overlapped with data block");

- /* Allocate memory for stats descriptors */
- stats_desc = calloc(header.num_desc, size_desc);
- TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
/* Read kvm stats descriptors */
- ret = pread(stats_fd, stats_desc,
- size_desc * header.num_desc, header.desc_offset);
- TEST_ASSERT(ret == size_desc * header.num_desc,
- "Read KVM stats descriptors");
+ stats_desc = read_stats_descriptors(stats_fd, &header);

/* Sanity check for fields in descriptors */
for (i = 0; i < header.num_desc; ++i) {
- pdesc = (void *)stats_desc + i * size_desc;
+ pdesc = get_stats_descriptor(stats_desc, i, &header);
/* Check type,unit,base boundaries */
TEST_ASSERT((pdesc->flags & KVM_STATS_TYPE_MASK)
<= KVM_STATS_TYPE_MAX, "Unknown KVM stats type");
@@ -129,7 +123,7 @@ static void stats_test(int stats_fd)
"Data size is not correct");
/* Check stats offset */
for (i = 0; i < header.num_desc; ++i) {
- pdesc = (void *)stats_desc + i * size_desc;
+ pdesc = get_stats_descriptor(stats_desc, i, &header);
TEST_ASSERT(pdesc->offset < size_data,
"Invalid offset (%u) for stats: %s",
pdesc->offset, pdesc->name);
@@ -144,7 +138,7 @@ static void stats_test(int stats_fd)
/* Read kvm stats data one by one */
size_data = 0;
for (i = 0; i < header.num_desc; ++i) {
- pdesc = (void *)stats_desc + i * size_desc;
+ pdesc = get_stats_descriptor(stats_desc, i, &header);
ret = pread(stats_fd, stats_data,
pdesc->size * sizeof(*stats_data),
header.data_offset + size_data);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 39f2f5f1338f..fc957a385a0a 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1857,3 +1857,36 @@ unsigned int vm_calc_num_guest_pages(enum vm_guest_mode mode, size_t size)
n = DIV_ROUND_UP(size, vm_guest_mode_params[mode].page_size);
return vm_adjust_num_guest_pages(mode, n);
}
+
+/*
+ * Read binary stats descriptors
+ *
+ * Input Args:
+ * stats_fd - the file descriptor for the binary stats file from which to read
+ * header - the binary stats metadata header corresponding to the given FD
+ *
+ * Output Args: None
+ *
+ * Return:
+ * A pointer to a newly allocated series of stat descriptors.
+ * Caller is responsible for freeing the returned kvm_stats_desc.
+ *
+ * Read the stats descriptors from the binary stats interface.
+ */
+struct kvm_stats_desc *read_stats_descriptors(int stats_fd,
+ struct kvm_stats_header *header)
+{
+ struct kvm_stats_desc *stats_desc;
+ ssize_t desc_size, total_size, ret;
+
+ desc_size = get_stats_descriptor_size(header);
+ total_size = header->num_desc * desc_size;
+
+ stats_desc = calloc(header->num_desc, desc_size);
+ TEST_ASSERT(stats_desc, "Allocate memory for stats descriptors");
+
+ ret = pread(stats_fd, stats_desc, total_size, header->desc_offset);
+ TEST_ASSERT(ret == total_size, "Read KVM stats descriptors");
+
+ return stats_desc;
+}
--
2.36.1.476.g0c4daa206d-goog

2022-06-13 22:25:48

by Ben Gardon

[permalink] [raw]
Subject: [PATCH v9 06/10] KVM: selftests: Add NX huge pages test

There's currently no test coverage of NX hugepages in KVM selftests, so
add a basic test to ensure that the feature works as intended.

The test creates a VM with a data slot backed with huge pages. The
memory in the data slot is filled with op-codes for the return
instruction. The guest then executes a series of accesses on the memory,
some reads, some instruction fetches. After each operation, the guest
exits and the test performs some checks on the backing page counts to
ensure that NX page splitting an reclaim work as expected.

Reviewed-by: David Matlack <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>
---
tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 10 +
.../selftests/kvm/include/kvm_util_base.h | 11 +
tools/testing/selftests/kvm/lib/kvm_util.c | 46 ++++
.../selftests/kvm/x86_64/nx_huge_pages_test.c | 229 ++++++++++++++++++
.../kvm/x86_64/nx_huge_pages_test.sh | 40 +++
6 files changed, 337 insertions(+)
create mode 100644 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
create mode 100755 tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh

diff --git a/tools/testing/selftests/kvm/.gitignore b/tools/testing/selftests/kvm/.gitignore
index dd5c88c11059..1f2e81c0f36a 100644
--- a/tools/testing/selftests/kvm/.gitignore
+++ b/tools/testing/selftests/kvm/.gitignore
@@ -28,6 +28,7 @@
/x86_64/max_vcpuid_cap_test
/x86_64/mmio_warning_test
/x86_64/mmu_role_test
+/x86_64/nx_huge_pages_test
/x86_64/platform_info_test
/x86_64/pmu_event_filter_test
/x86_64/set_boot_cpu_id
diff --git a/tools/testing/selftests/kvm/Makefile b/tools/testing/selftests/kvm/Makefile
index b52c130f7b2f..fc2299f31a85 100644
--- a/tools/testing/selftests/kvm/Makefile
+++ b/tools/testing/selftests/kvm/Makefile
@@ -70,6 +70,10 @@ LIBKVM_s390x += lib/s390x/ucall.c
LIBKVM_riscv += lib/riscv/processor.c
LIBKVM_riscv += lib/riscv/ucall.c

+# Non-compiled test targets
+TEST_PROGS_x86_64 += x86_64/nx_huge_pages_test.sh
+
+# Compiled test targets
TEST_GEN_PROGS_x86_64 = x86_64/cpuid_test
TEST_GEN_PROGS_x86_64 += x86_64/cr4_cpuid_sync_test
TEST_GEN_PROGS_x86_64 += x86_64/get_msr_index_features
@@ -134,6 +138,9 @@ TEST_GEN_PROGS_x86_64 += steal_time
TEST_GEN_PROGS_x86_64 += kvm_binary_stats_test
TEST_GEN_PROGS_x86_64 += system_counter_offset_test

+# Compiled outputs used by test targets
+TEST_GEN_PROGS_EXTENDED_x86_64 += x86_64/nx_huge_pages_test
+
TEST_GEN_PROGS_aarch64 += aarch64/arch_timer
TEST_GEN_PROGS_aarch64 += aarch64/debug-exceptions
TEST_GEN_PROGS_aarch64 += aarch64/get-reg-list
@@ -173,7 +180,9 @@ TEST_GEN_PROGS_riscv += kvm_page_table_test
TEST_GEN_PROGS_riscv += set_memory_region_test
TEST_GEN_PROGS_riscv += kvm_binary_stats_test

+TEST_PROGS += $(TEST_PROGS_$(UNAME_M))
TEST_GEN_PROGS += $(TEST_GEN_PROGS_$(UNAME_M))
+TEST_GEN_PROGS_EXTENDED += $(TEST_GEN_PROGS_EXTENDED_$(UNAME_M))
LIBKVM += $(LIBKVM_$(UNAME_M))

INSTALL_HDR_PATH = $(top_srcdir)/usr
@@ -220,6 +229,7 @@ $(LIBKVM_S_OBJ): $(OUTPUT)/%.o: %.S

x := $(shell mkdir -p $(sort $(dir $(TEST_GEN_PROGS))))
$(TEST_GEN_PROGS): $(LIBKVM_OBJS)
+$(TEST_GEN_PROGS_EXTENDED): $(LIBKVM_OBJS)

cscope: include_paths = $(LINUX_TOOL_INCLUDE) $(LINUX_HDR_PATH) include lib ..
cscope:
diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index aa7f8b681944..81ab7adfbef5 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -340,6 +340,17 @@ void read_stat_data(int stats_fd, struct kvm_stats_header *header,
struct kvm_stats_desc *desc, uint64_t *data,
size_t max_elements);

+void __vm_get_stat(struct kvm_vm *vm, const char *stat_name, uint64_t *data,
+ size_t max_elements);
+
+static inline uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name)
+{
+ uint64_t data;
+
+ __vm_get_stat(vm, stat_name, &data, 1);
+ return data;
+}
+
void vm_create_irqchip(struct kvm_vm *vm);

void vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags,
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index 5b8249a0e1de..0d97142a590e 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1925,3 +1925,49 @@ void read_stat_data(int stats_fd, struct kvm_stats_header *header,
"pread() on stat '%s' read %ld bytes, wanted %lu bytes",
desc->name, size, ret);
}
+
+/*
+ * Read the data of the named stat
+ *
+ * Input Args:
+ * vm - the VM for which the stat should be read
+ * stat_name - the name of the stat to read
+ * max_elements - the maximum number of 8-byte values to read into data
+ *
+ * Output Args:
+ * data - the buffer into which stat data should be read
+ *
+ * Read the data values of a specified stat from the binary stats interface.
+ */
+void __vm_get_stat(struct kvm_vm *vm, const char *stat_name, uint64_t *data,
+ size_t max_elements)
+{
+ struct kvm_stats_desc *stats_desc;
+ struct kvm_stats_header header;
+ struct kvm_stats_desc *desc;
+ size_t size_desc;
+ int stats_fd;
+ int i;
+
+ stats_fd = vm_get_stats_fd(vm);
+
+ read_stats_header(stats_fd, &header);
+
+ stats_desc = read_stats_descriptors(stats_fd, &header);
+
+ size_desc = get_stats_descriptor_size(&header);
+
+ for (i = 0; i < header.num_desc; ++i) {
+ desc = (void *)stats_desc + (i * size_desc);
+
+ if (strcmp(desc->name, stat_name))
+ continue;
+
+ read_stat_data(stats_fd, &header, desc, data, max_elements);
+
+ break;
+ }
+
+ free(stats_desc);
+ close(stats_fd);
+}
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
new file mode 100644
index 000000000000..5fa61d225787
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.c
@@ -0,0 +1,229 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * tools/testing/selftests/kvm/nx_huge_page_test.c
+ *
+ * Usage: to be run via nx_huge_page_test.sh, which does the necessary
+ * environment setup and teardown
+ *
+ * Copyright (C) 2022, Google LLC.
+ */
+
+#define _GNU_SOURCE
+
+#include <fcntl.h>
+#include <stdint.h>
+#include <time.h>
+
+#include <test_util.h>
+#include "kvm_util.h"
+#include "processor.h"
+
+#define HPAGE_SLOT 10
+#define HPAGE_GPA (4UL << 30) /* 4G prevents collision w/ slot 0 */
+#define HPAGE_GVA HPAGE_GPA /* GVA is arbitrary, so use GPA. */
+#define PAGES_PER_2MB_HUGE_PAGE 512
+#define HPAGE_SLOT_NPAGES (3 * PAGES_PER_2MB_HUGE_PAGE)
+
+/*
+ * Passed by nx_huge_pages_test.sh to provide an easy warning if this test is
+ * being run without it.
+ */
+#define MAGIC_TOKEN 887563923
+
+/*
+ * x86 opcode for the return instruction. Used to call into, and then
+ * immediately return from, memory backed with hugepages.
+ */
+#define RETURN_OPCODE 0xC3
+
+/* Call the specified memory address. */
+static void guest_do_CALL(uint64_t target)
+{
+ ((void (*)(void)) target)();
+}
+
+/*
+ * Exit the VM after each memory access so that the userspace component of the
+ * test can make assertions about the pages backing the VM.
+ *
+ * See the below for an explanation of how each access should affect the
+ * backing mappings.
+ */
+void guest_code(void)
+{
+ uint64_t hpage_1 = HPAGE_GVA;
+ uint64_t hpage_2 = hpage_1 + (PAGE_SIZE * 512);
+ uint64_t hpage_3 = hpage_2 + (PAGE_SIZE * 512);
+
+ READ_ONCE(*(uint64_t *)hpage_1);
+ GUEST_SYNC(1);
+
+ READ_ONCE(*(uint64_t *)hpage_2);
+ GUEST_SYNC(2);
+
+ guest_do_CALL(hpage_1);
+ GUEST_SYNC(3);
+
+ guest_do_CALL(hpage_3);
+ GUEST_SYNC(4);
+
+ READ_ONCE(*(uint64_t *)hpage_1);
+ GUEST_SYNC(5);
+
+ READ_ONCE(*(uint64_t *)hpage_3);
+ GUEST_SYNC(6);
+}
+
+static void check_2m_page_count(struct kvm_vm *vm, int expected_pages_2m)
+{
+ int actual_pages_2m;
+
+ actual_pages_2m = vm_get_stat(vm, "pages_2m");
+
+ TEST_ASSERT(actual_pages_2m == expected_pages_2m,
+ "Unexpected 2m page count. Expected %d, got %d",
+ expected_pages_2m, actual_pages_2m);
+}
+
+static void check_split_count(struct kvm_vm *vm, int expected_splits)
+{
+ int actual_splits;
+
+ actual_splits = vm_get_stat(vm, "nx_lpage_splits");
+
+ TEST_ASSERT(actual_splits == expected_splits,
+ "Unexpected NX huge page split count. Expected %d, got %d",
+ expected_splits, actual_splits);
+}
+
+static void wait_for_reclaim(int reclaim_period_ms)
+{
+ long reclaim_wait_ms;
+ struct timespec ts;
+
+ reclaim_wait_ms = reclaim_period_ms * 5;
+ ts.tv_sec = reclaim_wait_ms / 1000;
+ ts.tv_nsec = (reclaim_wait_ms - (ts.tv_sec * 1000)) * 1000000;
+ nanosleep(&ts, NULL);
+}
+
+static void help(char *name)
+{
+ puts("");
+ printf("usage: %s [-h] [-p period_ms] [-t token]\n", name);
+ puts("");
+ printf(" -p: The NX reclaim period in miliseconds.\n");
+ printf(" -t: The magic token to indicate environment setup is done.\n");
+ puts("");
+ exit(0);
+}
+
+int main(int argc, char **argv)
+{
+ int reclaim_period_ms = 0, token = 0, opt;
+ struct kvm_vcpu *vcpu;
+ struct kvm_vm *vm;
+ void *hva;
+
+ while ((opt = getopt(argc, argv, "hp:t:")) != -1) {
+ switch (opt) {
+ case 'p':
+ reclaim_period_ms = atoi(optarg);
+ break;
+ case 't':
+ token = atoi(optarg);
+ break;
+ case 'h':
+ default:
+ help(argv[0]);
+ break;
+ }
+ }
+
+ if (token != MAGIC_TOKEN) {
+ print_skip("This test must be run with the magic token %d.\n"
+ "This is done by nx_huge_pages_test.sh, which\n"
+ "also handles environment setup for the test.",
+ MAGIC_TOKEN);
+ exit(KSFT_SKIP);
+ }
+
+ if (!reclaim_period_ms) {
+ print_skip("The NX reclaim period must be specified and non-zero");
+ exit(KSFT_SKIP);
+ }
+
+ vm = vm_create(1);
+ vcpu = vm_vcpu_add(vm, 0, guest_code);
+
+ vm_userspace_mem_region_add(vm, VM_MEM_SRC_ANONYMOUS_HUGETLB,
+ HPAGE_GPA, HPAGE_SLOT,
+ HPAGE_SLOT_NPAGES, 0);
+
+ virt_map(vm, HPAGE_GVA, HPAGE_GPA, HPAGE_SLOT_NPAGES);
+
+ hva = addr_gpa2hva(vm, HPAGE_GPA);
+ memset(hva, RETURN_OPCODE, HPAGE_SLOT_NPAGES * PAGE_SIZE);
+
+ check_2m_page_count(vm, 0);
+ check_split_count(vm, 0);
+
+ /*
+ * The guest code will first read from the first hugepage, resulting
+ * in a huge page mapping being created.
+ */
+ vcpu_run(vcpu);
+ check_2m_page_count(vm, 1);
+ check_split_count(vm, 0);
+
+ /*
+ * Then the guest code will read from the second hugepage, resulting
+ * in another huge page mapping being created.
+ */
+ vcpu_run(vcpu);
+ check_2m_page_count(vm, 2);
+ check_split_count(vm, 0);
+
+ /*
+ * Next, the guest will execute from the first huge page, causing it
+ * to be remapped at 4k.
+ */
+ vcpu_run(vcpu);
+ check_2m_page_count(vm, 1);
+ check_split_count(vm, 1);
+
+ /*
+ * Executing from the third huge page (previously unaccessed) will
+ * cause part to be mapped at 4k.
+ */
+ vcpu_run(vcpu);
+ check_2m_page_count(vm, 1);
+ check_split_count(vm, 2);
+
+ /* Reading from the first huge page again should have no effect. */
+ vcpu_run(vcpu);
+ check_2m_page_count(vm, 1);
+ check_split_count(vm, 2);
+
+ /* Give recovery thread time to run. */
+ wait_for_reclaim(reclaim_period_ms);
+
+ /*
+ * Now that the reclaimer has run, all the split pages should be gone.
+ */
+ check_2m_page_count(vm, 1);
+ check_split_count(vm, 0);
+
+ /*
+ * The 4k mapping on hpage 3 should have been removed, so check that
+ * reading from it causes a huge page mapping to be installed.
+ */
+ vcpu_run(vcpu);
+ check_2m_page_count(vm, 2);
+ check_split_count(vm, 0);
+
+ kvm_vm_free(vm);
+
+ return 0;
+}
+
diff --git a/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
new file mode 100755
index 000000000000..4e090a84f5f3
--- /dev/null
+++ b/tools/testing/selftests/kvm/x86_64/nx_huge_pages_test.sh
@@ -0,0 +1,40 @@
+#!/bin/bash
+# SPDX-License-Identifier: GPL-2.0-only */
+#
+# Wrapper script which performs setup and cleanup for nx_huge_pages_test.
+# Makes use of root privileges to set up huge pages and KVM module parameters.
+#
+# tools/testing/selftests/kvm/nx_huge_page_test.sh
+# Copyright (C) 2022, Google LLC.
+
+set -e
+
+NX_HUGE_PAGES=$(cat /sys/module/kvm/parameters/nx_huge_pages)
+NX_HUGE_PAGES_RECOVERY_RATIO=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio)
+NX_HUGE_PAGES_RECOVERY_PERIOD=$(cat /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms)
+HUGE_PAGES=$(cat /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages)
+
+set +e
+
+function sudo_echo () {
+ echo "$1" | sudo tee -a "$2" > /dev/null
+}
+
+(
+ set -e
+
+ sudo_echo 1 /sys/module/kvm/parameters/nx_huge_pages
+ sudo_echo 1 /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
+ sudo_echo 100 /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
+ sudo_echo "$(( $HUGE_PAGES + 3 ))" /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+ "$(dirname $0)"/nx_huge_pages_test -t 887563923 -p 100
+)
+RET=$?
+
+sudo_echo "$NX_HUGE_PAGES" /sys/module/kvm/parameters/nx_huge_pages
+sudo_echo "$NX_HUGE_PAGES_RECOVERY_RATIO" /sys/module/kvm/parameters/nx_huge_pages_recovery_ratio
+sudo_echo "$NX_HUGE_PAGES_RECOVERY_PERIOD" /sys/module/kvm/parameters/nx_huge_pages_recovery_period_ms
+sudo_echo "$HUGE_PAGES" /sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
+
+exit $RET
--
2.36.1.476.g0c4daa206d-goog

2022-06-13 22:27:01

by Ben Gardon

[permalink] [raw]
Subject: [PATCH v9 05/10] KVM: selftests: Read binary stat data in lib

Move the code to read the binary stats data to the KVM selftests
library. It will be re-used by other tests to check KVM behavior.

Also opportunistically remove an unnecessary calculation with
"size_data" in stats_test.

Reviewed-by: David Matlack <[email protected]>
Reviewed-by: Peter Xu <[email protected]>
Signed-off-by: Ben Gardon <[email protected]>
---
.../selftests/kvm/include/kvm_util_base.h | 4 +++
.../selftests/kvm/kvm_binary_stats_test.c | 9 ++---
tools/testing/selftests/kvm/lib/kvm_util.c | 35 +++++++++++++++++++
3 files changed, 41 insertions(+), 7 deletions(-)

diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h
index 6c66c6ef485b..aa7f8b681944 100644
--- a/tools/testing/selftests/kvm/include/kvm_util_base.h
+++ b/tools/testing/selftests/kvm/include/kvm_util_base.h
@@ -336,6 +336,10 @@ static inline struct kvm_stats_desc *get_stats_descriptor(struct kvm_stats_desc
return (void *)stats + index * get_stats_descriptor_size(header);
}

+void read_stat_data(int stats_fd, struct kvm_stats_header *header,
+ struct kvm_stats_desc *desc, uint64_t *data,
+ size_t max_elements);
+
void vm_create_irqchip(struct kvm_vm *vm);

void vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags,
diff --git a/tools/testing/selftests/kvm/kvm_binary_stats_test.c b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
index 3002fab2bbf1..98b882ec8f98 100644
--- a/tools/testing/selftests/kvm/kvm_binary_stats_test.c
+++ b/tools/testing/selftests/kvm/kvm_binary_stats_test.c
@@ -147,15 +147,10 @@ static void stats_test(int stats_fd)
ret = pread(stats_fd, stats_data, size_data, header.data_offset);
TEST_ASSERT(ret == size_data, "Read KVM stats data");
/* Read kvm stats data one by one */
- size_data = 0;
for (i = 0; i < header.num_desc; ++i) {
pdesc = get_stats_descriptor(stats_desc, i, &header);
- ret = pread(stats_fd, stats_data,
- pdesc->size * sizeof(*stats_data),
- header.data_offset + size_data);
- TEST_ASSERT(ret == pdesc->size * sizeof(*stats_data),
- "Read data of KVM stats: %s", pdesc->name);
- size_data += pdesc->size * sizeof(*stats_data);
+ read_stat_data(stats_fd, &header, pdesc, stats_data,
+ pdesc->size);
}

free(stats_data);
diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c
index fc957a385a0a..5b8249a0e1de 100644
--- a/tools/testing/selftests/kvm/lib/kvm_util.c
+++ b/tools/testing/selftests/kvm/lib/kvm_util.c
@@ -1890,3 +1890,38 @@ struct kvm_stats_desc *read_stats_descriptors(int stats_fd,

return stats_desc;
}
+
+/*
+ * Read stat data for a particular stat
+ *
+ * Input Args:
+ * stats_fd - the file descriptor for the binary stats file from which to read
+ * header - the binary stats metadata header corresponding to the given FD
+ * desc - the binary stat metadata for the particular stat to be read
+ * max_elements - the maximum number of 8-byte values to read into data
+ *
+ * Output Args:
+ * data - the buffer into which stat data should be read
+ *
+ * Read the data values of a specified stat from the binary stats interface.
+ */
+void read_stat_data(int stats_fd, struct kvm_stats_header *header,
+ struct kvm_stats_desc *desc, uint64_t *data,
+ size_t max_elements)
+{
+ size_t nr_elements = min_t(ssize_t, desc->size, max_elements);
+ size_t size = nr_elements * sizeof(*data);
+ ssize_t ret;
+
+ TEST_ASSERT(desc->size, "No elements in stat '%s'", desc->name);
+ TEST_ASSERT(max_elements, "Zero elements requested for stat '%s'", desc->name);
+
+ ret = pread(stats_fd, data, size,
+ header->data_offset + desc->offset);
+
+ TEST_ASSERT(ret >= 0, "pread() failed on stat '%s', errno: %i (%s)",
+ desc->name, errno, strerror(errno));
+ TEST_ASSERT(ret == size,
+ "pread() on stat '%s' read %ld bytes, wanted %lu bytes",
+ desc->name, size, ret);
+}
--
2.36.1.476.g0c4daa206d-goog