2021-01-13 02:49:28

by Ben Gardon

[permalink] [raw]
Subject: [PATCH 0/6] KVM: selftests: Perf test cleanups and memslot modification test

This series contains a few cleanups that didn't make it into previous
series, including some cosmetic changes and small bug fixes. The series
also lays the groundwork for a memslot modification test which stresses
the memslot update and page fault code paths in an attempt to expose races.

Tested: dirty_log_perf_test, memslot_modification_stress_test, and
demand_paging_test were run, with all the patches in this series
applied, on an Intel Skylake machine.

echo Y > /sys/module/kvm/parameters/tdp_mmu; \
./memslot_modification_stress_test -i 1000 -v 64 -b 1G; \
./memslot_modification_stress_test -i 1000 -v 64 -b 64M -o; \
./dirty_log_perf_test -v 64 -b 1G; \
./dirty_log_perf_test -v 64 -b 64M -o; \
./demand_paging_test -v 64 -b 1G; \
./demand_paging_test -v 64 -b 64M -o; \
echo N > /sys/module/kvm/parameters/tdp_mmu; \
./memslot_modification_stress_test -i 1000 -v 64 -b 1G; \
./memslot_modification_stress_test -i 1000 -v 64 -b 64M -o; \
./dirty_log_perf_test -v 64 -b 1G; \
./dirty_log_perf_test -v 64 -b 64M -o; \
./demand_paging_test -v 64 -b 1G; \
./demand_paging_test -v 64 -b 64M -o

The tests behaved as expected, and fixed the problem of the
population stage being skipped in dirty_log_perf_test. This can be
seen in the output, with the population stage taking about the time
dirty pass 1 took and dirty pass 1 falling closer to the times for
the other passes.

Note that when running these tests, the -o option causes the test to take
much longer as the work each vCPU must do increases proportional to the
number of vCPUs.

You can view this series in Gerrit at:
https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/7216

Ben Gardon (6):
KVM: selftests: Rename timespec_diff_now to timespec_elapsed
KVM: selftests: Avoid flooding debug log while populating memory
KVM: selftests: Convert iterations to int in dirty_log_perf_test
KVM: selftests: Fix population stage in dirty_log_perf_test
KVM: selftests: Add option to overlap vCPU memory access
KVM: selftests: Add memslot modification stress test

tools/testing/selftests/kvm/.gitignore | 1 +
tools/testing/selftests/kvm/Makefile | 1 +
.../selftests/kvm/demand_paging_test.c | 40 +++-
.../selftests/kvm/dirty_log_perf_test.c | 72 +++---
.../selftests/kvm/include/perf_test_util.h | 4 +-
.../testing/selftests/kvm/include/test_util.h | 2 +-
.../selftests/kvm/lib/perf_test_util.c | 25 ++-
tools/testing/selftests/kvm/lib/test_util.c | 2 +-
.../kvm/memslot_modification_stress_test.c | 211 ++++++++++++++++++
9 files changed, 307 insertions(+), 51 deletions(-)
create mode 100644 tools/testing/selftests/kvm/memslot_modification_stress_test.c

--
2.30.0.284.gd98b1dd5eaa7-goog


2021-01-13 02:50:00

by Ben Gardon

[permalink] [raw]
Subject: [PATCH 3/6] KVM: selftests: Convert iterations to int in dirty_log_perf_test

In order to add an iteration -1 to indicate that the memory population
phase has not yet completed, convert the interations counters to ints.

No functional change intended.

Reviewed-by: Jacob Xu <[email protected]>

Signed-off-by: Ben Gardon <[email protected]>
---
.../selftests/kvm/dirty_log_perf_test.c | 26 +++++++++----------
1 file changed, 13 insertions(+), 13 deletions(-)

diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
index 15a9c45bdb5f..3875f22d7283 100644
--- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
+++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
@@ -28,8 +28,8 @@ static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE;
/* Host variables */
static u64 dirty_log_manual_caps;
static bool host_quit;
-static uint64_t iteration;
-static uint64_t vcpu_last_completed_iteration[KVM_MAX_VCPUS];
+static int iteration;
+static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];

static void *vcpu_worker(void *data)
{
@@ -48,7 +48,7 @@ static void *vcpu_worker(void *data)
run = vcpu_state(vm, vcpu_id);

while (!READ_ONCE(host_quit)) {
- uint64_t current_iteration = READ_ONCE(iteration);
+ int current_iteration = READ_ONCE(iteration);

clock_gettime(CLOCK_MONOTONIC, &start);
ret = _vcpu_run(vm, vcpu_id);
@@ -61,17 +61,17 @@ static void *vcpu_worker(void *data)

pr_debug("Got sync event from vCPU %d\n", vcpu_id);
vcpu_last_completed_iteration[vcpu_id] = current_iteration;
- pr_debug("vCPU %d updated last completed iteration to %lu\n",
+ pr_debug("vCPU %d updated last completed iteration to %d\n",
vcpu_id, vcpu_last_completed_iteration[vcpu_id]);

if (current_iteration) {
pages_count += vcpu_args->pages;
total = timespec_add(total, ts_diff);
- pr_debug("vCPU %d iteration %lu dirty memory time: %ld.%.9lds\n",
+ pr_debug("vCPU %d iteration %d dirty memory time: %ld.%.9lds\n",
vcpu_id, current_iteration, ts_diff.tv_sec,
ts_diff.tv_nsec);
} else {
- pr_debug("vCPU %d iteration %lu populate memory time: %ld.%.9lds\n",
+ pr_debug("vCPU %d iteration %d populate memory time: %ld.%.9lds\n",
vcpu_id, current_iteration, ts_diff.tv_sec,
ts_diff.tv_nsec);
}
@@ -81,7 +81,7 @@ static void *vcpu_worker(void *data)
}

avg = timespec_div(total, vcpu_last_completed_iteration[vcpu_id]);
- pr_debug("\nvCPU %d dirtied 0x%lx pages over %lu iterations in %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n",
+ pr_debug("\nvCPU %d dirtied 0x%lx pages over %d iterations in %ld.%.9lds. (Avg %ld.%.9lds/iteration)\n",
vcpu_id, pages_count, vcpu_last_completed_iteration[vcpu_id],
total.tv_sec, total.tv_nsec, avg.tv_sec, avg.tv_nsec);

@@ -144,7 +144,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
}

/* Allow the vCPU to populate memory */
- pr_debug("Starting iteration %lu - Populating\n", iteration);
+ pr_debug("Starting iteration %d - Populating\n", iteration);
while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id]) != iteration)
;

@@ -168,7 +168,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
clock_gettime(CLOCK_MONOTONIC, &start);
iteration++;

- pr_debug("Starting iteration %lu\n", iteration);
+ pr_debug("Starting iteration %d\n", iteration);
for (vcpu_id = 0; vcpu_id < nr_vcpus; vcpu_id++) {
while (READ_ONCE(vcpu_last_completed_iteration[vcpu_id])
!= iteration)
@@ -177,7 +177,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)

ts_diff = timespec_elapsed(start);
vcpu_dirty_total = timespec_add(vcpu_dirty_total, ts_diff);
- pr_info("Iteration %lu dirty memory time: %ld.%.9lds\n",
+ pr_info("Iteration %d dirty memory time: %ld.%.9lds\n",
iteration, ts_diff.tv_sec, ts_diff.tv_nsec);

clock_gettime(CLOCK_MONOTONIC, &start);
@@ -186,7 +186,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
ts_diff = timespec_elapsed(start);
get_dirty_log_total = timespec_add(get_dirty_log_total,
ts_diff);
- pr_info("Iteration %lu get dirty log time: %ld.%.9lds\n",
+ pr_info("Iteration %d get dirty log time: %ld.%.9lds\n",
iteration, ts_diff.tv_sec, ts_diff.tv_nsec);

if (dirty_log_manual_caps) {
@@ -197,7 +197,7 @@ static void run_test(enum vm_guest_mode mode, void *arg)
ts_diff = timespec_elapsed(start);
clear_dirty_log_total = timespec_add(clear_dirty_log_total,
ts_diff);
- pr_info("Iteration %lu clear dirty log time: %ld.%.9lds\n",
+ pr_info("Iteration %d clear dirty log time: %ld.%.9lds\n",
iteration, ts_diff.tv_sec, ts_diff.tv_nsec);
}
}
@@ -273,7 +273,7 @@ int main(int argc, char *argv[])
while ((opt = getopt(argc, argv, "hi:p:m:b:f:v:")) != -1) {
switch (opt) {
case 'i':
- p.iterations = strtol(optarg, NULL, 10);
+ p.iterations = atoi(optarg);
break;
case 'p':
p.phys_offset = strtoull(optarg, NULL, 0);
--
2.30.0.284.gd98b1dd5eaa7-goog

2021-01-13 07:44:50

by Thomas Huth

[permalink] [raw]
Subject: Re: [PATCH 3/6] KVM: selftests: Convert iterations to int in dirty_log_perf_test

On 12/01/2021 22.42, Ben Gardon wrote:
> In order to add an iteration -1 to indicate that the memory population
> phase has not yet completed, convert the interations counters to ints.
>
> No functional change intended.
>
> Reviewed-by: Jacob Xu <[email protected]>
>
> Signed-off-by: Ben Gardon <[email protected]>
> ---
> .../selftests/kvm/dirty_log_perf_test.c | 26 +++++++++----------
> 1 file changed, 13 insertions(+), 13 deletions(-)
>
> diff --git a/tools/testing/selftests/kvm/dirty_log_perf_test.c b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> index 15a9c45bdb5f..3875f22d7283 100644
> --- a/tools/testing/selftests/kvm/dirty_log_perf_test.c
> +++ b/tools/testing/selftests/kvm/dirty_log_perf_test.c
> @@ -28,8 +28,8 @@ static uint64_t guest_percpu_mem_size = DEFAULT_PER_VCPU_MEM_SIZE;
> /* Host variables */
> static u64 dirty_log_manual_caps;
> static bool host_quit;
> -static uint64_t iteration;
> -static uint64_t vcpu_last_completed_iteration[KVM_MAX_VCPUS];
> +static int iteration;
> +static int vcpu_last_completed_iteration[KVM_MAX_VCPUS];

Wouldn't it be better to use signed 64-bit variables instead? I.e. "int64_t" ?

Thomas

2021-01-19 04:55:19

by Paolo Bonzini

[permalink] [raw]
Subject: Re: [PATCH 0/6] KVM: selftests: Perf test cleanups and memslot modification test

On 12/01/21 22:42, Ben Gardon wrote:
> This series contains a few cleanups that didn't make it into previous
> series, including some cosmetic changes and small bug fixes. The series
> also lays the groundwork for a memslot modification test which stresses
> the memslot update and page fault code paths in an attempt to expose races.
>
> Tested: dirty_log_perf_test, memslot_modification_stress_test, and
> demand_paging_test were run, with all the patches in this series
> applied, on an Intel Skylake machine.
>
> echo Y > /sys/module/kvm/parameters/tdp_mmu; \
> ./memslot_modification_stress_test -i 1000 -v 64 -b 1G; \
> ./memslot_modification_stress_test -i 1000 -v 64 -b 64M -o; \
> ./dirty_log_perf_test -v 64 -b 1G; \
> ./dirty_log_perf_test -v 64 -b 64M -o; \
> ./demand_paging_test -v 64 -b 1G; \
> ./demand_paging_test -v 64 -b 64M -o; \
> echo N > /sys/module/kvm/parameters/tdp_mmu; \
> ./memslot_modification_stress_test -i 1000 -v 64 -b 1G; \
> ./memslot_modification_stress_test -i 1000 -v 64 -b 64M -o; \
> ./dirty_log_perf_test -v 64 -b 1G; \
> ./dirty_log_perf_test -v 64 -b 64M -o; \
> ./demand_paging_test -v 64 -b 1G; \
> ./demand_paging_test -v 64 -b 64M -o
>
> The tests behaved as expected, and fixed the problem of the
> population stage being skipped in dirty_log_perf_test. This can be
> seen in the output, with the population stage taking about the time
> dirty pass 1 took and dirty pass 1 falling closer to the times for
> the other passes.
>
> Note that when running these tests, the -o option causes the test to take
> much longer as the work each vCPU must do increases proportional to the
> number of vCPUs.
>
> You can view this series in Gerrit at:
> https://linux-review.googlesource.com/c/linux/kernel/git/torvalds/linux/+/7216
>
> Ben Gardon (6):
> KVM: selftests: Rename timespec_diff_now to timespec_elapsed
> KVM: selftests: Avoid flooding debug log while populating memory
> KVM: selftests: Convert iterations to int in dirty_log_perf_test
> KVM: selftests: Fix population stage in dirty_log_perf_test
> KVM: selftests: Add option to overlap vCPU memory access
> KVM: selftests: Add memslot modification stress test
>
> tools/testing/selftests/kvm/.gitignore | 1 +
> tools/testing/selftests/kvm/Makefile | 1 +
> .../selftests/kvm/demand_paging_test.c | 40 +++-
> .../selftests/kvm/dirty_log_perf_test.c | 72 +++---
> .../selftests/kvm/include/perf_test_util.h | 4 +-
> .../testing/selftests/kvm/include/test_util.h | 2 +-
> .../selftests/kvm/lib/perf_test_util.c | 25 ++-
> tools/testing/selftests/kvm/lib/test_util.c | 2 +-
> .../kvm/memslot_modification_stress_test.c | 211 ++++++++++++++++++
> 9 files changed, 307 insertions(+), 51 deletions(-)
> create mode 100644 tools/testing/selftests/kvm/memslot_modification_stress_test.c
>

Queued, thanks.

Paolo