Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp823727ybt; Wed, 8 Jul 2020 12:35:50 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzk59MPcE9skZRnBVTH1xoT0nY/2/6yZzBuL+pe0e08gFRKMKCTD/kSW85slOwvMGx/cRCr X-Received: by 2002:a50:afa2:: with SMTP id h31mr15877909edd.303.1594236949894; Wed, 08 Jul 2020 12:35:49 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594236949; cv=none; d=google.com; s=arc-20160816; b=OQaEppx/AfvXZ5pnywrdU/r309BWL3UjOsdohrMPbA4sb4BggbsRBMvmcZ0VJS1TRW wfUzZtXiaJ+m/ism/jCVyuo47c9tGTfPsFuqCv0ybX3ni9OwDZiUO+16+psA+I8wufO3 aMbfkiTFwy5Vd8SbM2hukqeYr++3vfyhbVlGBXuiyMOILLubY9j5SvkixpviwvBanbqj puO2S1ug+I7QX7SsTAaBLXTeuRxc6ENgkwDbCkjHPYadoOOId97N/sr8yQxA4xrGlyuI NlF2/kjEw55mmhPsW4HEvYsvTQgAeORWrswCKlSOGIuw/j/stuZj55c4Pc/fcP189POy KlqQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=tulzE+jYwDRofZW3LisyVs7di9bDy1rTMGk8Jy/MDLI=; b=KmOYht2eiSTtfEWRlupSLb4Li+kXWueRJH8GWFM5TzUdrALVDOe0dfezBoweU3uvpP zeuDikTrJ17ttt4+OXT0W5VHiEJSnCYDELwMiM0nTBH8MZj+LRUCW/oq1wReKW2z0LaH udMemAYtFa/+BQ+RKBpeo990WXCV2GhAeiqcDOaPvBEHPgYUqq9CEZLmeK/4rPqlsgNZ ymGta0vLXJN6CjY2YYAH9HR5aCcwFnwgFZ6eeLCLnMLQ9Tfz5H0lnySQgPq2OE2DYewz PeIsPnTRE5pmgmmRwxndo57K0HewjpVbkXu3e05M8KtMI37uDmyrx+32Ki7YAUiER2yY QuMA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TKWn2C1R; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id i16si415904ejx.549.2020.07.08.12.35.26; Wed, 08 Jul 2020 12:35:49 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=TKWn2C1R; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726908AbgGHTel (ORCPT + 99 others); Wed, 8 Jul 2020 15:34:41 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:28741 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726869AbgGHTeh (ORCPT ); Wed, 8 Jul 2020 15:34:37 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1594236874; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=tulzE+jYwDRofZW3LisyVs7di9bDy1rTMGk8Jy/MDLI=; b=TKWn2C1RsJKmDjw1S3AufhtrWZZ9Sa1nA9kR1QqoByMCX0GCwViRytt6tRoetq8Dr+XzCK /5rnnoXC2ShG49Ky7DzDNKbL8uEkVeOarszh6G0xkJM9xmTGoOqlJrpYaokbYR2u4ZBEqO 3mO/3VbzymnXgCWhEiB4W3B3n3l89UI= Received: from mail-qk1-f198.google.com (mail-qk1-f198.google.com [209.85.222.198]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-68-7r7FglfgOVGleWqkwdPCog-1; Wed, 08 Jul 2020 15:34:33 -0400 X-MC-Unique: 7r7FglfgOVGleWqkwdPCog-1 Received: by mail-qk1-f198.google.com with SMTP id i3so19626qkf.0 for ; Wed, 08 Jul 2020 12:34:32 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=tulzE+jYwDRofZW3LisyVs7di9bDy1rTMGk8Jy/MDLI=; b=mH01D//RaihphZcs8x+0Ulo/4GknW+GAwBhLqusyyw9Z/Yt+mZQlDfj44SgO6V3XqL zmu4YoElJb6xAcEhwhYVwYlH4RyWASoH3kNXq+Brg4PFzGwu173w+2r8li+p4mR2/Bnr X+fPPvkw8S88X+Zm0OmLHefIH15iBGahF0NllgPjJADJIV/QCKvt/szKgrXbpiln4ux2 I8Zl/08+xtsQArw0Kb8XC3AtF6kBSgyLLIQ2AKp7A+70rnb7gYjW2lQqVgxAIvVCW/yQ CqgrKhcSvaLJBNuftRfxm9SU1/g9m5/i7Tb9OJ7qJezNcGlmtKT1ihafbNpzdj6HEFhN 8Afw== X-Gm-Message-State: AOAM530lwMPNVbXdm0T4eIaPFHGESJwOt1GlIvMXal+j4CjX2EUIg7r4 TVA88pZJKgx2elGXvJyq7vZzqxF2VhSq2PAkUVicYZKEOH5kih5+VIkVvogXjhWtEjYnjLbsHw5 rlKAZJmPB3XwQ0oOv+jHMFPAH X-Received: by 2002:a37:a306:: with SMTP id m6mr40324148qke.7.1594236871934; Wed, 08 Jul 2020 12:34:31 -0700 (PDT) X-Received: by 2002:a37:a306:: with SMTP id m6mr40324129qke.7.1594236871592; Wed, 08 Jul 2020 12:34:31 -0700 (PDT) Received: from xz-x1.redhat.com ([2607:9880:19c8:6f::1f4f]) by smtp.gmail.com with ESMTPSA id f18sm664884qtc.28.2020.07.08.12.34.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 08 Jul 2020 12:34:31 -0700 (PDT) From: Peter Xu To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: peterx@redhat.com, Sean Christopherson , "Dr . David Alan Gilbert" , Andrew Jones , Paolo Bonzini Subject: [PATCH v11 12/13] KVM: selftests: Let dirty_log_test async for dirty ring test Date: Wed, 8 Jul 2020 15:34:07 -0400 Message-Id: <20200708193408.242909-13-peterx@redhat.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20200708193408.242909-1-peterx@redhat.com> References: <20200708193408.242909-1-peterx@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Previously the dirty ring test was working in synchronous way, because only with a vmexit (with that it was the ring full event) we'll know the hardware dirty bits will be flushed to the dirty ring. With this patch we first introduced the vcpu kick mechanism by using SIGUSR1, meanwhile we can have a guarantee of vmexit and also the flushing of hardware dirty bits. With all these, we can keep the vcpu dirty work asynchronous of the whole collection procedure now. Still, we need to be very careful that we can only do it async if the vcpu is not reaching soft limit (no KVM_EXIT_DIRTY_RING_FULL). Otherwise we must collect the dirty bits before continuing the vcpu. Further increase the dirty ring size to current maximum to make sure we torture more on the no-ring-full case, which should be the major scenario when the hypervisors like QEMU would like to use this feature. Reviewed-by: Andrew Jones Signed-off-by: Peter Xu --- tools/testing/selftests/kvm/dirty_log_test.c | 126 +++++++++++++----- .../testing/selftests/kvm/include/kvm_util.h | 1 + tools/testing/selftests/kvm/lib/kvm_util.c | 9 ++ 3 files changed, 106 insertions(+), 30 deletions(-) diff --git a/tools/testing/selftests/kvm/dirty_log_test.c b/tools/testing/selftests/kvm/dirty_log_test.c index 531431cff4fc..4b404dfdc2f9 100644 --- a/tools/testing/selftests/kvm/dirty_log_test.c +++ b/tools/testing/selftests/kvm/dirty_log_test.c @@ -13,6 +13,9 @@ #include #include #include +#include +#include +#include #include #include #include @@ -59,7 +62,9 @@ # define test_and_clear_bit_le test_and_clear_bit #endif -#define TEST_DIRTY_RING_COUNT 1024 +#define TEST_DIRTY_RING_COUNT 65536 + +#define SIG_IPI SIGUSR1 /* * Guest/Host shared variables. Ensure addr_gva2hva() and/or @@ -135,6 +140,12 @@ static uint64_t host_track_next_count; /* Whether dirty ring reset is requested, or finished */ static sem_t dirty_ring_vcpu_stop; static sem_t dirty_ring_vcpu_cont; +/* + * This is updated by the vcpu thread to tell the host whether it's a + * ring-full event. It should only be read until a sem_wait() of + * dirty_ring_vcpu_stop and before vcpu continues to run. + */ +static bool dirty_ring_vcpu_ring_full; enum log_mode_t { /* Only use KVM_GET_DIRTY_LOG for logging */ @@ -156,6 +167,33 @@ enum log_mode_t { static enum log_mode_t host_log_mode_option = LOG_MODE_ALL; /* Logging mode for current run */ static enum log_mode_t host_log_mode; +static pthread_t vcpu_thread; + +/* Only way to pass this to the signal handler */ +static struct kvm_vm *current_vm; + +static void vcpu_sig_handler(int sig) +{ + TEST_ASSERT(sig == SIG_IPI, "unknown signal: %d", sig); +} + +static void vcpu_kick(void) +{ + pthread_kill(vcpu_thread, SIG_IPI); +} + +/* + * In our test we do signal tricks, let's use a better version of + * sem_wait to avoid signal interrupts + */ +static void sem_wait_until(sem_t *sem) +{ + int ret; + + do + ret = sem_wait(sem); + while (ret == -1 && errno == EINTR); +} static bool clear_log_supported(void) { @@ -189,10 +227,13 @@ static void clear_log_collect_dirty_pages(struct kvm_vm *vm, int slot, kvm_vm_clear_dirty_log(vm, slot, bitmap, 0, num_pages); } -static void default_after_vcpu_run(struct kvm_vm *vm) +static void default_after_vcpu_run(struct kvm_vm *vm, int ret, int err) { struct kvm_run *run = vcpu_state(vm, VCPU_ID); + TEST_ASSERT(ret == 0 || (ret == -1 && err == EINTR), + "vcpu run failed: errno=%d", err); + TEST_ASSERT(get_ucall(vm, VCPU_ID, NULL) == UCALL_SYNC, "Invalid guest sync status: exit_reason=%s\n", exit_reason_str(run->exit_reason)); @@ -248,27 +289,37 @@ static uint32_t dirty_ring_collect_one(struct kvm_dirty_gfn *dirty_gfns, return count; } +static void dirty_ring_wait_vcpu(void) +{ + /* This makes sure that hardware PML cache flushed */ + vcpu_kick(); + sem_wait_until(&dirty_ring_vcpu_stop); +} + +static void dirty_ring_continue_vcpu(void) +{ + pr_info("Notifying vcpu to continue\n"); + sem_post(&dirty_ring_vcpu_cont); +} + static void dirty_ring_collect_dirty_pages(struct kvm_vm *vm, int slot, void *bitmap, uint32_t num_pages) { /* We only have one vcpu */ static uint32_t fetch_index = 0; uint32_t count = 0, cleared; + bool continued_vcpu = false; - /* - * Before fetching the dirty pages, we need a vmexit of the - * worker vcpu to make sure the hardware dirty buffers were - * flushed. This is not needed for dirty-log/clear-log tests - * because get dirty log will natually do so. - * - * For now we do it in the simple way - we simply wait until - * the vcpu uses up the soft dirty ring, then it'll always - * do a vmexit to make sure that PML buffers will be flushed. - * In real hypervisors, we probably need a vcpu kick or to - * stop the vcpus (before the final sync) to make sure we'll - * get all the existing dirty PFNs even cached in hardware. - */ - sem_wait(&dirty_ring_vcpu_stop); + dirty_ring_wait_vcpu(); + + if (!dirty_ring_vcpu_ring_full) { + /* + * This is not a ring-full event, it's safe to allow + * vcpu to continue + */ + dirty_ring_continue_vcpu(); + continued_vcpu = true; + } /* Only have one vcpu */ count = dirty_ring_collect_one(vcpu_map_dirty_ring(vm, VCPU_ID), @@ -280,13 +331,16 @@ static void dirty_ring_collect_dirty_pages(struct kvm_vm *vm, int slot, TEST_ASSERT(cleared == count, "Reset dirty pages (%u) mismatch " "with collected (%u)", cleared, count); - pr_info("Notifying vcpu to continue\n"); - sem_post(&dirty_ring_vcpu_cont); + if (!continued_vcpu) { + TEST_ASSERT(dirty_ring_vcpu_ring_full, + "Didn't continue vcpu even without ring full"); + dirty_ring_continue_vcpu(); + } pr_info("Iteration %ld collected %u pages\n", iteration, count); } -static void dirty_ring_after_vcpu_run(struct kvm_vm *vm) +static void dirty_ring_after_vcpu_run(struct kvm_vm *vm, int ret, int err) { struct kvm_run *run = vcpu_state(vm, VCPU_ID); @@ -294,10 +348,16 @@ static void dirty_ring_after_vcpu_run(struct kvm_vm *vm) if (get_ucall(vm, VCPU_ID, NULL) == UCALL_SYNC) { /* We should allow this to continue */ ; - } else if (run->exit_reason == KVM_EXIT_DIRTY_RING_FULL) { + } else if (run->exit_reason == KVM_EXIT_DIRTY_RING_FULL || + (ret == -1 && err == EINTR)) { + /* Update the flag first before pause */ + WRITE_ONCE(dirty_ring_vcpu_ring_full, + run->exit_reason == KVM_EXIT_DIRTY_RING_FULL); sem_post(&dirty_ring_vcpu_stop); - pr_info("vcpu stops because dirty ring full...\n"); - sem_wait(&dirty_ring_vcpu_cont); + pr_info("vcpu stops because %s...\n", + dirty_ring_vcpu_ring_full ? + "dirty ring is full" : "vcpu is kicked out"); + sem_wait_until(&dirty_ring_vcpu_cont); pr_info("vcpu continues now.\n"); } else { TEST_ASSERT(false, "Invalid guest sync status: " @@ -322,7 +382,7 @@ struct log_mode { void (*collect_dirty_pages) (struct kvm_vm *vm, int slot, void *bitmap, uint32_t num_pages); /* Hook to call when after each vcpu run */ - void (*after_vcpu_run)(struct kvm_vm *vm); + void (*after_vcpu_run)(struct kvm_vm *vm, int ret, int err); void (*before_vcpu_join) (void); } log_modes[LOG_MODE_NUM] = { { @@ -394,12 +454,12 @@ static void log_mode_collect_dirty_pages(struct kvm_vm *vm, int slot, mode->collect_dirty_pages(vm, slot, bitmap, num_pages); } -static void log_mode_after_vcpu_run(struct kvm_vm *vm) +static void log_mode_after_vcpu_run(struct kvm_vm *vm, int ret, int err) { struct log_mode *mode = &log_modes[host_log_mode]; if (mode->after_vcpu_run) - mode->after_vcpu_run(vm); + mode->after_vcpu_run(vm, ret, err); } static void log_mode_before_vcpu_join(void) @@ -420,20 +480,27 @@ static void generate_random_array(uint64_t *guest_array, uint64_t size) static void *vcpu_worker(void *data) { - int ret; + int ret, vcpu_fd; struct kvm_vm *vm = data; uint64_t *guest_array; uint64_t pages_count = 0; + struct sigaction sigact; + + current_vm = vm; + vcpu_fd = vcpu_get_fd(vm, VCPU_ID); + memset(&sigact, 0, sizeof(sigact)); + sigact.sa_handler = vcpu_sig_handler; + sigaction(SIG_IPI, &sigact, NULL); guest_array = addr_gva2hva(vm, (vm_vaddr_t)random_array); while (!READ_ONCE(host_quit)) { + /* Clear any existing kick signals */ generate_random_array(guest_array, TEST_PAGES_PER_LOOP); pages_count += TEST_PAGES_PER_LOOP; /* Let the guest dirty the random pages */ - ret = _vcpu_run(vm, VCPU_ID); - TEST_ASSERT(ret == 0, "vcpu_run failed: %d\n", ret); - log_mode_after_vcpu_run(vm); + ret = ioctl(vcpu_fd, KVM_RUN, NULL); + log_mode_after_vcpu_run(vm, ret, errno); } pr_info("Dirtied %"PRIu64" pages\n", pages_count); @@ -583,7 +650,6 @@ static struct kvm_vm *create_vm(enum vm_guest_mode mode, uint32_t vcpuid, static void run_test(enum vm_guest_mode mode, unsigned long iterations, unsigned long interval, uint64_t phys_offset) { - pthread_t vcpu_thread; struct kvm_vm *vm; unsigned long *bmap; diff --git a/tools/testing/selftests/kvm/include/kvm_util.h b/tools/testing/selftests/kvm/include/kvm_util.h index dce07d354a70..f3b5da383bb5 100644 --- a/tools/testing/selftests/kvm/include/kvm_util.h +++ b/tools/testing/selftests/kvm/include/kvm_util.h @@ -146,6 +146,7 @@ vm_paddr_t addr_gva2gpa(struct kvm_vm *vm, vm_vaddr_t gva); struct kvm_run *vcpu_state(struct kvm_vm *vm, uint32_t vcpuid); void vcpu_run(struct kvm_vm *vm, uint32_t vcpuid); int _vcpu_run(struct kvm_vm *vm, uint32_t vcpuid); +int vcpu_get_fd(struct kvm_vm *vm, uint32_t vcpuid); void vcpu_run_complete_io(struct kvm_vm *vm, uint32_t vcpuid); void vcpu_set_guest_debug(struct kvm_vm *vm, uint32_t vcpuid, struct kvm_guest_debug *debug); diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 831f986674ca..4625e193074e 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -1220,6 +1220,15 @@ int _vcpu_run(struct kvm_vm *vm, uint32_t vcpuid) return rc; } +int vcpu_get_fd(struct kvm_vm *vm, uint32_t vcpuid) +{ + struct vcpu *vcpu = vcpu_find(vm, vcpuid); + + TEST_ASSERT(vcpu != NULL, "vcpu not found, vcpuid: %u", vcpuid); + + return vcpu->fd; +} + void vcpu_run_complete_io(struct kvm_vm *vm, uint32_t vcpuid) { struct vcpu *vcpu = vcpu_find(vm, vcpuid); -- 2.26.2