Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp3795834rdb; Thu, 14 Sep 2023 02:58:54 -0700 (PDT) X-Google-Smtp-Source: AGHT+IFX3UJnF2i3oTLicpD2/IRpL6XvkqEBwl1PF2ZbkTTArX1nyYO3c6hez4L0UStsyP7wW3B5 X-Received: by 2002:a05:6358:921c:b0:13a:4855:d8f0 with SMTP id d28-20020a056358921c00b0013a4855d8f0mr5784507rwb.5.1694685534090; Thu, 14 Sep 2023 02:58:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694685534; cv=none; d=google.com; s=arc-20160816; b=tdJIUx8IDBo6hOMdRI3342AP2PITJOKXMofGSdYF0oQ/fDx6DqcwFvikocu5FhBxU2 etlfoL/6am6HttmIKVUPdEMqoGItx2YOfXCoU7Lj/pbXpfmcK2jxgpRPPfUkVuau+4B8 xYWkFp6ylF+PG0/vAxlyr/E5CzVcGk2lTC3L9n5pjChpN/AVPQa4P3MCufLFsnK3WCKC e9SuzLtwRS80hnNgIHJnw5z/uq9QaP9cdHpkiO7RpGLpnw27gpdcM2rKkOujtls7Vu// syZDKOv00Sj5kecnin407LIzwNah1WjJ3DJIjFLX3/wBqtGfCpRS/rCCibAO9cxgqgRL LeTA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:reply-to:dkim-signature; bh=OxMDE800j4aiJ8tPuJinoyPTmv5S2+AxjnM5iapuS4E=; fh=61hsfVoef5Tbbo+Rm06/Hxsz4fAtyORDF8Po5ZVRZDI=; b=wF28VhdWMDFXVoJ1pJhlaudeMCClQAMlBvgI4NRHvcI1EJumwgvrPy3yeSI/j24zmE nRg4Qe2S4bX19HoI4dk4P24HXWvXSNzqAmp/dxs0wN2/q/OKaC+jNDttLj4QJbSHcRMD lNm6AcOep0P+SEySD0PSeB1mM8FuIYIGbTef8tUOjwAlgUlPYHeHIogAT8xvPxLtD9XN gVo9TufyuuLg7orLDU4a1NEoNohY01DzWvvm4eOENdm8S5fNoNueIUugJAAfn3AN3wvf JiOxFSK8fMVOjrCsbrUtbbyKE3xNa8ySqvkfqZhoccIdWoot9hs4bp3pgZZ0d7aU9Otx JhaQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=FucLK6jg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from lipwig.vger.email (lipwig.vger.email. [23.128.96.33]) by mx.google.com with ESMTPS id b25-20020a6567d9000000b00565d8d51c20si1202899pgs.217.2023.09.14.02.58.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 02:58:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) client-ip=23.128.96.33; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=FucLK6jg; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.33 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by lipwig.vger.email (Postfix) with ESMTP id 6ED0081BB082; Wed, 13 Sep 2023 19:01:02 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at lipwig.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233956AbjINCAh (ORCPT + 99 others); Wed, 13 Sep 2023 22:00:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233984AbjINB7o (ORCPT ); Wed, 13 Sep 2023 21:59:44 -0400 Received: from mail-pl1-x64a.google.com (mail-pl1-x64a.google.com [IPv6:2607:f8b0:4864:20::64a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3A1643A95 for ; Wed, 13 Sep 2023 18:56:24 -0700 (PDT) Received: by mail-pl1-x64a.google.com with SMTP id d9443c01a7336-1c093862623so4349105ad.1 for ; Wed, 13 Sep 2023 18:56:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694656584; x=1695261384; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:from:to:cc:subject:date:message-id:reply-to; bh=OxMDE800j4aiJ8tPuJinoyPTmv5S2+AxjnM5iapuS4E=; b=FucLK6jgThnbThqsRQE/A82cSy07H0MKukXuQwYUqhMUFFr3G8UzGD19mx4JChbjok r7Tmq9UT4AtW9IYWyqZxv0q7vOZaQYn4o153d9QQ11h8UxRM3+bPnYiSD9MvDEiEdli/ ICtSTjQWe4jiBIbhAZ6yJzhaUL3KPRlWPxy97+8cT+3kKkn6vzudDpNuEJFqD/BhD5Hq 88klEM0+EF7Od0556pB2azdKsQPy0laRLuQbNf4IC0evPLn19tVPY0eCrr1/Reii2fNe LVsp6eIYvn74bmVopCMYaM4/mnKMKXubDYqVH0XK6hS1+kXCaRoQ3oHFxnm/jouJW7WK Y68g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694656584; x=1695261384; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:reply-to:x-gm-message-state:from:to:cc:subject:date:message-id :reply-to; bh=OxMDE800j4aiJ8tPuJinoyPTmv5S2+AxjnM5iapuS4E=; b=On29KpmRVEkiI/eF2F13Z2mNpCYtPNor1l4bdiWBAulF9RtrHmTpkUcgRPL8QWYTu6 q52vNUqdQ4OuatHh+yHtu04yJw1jDsaBhEnTXahqagYN/vrj6dS8sIKyZPN5sgtaInlY rhmIvgFJTzi6RnhYzOw7pSqAKSBDx8f4N0FS3jkPgFNxxoB5R+1vkwPEkN+5iso7Fh9W q23jciqvvo+id9NH/Q0TW4ARTjE/6Ab5uZsaXBLYJYr/2a4NtJz8IvfYLgTCOHmzUQPB FegZGRq3AtlGRt4JshLsb8q1oK/pvQoge7+3Y0lrSdKBVcpfG9YAgNWG8LkvtuXvk/lw q0pA== X-Gm-Message-State: AOJu0YyWoGtMxgv2xWHbjou1U9oO8/34o7CAtPEb0BioIuq0y7wRyhoF cKYU8ukOQt8nmN1GnrHy9HvG1dvRyw4= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a17:902:d50b:b0:1c3:29c4:c4e8 with SMTP id b11-20020a170902d50b00b001c329c4c4e8mr224068plg.4.1694656583660; Wed, 13 Sep 2023 18:56:23 -0700 (PDT) Reply-To: Sean Christopherson Date: Wed, 13 Sep 2023 18:55:22 -0700 In-Reply-To: <20230914015531.1419405-1-seanjc@google.com> Mime-Version: 1.0 References: <20230914015531.1419405-1-seanjc@google.com> X-Mailer: git-send-email 2.42.0.283.g2d96d420d3-goog Message-ID: <20230914015531.1419405-25-seanjc@google.com> Subject: [RFC PATCH v12 24/33] KVM: selftests: Add support for creating private memslots From: Sean Christopherson To: Paolo Bonzini , Marc Zyngier , Oliver Upton , Huacai Chen , Michael Ellerman , Anup Patel , Paul Walmsley , Palmer Dabbelt , Albert Ou , Sean Christopherson , "Matthew Wilcox (Oracle)" , Andrew Morton , Paul Moore , James Morris , "Serge E. Hallyn" Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, Chao Peng , Fuad Tabba , Jarkko Sakkinen , Anish Moorthy , Yu Zhang , Isaku Yamahata , Xu Yilun , Vlastimil Babka , Vishal Annapurve , Ackerley Tng , Maciej Szmigiero , David Hildenbrand , Quentin Perret , Michael Roth , Wang , Liam Merwick , Isaku Yamahata , "Kirill A . Shutemov" Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (lipwig.vger.email [0.0.0.0]); Wed, 13 Sep 2023 19:01:02 -0700 (PDT) X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lipwig.vger.email Add support for creating "private" memslots via KVM_CREATE_GUEST_MEMFD and KVM_SET_USER_MEMORY_REGION2. Make vm_userspace_mem_region_add() a wrapper to its effective replacement, vm_mem_add(), so that private memslots are fully opt-in, i.e. don't require update all tests that add memory regions. Pivot on the KVM_MEM_PRIVATE flag instead of the validity of the "gmem" file descriptor so that simple tests can let vm_mem_add() do the heavy lifting of creating the guest memfd, but also allow the caller to pass in an explicit fd+offset so that fancier tests can do things like back multiple memslots with a single file. If the caller passes in a fd, dup() the fd so that (a) __vm_mem_region_delete() can close the fd associated with the memory region without needing yet another flag, and (b) so that the caller can safely close its copy of the fd without having to first destroy memslots. Co-developed-by: Ackerley Tng Signed-off-by: Ackerley Tng Signed-off-by: Sean Christopherson --- .../selftests/kvm/include/kvm_util_base.h | 23 +++++ .../testing/selftests/kvm/include/test_util.h | 5 ++ tools/testing/selftests/kvm/lib/kvm_util.c | 85 ++++++++++++------- 3 files changed, 82 insertions(+), 31 deletions(-) diff --git a/tools/testing/selftests/kvm/include/kvm_util_base.h b/tools/testing/selftests/kvm/include/kvm_util_base.h index 9f144841c2ee..47ea25f9dc97 100644 --- a/tools/testing/selftests/kvm/include/kvm_util_base.h +++ b/tools/testing/selftests/kvm/include/kvm_util_base.h @@ -431,6 +431,26 @@ static inline uint64_t vm_get_stat(struct kvm_vm *vm, const char *stat_name) void vm_create_irqchip(struct kvm_vm *vm); +static inline int __vm_create_guest_memfd(struct kvm_vm *vm, uint64_t size, + uint64_t flags) +{ + struct kvm_create_guest_memfd gmem = { + .size = size, + .flags = flags, + }; + + return __vm_ioctl(vm, KVM_CREATE_GUEST_MEMFD, &gmem); +} + +static inline int vm_create_guest_memfd(struct kvm_vm *vm, uint64_t size, + uint64_t flags) +{ + int fd = __vm_create_guest_memfd(vm, size, flags); + + TEST_ASSERT(fd >= 0, KVM_IOCTL_ERROR(KVM_CREATE_GUEST_MEMFD, fd)); + return fd; +} + void vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags, uint64_t gpa, uint64_t size, void *hva); int __vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags, @@ -439,6 +459,9 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, uint64_t guest_paddr, uint32_t slot, uint64_t npages, uint32_t flags); +void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, uint64_t npages, + uint32_t flags, int gmem_fd, uint64_t gmem_offset); void vm_mem_region_set_flags(struct kvm_vm *vm, uint32_t slot, uint32_t flags); void vm_mem_region_move(struct kvm_vm *vm, uint32_t slot, uint64_t new_gpa); diff --git a/tools/testing/selftests/kvm/include/test_util.h b/tools/testing/selftests/kvm/include/test_util.h index 7e614adc6cf4..7257f2243ab9 100644 --- a/tools/testing/selftests/kvm/include/test_util.h +++ b/tools/testing/selftests/kvm/include/test_util.h @@ -142,6 +142,11 @@ static inline bool backing_src_is_shared(enum vm_mem_backing_src_type t) return vm_mem_backing_src_alias(t)->flag & MAP_SHARED; } +static inline bool backing_src_can_be_huge(enum vm_mem_backing_src_type t) +{ + return t != VM_MEM_SRC_ANONYMOUS && t != VM_MEM_SRC_SHMEM; +} + /* Aligns x up to the next multiple of size. Size must be a power of 2. */ static inline uint64_t align_up(uint64_t x, uint64_t size) { diff --git a/tools/testing/selftests/kvm/lib/kvm_util.c b/tools/testing/selftests/kvm/lib/kvm_util.c index 3676b37bea38..127f44c6c83c 100644 --- a/tools/testing/selftests/kvm/lib/kvm_util.c +++ b/tools/testing/selftests/kvm/lib/kvm_util.c @@ -669,6 +669,8 @@ static void __vm_mem_region_delete(struct kvm_vm *vm, TEST_ASSERT(!ret, __KVM_SYSCALL_ERROR("munmap()", ret)); close(region->fd); } + if (region->region.gmem_fd >= 0) + close(region->region.gmem_fd); free(region); } @@ -870,36 +872,15 @@ void vm_set_user_memory_region(struct kvm_vm *vm, uint32_t slot, uint32_t flags, errno, strerror(errno)); } -/* - * VM Userspace Memory Region Add - * - * Input Args: - * vm - Virtual Machine - * src_type - Storage source for this region. - * NULL to use anonymous memory. - * guest_paddr - Starting guest physical address - * slot - KVM region slot - * npages - Number of physical pages - * flags - KVM memory region flags (e.g. KVM_MEM_LOG_DIRTY_PAGES) - * - * Output Args: None - * - * Return: None - * - * Allocates a memory area of the number of pages specified by npages - * and maps it to the VM specified by vm, at a starting physical address - * given by guest_paddr. The region is created with a KVM region slot - * given by slot, which must be unique and < KVM_MEM_SLOTS_NUM. The - * region is created with the flags given by flags. - */ -void vm_userspace_mem_region_add(struct kvm_vm *vm, - enum vm_mem_backing_src_type src_type, - uint64_t guest_paddr, uint32_t slot, uint64_t npages, - uint32_t flags) +/* FIXME: This thing needs to be ripped apart and rewritten. */ +void vm_mem_add(struct kvm_vm *vm, enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, uint64_t npages, + uint32_t flags, int gmem_fd, uint64_t gmem_offset) { int ret; struct userspace_mem_region *region; size_t backing_src_pagesz = get_backing_src_pagesz(src_type); + size_t mem_size = npages * vm->page_size; size_t alignment; TEST_ASSERT(vm_adjust_num_guest_pages(vm->mode, npages) == npages, @@ -952,7 +933,7 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, /* Allocate and initialize new mem region structure. */ region = calloc(1, sizeof(*region)); TEST_ASSERT(region != NULL, "Insufficient Memory"); - region->mmap_size = npages * vm->page_size; + region->mmap_size = mem_size; #ifdef __s390x__ /* On s390x, the host address must be aligned to 1M (due to PGSTEs) */ @@ -999,14 +980,47 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, /* As needed perform madvise */ if ((src_type == VM_MEM_SRC_ANONYMOUS || src_type == VM_MEM_SRC_ANONYMOUS_THP) && thp_configured()) { - ret = madvise(region->host_mem, npages * vm->page_size, + ret = madvise(region->host_mem, mem_size, src_type == VM_MEM_SRC_ANONYMOUS ? MADV_NOHUGEPAGE : MADV_HUGEPAGE); TEST_ASSERT(ret == 0, "madvise failed, addr: %p length: 0x%lx src_type: %s", - region->host_mem, npages * vm->page_size, + region->host_mem, mem_size, vm_mem_backing_src_alias(src_type)->name); } region->backing_src_type = src_type; + + if (flags & KVM_MEM_PRIVATE) { + if (gmem_fd < 0) { + uint32_t gmem_flags = 0; + + /* + * Allow hugepages for the guest memfd backing if the + * "normal" backing is allowed/required to be huge. + */ + if (src_type != VM_MEM_SRC_ANONYMOUS && + src_type != VM_MEM_SRC_SHMEM) + gmem_flags |= KVM_GUEST_MEMFD_ALLOW_HUGEPAGE; + + TEST_ASSERT(!gmem_offset, + "Offset must be zero when creating new guest_memfd"); + gmem_fd = vm_create_guest_memfd(vm, mem_size, gmem_flags); + } else { + /* + * Install a unique fd for each memslot so that the fd + * can be closed when the region is deleted without + * needing to track if the fd is owned by the framework + * or by the caller. + */ + gmem_fd = dup(gmem_fd); + TEST_ASSERT(gmem_fd >= 0, __KVM_SYSCALL_ERROR("dup()", gmem_fd)); + } + + region->region.gmem_fd = gmem_fd; + region->region.gmem_offset = gmem_offset; + } else { + region->region.gmem_fd = -1; + } + region->unused_phy_pages = sparsebit_alloc(); sparsebit_set_num(region->unused_phy_pages, guest_paddr >> vm->page_shift, npages); @@ -1019,9 +1033,10 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, TEST_ASSERT(ret == 0, "KVM_SET_USER_MEMORY_REGION2 IOCTL failed,\n" " rc: %i errno: %i\n" " slot: %u flags: 0x%x\n" - " guest_phys_addr: 0x%lx size: 0x%lx", + " guest_phys_addr: 0x%lx size: 0x%lx guest_memfd: %d\n", ret, errno, slot, flags, - guest_paddr, (uint64_t) region->region.memory_size); + guest_paddr, (uint64_t) region->region.memory_size, + region->region.gmem_fd); /* Add to quick lookup data structures */ vm_userspace_mem_region_gpa_insert(&vm->regions.gpa_tree, region); @@ -1042,6 +1057,14 @@ void vm_userspace_mem_region_add(struct kvm_vm *vm, } } +void vm_userspace_mem_region_add(struct kvm_vm *vm, + enum vm_mem_backing_src_type src_type, + uint64_t guest_paddr, uint32_t slot, + uint64_t npages, uint32_t flags) +{ + vm_mem_add(vm, src_type, guest_paddr, slot, npages, flags, -1, 0); +} + /* * Memslot to region * -- 2.42.0.283.g2d96d420d3-goog