Received: by 2002:a05:7412:b101:b0:e2:908c:2ebd with SMTP id az1csp3357958rdb; Thu, 16 Nov 2023 07:31:55 -0800 (PST) X-Google-Smtp-Source: AGHT+IEXW5E1IVXMomGeTalXaWnCIN14ugrOHHBE7YnDqbbmQdfY5qZL7l1DVTNJcr7uoAoxIz2Q X-Received: by 2002:a05:6a21:7890:b0:187:4118:c0ee with SMTP id bf16-20020a056a21789000b001874118c0eemr8162104pzc.46.1700148715296; Thu, 16 Nov 2023 07:31:55 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1700148715; cv=none; d=google.com; s=arc-20160816; b=O2bCavus+yseNvkH16nT+DKA28+qlO50SG1blYfkwQByMLLsicNHPSm7fddcg0Baia SLlBvA+fZuKrk2kOPtSEjUhIEXb1A0dSDzTP0PrZaI2MrsBfHFQaAw+b1VldPePuLllZ EcZizHK6Myibyd0NW7J349ZDgsFcr775vfgFLJQPuSdMDjj/c+HwmT+IdGvMY32xeP83 F+2nCiQhgwNEDmReJeAjh/f/0pfHXM/RWSXOQtj98Qflciww+rmP502QOf2dlQ38KHxi G/Ri6HvxKplPmUDIlbwaVPUY5I4EUoqFaLZQEjp4/KO977GKVcPOSGqEkp2emgM9je3o Y8LQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=donjSP1Xz/3Ayd7xl575wHqD+kXImXwNsh04tLPFMYI=; fh=v/wTNYTiH+jql/ZrmuhPg6uKzUZg3GlpVXvzAQL+X04=; b=UO5lpOWwdE2gxEQw92tk2TfD61sj3x2aHaaZnkhYpr1gmC3vCb9alrYUUQrxv751Wz 6OOSme/TBMNoNn9y7rMg8Iq6ixDrEYawDXn8uUqsUrNWH67zaQrU5VxjXhq6m3PVL5sH 9YzxzfKfbqwQYCK9t4HiwCSxBQmI48xwUcMEO0K1gx2BiSLFjrJucViQls9fHzmZfWQ8 CPcSNKOUkWZx0Sads3+q/MKemUUToLgIa4m/xib3CjaCOakBP5BXI8b22OfVDTiZLX5v sxu9ya5XPxh28Udy0BaCu5WamfI1SCM6V3UIRaDZ9Z9PVbWfHf1kTDytTwoFEl9qvQE3 k90Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mediatek.com header.s=dk header.b=Y+y3bljI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=mediatek.com Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id c8-20020a63ea08000000b005b930e0b600si12241902pgi.820.2023.11.16.07.31.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 16 Nov 2023 07:31:55 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@mediatek.com header.s=dk header.b=Y+y3bljI; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=mediatek.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id 690CD8092229; Thu, 16 Nov 2023 07:30:28 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345125AbjKPPaN (ORCPT + 99 others); Thu, 16 Nov 2023 10:30:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53050 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231245AbjKPP32 (ORCPT ); Thu, 16 Nov 2023 10:29:28 -0500 Received: from mailgw02.mediatek.com (unknown [210.61.82.184]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CFF2BD5B; Thu, 16 Nov 2023 07:29:22 -0800 (PST) X-UUID: e508cc82849411ee8051498923ad61e6-20231116 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=mediatek.com; s=dk; h=Content-Type:MIME-Version:References:In-Reply-To:Message-ID:Date:Subject:CC:To:From; bh=donjSP1Xz/3Ayd7xl575wHqD+kXImXwNsh04tLPFMYI=; b=Y+y3bljI9ttB8IJu1kqoLHYznmu0oitE5ziYvKkZFTkU7DOnexoIrdVapwFgX0adpWt17995gBnk+iNb3q6Snvt1CtqfZk6IaTTY/ZsnNxf88ps9MTobauuAAz6ntwcU1pYIezN3NfEvRJoH/nK3UyB3w7rdtANNWkOnM9xmWgY=; X-CID-P-RULE: Release_Ham X-CID-O-INFO: VERSION:1.1.33,REQID:abb288b5-d22d-4527-a711-849807859812,IP:0,U RL:0,TC:0,Content:-25,EDM:0,RT:0,SF:0,FILE:0,BULK:0,RULE:Release_Ham,ACTIO N:release,TS:-25 X-CID-META: VersionHash:364b77b,CLOUDID:ea392460-c89d-4129-91cb-8ebfae4653fc,B ulkID:nil,BulkQuantity:0,Recheck:0,SF:102,TC:nil,Content:0,EDM:-3,IP:nil,U RL:0,File:nil,Bulk:nil,QS:nil,BEC:nil,COL:0,OSI:0,OSA:0,AV:0,LES:1,SPR:NO, DKR:0,DKP:0,BRR:0,BRE:0 X-CID-BVR: 0 X-CID-BAS: 0,_,0,_ X-CID-FACTOR: TF_CID_SPAM_SNR X-UUID: e508cc82849411ee8051498923ad61e6-20231116 Received: from mtkmbs13n1.mediatek.inc [(172.21.101.193)] by mailgw02.mediatek.com (envelope-from ) (Generic MTA with TLSv1.2 ECDHE-RSA-AES256-GCM-SHA384 256/256) with ESMTP id 1912807532; Thu, 16 Nov 2023 23:29:12 +0800 Received: from mtkmbs11n2.mediatek.inc (172.21.101.187) by mtkmbs13n2.mediatek.inc (172.21.101.108) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.26; Thu, 16 Nov 2023 23:29:11 +0800 Received: from mtksdccf07.mediatek.inc (172.21.84.99) by mtkmbs11n2.mediatek.inc (172.21.101.73) with Microsoft SMTP Server id 15.2.1118.26 via Frontend Transport; Thu, 16 Nov 2023 23:29:11 +0800 From: Yi-De Wu To: Yingshiuan Pan , Ze-Yu Wang , Yi-De Wu , Rob Herring , Krzysztof Kozlowski , Conor Dooley , Jonathan Corbet , Catalin Marinas , Will Deacon , Matthias Brugger , AngeloGioacchino Del Regno CC: Arnd Bergmann , , , , , , David Bradil , Trilok Soni , Jade Shih , Ivan Tseng , My Chuang , Shawn Hsiao , PeiLun Suei , Liju Chen , Willix Yeh , Kevenny Hsieh Subject: [PATCH v7 14/16] virt: geniezone: Add block-based demand paging support Date: Thu, 16 Nov 2023 23:27:54 +0800 Message-ID: <20231116152756.4250-15-yi-de.wu@mediatek.com> X-Mailer: git-send-email 2.18.0 In-Reply-To: <20231116152756.4250-1-yi-de.wu@mediatek.com> References: <20231116152756.4250-1-yi-de.wu@mediatek.com> MIME-Version: 1.0 Content-Type: text/plain X-MTK: N X-Spam-Status: No, score=-1.3 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_MSPIKE_H4, RCVD_IN_MSPIKE_WL,RDNS_NONE,SPF_HELO_PASS,SPF_PASS, T_SCC_BODY_TEXT_LINE,UNPARSEABLE_RELAY,URIBL_BLOCKED autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Thu, 16 Nov 2023 07:30:29 -0800 (PST) From: "Yingshiuan Pan" To balance memory usage and performance, GenieZone supports larger granularity demand paging, called block-based demand paging. Gzvm driver uses enable_cap to query the hypervisor if it supports block-based demand paging and the given granularity or not. Meanwhile, the gzvm driver allocates a shared buffer for storing the physical pages later. If the hypervisor supports, every time the gzvm driver handles guest page faults, it allocates more memory in advance (default: 2MB) for demand paging. And fills those physical pages into the allocated shared memory, then calls the hypervisor to map to guest's memory. The physical pages allocated for block-based demand paging is not necessary to be contiguous because in many cases, 2MB block is not followed. 1st, the memory is allocated because of VMM's page fault (VMM loads kernel image to guest memory before running). In this case, the page is allocated by the host kernel and using PAGE_SIZE. 2nd is that guest may return memory to host via ballooning and that is still 4KB (or PAGE_SIZE) granularity. Therefore, we do not have to allocate physically contiguous 2MB pages. Signed-off-by: Yingshiuan Pan Signed-off-by: Yi-De Wu Signed-off-by: Liju Chen --- arch/arm64/geniezone/gzvm_arch_common.h | 2 + arch/arm64/geniezone/vm.c | 19 ++++++++-- drivers/virt/geniezone/gzvm_mmu.c | 44 +++++++++++++++++++++- drivers/virt/geniezone/gzvm_vm.c | 49 +++++++++++++++++++++++++ include/linux/gzvm_drv.h | 16 ++++++++ include/uapi/linux/gzvm.h | 2 + 6 files changed, 128 insertions(+), 4 deletions(-) diff --git a/arch/arm64/geniezone/gzvm_arch_common.h b/arch/arm64/geniezone/gzvm_arch_common.h index 11e2c8c4e618..6ab941f946a1 100644 --- a/arch/arm64/geniezone/gzvm_arch_common.h +++ b/arch/arm64/geniezone/gzvm_arch_common.h @@ -25,6 +25,7 @@ enum { GZVM_FUNC_MEMREGION_PURPOSE = 15, GZVM_FUNC_SET_DTB_CONFIG = 16, GZVM_FUNC_MAP_GUEST = 17, + GZVM_FUNC_MAP_GUEST_BLOCK = 18, NR_GZVM_FUNC, }; @@ -50,6 +51,7 @@ enum { #define MT_HVC_GZVM_MEMREGION_PURPOSE GZVM_HCALL_ID(GZVM_FUNC_MEMREGION_PURPOSE) #define MT_HVC_GZVM_SET_DTB_CONFIG GZVM_HCALL_ID(GZVM_FUNC_SET_DTB_CONFIG) #define MT_HVC_GZVM_MAP_GUEST GZVM_HCALL_ID(GZVM_FUNC_MAP_GUEST) +#define MT_HVC_GZVM_MAP_GUEST_BLOCK GZVM_HCALL_ID(GZVM_FUNC_MAP_GUEST_BLOCK) #define GIC_V3_NR_LRS 16 diff --git a/arch/arm64/geniezone/vm.c b/arch/arm64/geniezone/vm.c index efb61a01a789..73c173236580 100644 --- a/arch/arm64/geniezone/vm.c +++ b/arch/arm64/geniezone/vm.c @@ -318,10 +318,11 @@ static int gzvm_vm_ioctl_cap_pvm(struct gzvm *gzvm, fallthrough; case GZVM_CAP_PVM_SET_PROTECTED_VM: /* - * To improve performance for protected VM, we have to populate VM's memory - * before VM booting + * If the hypervisor doesn't support block-based demand paging, we + * populate memory in advance to improve performance for protected VM. */ - populate_mem_region(gzvm); + if (gzvm->demand_page_gran == PAGE_SIZE) + populate_mem_region(gzvm); ret = gzvm_vm_arch_enable_cap(gzvm, cap, &res); return ret; case GZVM_CAP_PVM_GET_PVMFW_SIZE: @@ -338,12 +339,16 @@ int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm, struct gzvm_enable_cap *cap, void __user *argp) { + struct arm_smccc_res res = {0}; int ret; switch (cap->cap) { case GZVM_CAP_PROTECTED_VM: ret = gzvm_vm_ioctl_cap_pvm(gzvm, cap, argp); return ret; + case GZVM_CAP_BLOCK_BASED_DEMAND_PAGING: + ret = gzvm_vm_arch_enable_cap(gzvm, cap, &res); + return ret; default: break; } @@ -384,3 +389,11 @@ int gzvm_arch_map_guest(u16 vm_id, int memslot_id, u64 pfn, u64 gfn, return gzvm_hypcall_wrapper(MT_HVC_GZVM_MAP_GUEST, vm_id, memslot_id, pfn, gfn, nr_pages, 0, 0, &res); } + +int gzvm_arch_map_guest_block(u16 vm_id, int memslot_id, u64 gfn, u64 nr_pages) +{ + struct arm_smccc_res res; + + return gzvm_hypcall_wrapper(MT_HVC_GZVM_MAP_GUEST_BLOCK, vm_id, + memslot_id, gfn, nr_pages, 0, 0, 0, &res); +} diff --git a/drivers/virt/geniezone/gzvm_mmu.c b/drivers/virt/geniezone/gzvm_mmu.c index 6e2da38a80fd..77104df4afe5 100644 --- a/drivers/virt/geniezone/gzvm_mmu.c +++ b/drivers/virt/geniezone/gzvm_mmu.c @@ -107,6 +107,45 @@ int gzvm_gfn_to_pfn_memslot(struct gzvm_memslot *memslot, u64 gfn, return 0; } +static int handle_block_demand_page(struct gzvm *vm, int memslot_id, u64 gfn) +{ + u64 pfn, __gfn; + int ret, i; + + u32 nr_entries = GZVM_BLOCK_BASED_DEMAND_PAGE_SIZE / PAGE_SIZE; + struct gzvm_memslot *memslot = &vm->memslot[memslot_id]; + u64 start_gfn = ALIGN_DOWN(gfn, nr_entries); + u32 total_pages = memslot->npages; + u64 base_gfn = memslot->base_gfn; + + /* if the demand region is less than a block, adjust the nr_entries */ + if (start_gfn + nr_entries > base_gfn + total_pages) + nr_entries = base_gfn + total_pages - start_gfn; + + mutex_lock(&vm->demand_paging_lock); + for (i = 0, __gfn = start_gfn; i < nr_entries; i++, __gfn++) { + ret = gzvm_gfn_to_pfn_memslot(&vm->memslot[memslot_id], __gfn, + &pfn); + if (unlikely(ret)) { + ret = -ERR_FAULT; + goto err_unlock; + } + vm->demand_page_buffer[i] = pfn; + } + + ret = gzvm_arch_map_guest_block(vm->vm_id, memslot_id, start_gfn, + nr_entries); + if (unlikely(ret)) { + ret = -EFAULT; + goto err_unlock; + } + +err_unlock: + mutex_unlock(&vm->demand_paging_lock); + + return ret; +} + static int handle_single_demand_page(struct gzvm *vm, int memslot_id, u64 gfn) { int ret; @@ -143,5 +182,8 @@ int gzvm_handle_page_fault(struct gzvm_vcpu *vcpu) if (unlikely(memslot_id < 0)) return -EFAULT; - return handle_single_demand_page(vm, memslot_id, gfn); + if (vm->demand_page_gran == PAGE_SIZE) + return handle_single_demand_page(vm, memslot_id, gfn); + else + return handle_block_demand_page(vm, memslot_id, gfn); } diff --git a/drivers/virt/geniezone/gzvm_vm.c b/drivers/virt/geniezone/gzvm_vm.c index 9f7e44521de5..485d1e2097aa 100644 --- a/drivers/virt/geniezone/gzvm_vm.c +++ b/drivers/virt/geniezone/gzvm_vm.c @@ -294,6 +294,8 @@ static long gzvm_vm_ioctl(struct file *filp, unsigned int ioctl, static void gzvm_destroy_vm(struct gzvm *gzvm) { + size_t allocated_size; + pr_debug("VM-%u is going to be destroyed\n", gzvm->vm_id); mutex_lock(&gzvm->lock); @@ -306,6 +308,11 @@ static void gzvm_destroy_vm(struct gzvm *gzvm) list_del(&gzvm->vm_list); mutex_unlock(&gzvm_list_lock); + if (gzvm->demand_page_buffer) { + allocated_size = GZVM_BLOCK_BASED_DEMAND_PAGE_SIZE / PAGE_SIZE * sizeof(u64); + free_pages_exact(gzvm->demand_page_buffer, allocated_size); + } + mutex_unlock(&gzvm->lock); kfree(gzvm); @@ -325,6 +332,46 @@ static const struct file_operations gzvm_vm_fops = { .llseek = noop_llseek, }; +/** + * setup_vm_demand_paging - Query hypervisor suitable demand page size and set + * @vm: gzvm instance for setting up demand page size + * + * Return: void + */ +static void setup_vm_demand_paging(struct gzvm *vm) +{ + u32 buf_size = GZVM_BLOCK_BASED_DEMAND_PAGE_SIZE / PAGE_SIZE * sizeof(u64); + struct gzvm_enable_cap cap = {0}; + void *buffer; + int ret; + + mutex_init(&vm->demand_paging_lock); + buffer = alloc_pages_exact(buf_size, GFP_KERNEL); + if (!buffer) { + /* Fall back to use default page size for demand paging */ + vm->demand_page_gran = PAGE_SIZE; + vm->demand_page_buffer = NULL; + return; + } + + cap.cap = GZVM_CAP_BLOCK_BASED_DEMAND_PAGING; + cap.args[0] = GZVM_BLOCK_BASED_DEMAND_PAGE_SIZE; + cap.args[1] = (__u64)virt_to_phys(buffer); + /* demand_page_buffer is freed when destroy VM */ + vm->demand_page_buffer = buffer; + + ret = gzvm_vm_ioctl_enable_cap(vm, &cap, NULL); + if (ret == 0) { + vm->demand_page_gran = GZVM_BLOCK_BASED_DEMAND_PAGE_SIZE; + /* freed when destroy vm */ + vm->demand_page_buffer = buffer; + } else { + vm->demand_page_gran = PAGE_SIZE; + vm->demand_page_buffer = NULL; + free_pages_exact(buffer, buf_size); + } +} + static struct gzvm *gzvm_create_vm(unsigned long vm_type) { int ret; @@ -358,6 +405,8 @@ static struct gzvm *gzvm_create_vm(unsigned long vm_type) return ERR_PTR(ret); } + setup_vm_demand_paging(gzvm); + mutex_lock(&gzvm_list_lock); list_add(&gzvm->vm_list, &gzvm_list); mutex_unlock(&gzvm_list_lock); diff --git a/include/linux/gzvm_drv.h b/include/linux/gzvm_drv.h index b9e60fe5dcde..0a8b21e1ed1a 100644 --- a/include/linux/gzvm_drv.h +++ b/include/linux/gzvm_drv.h @@ -44,6 +44,8 @@ #define GZVM_VCPU_RUN_MAP_SIZE (PAGE_SIZE * 2) +#define GZVM_BLOCK_BASED_DEMAND_PAGE_SIZE (2 * 1024 * 1024) /* 2MB */ + /* struct mem_region_addr_range - Identical to ffa memory constituent */ struct mem_region_addr_range { /* the base IPA of the constituent memory region, aligned to 4 kiB */ @@ -106,6 +108,19 @@ struct gzvm { struct srcu_struct irq_srcu; /* lock for irq injection */ struct mutex irq_lock; + + /* + * demand page granularity: how much memory we allocate for VM in a + * single page fault + */ + u32 demand_page_gran; + /* the mailbox for transferring large portion pages */ + u64 *demand_page_buffer; + /* + * lock for preventing multiple cpu using the same demand page mailbox + * at the same time + */ + struct mutex demand_paging_lock; }; long gzvm_dev_ioctl_check_extension(struct gzvm *gzvm, unsigned long args); @@ -126,6 +141,7 @@ int gzvm_arch_create_vm(unsigned long vm_type); int gzvm_arch_destroy_vm(u16 vm_id); int gzvm_arch_map_guest(u16 vm_id, int memslot_id, u64 pfn, u64 gfn, u64 nr_pages); +int gzvm_arch_map_guest_block(u16 vm_id, int memslot_id, u64 gfn, u64 nr_pages); int gzvm_vm_ioctl_arch_enable_cap(struct gzvm *gzvm, struct gzvm_enable_cap *cap, void __user *argp); diff --git a/include/uapi/linux/gzvm.h b/include/uapi/linux/gzvm.h index 1f134c55ac2a..f858b04bf7b9 100644 --- a/include/uapi/linux/gzvm.h +++ b/include/uapi/linux/gzvm.h @@ -18,6 +18,8 @@ #define GZVM_CAP_VM_GPA_SIZE 0xa5 #define GZVM_CAP_PROTECTED_VM 0xffbadab1 +/* query hypervisor supported block-based demand page */ +#define GZVM_CAP_BLOCK_BASED_DEMAND_PAGING 0x9201 /* sub-commands put in args[0] for GZVM_CAP_PROTECTED_VM */ #define GZVM_CAP_PVM_SET_PVMFW_GPA 0 -- 2.18.0