Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp1455313rdb; Wed, 6 Dec 2023 22:08:05 -0800 (PST) X-Google-Smtp-Source: AGHT+IGkemZh1DBZvKqnCY3KgyJRUwl1fuOUs+8o2wcKqfr07515s/LdkTEkNPDKh0jTgl7GC6g6 X-Received: by 2002:a05:6808:48d:b0:3b9:dfc7:fcab with SMTP id z13-20020a056808048d00b003b9dfc7fcabmr236762oid.1.1701929284858; Wed, 06 Dec 2023 22:08:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701929284; cv=none; d=google.com; s=arc-20160816; b=cW8iNXUt86FM+qI/X73ISF5aWBJpwniB3AMSyis+BUeoAz7Sjs1CEpuVn67ozASpph 73SaFvFwMMs5HbOEW9cIJr6DVyThJbzCrbBZoNXD1pquDS2nLEPWg2NEf+JEocNg9AHm 8Me/vpNLCW2mHZ5ymRcqy+0ihlK8ejDlz0FWJB9JWn6mgc7FARtQ4lhBxFnGLnTyIwzR +3HxW9o0AlgmAngNVkVtiR+D/1kKgEX5abCd5jnovY0ZhlvWkMAJQ7R6OR3r0mCq/iC1 XXWfPP4rt+nUpPqA2CLeXO5Mnj5ItetKflQTQdh3ZDrknR86MzHDrZRa7G3rdwoGBsGP gEcA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=CntnxtK8Lx+9VhkTFQ4ebm7POn5Zxrcf29EHIgvNJN0=; fh=T2bv1cdQuegi1+e9ZIrtJWmm7XjCYkONDF45BZ/Vias=; b=h7hbDlg8bAF9D9op9lhLmUPBHn8dQb6fn92RnZAMOJ3GzRPbWkgP26/Q4T2XRK4A9B Qy84heL7JEQbjCwB+MKAucxyxsMzZbDci0m+TXpH0dgdyo7mu7E/GCOQNEjoUzPJLk46 JVbq+4tvHGwHfTfJGB58MyFHrYvfy66oib6ywCm/GkULqfjRcdU5z0d4dSOwL1fl6cX5 8q+F457EAcS6nm/SKexNfxGO06Buzc5XZGNjnZS0vmF6oFY/bmoSocP7QKVg0sFU9+h+ PrNd3+Lb1VGOCxnfLwQpDbIhnOuJw/uw3vgX87ZS6UEl9w7ZHLzMIEbodLeiOQDK2c9Y N2nA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=WlI0GPF3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Return-Path: Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id t19-20020a656093000000b005c179c0075dsi584852pgu.883.2023.12.06.22.08.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 22:08:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance.com header.s=google header.b=WlI0GPF3; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=bytedance.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id AD2C483F02AA; Wed, 6 Dec 2023 22:07:57 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229559AbjLGGHm (ORCPT + 99 others); Thu, 7 Dec 2023 01:07:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43776 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229454AbjLGGHk (ORCPT ); Thu, 7 Dec 2023 01:07:40 -0500 Received: from mail-pf1-x430.google.com (mail-pf1-x430.google.com [IPv6:2607:f8b0:4864:20::430]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96A171AD for ; Wed, 6 Dec 2023 22:07:27 -0800 (PST) Received: by mail-pf1-x430.google.com with SMTP id d2e1a72fcca58-6cea5548eb2so281925b3a.0 for ; Wed, 06 Dec 2023 22:07:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance.com; s=google; t=1701929247; x=1702534047; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=CntnxtK8Lx+9VhkTFQ4ebm7POn5Zxrcf29EHIgvNJN0=; b=WlI0GPF3I3dIcPgLOJ/7qXmQrMM8WI21QcwqWyg91OE/boodOo98D3t6WcXKhKAVOR dKpIQ0z7QK0AAW07+SOA0c0PSUsCkqe1qgtHDk9jDek0RQevs/oRU6+Bw0OqNOvSUEoS wtub83VOQjwltuNM2zPem6NFE7WP8Gcu6iXhiMFJMYinA0RyMdRbf57cRjQa+WNqweAi b1gCOHFTiRNbppc4zwTuhmzyJ/4xVOOyTggI1fXDhRYn6T1/l/5emr/9XS4hnClzRv+a DzQkPBIczn16tyrKbI9GkiL7OiVXZqmFTuTpRVRZm5V3gmCgn3YFp0eNndHH8Q2qD5Ak Q8mg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701929247; x=1702534047; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=CntnxtK8Lx+9VhkTFQ4ebm7POn5Zxrcf29EHIgvNJN0=; b=hGOOor1E3PcXJAMNczxB7Uk1a1KGxHlu8PzECQsZGnhFIbTnNk5nvVhSG+K5o/a/kx gptOB8rETCH60CFB5vqIronatRvWKM8+var/OyXqwXxznNbjgks+7B1wsK1YSJGwqsRg XRU3b5YgDyCjN3383tG14/XpOZPxjubC3zLyuPy5A8NRGqQfn2a6kE1xP+5ryiJuOwfY v7MjLSJVahJXKxc8R2XWQnMvlmVNFUEjhNxURzyO5OoouaYhNBrwD//QSAIBpvkJ8EzK APCI0MsdsdZrBnTSd/lReUMgLLZ4t56ti0bp5UKnncQYItwnCcwkWjVoSXlpSlv1iwoA Ct1w== X-Gm-Message-State: AOJu0YwktxXUYhssVvM+sA/KCQYp3MBwhf8Uz+xotIJnvJ+5BT4pNUNY yQdd7of1Y8Owf/FIqVjBUvGfPc5yN4ED2jToeXXUsUc/wj1PYHLW X-Received: by 2002:a05:6a20:a196:b0:18f:97c:8a40 with SMTP id r22-20020a056a20a19600b0018f097c8a40mr1783720pzk.107.1701929247028; Wed, 06 Dec 2023 22:07:27 -0800 (PST) MIME-Version: 1.0 References: <20231123065708.91345-1-luxu.kernel@bytedance.com> In-Reply-To: <20231123065708.91345-1-luxu.kernel@bytedance.com> From: Xu Lu Date: Thu, 7 Dec 2023 14:07:15 +0800 Message-ID: Subject: Re: [RFC PATCH V1 00/11] riscv: Introduce 64K base page To: paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, ardb@kernel.org, anup@brainfault.org, atishp@atishpatra.org Cc: dengliang.1214@bytedance.com, xieyongji@bytedance.com, lihangjing@bytedance.com, songmuchun@bytedance.com, punit.agrawal@bytedance.com, linux-kernel@vger.kernel.org, linux-riscv@lists.infradead.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIM_SIGNED,DKIM_VALID, DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Wed, 06 Dec 2023 22:07:57 -0800 (PST) A gentle ping. On Thu, Nov 23, 2023 at 2:57=E2=80=AFPM Xu Lu w= rote: > > Some existing architectures like ARM supports base page larger than 4K > as their MMU supports more page sizes. Thus, besides hugetlb page and > transparent huge page, there is another way for these architectures to > enjoy the benefits of fewer TLB misses without worrying about cost of > splitting and merging huge pages. However, on architectures with only > 4K MMU, larger base page is unavailable now. > > This patch series attempts to break through the limitation of MMU and > supports larger base page on RISC-V, which only supports 4K page size > now. > > The key idea to implement larger base page based on 4K MMU is to > decouple the MMU page from the base page in view of kernel mm, which we > denote as software page. In contrary to software page, we denote the MMU > page as hardware page. Below is the difference between these two kinds > of pages. > > 1. Kernel memory management module manages, allocates and maps memory at > a granularity of software page, which should not be restricted by > MMU and can be larger than hardware page. > > 2. Architecture page table operations should be carried out from MMU's > perspective and page table entries are encoded at a granularity of > hardware page, which is 4K on RISC-V MMU now. > > The main work to decouple these two kinds of pages lies in architecture > code. For example, we turn the pte_t struct to an array of page table > entries to match it with software page which can be larger than hardware > page, and adapt the page table operations accordingly. For 64K software > base page, the pte_t struct now contains 16 contiguous page table > entries which point to 16 contiguous 4K hardware pages. > > To achieve the benefits of large base page, we applies Svnapot for each > base page's mapping. The Svnapot extension on RISC-V is like contiguous > PTE on ARM64. It allows ptes of a naturally aligned power-of 2 size > memory range be encoded in the same format to save the TLB space. > > This patch series is the first version and is based on v6.7-rc1. This > version supports both bare metal and virtualization scenarios. > > In the next versions, we will continue on the following works: > > 1. Reduce the memory usage of page table page as it only uses 4K space > while costs a whole base page. > > 2. When IMSIC interrupt file is smaller than 64K, extra isolation > measures for the interrupt file are needed. (S)PMP and IOPMP may be good > choices. > > 3. More consideration is needed to make this patch series collaborate > with folios better. > > 4. Support 64K base page on IOMMU. > > 5. The performance test is on schedule to verify the actual performance > improvement and the decrease in TLB miss rate. > > Thanks in advance for comments. > > Xu Lu (11): > mm: Fix misused APIs on huge pte > riscv: Introduce concept of hardware base page > riscv: Adapt pte struct to gap between hw page and sw page > riscv: Adapt pte operations to gap between hw page and sw page > riscv: Decouple pmd operations and pte operations > riscv: Distinguish pmd huge pte and napot huge pte > riscv: Adapt satp operations to gap between hw page and sw page > riscv: Apply Svnapot for base page mapping > riscv: Adjust fix_btmap slots number to match variable page size > riscv: kvm: Adapt kvm to gap between hw page and sw page > riscv: Introduce 64K page size > > arch/Kconfig | 1 + > arch/riscv/Kconfig | 28 +++ > arch/riscv/include/asm/fixmap.h | 3 +- > arch/riscv/include/asm/hugetlb.h | 71 ++++++- > arch/riscv/include/asm/page.h | 16 +- > arch/riscv/include/asm/pgalloc.h | 21 ++- > arch/riscv/include/asm/pgtable-32.h | 2 +- > arch/riscv/include/asm/pgtable-64.h | 45 +++-- > arch/riscv/include/asm/pgtable.h | 282 +++++++++++++++++++++++----- > arch/riscv/kernel/efi.c | 2 +- > arch/riscv/kernel/head.S | 4 +- > arch/riscv/kernel/hibernate.c | 3 +- > arch/riscv/kvm/mmu.c | 198 +++++++++++++------ > arch/riscv/mm/context.c | 7 +- > arch/riscv/mm/fault.c | 1 + > arch/riscv/mm/hugetlbpage.c | 42 +++-- > arch/riscv/mm/init.c | 25 +-- > arch/riscv/mm/kasan_init.c | 7 +- > arch/riscv/mm/pageattr.c | 2 +- > fs/proc/task_mmu.c | 2 +- > include/asm-generic/hugetlb.h | 7 + > include/asm-generic/pgtable-nopmd.h | 1 + > include/linux/pgtable.h | 6 + > mm/hugetlb.c | 2 +- > mm/migrate.c | 5 +- > mm/mprotect.c | 2 +- > mm/rmap.c | 10 +- > mm/vmalloc.c | 3 +- > 28 files changed, 616 insertions(+), 182 deletions(-) > > -- > 2.20.1 >