Received: by 2002:a05:6358:3188:b0:123:57c1:9b43 with SMTP id q8csp376660rwd; Thu, 8 Jun 2023 01:51:57 -0700 (PDT) X-Google-Smtp-Source: ACHHUZ763jzBrPYKjp5DtlJNuGcaUTAyKhwgaIBm0ypbTVJMmn9zWxZruq9MBhI2f1FKEM/zlTkp X-Received: by 2002:a17:90a:d388:b0:258:c780:479e with SMTP id q8-20020a17090ad38800b00258c780479emr3263147pju.13.1686214317609; Thu, 08 Jun 2023 01:51:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1686214317; cv=none; d=google.com; s=arc-20160816; b=sq+SPE2lZ4/Oi7PQXRx9bFf5OW0lRQyKDeS2TmX3xZ/bZkMZ9qTPmrZjkcs+/NZoCl Vkoc6nRJJc3fJTXzIHHz6gGQ5a8cuEJLA8k1TAeG1a1GT8IBoD3nnKc50qcRrKb8DOe7 mtrgCDHwmtNKiDEluaOu9CfU/Gq25utmB24Vq9KHzejKg6UMjV6tHYwPfn+i/wD4hazh p2KzAand8+TRGO7XMbJGLToS4DUgxzPs4+H5wS0QWIrcwfhsRQ6zCfw6LJET7LMe60Nn yjy8rkIHFnL3QApxp3RvFHN95HfPRNkjhLfYZtcewZ2kxcQ28hmOP1SQFp1oHYs9f8ob geLA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=NOCn1p0AQbF22zvJJLWPlfA92SXqZKr4SYkzvQMCA7Y=; b=kjdgVzjOw69yOjGl52xgTf2pE6JRr3YVYSIv7Seyg/yMYiFkQposjMR8bR6jaFqkxd 0wuoRY14cuU1O0/k/GSDdDHs3PGTbdR5l/w2FryUgKZRI4jkOelJgjQM+KpeKdSxgexV jTgrGbh7Fyxkft8hbYcjiE0GH1OwDSiYZSsRYNJDxPlCWNQgLHrKD/kUgp7c69G2O8Kx 0QwXWXRI/GIvn9DB9zIea22mZ6uwGWneO2vNV0gXa3Ats0mQ3pw7eW8wSXN3x4CY0H// F+xGGbTKZjhbpAS5Yvnewb3AWoPD4bAh6c9o97m/ONkOV232dO4zScCCsNSxCRijYHMI WIvA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l13-20020a170902f68d00b001aafc97feb0si799164plg.174.2023.06.08.01.51.43; Thu, 08 Jun 2023 01:51:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234810AbjFHI0M (ORCPT + 99 others); Thu, 8 Jun 2023 04:26:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33166 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230499AbjFHI0J (ORCPT ); Thu, 8 Jun 2023 04:26:09 -0400 Received: from relay1-d.mail.gandi.net (relay1-d.mail.gandi.net [IPv6:2001:4b98:dc4:8::221]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4D149E43 for ; Thu, 8 Jun 2023 01:26:05 -0700 (PDT) X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr X-GND-Sasl: alex@ghiti.fr Received: by mail.gandi.net (Postfix) with ESMTPSA id 51E5224000E; Thu, 8 Jun 2023 08:26:00 +0000 (UTC) Message-ID: Date: Thu, 8 Jun 2023 10:25:59 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.11.0 Subject: Re: [PATCH v2] riscv: mm: Pre-allocate PGD entries for vmalloc/modules area Content-Language: en-US To: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , Paul Walmsley , Palmer Dabbelt , Albert Ou , linux-riscv@lists.infradead.org Cc: =?UTF-8?B?QmrDtnJuIFTDtnBlbA==?= , linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux@rivosinc.com, Alexandre Ghiti , Joerg Roedel References: <20230531093817.665799-1-bjorn@kernel.org> From: Alexandre Ghiti In-Reply-To: <20230531093817.665799-1-bjorn@kernel.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.7 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Björn, On 31/05/2023 11:38, Björn Töpel wrote: > From: Björn Töpel > > The RISC-V port requires that kernel PGD entries are to be > synchronized between MMs. This is done via the vmalloc_fault() > function, that simply copies the PGD entries from init_mm to the > faulting one. > > Historically, faulting in PGD entries have been a source for both bugs > [1], and poor performance. > > One way to get rid of vmalloc faults is by pre-allocating the PGD > entries. Pre-allocating the entries potientially wastes 64 * 4K (65 on > SV39). The pre-allocation function is pulled from Jörg Rödel's x86 > work, with the addition of 3-level page tables (PMD allocations). > > The pmd_alloc() function needs the ptlock cache to be initialized > (when split page locks is enabled), so the pre-allocation is done in a > RISC-V specific pgtable_cache_init() implementation. > > Pre-allocate the kernel PGD entries for the vmalloc/modules area, but > only for 64b platforms. > > Link: https://lore.kernel.org/lkml/20200508144043.13893-1-joro@8bytes.org/ # [1] > Signed-off-by: Björn Töpel > --- > v1->v2: Fixed broken !MMU build. > --- > arch/riscv/mm/fault.c | 16 ++---------- > arch/riscv/mm/init.c | 58 +++++++++++++++++++++++++++++++++++++++++++ > 2 files changed, 60 insertions(+), 14 deletions(-) > > diff --git a/arch/riscv/mm/fault.c b/arch/riscv/mm/fault.c > index 8685f85a7474..b023fb311e28 100644 > --- a/arch/riscv/mm/fault.c > +++ b/arch/riscv/mm/fault.c > @@ -238,24 +238,12 @@ void handle_page_fault(struct pt_regs *regs) > * only copy the information from the master page table, > * nothing more. > */ > - if (unlikely((addr >= VMALLOC_START) && (addr < VMALLOC_END))) { > + if ((!IS_ENABLED(CONFIG_MMU) || !IS_ENABLED(CONFIG_64BIT)) && > + unlikely(addr >= VMALLOC_START && addr < VMALLOC_END)) { > vmalloc_fault(regs, code, addr); > return; > } > > -#ifdef CONFIG_64BIT > - /* > - * Modules in 64bit kernels lie in their own virtual region which is not > - * in the vmalloc region, but dealing with page faults in this region > - * or the vmalloc region amounts to doing the same thing: checking that > - * the mapping exists in init_mm.pgd and updating user page table, so > - * just use vmalloc_fault. > - */ > - if (unlikely(addr >= MODULES_VADDR && addr < MODULES_END)) { > - vmalloc_fault(regs, code, addr); > - return; > - } > -#endif > /* Enable interrupts if they were enabled in the parent context. */ > if (!regs_irqs_disabled(regs)) > local_irq_enable(); > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c > index 747e5b1ef02d..45ceaff5679e 100644 > --- a/arch/riscv/mm/init.c > +++ b/arch/riscv/mm/init.c > @@ -1363,3 +1363,61 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, > return vmemmap_populate_basepages(start, end, node, NULL); > } > #endif > + > +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT) > +/* > + * Pre-allocates page-table pages for a specific area in the kernel > + * page-table. Only the level which needs to be synchronized between > + * all page-tables is allocated because the synchronization can be > + * expensive. > + */ > +static void __init preallocate_pgd_pages_range(unsigned long start, unsigned long end, > + const char *area) > +{ > + unsigned long addr; > + const char *lvl; > + > + for (addr = start; addr < end && addr >= start; addr = ALIGN(addr + 1, PGDIR_SIZE)) { > + pgd_t *pgd = pgd_offset_k(addr); > + p4d_t *p4d; > + pud_t *pud; > + pmd_t *pmd; > + > + lvl = "p4d"; > + p4d = p4d_alloc(&init_mm, pgd, addr); > + if (!p4d) > + goto failed; > + > + if (pgtable_l5_enabled) > + continue; > + > + lvl = "pud"; > + pud = pud_alloc(&init_mm, p4d, addr); > + if (!pud) > + goto failed; > + > + if (pgtable_l4_enabled) > + continue; > + > + lvl = "pmd"; > + pmd = pmd_alloc(&init_mm, pud, addr); > + if (!pmd) > + goto failed; > + } > + return; > + > +failed: > + /* > + * The pages have to be there now or they will be missing in > + * process page-tables later. > + */ > + panic("Failed to pre-allocate %s pages for %s area\n", lvl, area); > +} > + > +void __init pgtable_cache_init(void) > +{ > + preallocate_pgd_pages_range(VMALLOC_START, VMALLOC_END, "vmalloc"); > + if (IS_ENABLED(CONFIG_MODULES)) > + preallocate_pgd_pages_range(MODULES_VADDR, MODULES_END, "bpf/modules"); > +} > +#endif > > base-commit: ac9a78681b921877518763ba0e89202254349d1b You can add: Reviewed-by: Alexandre Ghiti Thanks! Alex