Received: by 2002:a05:6359:6284:b0:131:369:b2a3 with SMTP id se4csp4768396rwb; Tue, 8 Aug 2023 13:35:00 -0700 (PDT) X-Google-Smtp-Source: AGHT+IGtfZ/OHH8LKIMqFQL889Y/3fHbMZ7U4gybRIyDTgCcTQPLPlwTdJMozYLXgPum7Nert1aJ X-Received: by 2002:a05:6a00:13a2:b0:687:3d65:7792 with SMTP id t34-20020a056a0013a200b006873d657792mr757324pfg.20.1691526900065; Tue, 08 Aug 2023 13:35:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1691526900; cv=none; d=google.com; s=arc-20160816; b=Fcj9/pIYy2hZ6pe4/RFDsuPt1GNi7OPOyZYkkHwXF5Q09eLUDdnBr4HTvLv6t1xCh3 8Aoe/uQC4zgHjmHIMdcdjZchPvn8v2NwFUKO1Wwskhy2K1FZnOcA9nAXpLktmSzje9+H 96g9GLuN44up/gSa/AF6J3TzUmIIUbwfHNha05QpsSucg4pIJsUNOsa/EgmaaSr0WUhD d3Ptbbr0mWQCfEBT3PphBsMOSnJ0mGTa+ekfw/XUmCGEE6RY4T4VuLq++bl9ZpxAoAFn G5gKP+19DFiE4VX5QcLMVfYt7BfmruF/p8KEQxD28mj4mOkUL+Cyn+TxDUDbv0eggMqH Ui3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :content-language:references:cc:to:subject:user-agent:mime-version :date:message-id; bh=Cdf6iyZ38WL+T7aLseDZ2tU3bL3xgAlInwmPSR2VNnQ=; fh=DIzyS+gx9bTgNN2/qwW4TrotRAVqCYe1dwW3sy+HQ/4=; b=SYjrk1PYvjQduYrLVq9tznK1/0Q78yMIuU1kHsnF6U9xhW66Qull+l1f0bP+W3L2iT CyCFd3fDI0w7gdTK6aZe8OeQ64TEiqKBccScgK2rooAvaKYCU1i7I5UG4upf7lTcipZT 8fNWdqvhl3zoxdMWJUYidkrk5ywb8ePDfYtOtIOHgLjvItOm59mTzpDfm0MwPW7REq9/ 6gVnDt8azXfYlZ4/dnz4UIKVB2KFAePhsG1Y31+VdFuJHS7+DLSgoopeiXhVIdR7ZQAg 9caSyhF9HnmgwWBpeg7mbEgze/XFo5xiHZ9mLgv6LJqjWf9GWD/f7XwTJnxdZvUgfK2p e6Fw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id cm8-20020a056a00338800b0068703c4dd35si7572937pfb.1.2023.08.08.13.34.48; Tue, 08 Aug 2023 13:35:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233025AbjHHTuv (ORCPT + 99 others); Tue, 8 Aug 2023 15:50:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37324 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234016AbjHHTu3 (ORCPT ); Tue, 8 Aug 2023 15:50:29 -0400 Received: from mslow1.mail.gandi.net (mslow1.mail.gandi.net [217.70.178.240]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 11F2852E9B for ; Tue, 8 Aug 2023 09:56:57 -0700 (PDT) Received: from relay8-d.mail.gandi.net (unknown [IPv6:2001:4b98:dc4:8::228]) by mslow1.mail.gandi.net (Postfix) with ESMTP id 76CD6D5BAA for ; Tue, 8 Aug 2023 10:17:00 +0000 (UTC) Received: by mail.gandi.net (Postfix) with ESMTPSA id 7E6DC1BF204; Tue, 8 Aug 2023 10:16:51 +0000 (UTC) Message-ID: <5681817e-2751-0166-b823-df03aebedf9f@ghiti.fr> Date: Tue, 8 Aug 2023 12:16:50 +0200 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:102.0) Gecko/20100101 Thunderbird/102.13.0 Subject: Re: [PATCH 1/1] riscv: Implement arch_sync_kernel_mappings() for "preventive" TLB flush To: Dylan Jhong , paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, ajones@ventanamicro.com, alexghiti@rivosinc.com, anup@brainfault.org, rppt@kernel.org, samuel@sholland.org, panqinglin2020@iscas.ac.cn, sergey.matyukevich@syntacore.com, maz@kernel.org, linux-riscv@lists.infradead.org, conor.dooley@microchip.com, linux-kernel@vger.kernel.org Cc: ycliang@andestech.com, x5710999x@gmail.com, tim609@andestech.com References: <20230807082305.198784-1-dylan@andestech.com> <20230807082305.198784-2-dylan@andestech.com> Content-Language: en-US From: Alexandre Ghiti In-Reply-To: <20230807082305.198784-2-dylan@andestech.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-GND-Sasl: alex@ghiti.fr X-Spam-Status: No, score=-3.7 required=5.0 tests=BAYES_00,NICE_REPLY_A, RCVD_IN_DNSWL_BLOCKED,SPF_HELO_PASS,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Dylan, On 07/08/2023 10:23, Dylan Jhong wrote: > Since RISC-V is a microarchitecture that allows caching invalid entries in the TLB, > it is necessary to issue a "preventive" SFENCE.VMA to ensure that each core obtains > the correct kernel mapping. > > The patch implements TLB flushing in arch_sync_kernel_mappings(), ensuring that kernel > page table mappings created via vmap/vmalloc() are updated before switching MM. > > Signed-off-by: Dylan Jhong > --- > arch/riscv/include/asm/page.h | 2 ++ > arch/riscv/mm/tlbflush.c | 12 ++++++++++++ > 2 files changed, 14 insertions(+) > > diff --git a/arch/riscv/include/asm/page.h b/arch/riscv/include/asm/page.h > index b55ba20903ec..6c86ab69687e 100644 > --- a/arch/riscv/include/asm/page.h > +++ b/arch/riscv/include/asm/page.h > @@ -21,6 +21,8 @@ > #define HPAGE_MASK (~(HPAGE_SIZE - 1)) > #define HUGETLB_PAGE_ORDER (HPAGE_SHIFT - PAGE_SHIFT) > > +#define ARCH_PAGE_TABLE_SYNC_MASK PGTBL_PTE_MODIFIED > + > /* > * PAGE_OFFSET -- the first address of the first page of memory. > * When not using MMU this corresponds to the first free page in > diff --git a/arch/riscv/mm/tlbflush.c b/arch/riscv/mm/tlbflush.c > index 77be59aadc73..d63364948c85 100644 > --- a/arch/riscv/mm/tlbflush.c > +++ b/arch/riscv/mm/tlbflush.c > @@ -149,3 +149,15 @@ void flush_pmd_tlb_range(struct vm_area_struct *vma, unsigned long start, > __flush_tlb_range(vma->vm_mm, start, end - start, PMD_SIZE); > } > #endif > + > +/* > + * Since RISC-V is a microarchitecture that allows caching invalid entries in the TLB, > + * it is necessary to issue a "preventive" SFENCE.VMA to ensure that each core obtains > + * the correct kernel mapping. arch_sync_kernel_mappings() will ensure that kernel > + * page table mappings created via vmap/vmalloc() are updated before switching MM. > + */ > +void arch_sync_kernel_mappings(unsigned long start, unsigned long end) > +{ > + if (start < VMALLOC_END && end > VMALLOC_START) This test is too restrictive, it should catch the range [MODULES_VADDR;  MODULES_END[ too, sorry I did not notice that at first. > + flush_tlb_all(); > +} > \ No newline at end of file I have to admit that I *think* both your patch and mine are wrong: one of the problem that led to the removal of vmalloc_fault() is the possibility for tracing functions to actually allocate vmalloc regions in the vmalloc page fault path, which could give rise to nested exceptions (see https://lore.kernel.org/lkml/20200508144043.13893-1-joro@8bytes.org/). Here, everytime we allocate a vmalloc region, we send an IPI. If a vmalloc allocation happens in this path (if it is traced for example), it will give rise to an IPI...and so on. So I came to the conclusion that the only way to actually fix this issue is by resolving the vmalloc faults very early in the page fault path (by emitting a sfence.vma on uarch that cache invalid entries), before the kernel stack is even accessed. That's the best solution since it would completely remove all the preventive sfence.vma in flush_cache_vmap()/arch_sync_kernel_mappings(), we would rely on faulting which I assume should not happen a lot (?). I'm implementing this solution, but I'm pretty sure it won't be ready for 6.5. In the meantime, we need either your patch or mine to fix your issue...