Received: by 2002:a05:6a10:9848:0:0:0:0 with SMTP id x8csp3967461pxf; Tue, 16 Mar 2021 02:17:19 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzXFIHnmFklmIct8k8QRjEPwIVP7Asrab8o+vX5UjJtQvwc9AZueuzglk1f0vS5DBmxZS75 X-Received: by 2002:a05:6402:142:: with SMTP id s2mr34708013edu.2.1615886238975; Tue, 16 Mar 2021 02:17:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1615886238; cv=none; d=google.com; s=arc-20160816; b=GGQbbx3Uwu+hoQ9AZ1jGU15Woin7opp9/M7K2bBOkoR40qhOVS93dwOFp1Zf7NKAOY CRgPHJJ4EgbIjoMssuxAxqwBF1t+KAoXYwyg6tbVEdbFNCoRHO+wQRsC26M+MCOZ9i24 W8PJt0xpKGPgmZ+XbEnglcJvcCTAI8rSr/lDJcFyQvXdkEwJxYTy6GNY471U23I+dMPK MXnmNMPsmJWBw6GKSGBH2Pr1VTIoL8b8R/ohjEqVOFG8nx+vez+93OetK9jzy+KfKgay Ug8jcl6KKET+NB4dOxqlrTFpEgl/15Btu+yINpU+YQLR5cmzMHaftLhjlkRg6Y63GYym wrTw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=IQY1D4mxt0Au+XEL7JzuOd/XePQEer2cRWwwPZzsh6c=; b=xB20CBaZg/0O9iu0C003dHBoTKmc3ujkyHQjx3gBxDNtcYUSxUHMgqIZ5JeB12T7T7 cb/JXyKAkLj+C8Q1OxX8MV2d8RcQMPWCDyLCMEE2QfC7Wn41E2fsHDt06uTkZiwlSWrK +S0i4XoMqpvGCMPEzSSNVbQozzXRX34q6Wy4r8Z2jnY23LON0CMNomQS/aWJH6MP2Mrr n5kiBnoyF2zan3+JNl1pPudS4k2S1hvLTnTNhWsWl/P+qJBWkO0cVa/GXpacN1PDOt7E RQDDvkaine6NOb3v8+atfWNB9PrOwHouxhFmUY2m1kL6Yu19bQJvkumNMw4f1Odm9EtV veYA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@brainfault-org.20150623.gappssmtp.com header.s=20150623 header.b=iSTAvxoq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n21si13703245edv.569.2021.03.16.02.16.55; Tue, 16 Mar 2021 02:17:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@brainfault-org.20150623.gappssmtp.com header.s=20150623 header.b=iSTAvxoq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232411AbhCPHdB (ORCPT + 99 others); Tue, 16 Mar 2021 03:33:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46258 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232462AbhCPHch (ORCPT ); Tue, 16 Mar 2021 03:32:37 -0400 Received: from mail-wr1-x435.google.com (mail-wr1-x435.google.com [IPv6:2a00:1450:4864:20::435]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2BB33C06174A for ; Tue, 16 Mar 2021 00:32:37 -0700 (PDT) Received: by mail-wr1-x435.google.com with SMTP id v4so6893412wrp.13 for ; Tue, 16 Mar 2021 00:32:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=brainfault-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=IQY1D4mxt0Au+XEL7JzuOd/XePQEer2cRWwwPZzsh6c=; b=iSTAvxoqL774stNAExAxT0y5w1fZ8YHP4kYJkIwmdTY8g3saMZdTToYrYIbZgKI/1u knkImyw+1b7JynzRKoDvKzk64jkssowNItcW1TCTVqM7mAeiRlocwjk4N7IvQebIRK0U RKSbdVfVX9g03eyIenyc7Qf1Kessf3JSADznxoNKBkoGOEgBSCVS6vcMYUlmn6qR0Fsu /aq6nAogbMsp0ip7/137tjsfR4t9lYBbrOCX20KFcu/K27rwLjkpuP/K+FBS2HxBwdqz nxapU+ZlTPwo4QnRDO+MSFFGgi06bBhYX6srUsf5TruFNqj9Wu24cJGZOJkqIFpEBruf xDoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=IQY1D4mxt0Au+XEL7JzuOd/XePQEer2cRWwwPZzsh6c=; b=FxoiBr48WVcBMgefJMj9FGXYCv7OJvcZdQTQOI3JMfkXUp9BkoCqGKDqCDv1DBlVmn 8ovgMonEMXdfAPaYOh9NUeYu9ryxgqDniq9BjRnx7wjXVPxoDGXyu5Jmj6AUK94J3+RO ClV8UfhOYPJbxDHw34UnszXOXu1h/I7xuqMZ3A3FDwFuSoQaGVtR7QEm8DUX9WZNAhGS us+LiNRT2cgdjDMABkwUwoJ0FBaUwDf8BqYD1+wd6fUN4F+yFz8EbzQnquFjGKhGfTPm ETbt/vFz2vmAyGliTYVth5pKiB4dacTmxvTznbG/ubZMNiSgrSCtQPYVhWizLCJG7gRq y2kg== X-Gm-Message-State: AOAM531bIf3k4lIuLLkxXU/B8Eoa640lHQoSg40TOXJkZaaDQGm4HlDm S8dJQbw6OqQsUBn0qOKB9ZqP3g/ZDEx9oYuH8Khy2Q== X-Received: by 2002:a5d:5744:: with SMTP id q4mr3424016wrw.390.1615879955825; Tue, 16 Mar 2021 00:32:35 -0700 (PDT) MIME-Version: 1.0 References: <20210316015328.13516-1-liu@jiuyang.me> <20210316034638.16276-1-liu@jiuyang.me> In-Reply-To: From: Anup Patel Date: Tue, 16 Mar 2021 13:02:24 +0530 Message-ID: Subject: Re: [PATCH] Insert SFENCE.VMA in function set_pte_at for RISCV To: Jiuyang Liu Cc: Alexandre Ghiti , Andrew Waterman , Paul Walmsley , Palmer Dabbelt , Albert Ou , Atish Patra , Anup Patel , Andrew Morton , Mike Rapoport , Kefeng Wang , Zong Li , Greentime Hu , linux-riscv , "linux-kernel@vger.kernel.org List" Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Mar 16, 2021 at 12:27 PM Jiuyang Liu wrote: > > > As per my understanding, we don't need to explicitly invalidate local TLB > > in set_pte() or set_pet_at() because generic Linux page table management > > (/mm/*) will call the appropriate flush_tlb_xyz() function after page > > table updates. > > I witnessed this bug in our micro-architecture: set_pte instruction is > still in the store buffer, no functions are inserting SFENCE.VMA in > the stack below, so TLB cannot witness this modification. > Here is my call stack: > set_pte > set_pte_at > map_vm_area > __vmalloc_area_node > __vmalloc_node_range > __vmalloc_node > __vmalloc_node_flags > vzalloc > n_tty_open > > I think this is an architecture specific code, so /mm/* should > not be modified. > And spec requires SFENCE.VMA to be inserted on each modification to > TLB. So I added code here. The generic linux/mm/* already calls the appropriate tlb_flush_xyz() function defined in arch/riscv/include/asm/tlbflush.h Better to have a write-barrier in set_pte(). > > > Also, just local TLB flush is generally not sufficient because > > a lot of page tables will be used across on multiple HARTs. > > Yes, this is the biggest issue, in RISC-V Volume 2, Privileged Spec v. > 20190608 page 67 gave a solution: This is not an issue with RISC-V privilege spec rather it is more about placing RISC-V fences at right locations. > Consequently, other harts must be notified separately when the > memory-management data structures have been modified. One approach is > to use > 1) a local data fence to ensure local writes are visible globally, > then 2) an interprocessor interrupt to the other thread, > then 3) a local SFENCE.VMA in the interrupt handler of the remote thread, > and finally 4) signal back to originating thread that operation is > complete. This is, of course, the RISC-V analog to a TLB shootdown. I would suggest trying approach#1. You can include "asm/barrier.h" here and use wmb() or __smp_wmb() in-place of local TLB flush. > > In general, this patch didn't handle the G bit in PTE, kernel trap it > to sbi_remote_sfence_vma. do you think I should use flush_tlb_all? > > Jiuyang > > > > > arch/arm/mm/mmu.c > void set_pte_at(struct mm_struct *mm, unsigned long addr, > pte_t *ptep, pte_t pteval) > { > unsigned long ext = 0; > > if (addr < TASK_SIZE && pte_valid_user(pteval)) { > if (!pte_special(pteval)) > __sync_icache_dcache(pteval); > ext |= PTE_EXT_NG; > } > > set_pte_ext(ptep, pteval, ext); > } > > arch/mips/include/asm/pgtable.h > static inline void set_pte_at(struct mm_struct *mm, unsigned long addr, > pte_t *ptep, pte_t pteval) > { > > if (!pte_present(pteval)) > goto cache_sync_done; > > if (pte_present(*ptep) && (pte_pfn(*ptep) == pte_pfn(pteval))) > goto cache_sync_done; > > __update_cache(addr, pteval); > cache_sync_done: > set_pte(ptep, pteval); > } > > > Also, just local TLB flush is generally not sufficient because > > a lot of page tables will be used accross on multiple HARTs. > > > On Tue, Mar 16, 2021 at 5:05 AM Anup Patel wrote: > > > > +Alex > > > > On Tue, Mar 16, 2021 at 9:20 AM Jiuyang Liu wrote: > > > > > > This patch inserts SFENCE.VMA after modifying PTE based on RISC-V > > > specification. > > > > > > arch/riscv/include/asm/pgtable.h: > > > 1. implement pte_user, pte_global and pte_leaf to check correspond > > > attribute of a pte_t. > > > > Adding pte_user(), pte_global(), and pte_leaf() is fine. > > > > > > > > 2. insert SFENCE.VMA in set_pte_at based on RISC-V Volume 2, Privileged > > > Spec v. 20190608 page 66 and 67: > > > If software modifies a non-leaf PTE, it should execute SFENCE.VMA with > > > rs1=x0. If any PTE along the traversal path had its G bit set, rs2 must > > > be x0; otherwise, rs2 should be set to the ASID for which the > > > translation is being modified. > > > If software modifies a leaf PTE, it should execute SFENCE.VMA with rs1 > > > set to a virtual address within the page. If any PTE along the traversal > > > path had its G bit set, rs2 must be x0; otherwise, rs2 should be set to > > > the ASID for which the translation is being modified. > > > > > > arch/riscv/include/asm/tlbflush.h: > > > 1. implement get_current_asid to get current program asid. > > > 2. implement local_flush_tlb_asid to flush tlb with asid. > > > > As per my understanding, we don't need to explicitly invalidate local TLB > > in set_pte() or set_pet_at() because generic Linux page table management > > (/mm/*) will call the appropriate flush_tlb_xyz() function after page > > table updates. Also, just local TLB flush is generally not sufficient because > > a lot of page tables will be used accross on multiple HARTs. > > > > > > > > Signed-off-by: Jiuyang Liu > > > --- > > > arch/riscv/include/asm/pgtable.h | 27 +++++++++++++++++++++++++++ > > > arch/riscv/include/asm/tlbflush.h | 12 ++++++++++++ > > > 2 files changed, 39 insertions(+) > > > > > > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h > > > index ebf817c1bdf4..5a47c60372c1 100644 > > > --- a/arch/riscv/include/asm/pgtable.h > > > +++ b/arch/riscv/include/asm/pgtable.h > > > @@ -222,6 +222,16 @@ static inline int pte_write(pte_t pte) > > > return pte_val(pte) & _PAGE_WRITE; > > > } > > > > > > +static inline int pte_user(pte_t pte) > > > +{ > > > + return pte_val(pte) & _PAGE_USER; > > > +} > > > + > > > +static inline int pte_global(pte_t pte) > > > +{ > > > + return pte_val(pte) & _PAGE_GLOBAL; > > > +} > > > + > > > static inline int pte_exec(pte_t pte) > > > { > > > return pte_val(pte) & _PAGE_EXEC; > > > @@ -248,6 +258,11 @@ static inline int pte_special(pte_t pte) > > > return pte_val(pte) & _PAGE_SPECIAL; > > > } > > > > > > +static inline int pte_leaf(pte_t pte) > > > +{ > > > + return pte_val(pte) & (_PAGE_READ | _PAGE_WRITE | _PAGE_EXEC); > > > +} > > > + > > > /* static inline pte_t pte_rdprotect(pte_t pte) */ > > > > > > static inline pte_t pte_wrprotect(pte_t pte) > > > @@ -358,6 +373,18 @@ static inline void set_pte_at(struct mm_struct *mm, > > > flush_icache_pte(pteval); > > > > > > set_pte(ptep, pteval); > > > + > > > + if (pte_present(pteval)) { > > > + if (pte_leaf(pteval)) { > > > + local_flush_tlb_page(addr); > > > + } else { > > > + if (pte_global(pteval)) > > > + local_flush_tlb_all(); > > > + else > > > + local_flush_tlb_asid(); > > > + > > > + } > > > + } > > > } > > > > > > static inline void pte_clear(struct mm_struct *mm, > > > diff --git a/arch/riscv/include/asm/tlbflush.h b/arch/riscv/include/asm/tlbflush.h > > > index 394cfbccdcd9..1f9b62b3670b 100644 > > > --- a/arch/riscv/include/asm/tlbflush.h > > > +++ b/arch/riscv/include/asm/tlbflush.h > > > @@ -21,6 +21,18 @@ static inline void local_flush_tlb_page(unsigned long addr) > > > { > > > __asm__ __volatile__ ("sfence.vma %0" : : "r" (addr) : "memory"); > > > } > > > + > > > +static inline unsigned long get_current_asid(void) > > > +{ > > > + return (csr_read(CSR_SATP) >> SATP_ASID_SHIFT) & SATP_ASID_MASK; > > > +} > > > + > > > +static inline void local_flush_tlb_asid(void) > > > +{ > > > + unsigned long asid = get_current_asid(); > > > + __asm__ __volatile__ ("sfence.vma x0, %0" : : "r" (asid) : "memory"); > > > +} > > > + > > > #else /* CONFIG_MMU */ > > > #define local_flush_tlb_all() do { } while (0) > > > #define local_flush_tlb_page(addr) do { } while (0) > > > -- > > > 2.30.2 > > > > > > > > > _______________________________________________ > > > linux-riscv mailing list > > > linux-riscv@lists.infradead.org > > > http://lists.infradead.org/mailman/listinfo/linux-riscv > > > > Regards, > > Anup Regards, Anup