2021-09-23 07:30:18

by Guo Ren

[permalink] [raw]
Subject: [PATCH] riscv: Add RISC-V svpbmt extension

From: Guo Ren <[email protected]>

This patch follows the standard pure RISC-V Svpbmt extension in
privilege spec to solve the non-coherent SOC dma synchronization
issues.

Here is the svpbmt PTE format:
| 63 | 62-61 | 60-8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
N MT RSW D A G U X W R V
^

Of the Reserved bits [63:54] in a leaf PTE, the high bit is already
allocated (as the N bit), so bits [62:61] are used as the MT (aka
MemType) field. This field specifies one of three memory types that
are close equivalents (or equivalent in effect) to the three main x86
and ARMv8 memory types - as shown in the following table.

RISC-V
Encoding &
MemType RISC-V Description
---------- ------------------------------------------------
00 - PMA Normal Cacheable, No change to implied PMA memory type
01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory
10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory
11 - Rsvd Reserved for future standard use

The standard protection_map [] needn't be modified because the "PMA"
type keeps the highest bits zero. The standard protection_map[] needn't
be modified. So the whole modification is in the arch/riscv pgtable and
using a global variable (__riscv_svpbmt) in the _PAGE_DMA_MASK/IO/NC
for pgprot_noncached (&writecombine). We also need _PAGE_CHG_MASK to
detect PFN than before.

Devicetree: reuse "mmu-type" of cpu_section as a user interface to
enable the feature or not:
- riscv,sv39,svpbmt
- riscv,sv48,svpbmt

Signed-off-by: Guo Ren <[email protected]>
Cc: Liu Shaohua <[email protected]>
Cc: Wei Fu <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Anup Patel <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Atish Patra <[email protected]>
Cc: Drew Fustini <[email protected]>
Cc: Wei Fu <[email protected]>
Cc: Wei Wu <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Maxime Ripard <[email protected]>
Cc: Daniel Lustig <[email protected]>
Cc: Greg Favor <[email protected]>
Cc: Andrea Mondelli <[email protected]>
Cc: Jonathan Behrens <[email protected]>
Cc: Xinhaoqu (Freddie) <[email protected]>
Cc: Bill Huffman <[email protected]>
Cc: Nick Kossifidis <[email protected]>
Cc: Allen Baum <[email protected]>
Cc: Josh Scheid <[email protected]>
Cc: Richard Trauben <[email protected]>
---
.../devicetree/bindings/riscv/cpus.yaml | 2 +
arch/riscv/include/asm/fixmap.h | 2 +-
arch/riscv/include/asm/pgtable-64.h | 8 ++--
arch/riscv/include/asm/pgtable-bits.h | 46 ++++++++++++++++++-
arch/riscv/include/asm/pgtable.h | 39 ++++++++++++----
arch/riscv/include/asm/processor.h | 1 +
arch/riscv/kernel/cpufeature.c | 23 ++++++++++
arch/riscv/kernel/setup.c | 2 +
arch/riscv/mm/init.c | 5 ++
9 files changed, 113 insertions(+), 15 deletions(-)

diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml
index e534f6a7cfa1..1825cd8db0de 100644
--- a/Documentation/devicetree/bindings/riscv/cpus.yaml
+++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
@@ -56,7 +56,9 @@ properties:
enum:
- riscv,sv32
- riscv,sv39
+ - riscv,sv39,svpbmt
- riscv,sv48
+ - riscv,sv48,svpbmt
- riscv,none

riscv,isa:
diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
index 54cbf07fb4e9..5acd99d08e74 100644
--- a/arch/riscv/include/asm/fixmap.h
+++ b/arch/riscv/include/asm/fixmap.h
@@ -43,7 +43,7 @@ enum fixed_addresses {
__end_of_fixed_addresses
};

-#define FIXMAP_PAGE_IO PAGE_KERNEL
+#define FIXMAP_PAGE_IO PAGE_IOREMAP

#define __early_set_fixmap __set_fixmap

diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h
index 228261aa9628..0b53ea67e91a 100644
--- a/arch/riscv/include/asm/pgtable-64.h
+++ b/arch/riscv/include/asm/pgtable-64.h
@@ -61,12 +61,14 @@ static inline void pud_clear(pud_t *pudp)

static inline pmd_t *pud_pgtable(pud_t pud)
{
- return (pmd_t *)pfn_to_virt(pud_val(pud) >> _PAGE_PFN_SHIFT);
+ return (pmd_t *)pfn_to_virt((pud_val(pud) & _PAGE_CHG_MASK)
+ >> _PAGE_PFN_SHIFT);
}

static inline struct page *pud_page(pud_t pud)
{
- return pfn_to_page(pud_val(pud) >> _PAGE_PFN_SHIFT);
+ return pfn_to_page((pud_val(pud) & _PAGE_CHG_MASK)
+ >> _PAGE_PFN_SHIFT);
}

static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
@@ -76,7 +78,7 @@ static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)

static inline unsigned long _pmd_pfn(pmd_t pmd)
{
- return pmd_val(pmd) >> _PAGE_PFN_SHIFT;
+ return (pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT;
}

#define mk_pmd(page, prot) pfn_pmd(page_to_pfn(page), prot)
diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
index 2ee413912926..041fe4fdbafa 100644
--- a/arch/riscv/include/asm/pgtable-bits.h
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -7,7 +7,7 @@
#define _ASM_RISCV_PGTABLE_BITS_H

/*
- * PTE format:
+ * rv32 PTE format:
* | XLEN-1 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
* PFN reserved for SW D A G U X W R V
*/
@@ -24,6 +24,47 @@
#define _PAGE_DIRTY (1 << 7) /* Set by hardware on any write */
#define _PAGE_SOFT (1 << 8) /* Reserved for software */

+#ifndef __ASSEMBLY__
+#ifdef CONFIG_64BIT
+/*
+ * rv64 PTE format:
+ * | 63 | 62 61 | 60 54 | 53 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
+ * N MT RSV PFN reserved for SW D A G U X W R V
+ * [62:61] Memory Type definitions:
+ * 00 - PMA Normal Cacheable, No change to implied PMA memory type
+ * 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory
+ * 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory
+ * 11 - Rsvd Reserved for future standard use
+ */
+#define _PAGE_MT_MASK ((u64)0x3 << 61)
+#define _PAGE_MT_PMA ((u64)0x0 << 61)
+#define _PAGE_MT_NC ((u64)0x1 << 61)
+#define _PAGE_MT_IO ((u64)0x2 << 61)
+
+enum {
+ MT_PMA,
+ MT_NC,
+ MT_IO,
+ MT_MAX
+};
+
+extern struct __riscv_svpbmt_struct {
+ unsigned long mask;
+ unsigned long mt[MT_MAX];
+} __riscv_svpbmt;
+
+#define _PAGE_DMA_MASK __riscv_svpbmt.mask
+#define _PAGE_DMA_PMA __riscv_svpbmt.mt[MT_PMA]
+#define _PAGE_DMA_NC __riscv_svpbmt.mt[MT_NC]
+#define _PAGE_DMA_IO __riscv_svpbmt.mt[MT_IO]
+#else
+#define _PAGE_DMA_MASK 0
+#define _PAGE_DMA_PMA 0
+#define _PAGE_DMA_NC 0
+#define _PAGE_DMA_IO 0
+#endif /* CONFIG_64BIT */
+#endif /* __ASSEMBLY__ */
+
#define _PAGE_SPECIAL _PAGE_SOFT
#define _PAGE_TABLE _PAGE_PRESENT

@@ -38,7 +79,8 @@
/* Set of bits to preserve across pte_modify() */
#define _PAGE_CHG_MASK (~(unsigned long)(_PAGE_PRESENT | _PAGE_READ | \
_PAGE_WRITE | _PAGE_EXEC | \
- _PAGE_USER | _PAGE_GLOBAL))
+ _PAGE_USER | _PAGE_GLOBAL | \
+ _PAGE_DMA_MASK))
/*
* when all of R/W/X are zero, the PTE is a pointer to the next level
* of the page table; otherwise, it is a leaf PTE.
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 39b550310ec6..d07ba586c866 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -136,7 +136,8 @@
| _PAGE_PRESENT \
| _PAGE_ACCESSED \
| _PAGE_DIRTY \
- | _PAGE_GLOBAL)
+ | _PAGE_GLOBAL \
+ | _PAGE_DMA_PMA)

#define PAGE_KERNEL __pgprot(_PAGE_KERNEL)
#define PAGE_KERNEL_READ __pgprot(_PAGE_KERNEL & ~_PAGE_WRITE)
@@ -146,11 +147,9 @@

#define PAGE_TABLE __pgprot(_PAGE_TABLE)

-/*
- * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
- * change the properties of memory regions.
- */
-#define _PAGE_IOREMAP _PAGE_KERNEL
+#define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_DMA_MASK) | _PAGE_DMA_IO)
+
+#define PAGE_IOREMAP __pgprot(_PAGE_IOREMAP)

extern pgd_t swapper_pg_dir[];

@@ -230,12 +229,12 @@ static inline unsigned long _pgd_pfn(pgd_t pgd)

static inline struct page *pmd_page(pmd_t pmd)
{
- return pfn_to_page(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
+ return pfn_to_page((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
}

static inline unsigned long pmd_page_vaddr(pmd_t pmd)
{
- return (unsigned long)pfn_to_virt(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
+ return (unsigned long)pfn_to_virt((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
}

static inline pte_t pmd_pte(pmd_t pmd)
@@ -251,7 +250,7 @@ static inline pte_t pud_pte(pud_t pud)
/* Yields the page frame number (PFN) of a page table entry */
static inline unsigned long pte_pfn(pte_t pte)
{
- return (pte_val(pte) >> _PAGE_PFN_SHIFT);
+ return ((pte_val(pte) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
}

#define pte_page(x) pfn_to_page(pte_pfn(x))
@@ -490,6 +489,28 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
return ptep_test_and_clear_young(vma, address, ptep);
}

+#define pgprot_noncached pgprot_noncached
+static inline pgprot_t pgprot_noncached(pgprot_t _prot)
+{
+ unsigned long prot = pgprot_val(_prot);
+
+ prot &= ~_PAGE_DMA_MASK;
+ prot |= _PAGE_DMA_IO;
+
+ return __pgprot(prot);
+}
+
+#define pgprot_writecombine pgprot_writecombine
+static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
+{
+ unsigned long prot = pgprot_val(_prot);
+
+ prot &= ~_PAGE_DMA_MASK;
+ prot |= _PAGE_DMA_NC;
+
+ return __pgprot(prot);
+}
+
/*
* THP functions
*/
diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
index 021ed64ee608..92676156cbf6 100644
--- a/arch/riscv/include/asm/processor.h
+++ b/arch/riscv/include/asm/processor.h
@@ -70,6 +70,7 @@ struct device_node;
int riscv_of_processor_hartid(struct device_node *node);
int riscv_of_parent_hartid(struct device_node *node);

+extern void riscv_svpbmt(void);
extern void riscv_fill_hwcap(void);
extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);

diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
index d959d207a40d..4a2211c2c464 100644
--- a/arch/riscv/kernel/cpufeature.c
+++ b/arch/riscv/kernel/cpufeature.c
@@ -8,6 +8,7 @@

#include <linux/bitmap.h>
#include <linux/of.h>
+#include <linux/pgtable.h>
#include <asm/processor.h>
#include <asm/hwcap.h>
#include <asm/smp.h>
@@ -59,6 +60,28 @@ bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit)
}
EXPORT_SYMBOL_GPL(__riscv_isa_extension_available);

+void __init riscv_svpbmt(void)
+{
+#ifdef CONFIG_64BIT
+ struct device_node *node;
+ const char *str;
+
+ for_each_of_cpu_node(node) {
+ if (of_property_read_string(node, "mmu-type", &str)) {
+ continue;
+ }
+
+ if (!strncmp(str + 11, "svpbmt", 6)) {
+ __riscv_svpbmt.mask = _PAGE_MT_MASK;
+ __riscv_svpbmt.mt[MT_PMA] = _PAGE_MT_PMA;
+ __riscv_svpbmt.mt[MT_NC] = _PAGE_MT_NC;
+ __riscv_svpbmt.mt[MT_IO] = _PAGE_MT_IO;
+ break;
+ }
+ }
+#endif
+}
+
void __init riscv_fill_hwcap(void)
{
struct device_node *node;
diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
index 120b2f6f71bc..e7457113b45e 100644
--- a/arch/riscv/kernel/setup.c
+++ b/arch/riscv/kernel/setup.c
@@ -294,6 +294,8 @@ void __init setup_arch(char **cmdline_p)
setup_smp();
#endif

+ riscv_svpbmt();
+
riscv_fill_hwcap();
}

diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 7cb4f391d106..43b2e11fd3e0 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -905,3 +905,8 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
return vmemmap_populate_basepages(start, end, node, NULL);
}
#endif
+
+#ifdef CONFIG_64BIT
+struct __riscv_svpbmt_struct __riscv_svpbmt __ro_after_init;
+EXPORT_SYMBOL(__riscv_svpbmt);
+#endif
--
2.25.1


2021-09-23 09:16:33

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

On Thu, Sep 23, 2021 at 12:57 PM <[email protected]> wrote:
>
> From: Guo Ren <[email protected]>
>
> This patch follows the standard pure RISC-V Svpbmt extension in
> privilege spec to solve the non-coherent SOC dma synchronization
> issues.
>
> Here is the svpbmt PTE format:
> | 63 | 62-61 | 60-8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
> N MT RSW D A G U X W R V
> ^
>
> Of the Reserved bits [63:54] in a leaf PTE, the high bit is already
> allocated (as the N bit), so bits [62:61] are used as the MT (aka
> MemType) field. This field specifies one of three memory types that
> are close equivalents (or equivalent in effect) to the three main x86
> and ARMv8 memory types - as shown in the following table.
>
> RISC-V
> Encoding &
> MemType RISC-V Description
> ---------- ------------------------------------------------
> 00 - PMA Normal Cacheable, No change to implied PMA memory type
> 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory
> 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory
> 11 - Rsvd Reserved for future standard use
>
> The standard protection_map [] needn't be modified because the "PMA"
> type keeps the highest bits zero. The standard protection_map[] needn't
> be modified. So the whole modification is in the arch/riscv pgtable and
> using a global variable (__riscv_svpbmt) in the _PAGE_DMA_MASK/IO/NC
> for pgprot_noncached (&writecombine). We also need _PAGE_CHG_MASK to
> detect PFN than before.
>
> Devicetree: reuse "mmu-type" of cpu_section as a user interface to
> enable the feature or not:
> - riscv,sv39,svpbmt
> - riscv,sv48,svpbmt
>
> Signed-off-by: Guo Ren <[email protected]>
> Cc: Liu Shaohua <[email protected]>
> Cc: Wei Fu <[email protected]>
> Cc: Palmer Dabbelt <[email protected]>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Anup Patel <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Atish Patra <[email protected]>
> Cc: Drew Fustini <[email protected]>
> Cc: Wei Fu <[email protected]>
> Cc: Wei Wu <[email protected]>
> Cc: Chen-Yu Tsai <[email protected]>
> Cc: Maxime Ripard <[email protected]>
> Cc: Daniel Lustig <[email protected]>
> Cc: Greg Favor <[email protected]>
> Cc: Andrea Mondelli <[email protected]>
> Cc: Jonathan Behrens <[email protected]>
> Cc: Xinhaoqu (Freddie) <[email protected]>
> Cc: Bill Huffman <[email protected]>
> Cc: Nick Kossifidis <[email protected]>
> Cc: Allen Baum <[email protected]>
> Cc: Josh Scheid <[email protected]>
> Cc: Richard Trauben <[email protected]>
> ---
> .../devicetree/bindings/riscv/cpus.yaml | 2 +
> arch/riscv/include/asm/fixmap.h | 2 +-
> arch/riscv/include/asm/pgtable-64.h | 8 ++--
> arch/riscv/include/asm/pgtable-bits.h | 46 ++++++++++++++++++-
> arch/riscv/include/asm/pgtable.h | 39 ++++++++++++----
> arch/riscv/include/asm/processor.h | 1 +
> arch/riscv/kernel/cpufeature.c | 23 ++++++++++
> arch/riscv/kernel/setup.c | 2 +
> arch/riscv/mm/init.c | 5 ++
> 9 files changed, 113 insertions(+), 15 deletions(-)
>
> diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml
> index e534f6a7cfa1..1825cd8db0de 100644
> --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
> +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
> @@ -56,7 +56,9 @@ properties:
> enum:
> - riscv,sv32
> - riscv,sv39
> + - riscv,sv39,svpbmt
> - riscv,sv48
> + - riscv,sv48,svpbmt
> - riscv,none

I think augmenting the "mmu-type" DT property is a good approach but the
description of "mmu-type" DT property needs to say:

"Identifies the MMU address translation mode and page based memory
type used on ..."

Also, DT bindings change is generally a separate patch so better to make
a separate patch for this.

>
> riscv,isa:
> diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
> index 54cbf07fb4e9..5acd99d08e74 100644
> --- a/arch/riscv/include/asm/fixmap.h
> +++ b/arch/riscv/include/asm/fixmap.h
> @@ -43,7 +43,7 @@ enum fixed_addresses {
> __end_of_fixed_addresses
> };
>
> -#define FIXMAP_PAGE_IO PAGE_KERNEL
> +#define FIXMAP_PAGE_IO PAGE_IOREMAP
>
> #define __early_set_fixmap __set_fixmap
>
> diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h
> index 228261aa9628..0b53ea67e91a 100644
> --- a/arch/riscv/include/asm/pgtable-64.h
> +++ b/arch/riscv/include/asm/pgtable-64.h
> @@ -61,12 +61,14 @@ static inline void pud_clear(pud_t *pudp)
>
> static inline pmd_t *pud_pgtable(pud_t pud)
> {
> - return (pmd_t *)pfn_to_virt(pud_val(pud) >> _PAGE_PFN_SHIFT);
> + return (pmd_t *)pfn_to_virt((pud_val(pud) & _PAGE_CHG_MASK)
> + >> _PAGE_PFN_SHIFT);
> }
>
> static inline struct page *pud_page(pud_t pud)
> {
> - return pfn_to_page(pud_val(pud) >> _PAGE_PFN_SHIFT);
> + return pfn_to_page((pud_val(pud) & _PAGE_CHG_MASK)
> + >> _PAGE_PFN_SHIFT);
> }
>
> static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
> @@ -76,7 +78,7 @@ static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
>
> static inline unsigned long _pmd_pfn(pmd_t pmd)
> {
> - return pmd_val(pmd) >> _PAGE_PFN_SHIFT;
> + return (pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT;
> }
>
> #define mk_pmd(page, prot) pfn_pmd(page_to_pfn(page), prot)
> diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
> index 2ee413912926..041fe4fdbafa 100644
> --- a/arch/riscv/include/asm/pgtable-bits.h
> +++ b/arch/riscv/include/asm/pgtable-bits.h
> @@ -7,7 +7,7 @@
> #define _ASM_RISCV_PGTABLE_BITS_H
>
> /*
> - * PTE format:
> + * rv32 PTE format:
> * | XLEN-1 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
> * PFN reserved for SW D A G U X W R V
> */
> @@ -24,6 +24,47 @@
> #define _PAGE_DIRTY (1 << 7) /* Set by hardware on any write */
> #define _PAGE_SOFT (1 << 8) /* Reserved for software */
>
> +#ifndef __ASSEMBLY__
> +#ifdef CONFIG_64BIT
> +/*
> + * rv64 PTE format:
> + * | 63 | 62 61 | 60 54 | 53 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
> + * N MT RSV PFN reserved for SW D A G U X W R V
> + * [62:61] Memory Type definitions:
> + * 00 - PMA Normal Cacheable, No change to implied PMA memory type
> + * 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory
> + * 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory
> + * 11 - Rsvd Reserved for future standard use
> + */
> +#define _PAGE_MT_MASK ((u64)0x3 << 61)
> +#define _PAGE_MT_PMA ((u64)0x0 << 61)
> +#define _PAGE_MT_NC ((u64)0x1 << 61)
> +#define _PAGE_MT_IO ((u64)0x2 << 61)
> +
> +enum {
> + MT_PMA,
> + MT_NC,
> + MT_IO,
> + MT_MAX
> +};
> +
> +extern struct __riscv_svpbmt_struct {
> + unsigned long mask;
> + unsigned long mt[MT_MAX];
> +} __riscv_svpbmt;
> +
> +#define _PAGE_DMA_MASK __riscv_svpbmt.mask
> +#define _PAGE_DMA_PMA __riscv_svpbmt.mt[MT_PMA]
> +#define _PAGE_DMA_NC __riscv_svpbmt.mt[MT_NC]
> +#define _PAGE_DMA_IO __riscv_svpbmt.mt[MT_IO]
> +#else
> +#define _PAGE_DMA_MASK 0
> +#define _PAGE_DMA_PMA 0
> +#define _PAGE_DMA_NC 0
> +#define _PAGE_DMA_IO 0
> +#endif /* CONFIG_64BIT */
> +#endif /* __ASSEMBLY__ */
> +
> #define _PAGE_SPECIAL _PAGE_SOFT
> #define _PAGE_TABLE _PAGE_PRESENT
>
> @@ -38,7 +79,8 @@
> /* Set of bits to preserve across pte_modify() */
> #define _PAGE_CHG_MASK (~(unsigned long)(_PAGE_PRESENT | _PAGE_READ | \
> _PAGE_WRITE | _PAGE_EXEC | \
> - _PAGE_USER | _PAGE_GLOBAL))
> + _PAGE_USER | _PAGE_GLOBAL | \
> + _PAGE_DMA_MASK))
> /*
> * when all of R/W/X are zero, the PTE is a pointer to the next level
> * of the page table; otherwise, it is a leaf PTE.
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index 39b550310ec6..d07ba586c866 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -136,7 +136,8 @@
> | _PAGE_PRESENT \
> | _PAGE_ACCESSED \
> | _PAGE_DIRTY \
> - | _PAGE_GLOBAL)
> + | _PAGE_GLOBAL \
> + | _PAGE_DMA_PMA)
>
> #define PAGE_KERNEL __pgprot(_PAGE_KERNEL)
> #define PAGE_KERNEL_READ __pgprot(_PAGE_KERNEL & ~_PAGE_WRITE)
> @@ -146,11 +147,9 @@
>
> #define PAGE_TABLE __pgprot(_PAGE_TABLE)
>
> -/*
> - * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
> - * change the properties of memory regions.
> - */
> -#define _PAGE_IOREMAP _PAGE_KERNEL
> +#define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_DMA_MASK) | _PAGE_DMA_IO)
> +
> +#define PAGE_IOREMAP __pgprot(_PAGE_IOREMAP)
>
> extern pgd_t swapper_pg_dir[];
>
> @@ -230,12 +229,12 @@ static inline unsigned long _pgd_pfn(pgd_t pgd)
>
> static inline struct page *pmd_page(pmd_t pmd)
> {
> - return pfn_to_page(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
> + return pfn_to_page((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> }
>
> static inline unsigned long pmd_page_vaddr(pmd_t pmd)
> {
> - return (unsigned long)pfn_to_virt(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
> + return (unsigned long)pfn_to_virt((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> }
>
> static inline pte_t pmd_pte(pmd_t pmd)
> @@ -251,7 +250,7 @@ static inline pte_t pud_pte(pud_t pud)
> /* Yields the page frame number (PFN) of a page table entry */
> static inline unsigned long pte_pfn(pte_t pte)
> {
> - return (pte_val(pte) >> _PAGE_PFN_SHIFT);
> + return ((pte_val(pte) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> }
>
> #define pte_page(x) pfn_to_page(pte_pfn(x))
> @@ -490,6 +489,28 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
> return ptep_test_and_clear_young(vma, address, ptep);
> }
>
> +#define pgprot_noncached pgprot_noncached
> +static inline pgprot_t pgprot_noncached(pgprot_t _prot)
> +{
> + unsigned long prot = pgprot_val(_prot);
> +
> + prot &= ~_PAGE_DMA_MASK;
> + prot |= _PAGE_DMA_IO;
> +
> + return __pgprot(prot);
> +}
> +
> +#define pgprot_writecombine pgprot_writecombine
> +static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
> +{
> + unsigned long prot = pgprot_val(_prot);
> +
> + prot &= ~_PAGE_DMA_MASK;
> + prot |= _PAGE_DMA_NC;
> +
> + return __pgprot(prot);
> +}
> +
> /*
> * THP functions
> */
> diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
> index 021ed64ee608..92676156cbf6 100644
> --- a/arch/riscv/include/asm/processor.h
> +++ b/arch/riscv/include/asm/processor.h
> @@ -70,6 +70,7 @@ struct device_node;
> int riscv_of_processor_hartid(struct device_node *node);
> int riscv_of_parent_hartid(struct device_node *node);
>
> +extern void riscv_svpbmt(void);

You can drop this change.

> extern void riscv_fill_hwcap(void);
> extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
>
> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> index d959d207a40d..4a2211c2c464 100644
> --- a/arch/riscv/kernel/cpufeature.c
> +++ b/arch/riscv/kernel/cpufeature.c
> @@ -8,6 +8,7 @@
>
> #include <linux/bitmap.h>
> #include <linux/of.h>
> +#include <linux/pgtable.h>
> #include <asm/processor.h>
> #include <asm/hwcap.h>
> #include <asm/smp.h>
> @@ -59,6 +60,28 @@ bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit)
> }
> EXPORT_SYMBOL_GPL(__riscv_isa_extension_available);
>
> +void __init riscv_svpbmt(void)

Make this static function and call it from riscv_fill_hwcap().

> +{
> +#ifdef CONFIG_64BIT
> + struct device_node *node;
> + const char *str;
> +
> + for_each_of_cpu_node(node) {
> + if (of_property_read_string(node, "mmu-type", &str)) {
> + continue;
> + }
> +
> + if (!strncmp(str + 11, "svpbmt", 6)) {
> + __riscv_svpbmt.mask = _PAGE_MT_MASK;
> + __riscv_svpbmt.mt[MT_PMA] = _PAGE_MT_PMA;
> + __riscv_svpbmt.mt[MT_NC] = _PAGE_MT_NC;
> + __riscv_svpbmt.mt[MT_IO] = _PAGE_MT_IO;
> + break;
> + }

I don't agree with this loop.

The __riscv_svpbmt should be updated only when all CPU nodes
have "svpmbt" in the "mmu-type" DT property.

> + }
> +#endif
> +}
> +
> void __init riscv_fill_hwcap(void)
> {
> struct device_node *node;
> diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> index 120b2f6f71bc..e7457113b45e 100644
> --- a/arch/riscv/kernel/setup.c
> +++ b/arch/riscv/kernel/setup.c
> @@ -294,6 +294,8 @@ void __init setup_arch(char **cmdline_p)
> setup_smp();
> #endif
>
> + riscv_svpbmt();
> +

You can drop this change based on above comment.

> riscv_fill_hwcap();
> }
>
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 7cb4f391d106..43b2e11fd3e0 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -905,3 +905,8 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
> return vmemmap_populate_basepages(start, end, node, NULL);
> }
> #endif
> +
> +#ifdef CONFIG_64BIT
> +struct __riscv_svpbmt_struct __riscv_svpbmt __ro_after_init;
> +EXPORT_SYMBOL(__riscv_svpbmt);
> +#endif
> --
> 2.25.1
>

Appending "svpmb" suffix to "mmu-type" breaks print_mmu()
in arch/riscv/kernel/cpu.c. Please update that as well.

We will also need couple of Tested-by for existing boards.

Regards,
Anup

2021-09-23 09:40:30

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

On Thu, Sep 23, 2021 at 2:55 PM Nick Kossifidis <[email protected]> wrote:
>
> Hello Guo,
>
> Στις 2021-09-23 10:27, [email protected] έγραψε:
> diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml
> b/Documentation/devicetree/bindings/riscv/cpus.yaml
> index e534f6a7cfa1..1825cd8db0de 100644
> --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
> +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
> @@ -56,7 +56,9 @@ properties:
> enum:
> - riscv,sv32
> - riscv,sv39
> + - riscv,sv39,svpbmt
> - riscv,sv48
> + - riscv,sv48,svpbmt
> - riscv,none
>
> Isn't svpbmt orthogonal to the mmu type ? It's a functionality that can
> be present on either sv39/48/57 so why not have another "svpbmt"
> property directly on the cpu node ?

Actually, "mmu-type" would be a good place because it's page based
memory attribute and paging can't exist without mmu translation mode.

Also, "svpmbt" is indeed a CPU property so has to be feature individual
CPU node. Hypothetically, a heterogeneous system is possible where
some CPUs have "svpmbt" and some CPUs don't have "svpmbt". For
example, a future FUxxx SoC might have a E-core and few S-cores
where S-cores have Svpmbt whereas E-core does not have Svpmbt
because it's an embedded core.

Regards,
Anup

>
> > + * rv64 PTE format:
> > + * | 63 | 62 61 | 60 54 | 53 10 | 9 8 | 7 | 6 | 5 | 4 |
> > 3 | 2 | 1 | 0
> > + * N MT RSV PFN reserved for SW D A G U
> > X W R V
> > + * [62:61] Memory Type definitions:
> > + * 00 - PMA Normal Cacheable, No change to implied PMA memory type
> > + * 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory
> > + * 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O
> > memory
> > + * 11 - Rsvd Reserved for future standard use
> > + */
> > +#define _PAGE_MT_MASK ((u64)0x3 << 61)
> > +#define _PAGE_MT_PMA ((u64)0x0 << 61)
> > +#define _PAGE_MT_NC ((u64)0x1 << 61)
> > +#define _PAGE_MT_IO ((u64)0x2 << 61)
> > +
>
> It'd be cleaner IMHO if you defined _PAGE_MT_MASK as (_PAGE_MT_PMA |
> _PAGE_MT_NC | _PAGE_MT_IO), like other masks are defined (e.g.
> _PAGE_CHG_MASK on the same file). I also suggest you use unsigned long
> instead of u64 for consistency.
>
> > +enum {
> > + MT_PMA,
> > + MT_NC,
> > + MT_IO,
> > + MT_MAX
> > +};
> > +
> > +extern struct __riscv_svpbmt_struct {
> > + unsigned long mask;
> > + unsigned long mt[MT_MAX];
> > +} __riscv_svpbmt;
> > +
> > +#define _PAGE_DMA_MASK __riscv_svpbmt.mask
> > +#define _PAGE_DMA_PMA __riscv_svpbmt.mt[MT_PMA]
> > +#define _PAGE_DMA_NC __riscv_svpbmt.mt[MT_NC]
> > +#define _PAGE_DMA_IO __riscv_svpbmt.mt[MT_IO]
> > +#else
> > +#define _PAGE_DMA_MASK 0
> > +#define _PAGE_DMA_PMA 0
> > +#define _PAGE_DMA_NC 0
> > +#define _PAGE_DMA_IO 0
> > +#endif /* CONFIG_64BIT */
> > +#endif /* __ASSEMBLY__ */
> > +
> > #define _PAGE_SPECIAL _PAGE_SOFT
> > #define _PAGE_TABLE _PAGE_PRESENT
> >
>
> This struct is not useful as part of enabling the standard Svpbmt
> extension on Linux, we can set _PAGE_DMA_* macros directly on this patch
> and introduce the struct approach later on, when we also define
> alternative values for _PAGE_DMA_* flags. Also to someone reading the
> code the struct doesn't make sense without some documentation on why
> it's needed. Finally why the enum / array ? Why not just have different
> fields on the struct ?
>
> > diff --git a/arch/riscv/include/asm/pgtable.h
> > b/arch/riscv/include/asm/pgtable.h
> > index 39b550310ec6..d07ba586c866 100644
> > --- a/arch/riscv/include/asm/pgtable.h
> > +++ b/arch/riscv/include/asm/pgtable.h
> > @@ -136,7 +136,8 @@
> > | _PAGE_PRESENT \
> > | _PAGE_ACCESSED \
> > | _PAGE_DIRTY \
> > - | _PAGE_GLOBAL)
> > + | _PAGE_GLOBAL \
> > + | _PAGE_DMA_PMA)
> >
>
> That's a bit misleading, it's like marking the kernel pages as DMAable.
>
> -/*
> - * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so
> we can't
> - * change the properties of memory regions.
> - */
> -#define _PAGE_IOREMAP _PAGE_KERNEL
> +#define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_DMA_MASK) |
> _PAGE_DMA_IO)
> +
> +#define PAGE_IOREMAP __pgprot(_PAGE_IOREMAP)
>
> This isn't used anywhere.
>
> @@ -490,6 +489,28 @@ static inline int ptep_clear_flush_young(struct
> vm_area_struct *vma,
> return ptep_test_and_clear_young(vma, address, ptep);
> }
>
> +#define pgprot_noncached pgprot_noncached
> +static inline pgprot_t pgprot_noncached(pgprot_t _prot)
> +{
> + unsigned long prot = pgprot_val(_prot);
> +
> + prot &= ~_PAGE_DMA_MASK;
> + prot |= _PAGE_DMA_IO;
> +
> + return __pgprot(prot);
> +}
> +
> +#define pgprot_writecombine pgprot_writecombine
> +static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
> +{
> + unsigned long prot = pgprot_val(_prot);
> +
> + prot &= ~_PAGE_DMA_MASK;
> + prot |= _PAGE_DMA_NC;
> +
> + return __pgprot(prot);
> +}
> +
>
> We also have the IO type, we should also define pgprot_device to also
> ensure ordering, or else it'll fallback to pgprot_noncached, which in
> our case won't work well due to RVWMO:
> https://elixir.bootlin.com/linux/latest/source/include/linux/pgtable.h#L930
>
> +void __init riscv_svpbmt(void)
> +{
> +#ifdef CONFIG_64BIT
> + struct device_node *node;
> + const char *str;
> +
> + for_each_of_cpu_node(node) {
> + if (of_property_read_string(node, "mmu-type", &str)) {
> + continue;
> + }
> +
> + if (!strncmp(str + 11, "svpbmt", 6)) {
> + __riscv_svpbmt.mask = _PAGE_MT_MASK;
> + __riscv_svpbmt.mt[MT_PMA] = _PAGE_MT_PMA;
> + __riscv_svpbmt.mt[MT_NC] = _PAGE_MT_NC;
> + __riscv_svpbmt.mt[MT_IO] = _PAGE_MT_IO;
> + break;
> + }
> + }
> +#endif
> +}
>
> You break; here the first time you find a cpu node with svpbmt enabled,
> shouldn't we make sure that all used cpu nodes support svpbmt before
> using the extension ?
>
> Regards,
> Nick
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv

2021-09-23 09:44:02

by Nick Kossifidis

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

Στις 2021-09-23 12:37, Anup Patel έγραψε:
> On Thu, Sep 23, 2021 at 2:55 PM Nick Kossifidis <[email protected]>
> wrote:
>>
>> Hello Guo,
>>
>> Στις 2021-09-23 10:27, [email protected] έγραψε:
>> diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml
>> b/Documentation/devicetree/bindings/riscv/cpus.yaml
>> index e534f6a7cfa1..1825cd8db0de 100644
>> --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
>> +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
>> @@ -56,7 +56,9 @@ properties:
>> enum:
>> - riscv,sv32
>> - riscv,sv39
>> + - riscv,sv39,svpbmt
>> - riscv,sv48
>> + - riscv,sv48,svpbmt
>> - riscv,none
>>
>> Isn't svpbmt orthogonal to the mmu type ? It's a functionality that
>> can
>> be present on either sv39/48/57 so why not have another "svpbmt"
>> property directly on the cpu node ?
>
> Actually, "mmu-type" would be a good place because it's page based
> memory attribute and paging can't exist without mmu translation mode.
>
> Also, "svpmbt" is indeed a CPU property so has to be feature individual
> CPU node. Hypothetically, a heterogeneous system is possible where
> some CPUs have "svpmbt" and some CPUs don't have "svpmbt". For
> example, a future FUxxx SoC might have a E-core and few S-cores
> where S-cores have Svpmbt whereas E-core does not have Svpmbt
> because it's an embedded core.
>

I should say cpuX node, not the root /cpu node. We can have an svpbmt
property in the same way we have an mmu-type property.

Regards,
Nick

2021-09-23 09:50:11

by Nick Kossifidis

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

Hello Guo,

Στις 2021-09-23 10:27, [email protected] έγραψε:
diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml
b/Documentation/devicetree/bindings/riscv/cpus.yaml
index e534f6a7cfa1..1825cd8db0de 100644
--- a/Documentation/devicetree/bindings/riscv/cpus.yaml
+++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
@@ -56,7 +56,9 @@ properties:
enum:
- riscv,sv32
- riscv,sv39
+ - riscv,sv39,svpbmt
- riscv,sv48
+ - riscv,sv48,svpbmt
- riscv,none

Isn't svpbmt orthogonal to the mmu type ? It's a functionality that can
be present on either sv39/48/57 so why not have another "svpbmt"
property directly on the cpu node ?

> + * rv64 PTE format:
> + * | 63 | 62 61 | 60 54 | 53 10 | 9 8 | 7 | 6 | 5 | 4 |
> 3 | 2 | 1 | 0
> + * N MT RSV PFN reserved for SW D A G U
> X W R V
> + * [62:61] Memory Type definitions:
> + * 00 - PMA Normal Cacheable, No change to implied PMA memory type
> + * 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory
> + * 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O
> memory
> + * 11 - Rsvd Reserved for future standard use
> + */
> +#define _PAGE_MT_MASK ((u64)0x3 << 61)
> +#define _PAGE_MT_PMA ((u64)0x0 << 61)
> +#define _PAGE_MT_NC ((u64)0x1 << 61)
> +#define _PAGE_MT_IO ((u64)0x2 << 61)
> +

It'd be cleaner IMHO if you defined _PAGE_MT_MASK as (_PAGE_MT_PMA |
_PAGE_MT_NC | _PAGE_MT_IO), like other masks are defined (e.g.
_PAGE_CHG_MASK on the same file). I also suggest you use unsigned long
instead of u64 for consistency.

> +enum {
> + MT_PMA,
> + MT_NC,
> + MT_IO,
> + MT_MAX
> +};
> +
> +extern struct __riscv_svpbmt_struct {
> + unsigned long mask;
> + unsigned long mt[MT_MAX];
> +} __riscv_svpbmt;
> +
> +#define _PAGE_DMA_MASK __riscv_svpbmt.mask
> +#define _PAGE_DMA_PMA __riscv_svpbmt.mt[MT_PMA]
> +#define _PAGE_DMA_NC __riscv_svpbmt.mt[MT_NC]
> +#define _PAGE_DMA_IO __riscv_svpbmt.mt[MT_IO]
> +#else
> +#define _PAGE_DMA_MASK 0
> +#define _PAGE_DMA_PMA 0
> +#define _PAGE_DMA_NC 0
> +#define _PAGE_DMA_IO 0
> +#endif /* CONFIG_64BIT */
> +#endif /* __ASSEMBLY__ */
> +
> #define _PAGE_SPECIAL _PAGE_SOFT
> #define _PAGE_TABLE _PAGE_PRESENT
>

This struct is not useful as part of enabling the standard Svpbmt
extension on Linux, we can set _PAGE_DMA_* macros directly on this patch
and introduce the struct approach later on, when we also define
alternative values for _PAGE_DMA_* flags. Also to someone reading the
code the struct doesn't make sense without some documentation on why
it's needed. Finally why the enum / array ? Why not just have different
fields on the struct ?

> diff --git a/arch/riscv/include/asm/pgtable.h
> b/arch/riscv/include/asm/pgtable.h
> index 39b550310ec6..d07ba586c866 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -136,7 +136,8 @@
> | _PAGE_PRESENT \
> | _PAGE_ACCESSED \
> | _PAGE_DIRTY \
> - | _PAGE_GLOBAL)
> + | _PAGE_GLOBAL \
> + | _PAGE_DMA_PMA)
>

That's a bit misleading, it's like marking the kernel pages as DMAable.

-/*
- * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so
we can't
- * change the properties of memory regions.
- */
-#define _PAGE_IOREMAP _PAGE_KERNEL
+#define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_DMA_MASK) |
_PAGE_DMA_IO)
+
+#define PAGE_IOREMAP __pgprot(_PAGE_IOREMAP)

This isn't used anywhere.

@@ -490,6 +489,28 @@ static inline int ptep_clear_flush_young(struct
vm_area_struct *vma,
return ptep_test_and_clear_young(vma, address, ptep);
}

+#define pgprot_noncached pgprot_noncached
+static inline pgprot_t pgprot_noncached(pgprot_t _prot)
+{
+ unsigned long prot = pgprot_val(_prot);
+
+ prot &= ~_PAGE_DMA_MASK;
+ prot |= _PAGE_DMA_IO;
+
+ return __pgprot(prot);
+}
+
+#define pgprot_writecombine pgprot_writecombine
+static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
+{
+ unsigned long prot = pgprot_val(_prot);
+
+ prot &= ~_PAGE_DMA_MASK;
+ prot |= _PAGE_DMA_NC;
+
+ return __pgprot(prot);
+}
+

We also have the IO type, we should also define pgprot_device to also
ensure ordering, or else it'll fallback to pgprot_noncached, which in
our case won't work well due to RVWMO:
https://elixir.bootlin.com/linux/latest/source/include/linux/pgtable.h#L930

+void __init riscv_svpbmt(void)
+{
+#ifdef CONFIG_64BIT
+ struct device_node *node;
+ const char *str;
+
+ for_each_of_cpu_node(node) {
+ if (of_property_read_string(node, "mmu-type", &str)) {
+ continue;
+ }
+
+ if (!strncmp(str + 11, "svpbmt", 6)) {
+ __riscv_svpbmt.mask = _PAGE_MT_MASK;
+ __riscv_svpbmt.mt[MT_PMA] = _PAGE_MT_PMA;
+ __riscv_svpbmt.mt[MT_NC] = _PAGE_MT_NC;
+ __riscv_svpbmt.mt[MT_IO] = _PAGE_MT_IO;
+ break;
+ }
+ }
+#endif
+}

You break; here the first time you find a cpu node with svpbmt enabled,
shouldn't we make sure that all used cpu nodes support svpbmt before
using the extension ?

Regards,
Nick

2021-09-23 09:50:40

by Nick Kossifidis

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

Στις 2021-09-23 12:42, Nick Kossifidis έγραψε:
> Στις 2021-09-23 12:37, Anup Patel έγραψε:
>> On Thu, Sep 23, 2021 at 2:55 PM Nick Kossifidis <[email protected]>
>> wrote:
>>>
>>> Hello Guo,
>>>
>>> Στις 2021-09-23 10:27, [email protected] έγραψε:
>>> diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml
>>> b/Documentation/devicetree/bindings/riscv/cpus.yaml
>>> index e534f6a7cfa1..1825cd8db0de 100644
>>> --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
>>> +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
>>> @@ -56,7 +56,9 @@ properties:
>>> enum:
>>> - riscv,sv32
>>> - riscv,sv39
>>> + - riscv,sv39,svpbmt
>>> - riscv,sv48
>>> + - riscv,sv48,svpbmt
>>> - riscv,none
>>>
>>> Isn't svpbmt orthogonal to the mmu type ? It's a functionality that
>>> can
>>> be present on either sv39/48/57 so why not have another "svpbmt"
>>> property directly on the cpu node ?
>>
>> Actually, "mmu-type" would be a good place because it's page based
>> memory attribute and paging can't exist without mmu translation mode.
>>
>> Also, "svpmbt" is indeed a CPU property so has to be feature
>> individual
>> CPU node. Hypothetically, a heterogeneous system is possible where
>> some CPUs have "svpmbt" and some CPUs don't have "svpmbt". For
>> example, a future FUxxx SoC might have a E-core and few S-cores
>> where S-cores have Svpmbt whereas E-core does not have Svpmbt
>> because it's an embedded core.
>>
>
> I should say cpuX node, not the root /cpu node. We can have an svpbmt
> property in the same way we have an mmu-type property.
>

I'm also thinking of future mmu-related extensions, e.g. what about
svnapot ? Should we have mmu-type be riscv,sv39,svnapot and e.g.
riscv.sv39,svpbmt,svnapot ? It'll become messy.

Regards,
Nick

2021-09-23 09:59:31

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

On Thu, Sep 23, 2021 at 3:18 PM Nick Kossifidis <[email protected]> wrote:
>
> Στις 2021-09-23 12:42, Nick Kossifidis έγραψε:
> > Στις 2021-09-23 12:37, Anup Patel έγραψε:
> >> On Thu, Sep 23, 2021 at 2:55 PM Nick Kossifidis <[email protected]>
> >> wrote:
> >>>
> >>> Hello Guo,
> >>>
> >>> Στις 2021-09-23 10:27, [email protected] έγραψε:
> >>> diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml
> >>> b/Documentation/devicetree/bindings/riscv/cpus.yaml
> >>> index e534f6a7cfa1..1825cd8db0de 100644
> >>> --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
> >>> +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
> >>> @@ -56,7 +56,9 @@ properties:
> >>> enum:
> >>> - riscv,sv32
> >>> - riscv,sv39
> >>> + - riscv,sv39,svpbmt
> >>> - riscv,sv48
> >>> + - riscv,sv48,svpbmt
> >>> - riscv,none
> >>>
> >>> Isn't svpbmt orthogonal to the mmu type ? It's a functionality that
> >>> can
> >>> be present on either sv39/48/57 so why not have another "svpbmt"
> >>> property directly on the cpu node ?
> >>
> >> Actually, "mmu-type" would be a good place because it's page based
> >> memory attribute and paging can't exist without mmu translation mode.
> >>
> >> Also, "svpmbt" is indeed a CPU property so has to be feature
> >> individual
> >> CPU node. Hypothetically, a heterogeneous system is possible where
> >> some CPUs have "svpmbt" and some CPUs don't have "svpmbt". For
> >> example, a future FUxxx SoC might have a E-core and few S-cores
> >> where S-cores have Svpmbt whereas E-core does not have Svpmbt
> >> because it's an embedded core.
> >>
> >
> > I should say cpuX node, not the root /cpu node. We can have an svpbmt
> > property in the same way we have an mmu-type property.
> >
>
> I'm also thinking of future mmu-related extensions, e.g. what about
> svnapot ? Should we have mmu-type be riscv,sv39,svnapot and e.g.
> riscv.sv39,svpbmt,svnapot ? It'll become messy.

I agree, "mmu-type" can become longer in future but I was thinking
if all MMU related features can simply be comma-separated values
of one DT property.

Alternately, we can have "riscv,svpmbt" bool DT property in each
CPU node which will keep things simpler as compared to parsing
comma-separate string in "mmu-type" DT property.

Regards,
Anup

>
> Regards,
> Nick

2021-09-23 10:12:16

by Philipp Tomsich

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

On Thu, 23 Sept 2021 at 11:48, Nick Kossifidis <[email protected]> wrote:
>
> Στις 2021-09-23 12:42, Nick Kossifidis έγραψε:
> > Στις 2021-09-23 12:37, Anup Patel έγραψε:
> >> On Thu, Sep 23, 2021 at 2:55 PM Nick Kossifidis <[email protected]>
> >> wrote:
> >>>
> >>> Hello Guo,
> >>>
> >>> Στις 2021-09-23 10:27, [email protected] έγραψε:
> >>> diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml
> >>> b/Documentation/devicetree/bindings/riscv/cpus.yaml
> >>> index e534f6a7cfa1..1825cd8db0de 100644
> >>> --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
> >>> +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
> >>> @@ -56,7 +56,9 @@ properties:
> >>> enum:
> >>> - riscv,sv32
> >>> - riscv,sv39
> >>> + - riscv,sv39,svpbmt
> >>> - riscv,sv48
> >>> + - riscv,sv48,svpbmt
> >>> - riscv,none
> >>>
> >>> Isn't svpbmt orthogonal to the mmu type ? It's a functionality that
> >>> can
> >>> be present on either sv39/48/57 so why not have another "svpbmt"
> >>> property directly on the cpu node ?
> >>
> >> Actually, "mmu-type" would be a good place because it's page based
> >> memory attribute and paging can't exist without mmu translation mode.
> >>
> >> Also, "svpmbt" is indeed a CPU property so has to be feature
> >> individual
> >> CPU node. Hypothetically, a heterogeneous system is possible where
> >> some CPUs have "svpmbt" and some CPUs don't have "svpmbt". For
> >> example, a future FUxxx SoC might have a E-core and few S-cores
> >> where S-cores have Svpmbt whereas E-core does not have Svpmbt
> >> because it's an embedded core.
> >>
> >
> > I should say cpuX node, not the root /cpu node. We can have an svpbmt
> > property in the same way we have an mmu-type property.
> >
>
> I'm also thinking of future mmu-related extensions, e.g. what about
> svnapot ? Should we have mmu-type be riscv,sv39,svnapot and e.g.
> riscv.sv39,svpbmt,svnapot ? It'll become messy.

How if we expand this to a mmu subnode in cpu@x and add a booleans for
adornments like svnapot and svpbmt?

The older mmu-type could then treated to indicate a mmu w/o any adornments
specified. I am aware that this generates an additional parsing-path
that will be
maintained, but it will allow future properties to be grouped.

This could like like the following:

cpu@0 {
...
mmu {
type = "riscv,sv39";
supports-svpbmt;
}
...
}

Cheers,
Philipp.

2021-09-23 10:14:34

by Nick Kossifidis

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

Στις 2021-09-23 13:04, Philipp Tomsich έγραψε:
>
> How if we expand this to a mmu subnode in cpu@x and add a booleans for
> adornments like svnapot and svpbmt?
> The older mmu-type could then treated to indicate a mmu w/o any
> adornments specified. I am aware that this generates an additional
> parsing-path that will be maintained, but it will allow future
> properties to be grouped.
>
> cpu@0 {
> ...
> mmu {
> type = "riscv,sv39";
> supports-svpbmt;
> }
> ...
> }

I was about to propose the same thing, we can do this now without
breaking backwards compatibility, we don't really use mmu-type property
at this point, we are either sv39 or nommu.

2021-09-23 10:37:37

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

On Thu, Sep 23, 2021 at 3:38 PM Philipp Tomsich
<[email protected]> wrote:
>
> On Thu, 23 Sept 2021 at 11:48, Nick Kossifidis <[email protected]> wrote:
> >
> > Στις 2021-09-23 12:42, Nick Kossifidis έγραψε:
> > > Στις 2021-09-23 12:37, Anup Patel έγραψε:
> > >> On Thu, Sep 23, 2021 at 2:55 PM Nick Kossifidis <[email protected]>
> > >> wrote:
> > >>>
> > >>> Hello Guo,
> > >>>
> > >>> Στις 2021-09-23 10:27, [email protected] έγραψε:
> > >>> diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml
> > >>> b/Documentation/devicetree/bindings/riscv/cpus.yaml
> > >>> index e534f6a7cfa1..1825cd8db0de 100644
> > >>> --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
> > >>> +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
> > >>> @@ -56,7 +56,9 @@ properties:
> > >>> enum:
> > >>> - riscv,sv32
> > >>> - riscv,sv39
> > >>> + - riscv,sv39,svpbmt
> > >>> - riscv,sv48
> > >>> + - riscv,sv48,svpbmt
> > >>> - riscv,none
> > >>>
> > >>> Isn't svpbmt orthogonal to the mmu type ? It's a functionality that
> > >>> can
> > >>> be present on either sv39/48/57 so why not have another "svpbmt"
> > >>> property directly on the cpu node ?
> > >>
> > >> Actually, "mmu-type" would be a good place because it's page based
> > >> memory attribute and paging can't exist without mmu translation mode.
> > >>
> > >> Also, "svpmbt" is indeed a CPU property so has to be feature
> > >> individual
> > >> CPU node. Hypothetically, a heterogeneous system is possible where
> > >> some CPUs have "svpmbt" and some CPUs don't have "svpmbt". For
> > >> example, a future FUxxx SoC might have a E-core and few S-cores
> > >> where S-cores have Svpmbt whereas E-core does not have Svpmbt
> > >> because it's an embedded core.
> > >>
> > >
> > > I should say cpuX node, not the root /cpu node. We can have an svpbmt
> > > property in the same way we have an mmu-type property.
> > >
> >
> > I'm also thinking of future mmu-related extensions, e.g. what about
> > svnapot ? Should we have mmu-type be riscv,sv39,svnapot and e.g.
> > riscv.sv39,svpbmt,svnapot ? It'll become messy.
>
> How if we expand this to a mmu subnode in cpu@x and add a booleans for
> adornments like svnapot and svpbmt?
>
> The older mmu-type could then treated to indicate a mmu w/o any adornments
> specified. I am aware that this generates an additional parsing-path
> that will be
> maintained, but it will allow future properties to be grouped.
>
> This could like like the following:
>
> cpu@0 {
> ...
> mmu {
> type = "riscv,sv39";
> supports-svpbmt;
> }
> ...
> }

This is better but we will have to support the old "mmu-type" DT
property as well because we can't break compatibility in DT bindings.

Regards,
Anup

2021-09-23 11:53:25

by Alexandre Ghiti

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

On Thu, Sep 23, 2021 at 12:11 PM Nick Kossifidis <[email protected]> wrote:
>
> Στις 2021-09-23 13:04, Philipp Tomsich έγραψε:
> >
> > How if we expand this to a mmu subnode in cpu@x and add a booleans for
> > adornments like svnapot and svpbmt?
> > The older mmu-type could then treated to indicate a mmu w/o any
> > adornments specified. I am aware that this generates an additional
> > parsing-path that will be maintained, but it will allow future
> > properties to be grouped.
> >
> > cpu@0 {
> > ...
> > mmu {
> > type = "riscv,sv39";
> > supports-svpbmt;
> > }
> > ...
> > }
>
> I was about to propose the same thing, we can do this now without
> breaking backwards compatibility, we don't really use mmu-type property
> at this point, we are either sv39 or nommu.

Indeed, this property is only informative and not useful since we can
directly "ask" the hw what it supports (cf sv48 patchset). And it
cannot actually be used to force a certain svXX since reading the
device tree happens way too late in the boot process (I have this
issue with my sv48 patchset where I used to read the device tree to
set the size of the address space, but it actually breaks KASAN).

Isn't there a way to know if it supports svPBMT at runtime?

Alex

>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv

2021-09-23 11:55:34

by Guo Ren

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

On Thu, Sep 23, 2021 at 5:13 PM Anup Patel <[email protected]> wrote:
>
> On Thu, Sep 23, 2021 at 12:57 PM <[email protected]> wrote:
> >
> > From: Guo Ren <[email protected]>
> >
> > This patch follows the standard pure RISC-V Svpbmt extension in
> > privilege spec to solve the non-coherent SOC dma synchronization
> > issues.
> >
> > Here is the svpbmt PTE format:
> > | 63 | 62-61 | 60-8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
> > N MT RSW D A G U X W R V
> > ^
> >
> > Of the Reserved bits [63:54] in a leaf PTE, the high bit is already
> > allocated (as the N bit), so bits [62:61] are used as the MT (aka
> > MemType) field. This field specifies one of three memory types that
> > are close equivalents (or equivalent in effect) to the three main x86
> > and ARMv8 memory types - as shown in the following table.
> >
> > RISC-V
> > Encoding &
> > MemType RISC-V Description
> > ---------- ------------------------------------------------
> > 00 - PMA Normal Cacheable, No change to implied PMA memory type
> > 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory
> > 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory
> > 11 - Rsvd Reserved for future standard use
> >
> > The standard protection_map [] needn't be modified because the "PMA"
> > type keeps the highest bits zero. The standard protection_map[] needn't
> > be modified. So the whole modification is in the arch/riscv pgtable and
> > using a global variable (__riscv_svpbmt) in the _PAGE_DMA_MASK/IO/NC
> > for pgprot_noncached (&writecombine). We also need _PAGE_CHG_MASK to
> > detect PFN than before.
> >
> > Devicetree: reuse "mmu-type" of cpu_section as a user interface to
> > enable the feature or not:
> > - riscv,sv39,svpbmt
> > - riscv,sv48,svpbmt
> >
> > Signed-off-by: Guo Ren <[email protected]>
> > Cc: Liu Shaohua <[email protected]>
> > Cc: Wei Fu <[email protected]>
> > Cc: Palmer Dabbelt <[email protected]>
> > Cc: Christoph Hellwig <[email protected]>
> > Cc: Anup Patel <[email protected]>
> > Cc: Arnd Bergmann <[email protected]>
> > Cc: Atish Patra <[email protected]>
> > Cc: Drew Fustini <[email protected]>
> > Cc: Wei Fu <[email protected]>
> > Cc: Wei Wu <[email protected]>
> > Cc: Chen-Yu Tsai <[email protected]>
> > Cc: Maxime Ripard <[email protected]>
> > Cc: Daniel Lustig <[email protected]>
> > Cc: Greg Favor <[email protected]>
> > Cc: Andrea Mondelli <[email protected]>
> > Cc: Jonathan Behrens <[email protected]>
> > Cc: Xinhaoqu (Freddie) <[email protected]>
> > Cc: Bill Huffman <[email protected]>
> > Cc: Nick Kossifidis <[email protected]>
> > Cc: Allen Baum <[email protected]>
> > Cc: Josh Scheid <[email protected]>
> > Cc: Richard Trauben <[email protected]>
> > ---
> > .../devicetree/bindings/riscv/cpus.yaml | 2 +
> > arch/riscv/include/asm/fixmap.h | 2 +-
> > arch/riscv/include/asm/pgtable-64.h | 8 ++--
> > arch/riscv/include/asm/pgtable-bits.h | 46 ++++++++++++++++++-
> > arch/riscv/include/asm/pgtable.h | 39 ++++++++++++----
> > arch/riscv/include/asm/processor.h | 1 +
> > arch/riscv/kernel/cpufeature.c | 23 ++++++++++
> > arch/riscv/kernel/setup.c | 2 +
> > arch/riscv/mm/init.c | 5 ++
> > 9 files changed, 113 insertions(+), 15 deletions(-)
> >
> > diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml b/Documentation/devicetree/bindings/riscv/cpus.yaml
> > index e534f6a7cfa1..1825cd8db0de 100644
> > --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
> > +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
> > @@ -56,7 +56,9 @@ properties:
> > enum:
> > - riscv,sv32
> > - riscv,sv39
> > + - riscv,sv39,svpbmt
> > - riscv,sv48
> > + - riscv,sv48,svpbmt
> > - riscv,none
>
> I think augmenting the "mmu-type" DT property is a good approach but the
> description of "mmu-type" DT property needs to say:
>
> "Identifies the MMU address translation mode and page based memory
> type used on ..."
Okay

>
> Also, DT bindings change is generally a separate patch so better to make
> a separate patch for this.
Okay

>
> >
> > riscv,isa:
> > diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
> > index 54cbf07fb4e9..5acd99d08e74 100644
> > --- a/arch/riscv/include/asm/fixmap.h
> > +++ b/arch/riscv/include/asm/fixmap.h
> > @@ -43,7 +43,7 @@ enum fixed_addresses {
> > __end_of_fixed_addresses
> > };
> >
> > -#define FIXMAP_PAGE_IO PAGE_KERNEL
> > +#define FIXMAP_PAGE_IO PAGE_IOREMAP
> >
> > #define __early_set_fixmap __set_fixmap
> >
> > diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h
> > index 228261aa9628..0b53ea67e91a 100644
> > --- a/arch/riscv/include/asm/pgtable-64.h
> > +++ b/arch/riscv/include/asm/pgtable-64.h
> > @@ -61,12 +61,14 @@ static inline void pud_clear(pud_t *pudp)
> >
> > static inline pmd_t *pud_pgtable(pud_t pud)
> > {
> > - return (pmd_t *)pfn_to_virt(pud_val(pud) >> _PAGE_PFN_SHIFT);
> > + return (pmd_t *)pfn_to_virt((pud_val(pud) & _PAGE_CHG_MASK)
> > + >> _PAGE_PFN_SHIFT);
> > }
> >
> > static inline struct page *pud_page(pud_t pud)
> > {
> > - return pfn_to_page(pud_val(pud) >> _PAGE_PFN_SHIFT);
> > + return pfn_to_page((pud_val(pud) & _PAGE_CHG_MASK)
> > + >> _PAGE_PFN_SHIFT);
> > }
> >
> > static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
> > @@ -76,7 +78,7 @@ static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
> >
> > static inline unsigned long _pmd_pfn(pmd_t pmd)
> > {
> > - return pmd_val(pmd) >> _PAGE_PFN_SHIFT;
> > + return (pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT;
> > }
> >
> > #define mk_pmd(page, prot) pfn_pmd(page_to_pfn(page), prot)
> > diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
> > index 2ee413912926..041fe4fdbafa 100644
> > --- a/arch/riscv/include/asm/pgtable-bits.h
> > +++ b/arch/riscv/include/asm/pgtable-bits.h
> > @@ -7,7 +7,7 @@
> > #define _ASM_RISCV_PGTABLE_BITS_H
> >
> > /*
> > - * PTE format:
> > + * rv32 PTE format:
> > * | XLEN-1 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
> > * PFN reserved for SW D A G U X W R V
> > */
> > @@ -24,6 +24,47 @@
> > #define _PAGE_DIRTY (1 << 7) /* Set by hardware on any write */
> > #define _PAGE_SOFT (1 << 8) /* Reserved for software */
> >
> > +#ifndef __ASSEMBLY__
> > +#ifdef CONFIG_64BIT
> > +/*
> > + * rv64 PTE format:
> > + * | 63 | 62 61 | 60 54 | 53 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
> > + * N MT RSV PFN reserved for SW D A G U X W R V
> > + * [62:61] Memory Type definitions:
> > + * 00 - PMA Normal Cacheable, No change to implied PMA memory type
> > + * 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory
> > + * 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory
> > + * 11 - Rsvd Reserved for future standard use
> > + */
> > +#define _PAGE_MT_MASK ((u64)0x3 << 61)
> > +#define _PAGE_MT_PMA ((u64)0x0 << 61)
> > +#define _PAGE_MT_NC ((u64)0x1 << 61)
> > +#define _PAGE_MT_IO ((u64)0x2 << 61)
> > +
> > +enum {
> > + MT_PMA,
> > + MT_NC,
> > + MT_IO,
> > + MT_MAX
> > +};
> > +
> > +extern struct __riscv_svpbmt_struct {
> > + unsigned long mask;
> > + unsigned long mt[MT_MAX];
> > +} __riscv_svpbmt;
> > +
> > +#define _PAGE_DMA_MASK __riscv_svpbmt.mask
> > +#define _PAGE_DMA_PMA __riscv_svpbmt.mt[MT_PMA]
> > +#define _PAGE_DMA_NC __riscv_svpbmt.mt[MT_NC]
> > +#define _PAGE_DMA_IO __riscv_svpbmt.mt[MT_IO]
> > +#else
> > +#define _PAGE_DMA_MASK 0
> > +#define _PAGE_DMA_PMA 0
> > +#define _PAGE_DMA_NC 0
> > +#define _PAGE_DMA_IO 0
> > +#endif /* CONFIG_64BIT */
> > +#endif /* __ASSEMBLY__ */
> > +
> > #define _PAGE_SPECIAL _PAGE_SOFT
> > #define _PAGE_TABLE _PAGE_PRESENT
> >
> > @@ -38,7 +79,8 @@
> > /* Set of bits to preserve across pte_modify() */
> > #define _PAGE_CHG_MASK (~(unsigned long)(_PAGE_PRESENT | _PAGE_READ | \
> > _PAGE_WRITE | _PAGE_EXEC | \
> > - _PAGE_USER | _PAGE_GLOBAL))
> > + _PAGE_USER | _PAGE_GLOBAL | \
> > + _PAGE_DMA_MASK))
> > /*
> > * when all of R/W/X are zero, the PTE is a pointer to the next level
> > * of the page table; otherwise, it is a leaf PTE.
> > diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> > index 39b550310ec6..d07ba586c866 100644
> > --- a/arch/riscv/include/asm/pgtable.h
> > +++ b/arch/riscv/include/asm/pgtable.h
> > @@ -136,7 +136,8 @@
> > | _PAGE_PRESENT \
> > | _PAGE_ACCESSED \
> > | _PAGE_DIRTY \
> > - | _PAGE_GLOBAL)
> > + | _PAGE_GLOBAL \
> > + | _PAGE_DMA_PMA)
> >
> > #define PAGE_KERNEL __pgprot(_PAGE_KERNEL)
> > #define PAGE_KERNEL_READ __pgprot(_PAGE_KERNEL & ~_PAGE_WRITE)
> > @@ -146,11 +147,9 @@
> >
> > #define PAGE_TABLE __pgprot(_PAGE_TABLE)
> >
> > -/*
> > - * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
> > - * change the properties of memory regions.
> > - */
> > -#define _PAGE_IOREMAP _PAGE_KERNEL
> > +#define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_DMA_MASK) | _PAGE_DMA_IO)
> > +
> > +#define PAGE_IOREMAP __pgprot(_PAGE_IOREMAP)
> >
> > extern pgd_t swapper_pg_dir[];
> >
> > @@ -230,12 +229,12 @@ static inline unsigned long _pgd_pfn(pgd_t pgd)
> >
> > static inline struct page *pmd_page(pmd_t pmd)
> > {
> > - return pfn_to_page(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
> > + return pfn_to_page((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> > }
> >
> > static inline unsigned long pmd_page_vaddr(pmd_t pmd)
> > {
> > - return (unsigned long)pfn_to_virt(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
> > + return (unsigned long)pfn_to_virt((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> > }
> >
> > static inline pte_t pmd_pte(pmd_t pmd)
> > @@ -251,7 +250,7 @@ static inline pte_t pud_pte(pud_t pud)
> > /* Yields the page frame number (PFN) of a page table entry */
> > static inline unsigned long pte_pfn(pte_t pte)
> > {
> > - return (pte_val(pte) >> _PAGE_PFN_SHIFT);
> > + return ((pte_val(pte) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> > }
> >
> > #define pte_page(x) pfn_to_page(pte_pfn(x))
> > @@ -490,6 +489,28 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
> > return ptep_test_and_clear_young(vma, address, ptep);
> > }
> >
> > +#define pgprot_noncached pgprot_noncached
> > +static inline pgprot_t pgprot_noncached(pgprot_t _prot)
> > +{
> > + unsigned long prot = pgprot_val(_prot);
> > +
> > + prot &= ~_PAGE_DMA_MASK;
> > + prot |= _PAGE_DMA_IO;
> > +
> > + return __pgprot(prot);
> > +}
> > +
> > +#define pgprot_writecombine pgprot_writecombine
> > +static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
> > +{
> > + unsigned long prot = pgprot_val(_prot);
> > +
> > + prot &= ~_PAGE_DMA_MASK;
> > + prot |= _PAGE_DMA_NC;
> > +
> > + return __pgprot(prot);
> > +}
> > +
> > /*
> > * THP functions
> > */
> > diff --git a/arch/riscv/include/asm/processor.h b/arch/riscv/include/asm/processor.h
> > index 021ed64ee608..92676156cbf6 100644
> > --- a/arch/riscv/include/asm/processor.h
> > +++ b/arch/riscv/include/asm/processor.h
> > @@ -70,6 +70,7 @@ struct device_node;
> > int riscv_of_processor_hartid(struct device_node *node);
> > int riscv_of_parent_hartid(struct device_node *node);
> >
> > +extern void riscv_svpbmt(void);
>
> You can drop this change.
>
> > extern void riscv_fill_hwcap(void);
> > extern int arch_dup_task_struct(struct task_struct *dst, struct task_struct *src);
> >
> > diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
> > index d959d207a40d..4a2211c2c464 100644
> > --- a/arch/riscv/kernel/cpufeature.c
> > +++ b/arch/riscv/kernel/cpufeature.c
> > @@ -8,6 +8,7 @@
> >
> > #include <linux/bitmap.h>
> > #include <linux/of.h>
> > +#include <linux/pgtable.h>
> > #include <asm/processor.h>
> > #include <asm/hwcap.h>
> > #include <asm/smp.h>
> > @@ -59,6 +60,28 @@ bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit)
> > }
> > EXPORT_SYMBOL_GPL(__riscv_isa_extension_available);
> >
> > +void __init riscv_svpbmt(void)
>
> Make this static function and call it from riscv_fill_hwcap().
Okay

>
> > +{
> > +#ifdef CONFIG_64BIT
> > + struct device_node *node;
> > + const char *str;
> > +
> > + for_each_of_cpu_node(node) {
> > + if (of_property_read_string(node, "mmu-type", &str)) {
> > + continue;
> > + }
> > +
> > + if (!strncmp(str + 11, "svpbmt", 6)) {
> > + __riscv_svpbmt.mask = _PAGE_MT_MASK;
> > + __riscv_svpbmt.mt[MT_PMA] = _PAGE_MT_PMA;
> > + __riscv_svpbmt.mt[MT_NC] = _PAGE_MT_NC;
> > + __riscv_svpbmt.mt[MT_IO] = _PAGE_MT_IO;
> > + break;
> > + }
>
> I don't agree with this loop.
>
> The __riscv_svpbmt should be updated only when all CPU nodes
> have "svpmbt" in the "mmu-type" DT property.
Okay

>
> > + }
> > +#endif
> > +}
> > +
> > void __init riscv_fill_hwcap(void)
> > {
> > struct device_node *node;
> > diff --git a/arch/riscv/kernel/setup.c b/arch/riscv/kernel/setup.c
> > index 120b2f6f71bc..e7457113b45e 100644
> > --- a/arch/riscv/kernel/setup.c
> > +++ b/arch/riscv/kernel/setup.c
> > @@ -294,6 +294,8 @@ void __init setup_arch(char **cmdline_p)
> > setup_smp();
> > #endif
> >
> > + riscv_svpbmt();
> > +
>
> You can drop this change based on above comment.
>
> > riscv_fill_hwcap();
> > }
> >
> > diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> > index 7cb4f391d106..43b2e11fd3e0 100644
> > --- a/arch/riscv/mm/init.c
> > +++ b/arch/riscv/mm/init.c
> > @@ -905,3 +905,8 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
> > return vmemmap_populate_basepages(start, end, node, NULL);
> > }
> > #endif
> > +
> > +#ifdef CONFIG_64BIT
> > +struct __riscv_svpbmt_struct __riscv_svpbmt __ro_after_init;
> > +EXPORT_SYMBOL(__riscv_svpbmt);
> > +#endif
> > --
> > 2.25.1
> >
>
> Appending "svpmb" suffix to "mmu-type" breaks print_mmu()
> in arch/riscv/kernel/cpu.c. Please update that as well.
Okay

>
> We will also need couple of Tested-by for existing boards.
>
> Regards,
> Anup

--
Best Regards
Guo Ren

ML: https://lore.kernel.org/linux-csky/

2021-09-23 12:09:02

by Guo Ren

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

On Thu, Sep 23, 2021 at 5:48 PM Nick Kossifidis <[email protected]> wrote:
>
> Hello Guo,
>
> Στις 2021-09-23 10:27, [email protected] έγραψε:
> diff --git a/Documentation/devicetree/bindings/riscv/cpus.yaml
> b/Documentation/devicetree/bindings/riscv/cpus.yaml
> index e534f6a7cfa1..1825cd8db0de 100644
> --- a/Documentation/devicetree/bindings/riscv/cpus.yaml
> +++ b/Documentation/devicetree/bindings/riscv/cpus.yaml
> @@ -56,7 +56,9 @@ properties:
> enum:
> - riscv,sv32
> - riscv,sv39
> + - riscv,sv39,svpbmt
> - riscv,sv48
> + - riscv,sv48,svpbmt
> - riscv,none
>
> Isn't svpbmt orthogonal to the mmu type ? It's a functionality that can
> be present on either sv39/48/57 so why not have another "svpbmt"
> property directly on the cpu node ?
Agree, But that's another patch to redesign mmu-type in dts.

>
> > + * rv64 PTE format:
> > + * | 63 | 62 61 | 60 54 | 53 10 | 9 8 | 7 | 6 | 5 | 4 |
> > 3 | 2 | 1 | 0
> > + * N MT RSV PFN reserved for SW D A G U
> > X W R V
> > + * [62:61] Memory Type definitions:
> > + * 00 - PMA Normal Cacheable, No change to implied PMA memory type
> > + * 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory
> > + * 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O
> > memory
> > + * 11 - Rsvd Reserved for future standard use
> > + */
> > +#define _PAGE_MT_MASK ((u64)0x3 << 61)
> > +#define _PAGE_MT_PMA ((u64)0x0 << 61)
> > +#define _PAGE_MT_NC ((u64)0x1 << 61)
> > +#define _PAGE_MT_IO ((u64)0x2 << 61)
> > +
>
> It'd be cleaner IMHO if you defined _PAGE_MT_MASK as (_PAGE_MT_PMA |
> _PAGE_MT_NC | _PAGE_MT_IO), like other masks are defined (e.g.
> _PAGE_CHG_MASK on the same file). I also suggest you use unsigned long
> instead of u64 for consistency.
Okay

>
> > +enum {
> > + MT_PMA,
> > + MT_NC,
> > + MT_IO,
> > + MT_MAX
> > +};
> > +
> > +extern struct __riscv_svpbmt_struct {
> > + unsigned long mask;
> > + unsigned long mt[MT_MAX];
> > +} __riscv_svpbmt;
> > +
> > +#define _PAGE_DMA_MASK __riscv_svpbmt.mask
> > +#define _PAGE_DMA_PMA __riscv_svpbmt.mt[MT_PMA]
> > +#define _PAGE_DMA_NC __riscv_svpbmt.mt[MT_NC]
> > +#define _PAGE_DMA_IO __riscv_svpbmt.mt[MT_IO]
> > +#else
> > +#define _PAGE_DMA_MASK 0
> > +#define _PAGE_DMA_PMA 0
> > +#define _PAGE_DMA_NC 0
> > +#define _PAGE_DMA_IO 0
> > +#endif /* CONFIG_64BIT */
> > +#endif /* __ASSEMBLY__ */
> > +
> > #define _PAGE_SPECIAL _PAGE_SOFT
> > #define _PAGE_TABLE _PAGE_PRESENT
> >
>
> This struct is not useful as part of enabling the standard Svpbmt
> extension on Linux, we can set _PAGE_DMA_* macros directly on this patch
> and introduce the struct approach later on, when we also define
> alternative values for _PAGE_DMA_* flags. Also to someone reading the
> code the struct doesn't make sense without some documentation on why
> it's needed. Finally why the enum / array ? Why not just have different
> fields on the struct ?
Okay, I'll use different fields on the struct.

>
> > diff --git a/arch/riscv/include/asm/pgtable.h
> > b/arch/riscv/include/asm/pgtable.h
> > index 39b550310ec6..d07ba586c866 100644
> > --- a/arch/riscv/include/asm/pgtable.h
> > +++ b/arch/riscv/include/asm/pgtable.h
> > @@ -136,7 +136,8 @@
> > | _PAGE_PRESENT \
> > | _PAGE_ACCESSED \
> > | _PAGE_DIRTY \
> > - | _PAGE_GLOBAL)
> > + | _PAGE_GLOBAL \
> > + | _PAGE_DMA_PMA)
> >
>
> That's a bit misleading, it's like marking the kernel pages as DMAable.
>
> -/*
> - * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so
> we can't
> - * change the properties of memory regions.
> - */
> -#define _PAGE_IOREMAP _PAGE_KERNEL
> +#define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_DMA_MASK) |
> _PAGE_DMA_IO)
> +
> +#define PAGE_IOREMAP __pgprot(_PAGE_IOREMAP)
>
> This isn't used anywhere.

See:

arch/riscv/include/asm/fixmap.h
@@ -43,7 +43,7 @@ enum fixed_addresses {
__end_of_fixed_addresses
};

-#define FIXMAP_PAGE_IO PAGE_KERNEL
+#define FIXMAP_PAGE_IO PAGE_IOREMAP

>
> @@ -490,6 +489,28 @@ static inline int ptep_clear_flush_young(struct
> vm_area_struct *vma,
> return ptep_test_and_clear_young(vma, address, ptep);
> }
>
> +#define pgprot_noncached pgprot_noncached
> +static inline pgprot_t pgprot_noncached(pgprot_t _prot)
> +{
> + unsigned long prot = pgprot_val(_prot);
> +
> + prot &= ~_PAGE_DMA_MASK;
> + prot |= _PAGE_DMA_IO;
> +
> + return __pgprot(prot);
> +}
> +
> +#define pgprot_writecombine pgprot_writecombine
> +static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
> +{
> + unsigned long prot = pgprot_val(_prot);
> +
> + prot &= ~_PAGE_DMA_MASK;
> + prot |= _PAGE_DMA_NC;
> +
> + return __pgprot(prot);
> +}
> +
>
> We also have the IO type, we should also define pgprot_device to also
> ensure ordering, or else it'll fallback to pgprot_noncached, which in
> our case won't work well due to RVWMO:
> https://elixir.bootlin.com/linux/latest/source/include/linux/pgtable.h#L930
Current is:
pgprot_noncached = pgprot_device = IO type
writecombine = NC type

I think it's proper.

>
> +void __init riscv_svpbmt(void)
> +{
> +#ifdef CONFIG_64BIT
> + struct device_node *node;
> + const char *str;
> +
> + for_each_of_cpu_node(node) {
> + if (of_property_read_string(node, "mmu-type", &str)) {
> + continue;
> + }
> +
> + if (!strncmp(str + 11, "svpbmt", 6)) {
> + __riscv_svpbmt.mask = _PAGE_MT_MASK;
> + __riscv_svpbmt.mt[MT_PMA] = _PAGE_MT_PMA;
> + __riscv_svpbmt.mt[MT_NC] = _PAGE_MT_NC;
> + __riscv_svpbmt.mt[MT_IO] = _PAGE_MT_IO;
> + break;
> + }
> + }
> +#endif
> +}
>
> You break; here the first time you find a cpu node with svpbmt enabled,
> shouldn't we make sure that all used cpu nodes support svpbmt before
> using the extension ?
Okay

>
> Regards,
> Nick



--
Best Regards
Guo Ren

ML: https://lore.kernel.org/linux-csky/

2021-09-23 12:09:05

by Guo Ren

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

On Thu, Sep 23, 2021 at 6:09 PM Nick Kossifidis <[email protected]> wrote:
>
> Στις 2021-09-23 13:04, Philipp Tomsich έγραψε:
> >
> > How if we expand this to a mmu subnode in cpu@x and add a booleans for
> > adornments like svnapot and svpbmt?
> > The older mmu-type could then treated to indicate a mmu w/o any
> > adornments specified. I am aware that this generates an additional
> > parsing-path that will be maintained, but it will allow future
> > properties to be grouped.
> >
> > cpu@0 {
> > ...
> > mmu {
> > type = "riscv,sv39";
> > supports-svpbmt;
> > }
> > ...
> > }
>
> I was about to propose the same thing, we can do this now without
> breaking backwards compatibility, we don't really use mmu-type property
> at this point, we are either sv39 or nommu.

It should be another patch.

--
Best Regards
Guo Ren

ML: https://lore.kernel.org/linux-csky/

2021-09-23 12:09:18

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH] riscv: Add RISC-V svpbmt extension

On Thu, Sep 23, 2021 at 5:21 PM Alexandre Ghiti
<[email protected]> wrote:
>
> On Thu, Sep 23, 2021 at 12:11 PM Nick Kossifidis <[email protected]> wrote:
> >
> > Στις 2021-09-23 13:04, Philipp Tomsich έγραψε:
> > >
> > > How if we expand this to a mmu subnode in cpu@x and add a booleans for
> > > adornments like svnapot and svpbmt?
> > > The older mmu-type could then treated to indicate a mmu w/o any
> > > adornments specified. I am aware that this generates an additional
> > > parsing-path that will be maintained, but it will allow future
> > > properties to be grouped.
> > >
> > > cpu@0 {
> > > ...
> > > mmu {
> > > type = "riscv,sv39";
> > > supports-svpbmt;
> > > }
> > > ...
> > > }
> >
> > I was about to propose the same thing, we can do this now without
> > breaking backwards compatibility, we don't really use mmu-type property
> > at this point, we are either sv39 or nommu.
>
> Indeed, this property is only informative and not useful since we can
> directly "ask" the hw what it supports (cf sv48 patchset). And it
> cannot actually be used to force a certain svXX since reading the
> device tree happens way too late in the boot process (I have this
> issue with my sv48 patchset where I used to read the device tree to
> set the size of the address space, but it actually breaks KASAN).
>
> Isn't there a way to know if it supports svPBMT at runtime?

Unfortunately, we can't detect Svpmbt through CSR read/write
or traps.

A DT property seems to be the only way for Svpmbt.

Regards,
Anup

>
> Alex
>
> >
> > _______________________________________________
> > linux-riscv mailing list
> > [email protected]
> > http://lists.infradead.org/mailman/listinfo/linux-riscv