From: Guo Ren <[email protected]>
These patches are a continuation of "riscv: Add DMA_COHERENT support for
Allwinner D1". Compared with the previous factory-defined hardware
implementation, it now follows the PBMT extension proposal.
V2: https://lore.kernel.org/linux-riscv/[email protected]/
V1: https://lore.kernel.org/linux-riscv/[email protected]/
Changes since v2:
- Change to PBMT extension proposal
- Add pbmt in dts
- Using img hdr for custom memory types moification
Changes since v1:
- Rebase on linux-5.13-rc4
- Support defconfig for different PTE attributes
Guo Ren (2):
riscv: pgtable: Add custom protection_map init
riscv: pgtable: Add "PBMT" extension supported
arch/riscv/Kconfig | 4 +++
arch/riscv/include/asm/image.h | 6 ++--
arch/riscv/include/asm/pgtable-64.h | 8 +++--
arch/riscv/include/asm/pgtable-bits.h | 55 ++++++++++++++++++++++++++--
arch/riscv/include/asm/pgtable.h | 17 ++++-----
arch/riscv/kernel/head.S | 6 ++++
arch/riscv/mm/init.c | 68 +++++++++++++++++++++++++++++++++++
mm/mmap.c | 4 +++
8 files changed, 149 insertions(+), 19 deletions(-)
--
2.7.4
From: Guo Ren <[email protected]>
Some RISC-V CPU vendors have defined their own PTE attributes to
solve non-coherent DMA bus problems. That makes _P/SXXX definitions
contain global variables which could be initialized at the early
boot stage before setup_vm.
This patch is similar to 316d097c4cd4 (x86/pti: Filter at
vma->vm_page_prot population) which give a choice for arch custom
implementation.
Signed-off-by: Guo Ren <[email protected]>
Cc: Andrew Morton <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
---
arch/riscv/Kconfig | 4 ++++
arch/riscv/mm/init.c | 22 ++++++++++++++++++++++
mm/mmap.c | 4 ++++
3 files changed, 30 insertions(+)
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index be3d7e2..5327f6f 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -29,6 +29,7 @@ config RISCV
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
select ARCH_HAS_MMIOWB
+ select ARCH_HAS_PROTECTION_MAP_INIT
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SET_DIRECT_MAP
select ARCH_HAS_SET_MEMORY
@@ -207,6 +208,9 @@ config GENERIC_HWEIGHT
config FIX_EARLYCON_MEM
def_bool MMU
+config ARCH_HAS_PROTECTION_MAP_INIT
+ def_bool y
+
config PGTABLE_LEVELS
int
default 3 if 64BIT
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 9c4eabb..4b55046 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -554,6 +554,26 @@ static void __init create_kernel_page_table(pgd_t *pgdir, uintptr_t map_size, bo
}
#endif
+static void __init setup_protection_map(void)
+{
+ protection_map[0] = __P000;
+ protection_map[1] = __P001;
+ protection_map[2] = __P010;
+ protection_map[3] = __P011;
+ protection_map[4] = __P100;
+ protection_map[5] = __P101;
+ protection_map[6] = __P110;
+ protection_map[7] = __P111;
+ protection_map[8] = __S000;
+ protection_map[9] = __S001;
+ protection_map[10] = __S010;
+ protection_map[11] = __S011;
+ protection_map[12] = __S100;
+ protection_map[13] = __S101;
+ protection_map[14] = __S110;
+ protection_map[15] = __S111;
+}
+
asmlinkage void __init setup_vm(uintptr_t dtb_pa)
{
uintptr_t __maybe_unused pa;
@@ -562,6 +582,8 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
pmd_t fix_bmap_spmd, fix_bmap_epmd;
#endif
+ setup_protection_map();
+
#ifdef CONFIG_XIP_KERNEL
xiprom = (uintptr_t)CONFIG_XIP_PHYS_ADDR;
xiprom_sz = (uintptr_t)(&_exiprom) - (uintptr_t)(&_xiprom);
diff --git a/mm/mmap.c b/mm/mmap.c
index 0584e54..0a86059 100644
--- a/mm/mmap.c
+++ b/mm/mmap.c
@@ -100,10 +100,14 @@ static void unmap_region(struct mm_struct *mm,
* w: (no) no
* x: (yes) yes
*/
+#ifdef CONFIG_ARCH_HAS_PROTECTION_MAP_INIT
+pgprot_t protection_map[16] __ro_after_init;
+#else
pgprot_t protection_map[16] __ro_after_init = {
__P000, __P001, __P010, __P011, __P100, __P101, __P110, __P111,
__S000, __S001, __S010, __S011, __S100, __S101, __S110, __S111
};
+#endif
#ifndef CONFIG_ARCH_HAS_FILTER_PGPROT
static inline pgprot_t arch_filter_pgprot(pgprot_t prot)
--
2.7.4
From: Guo Ren <[email protected]>
"PBMT" - Page-Based Memory Types (see Link for detail), current it
has defined 3 memory types [62:61] in PTE:
- WB 00 "Cacheable 'main memory'"
- NC 01 "Noncacheable 'main memory'"
- IO 11 "Non-cacheable non-idempotent 'I/O'"
The patch not only implements the current PBMT extension but also
considers future scalability. It uses 3 words of image header to
store 8 memory types' values plus a mask value. That means there
are still 5 memory types reserved for future scalability.
This patch does not introduce any manufacturer-defined attribute
codes, because RISC-V requires manufacturers to follow the PBMT
extension specification. However, if the manufacturer implements
a custom PBMT code, it can be compatible by modifying
image_hdr.pbmt[3] during the startup phase.
The patch is trying to keep both below works together:
- "Linux Keep real hardware work" (Allwinner D1 needs custom mt)
- "riscv spec acceptance policy" (Svpbmt extension)
Link: https://lists.riscv.org/g/tech-virt-mem/topic/simplified_latest_pbmt/83389883?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,83389883
Signed-off-by: Guo Ren <[email protected]>
Signed-off-by: Liu Shaohua <[email protected]>
Cc: Palmer Dabbelt <[email protected]>
Cc: Christoph Hellwig <[email protected]>
Cc: Anup Patel <[email protected]>
Cc: Arnd Bergmann <[email protected]>
Cc: Drew Fustini <[email protected]>
Cc: Wei Fu <[email protected]>
Cc: Wei Wu <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Maxime Ripard <[email protected]>
Cc: Daniel Lustig <[email protected]>
Cc: Greg Favor <[email protected]>
Cc: Andrea Mondelli <[email protected]>
Cc: Jonathan Behrens <[email protected]>
Cc: Xinhaoqu (Freddie) <[email protected]>
Cc: Bill Huffman <[email protected]>
Cc: Nick Kossifidis <[email protected]>
Cc: Allen Baum <[email protected]>
Cc: Josh Scheid <[email protected]>
Cc: Richard Trauben <[email protected]>
---
arch/riscv/include/asm/image.h | 6 ++--
arch/riscv/include/asm/pgtable-64.h | 8 +++--
arch/riscv/include/asm/pgtable-bits.h | 55 +++++++++++++++++++++++++++++++++--
arch/riscv/include/asm/pgtable.h | 17 +++++------
arch/riscv/kernel/head.S | 6 ++++
arch/riscv/mm/init.c | 46 +++++++++++++++++++++++++++++
6 files changed, 119 insertions(+), 19 deletions(-)
diff --git a/arch/riscv/include/asm/image.h b/arch/riscv/include/asm/image.h
index e0b319a..15e13de 100644
--- a/arch/riscv/include/asm/image.h
+++ b/arch/riscv/include/asm/image.h
@@ -38,8 +38,7 @@
* @image_size: Effective Image size (little endian)
* @flags: kernel flags (little endian)
* @version: version
- * @res1: reserved
- * @res2: reserved
+ * @pbmt[3]: Page-Based Memory Types (Encode within 3 words)
* @magic: Magic number (RISC-V specific; deprecated)
* @magic2: Magic number 2 (to match the ARM64 'magic' field pos)
* @res3: reserved (will be used for PE COFF offset)
@@ -55,8 +54,7 @@ struct riscv_image_header {
u64 image_size;
u64 flags;
u32 version;
- u32 res1;
- u64 res2;
+ u32 pbmt[3];
u64 magic;
u32 magic2;
u32 res3;
diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h
index e3b7c5d..ecf10bc 100644
--- a/arch/riscv/include/asm/pgtable-64.h
+++ b/arch/riscv/include/asm/pgtable-64.h
@@ -61,12 +61,14 @@ static inline void pud_clear(pud_t *pudp)
static inline unsigned long pud_page_vaddr(pud_t pud)
{
- return (unsigned long)pfn_to_virt(pud_val(pud) >> _PAGE_PFN_SHIFT);
+ return (unsigned long)pfn_to_virt(
+ (pud_val(pud) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
}
static inline struct page *pud_page(pud_t pud)
{
- return pfn_to_page(pud_val(pud) >> _PAGE_PFN_SHIFT);
+ return pfn_to_page(
+ (pud_val(pud) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
}
static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
@@ -76,7 +78,7 @@ static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
static inline unsigned long _pmd_pfn(pmd_t pmd)
{
- return pmd_val(pmd) >> _PAGE_PFN_SHIFT;
+ return (pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT;
}
#define pmd_ERROR(e) \
diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
index 2ee4139..1aadd00 100644
--- a/arch/riscv/include/asm/pgtable-bits.h
+++ b/arch/riscv/include/asm/pgtable-bits.h
@@ -7,7 +7,7 @@
#define _ASM_RISCV_PGTABLE_BITS_H
/*
- * PTE format:
+ * rv32 PTE format:
* | XLEN-1 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
* PFN reserved for SW D A G U X W R V
*/
@@ -24,6 +24,49 @@
#define _PAGE_DIRTY (1 << 7) /* Set by hardware on any write */
#define _PAGE_SOFT (1 << 8) /* Reserved for software */
+#ifdef CONFIG_64BIT
+/*
+ * rv64 PTE format:
+ * | 63 | 62 61 | 60 54 | 53 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
+ * N MT RSV PFN reserved for SW D A G U X W R V
+ * [62:61] Memory Type definitions:
+ * - WB: 00
+ * - NC: 01
+ * - IO: 11
+ */
+#define _PAGE_MT_MASK (0x3 << 61)
+#define _PAGE_MT_WB (0x0 << 61)
+#define _PAGE_MT_NC (0x1 << 61)
+#define _PAGE_MT_IO (0x2 << 61)
+
+/*
+ * Using 96 bits of image header to encode memory types. Every types occupy 10
+ * bits and every word contains 3 elements.
+ * - word0: dma + mt[0] + mt[1]
+ * - word1: mt[2] + mt[3] + mt[4]
+ * - word2: mt[5] + mt[6] + mt[7]
+ */
+#define _IMG_HDR_MT_WORD0 (_PAGE_MT_MASK >> 54) | \
+ (_PAGE_MT_WB >> 44) | \
+ (_PAGE_MT_NC >> 34)
+#define _IMG_HDR_MT_WORD1 (_PAGE_MT_IO >> 54) | \
+ (0 >> 44) | \
+ (0 >> 34)
+#define _IMG_HDR_MT_WORD2 (0 >> 54) | \
+ (0 >> 44) | \
+ (0 >> 34)
+
+#define _PAGE_DMA_MASK __riscv_pbmt.mask
+#define _PAGE_DMA_WB __riscv_pbmt.mt[0]
+#define _PAGE_DMA_NC __riscv_pbmt.mt[1]
+#define _PAGE_DMA_IO __riscv_pbmt.mt[2]
+#else
+#define _PAGE_DMA_MASK 0
+#define _PAGE_DMA_WB 0
+#define _PAGE_DMA_NC 0
+#define _PAGE_DMA_IO 0
+#endif
+
#define _PAGE_SPECIAL _PAGE_SOFT
#define _PAGE_TABLE _PAGE_PRESENT
@@ -35,10 +78,18 @@
#define _PAGE_PFN_SHIFT 10
+#ifndef __ASSEMBLY__
+extern struct __riscv_pbmt_struct {
+ unsigned long mask;
+ unsigned long mt[8];
+} __riscv_pbmt;
+#endif
+
/* Set of bits to preserve across pte_modify() */
#define _PAGE_CHG_MASK (~(unsigned long)(_PAGE_PRESENT | _PAGE_READ | \
_PAGE_WRITE | _PAGE_EXEC | \
- _PAGE_USER | _PAGE_GLOBAL))
+ _PAGE_USER | _PAGE_GLOBAL | \
+ _PAGE_DMA_MASK))
/*
* when all of R/W/X are zero, the PTE is a pointer to the next level
* of the page table; otherwise, it is a leaf PTE.
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 46a8aa6..d59c1d3 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -116,7 +116,7 @@
#define USER_PTRS_PER_PGD (TASK_SIZE / PGDIR_SIZE)
/* Page protection bits */
-#define _PAGE_BASE (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_USER)
+#define _PAGE_BASE (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_USER | _PAGE_DMA_WB)
#define PAGE_NONE __pgprot(_PAGE_PROT_NONE)
#define PAGE_READ __pgprot(_PAGE_BASE | _PAGE_READ)
@@ -137,7 +137,8 @@
| _PAGE_PRESENT \
| _PAGE_ACCESSED \
| _PAGE_DIRTY \
- | _PAGE_GLOBAL)
+ | _PAGE_GLOBAL \
+ | _PAGE_DMA_WB)
#define PAGE_KERNEL __pgprot(_PAGE_KERNEL)
#define PAGE_KERNEL_READ __pgprot(_PAGE_KERNEL & ~_PAGE_WRITE)
@@ -147,11 +148,7 @@
#define PAGE_TABLE __pgprot(_PAGE_TABLE)
-/*
- * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
- * change the properties of memory regions.
- */
-#define _PAGE_IOREMAP _PAGE_KERNEL
+#define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_DMA_MASK) | _PAGE_DMA_IO)
extern pgd_t swapper_pg_dir[];
@@ -231,12 +228,12 @@ static inline unsigned long _pgd_pfn(pgd_t pgd)
static inline struct page *pmd_page(pmd_t pmd)
{
- return pfn_to_page(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
+ return pfn_to_page((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
}
static inline unsigned long pmd_page_vaddr(pmd_t pmd)
{
- return (unsigned long)pfn_to_virt(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
+ return (unsigned long)pfn_to_virt((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
}
static inline pte_t pmd_pte(pmd_t pmd)
@@ -252,7 +249,7 @@ static inline pte_t pud_pte(pud_t pud)
/* Yields the page frame number (PFN) of a page table entry */
static inline unsigned long pte_pfn(pte_t pte)
{
- return (pte_val(pte) >> _PAGE_PFN_SHIFT);
+ return ((pte_val(pte) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
}
#define pte_page(x) pfn_to_page(pte_pfn(x))
diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
index 89cc58a..6037df9 100644
--- a/arch/riscv/kernel/head.S
+++ b/arch/riscv/kernel/head.S
@@ -63,8 +63,14 @@ ENTRY(_start)
.dword _end - _start
.dword __HEAD_FLAGS
.word RISCV_HEADER_VERSION
+#ifdef CONFIG_64BIT
+ .word _IMG_HDR_MT_WORD0
+ .word _IMG_HDR_MT_WORD1
+ .word _IMG_HDR_MT_WORD2
+#else
.word 0
.dword 0
+#endif
.ascii RISCV_IMAGE_MAGIC
.balign 4
.ascii RISCV_IMAGE_MAGIC2
diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
index 4b55046..8c1978a 100644
--- a/arch/riscv/mm/init.c
+++ b/arch/riscv/mm/init.c
@@ -21,6 +21,7 @@
#include <linux/crash_dump.h>
#include <asm/fixmap.h>
+#include <asm/image.h>
#include <asm/tlbflush.h>
#include <asm/sections.h>
#include <asm/soc.h>
@@ -554,6 +555,27 @@ static void __init create_kernel_page_table(pgd_t *pgdir, uintptr_t map_size, bo
}
#endif
+#ifdef CONFIG_64BIT
+#define PBMT_HDR_TO_MT(val, off) ((unsigned long)((val >> off) & 0x3ff) << 54)
+
+static void __init setup_pbmt(void)
+{
+ unsigned int *pbmt = ((struct riscv_image_header *)(&_start))->pbmt;
+
+ __riscv_pbmt.mask = PBMT_HDR_TO_MT(pbmt[0], 0);
+ __riscv_pbmt.mt[0] = PBMT_HDR_TO_MT(pbmt[0], 10);
+ __riscv_pbmt.mt[1] = PBMT_HDR_TO_MT(pbmt[0], 20);
+
+ __riscv_pbmt.mt[2] = PBMT_HDR_TO_MT(pbmt[1], 0);
+ __riscv_pbmt.mt[3] = PBMT_HDR_TO_MT(pbmt[1], 10);
+ __riscv_pbmt.mt[4] = PBMT_HDR_TO_MT(pbmt[1], 20);
+
+ __riscv_pbmt.mt[5] = PBMT_HDR_TO_MT(pbmt[2], 0);
+ __riscv_pbmt.mt[6] = PBMT_HDR_TO_MT(pbmt[2], 10);
+ __riscv_pbmt.mt[7] = PBMT_HDR_TO_MT(pbmt[2], 20);
+}
+#endif
+
static void __init setup_protection_map(void)
{
protection_map[0] = __P000;
@@ -582,6 +604,9 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
pmd_t fix_bmap_spmd, fix_bmap_epmd;
#endif
+#ifdef CONFIG_64BIT
+ setup_pbmt();
+#endif
setup_protection_map();
#ifdef CONFIG_XIP_KERNEL
@@ -915,6 +940,19 @@ void __init paging_init(void)
setup_vm_final();
}
+#ifdef CONFIG_64BIT
+void __init pbmt_init(void)
+{
+ struct device_node *cpu;
+
+ cpu = of_find_node_by_path("/cpus");
+ if (!cpu || of_property_read_bool(cpu, "pbmt-extension"))
+ return;
+
+ memset(&__riscv_pbmt, 0, sizeof(__riscv_pbmt));
+}
+#endif
+
void __init misc_mem_init(void)
{
early_memtest(min_low_pfn << PAGE_SHIFT, max_low_pfn << PAGE_SHIFT);
@@ -925,6 +963,9 @@ void __init misc_mem_init(void)
reserve_crashkernel();
#endif
memblock_dump_all();
+#ifdef CONFIG_64BIT
+ pbmt_init();
+#endif
}
#ifdef CONFIG_SPARSEMEM_VMEMMAP
@@ -934,3 +975,8 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
return vmemmap_populate_basepages(start, end, node, NULL);
}
#endif
+
+#ifdef CONFIG_64BIT
+struct __riscv_pbmt_struct __riscv_pbmt __ro_after_init;
+EXPORT_SYMBOL(__riscv_pbmt);
+#endif
--
2.7.4
On Mon, Jun 14, 2021 at 11:22 PM <[email protected]> wrote:
>
> From: Guo Ren <[email protected]>
>
> "PBMT" - Page-Based Memory Types (see Link for detail), current it
> has defined 3 memory types [62:61] in PTE:
> - WB 00 "Cacheable 'main memory'"
> - NC 01 "Noncacheable 'main memory'"
> - IO 11 "Non-cacheable non-idempotent 'I/O'"
>
> The patch not only implements the current PBMT extension but also
> considers future scalability. It uses 3 words of image header to
> store 8 memory types' values plus a mask value. That means there
> are still 5 memory types reserved for future scalability.
This is the worst work-around to the Linux RISC-V patch acceptance
policy.
Passing PTE attributes in the Linux Image header means boot-loaders
will have to update the image header before jumping to Linux kernel.
Basically, this is changing the Linux boot-process by adding platform
specific image header updation step.
Further, this patch is doing too many things in one-go. I needs to be
broken down into smaller fine-grained patches.
I totally disapprove of this patch in it's current shape since the
Linux boot-protocol should not have any platform specific part.
Also, please don't CC RVI mailing list for Linux patches because
the people can post to RVI mailing list only by joining it.
Regards,
Anup
>
> This patch does not introduce any manufacturer-defined attribute
> codes, because RISC-V requires manufacturers to follow the PBMT
> extension specification. However, if the manufacturer implements
> a custom PBMT code, it can be compatible by modifying
> image_hdr.pbmt[3] during the startup phase.
>
> The patch is trying to keep both below works together:
> - "Linux Keep real hardware work" (Allwinner D1 needs custom mt)
> - "riscv spec acceptance policy" (Svpbmt extension)
>
> Link: https://lists.riscv.org/g/tech-virt-mem/topic/simplified_latest_pbmt/83389883?p=,,,20,0,0,0::recentpostdate%2Fsticky,,,20,2,0,83389883
> Signed-off-by: Guo Ren <[email protected]>
> Signed-off-by: Liu Shaohua <[email protected]>
> Cc: Palmer Dabbelt <[email protected]>
> Cc: Christoph Hellwig <[email protected]>
> Cc: Anup Patel <[email protected]>
> Cc: Arnd Bergmann <[email protected]>
> Cc: Drew Fustini <[email protected]>
> Cc: Wei Fu <[email protected]>
> Cc: Wei Wu <[email protected]>
> Cc: Chen-Yu Tsai <[email protected]>
> Cc: Maxime Ripard <[email protected]>
> Cc: Daniel Lustig <[email protected]>
> Cc: Greg Favor <[email protected]>
> Cc: Andrea Mondelli <[email protected]>
> Cc: Jonathan Behrens <[email protected]>
> Cc: Xinhaoqu (Freddie) <[email protected]>
> Cc: Bill Huffman <[email protected]>
> Cc: Nick Kossifidis <[email protected]>
> Cc: Allen Baum <[email protected]>
> Cc: Josh Scheid <[email protected]>
> Cc: Richard Trauben <[email protected]>
> ---
> arch/riscv/include/asm/image.h | 6 ++--
> arch/riscv/include/asm/pgtable-64.h | 8 +++--
> arch/riscv/include/asm/pgtable-bits.h | 55 +++++++++++++++++++++++++++++++++--
> arch/riscv/include/asm/pgtable.h | 17 +++++------
> arch/riscv/kernel/head.S | 6 ++++
> arch/riscv/mm/init.c | 46 +++++++++++++++++++++++++++++
> 6 files changed, 119 insertions(+), 19 deletions(-)
>
> diff --git a/arch/riscv/include/asm/image.h b/arch/riscv/include/asm/image.h
> index e0b319a..15e13de 100644
> --- a/arch/riscv/include/asm/image.h
> +++ b/arch/riscv/include/asm/image.h
> @@ -38,8 +38,7 @@
> * @image_size: Effective Image size (little endian)
> * @flags: kernel flags (little endian)
> * @version: version
> - * @res1: reserved
> - * @res2: reserved
> + * @pbmt[3]: Page-Based Memory Types (Encode within 3 words)
> * @magic: Magic number (RISC-V specific; deprecated)
> * @magic2: Magic number 2 (to match the ARM64 'magic' field pos)
> * @res3: reserved (will be used for PE COFF offset)
> @@ -55,8 +54,7 @@ struct riscv_image_header {
> u64 image_size;
> u64 flags;
> u32 version;
> - u32 res1;
> - u64 res2;
> + u32 pbmt[3];
> u64 magic;
> u32 magic2;
> u32 res3;
> diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h
> index e3b7c5d..ecf10bc 100644
> --- a/arch/riscv/include/asm/pgtable-64.h
> +++ b/arch/riscv/include/asm/pgtable-64.h
> @@ -61,12 +61,14 @@ static inline void pud_clear(pud_t *pudp)
>
> static inline unsigned long pud_page_vaddr(pud_t pud)
> {
> - return (unsigned long)pfn_to_virt(pud_val(pud) >> _PAGE_PFN_SHIFT);
> + return (unsigned long)pfn_to_virt(
> + (pud_val(pud) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> }
>
> static inline struct page *pud_page(pud_t pud)
> {
> - return pfn_to_page(pud_val(pud) >> _PAGE_PFN_SHIFT);
> + return pfn_to_page(
> + (pud_val(pud) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> }
>
> static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
> @@ -76,7 +78,7 @@ static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
>
> static inline unsigned long _pmd_pfn(pmd_t pmd)
> {
> - return pmd_val(pmd) >> _PAGE_PFN_SHIFT;
> + return (pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT;
> }
>
> #define pmd_ERROR(e) \
> diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
> index 2ee4139..1aadd00 100644
> --- a/arch/riscv/include/asm/pgtable-bits.h
> +++ b/arch/riscv/include/asm/pgtable-bits.h
> @@ -7,7 +7,7 @@
> #define _ASM_RISCV_PGTABLE_BITS_H
>
> /*
> - * PTE format:
> + * rv32 PTE format:
> * | XLEN-1 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
> * PFN reserved for SW D A G U X W R V
> */
> @@ -24,6 +24,49 @@
> #define _PAGE_DIRTY (1 << 7) /* Set by hardware on any write */
> #define _PAGE_SOFT (1 << 8) /* Reserved for software */
>
> +#ifdef CONFIG_64BIT
> +/*
> + * rv64 PTE format:
> + * | 63 | 62 61 | 60 54 | 53 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
> + * N MT RSV PFN reserved for SW D A G U X W R V
> + * [62:61] Memory Type definitions:
> + * - WB: 00
> + * - NC: 01
> + * - IO: 11
> + */
> +#define _PAGE_MT_MASK (0x3 << 61)
> +#define _PAGE_MT_WB (0x0 << 61)
> +#define _PAGE_MT_NC (0x1 << 61)
> +#define _PAGE_MT_IO (0x2 << 61)
> +
> +/*
> + * Using 96 bits of image header to encode memory types. Every types occupy 10
> + * bits and every word contains 3 elements.
> + * - word0: dma + mt[0] + mt[1]
> + * - word1: mt[2] + mt[3] + mt[4]
> + * - word2: mt[5] + mt[6] + mt[7]
> + */
> +#define _IMG_HDR_MT_WORD0 (_PAGE_MT_MASK >> 54) | \
> + (_PAGE_MT_WB >> 44) | \
> + (_PAGE_MT_NC >> 34)
> +#define _IMG_HDR_MT_WORD1 (_PAGE_MT_IO >> 54) | \
> + (0 >> 44) | \
> + (0 >> 34)
> +#define _IMG_HDR_MT_WORD2 (0 >> 54) | \
> + (0 >> 44) | \
> + (0 >> 34)
> +
> +#define _PAGE_DMA_MASK __riscv_pbmt.mask
> +#define _PAGE_DMA_WB __riscv_pbmt.mt[0]
> +#define _PAGE_DMA_NC __riscv_pbmt.mt[1]
> +#define _PAGE_DMA_IO __riscv_pbmt.mt[2]
> +#else
> +#define _PAGE_DMA_MASK 0
> +#define _PAGE_DMA_WB 0
> +#define _PAGE_DMA_NC 0
> +#define _PAGE_DMA_IO 0
> +#endif
> +
> #define _PAGE_SPECIAL _PAGE_SOFT
> #define _PAGE_TABLE _PAGE_PRESENT
>
> @@ -35,10 +78,18 @@
>
> #define _PAGE_PFN_SHIFT 10
>
> +#ifndef __ASSEMBLY__
> +extern struct __riscv_pbmt_struct {
> + unsigned long mask;
> + unsigned long mt[8];
> +} __riscv_pbmt;
> +#endif
> +
> /* Set of bits to preserve across pte_modify() */
> #define _PAGE_CHG_MASK (~(unsigned long)(_PAGE_PRESENT | _PAGE_READ | \
> _PAGE_WRITE | _PAGE_EXEC | \
> - _PAGE_USER | _PAGE_GLOBAL))
> + _PAGE_USER | _PAGE_GLOBAL | \
> + _PAGE_DMA_MASK))
> /*
> * when all of R/W/X are zero, the PTE is a pointer to the next level
> * of the page table; otherwise, it is a leaf PTE.
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index 46a8aa6..d59c1d3 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -116,7 +116,7 @@
> #define USER_PTRS_PER_PGD (TASK_SIZE / PGDIR_SIZE)
>
> /* Page protection bits */
> -#define _PAGE_BASE (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_USER)
> +#define _PAGE_BASE (_PAGE_PRESENT | _PAGE_ACCESSED | _PAGE_USER | _PAGE_DMA_WB)
>
> #define PAGE_NONE __pgprot(_PAGE_PROT_NONE)
> #define PAGE_READ __pgprot(_PAGE_BASE | _PAGE_READ)
> @@ -137,7 +137,8 @@
> | _PAGE_PRESENT \
> | _PAGE_ACCESSED \
> | _PAGE_DIRTY \
> - | _PAGE_GLOBAL)
> + | _PAGE_GLOBAL \
> + | _PAGE_DMA_WB)
>
> #define PAGE_KERNEL __pgprot(_PAGE_KERNEL)
> #define PAGE_KERNEL_READ __pgprot(_PAGE_KERNEL & ~_PAGE_WRITE)
> @@ -147,11 +148,7 @@
>
> #define PAGE_TABLE __pgprot(_PAGE_TABLE)
>
> -/*
> - * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
> - * change the properties of memory regions.
> - */
> -#define _PAGE_IOREMAP _PAGE_KERNEL
> +#define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_DMA_MASK) | _PAGE_DMA_IO)
>
> extern pgd_t swapper_pg_dir[];
>
> @@ -231,12 +228,12 @@ static inline unsigned long _pgd_pfn(pgd_t pgd)
>
> static inline struct page *pmd_page(pmd_t pmd)
> {
> - return pfn_to_page(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
> + return pfn_to_page((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> }
>
> static inline unsigned long pmd_page_vaddr(pmd_t pmd)
> {
> - return (unsigned long)pfn_to_virt(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
> + return (unsigned long)pfn_to_virt((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> }
>
> static inline pte_t pmd_pte(pmd_t pmd)
> @@ -252,7 +249,7 @@ static inline pte_t pud_pte(pud_t pud)
> /* Yields the page frame number (PFN) of a page table entry */
> static inline unsigned long pte_pfn(pte_t pte)
> {
> - return (pte_val(pte) >> _PAGE_PFN_SHIFT);
> + return ((pte_val(pte) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
> }
>
> #define pte_page(x) pfn_to_page(pte_pfn(x))
> diff --git a/arch/riscv/kernel/head.S b/arch/riscv/kernel/head.S
> index 89cc58a..6037df9 100644
> --- a/arch/riscv/kernel/head.S
> +++ b/arch/riscv/kernel/head.S
> @@ -63,8 +63,14 @@ ENTRY(_start)
> .dword _end - _start
> .dword __HEAD_FLAGS
> .word RISCV_HEADER_VERSION
> +#ifdef CONFIG_64BIT
> + .word _IMG_HDR_MT_WORD0
> + .word _IMG_HDR_MT_WORD1
> + .word _IMG_HDR_MT_WORD2
> +#else
> .word 0
> .dword 0
> +#endif
> .ascii RISCV_IMAGE_MAGIC
> .balign 4
> .ascii RISCV_IMAGE_MAGIC2
> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
> index 4b55046..8c1978a 100644
> --- a/arch/riscv/mm/init.c
> +++ b/arch/riscv/mm/init.c
> @@ -21,6 +21,7 @@
> #include <linux/crash_dump.h>
>
> #include <asm/fixmap.h>
> +#include <asm/image.h>
> #include <asm/tlbflush.h>
> #include <asm/sections.h>
> #include <asm/soc.h>
> @@ -554,6 +555,27 @@ static void __init create_kernel_page_table(pgd_t *pgdir, uintptr_t map_size, bo
> }
> #endif
>
> +#ifdef CONFIG_64BIT
> +#define PBMT_HDR_TO_MT(val, off) ((unsigned long)((val >> off) & 0x3ff) << 54)
> +
> +static void __init setup_pbmt(void)
> +{
> + unsigned int *pbmt = ((struct riscv_image_header *)(&_start))->pbmt;
> +
> + __riscv_pbmt.mask = PBMT_HDR_TO_MT(pbmt[0], 0);
> + __riscv_pbmt.mt[0] = PBMT_HDR_TO_MT(pbmt[0], 10);
> + __riscv_pbmt.mt[1] = PBMT_HDR_TO_MT(pbmt[0], 20);
> +
> + __riscv_pbmt.mt[2] = PBMT_HDR_TO_MT(pbmt[1], 0);
> + __riscv_pbmt.mt[3] = PBMT_HDR_TO_MT(pbmt[1], 10);
> + __riscv_pbmt.mt[4] = PBMT_HDR_TO_MT(pbmt[1], 20);
> +
> + __riscv_pbmt.mt[5] = PBMT_HDR_TO_MT(pbmt[2], 0);
> + __riscv_pbmt.mt[6] = PBMT_HDR_TO_MT(pbmt[2], 10);
> + __riscv_pbmt.mt[7] = PBMT_HDR_TO_MT(pbmt[2], 20);
> +}
> +#endif
> +
> static void __init setup_protection_map(void)
> {
> protection_map[0] = __P000;
> @@ -582,6 +604,9 @@ asmlinkage void __init setup_vm(uintptr_t dtb_pa)
> pmd_t fix_bmap_spmd, fix_bmap_epmd;
> #endif
>
> +#ifdef CONFIG_64BIT
> + setup_pbmt();
> +#endif
> setup_protection_map();
>
> #ifdef CONFIG_XIP_KERNEL
> @@ -915,6 +940,19 @@ void __init paging_init(void)
> setup_vm_final();
> }
>
> +#ifdef CONFIG_64BIT
> +void __init pbmt_init(void)
> +{
> + struct device_node *cpu;
> +
> + cpu = of_find_node_by_path("/cpus");
> + if (!cpu || of_property_read_bool(cpu, "pbmt-extension"))
> + return;
> +
> + memset(&__riscv_pbmt, 0, sizeof(__riscv_pbmt));
> +}
> +#endif
> +
> void __init misc_mem_init(void)
> {
> early_memtest(min_low_pfn << PAGE_SHIFT, max_low_pfn << PAGE_SHIFT);
> @@ -925,6 +963,9 @@ void __init misc_mem_init(void)
> reserve_crashkernel();
> #endif
> memblock_dump_all();
> +#ifdef CONFIG_64BIT
> + pbmt_init();
> +#endif
> }
>
> #ifdef CONFIG_SPARSEMEM_VMEMMAP
> @@ -934,3 +975,8 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
> return vmemmap_populate_basepages(start, end, node, NULL);
> }
> #endif
> +
> +#ifdef CONFIG_64BIT
> +struct __riscv_pbmt_struct __riscv_pbmt __ro_after_init;
> +EXPORT_SYMBOL(__riscv_pbmt);
> +#endif
> --
> 2.7.4
>
On Tue, Jun 15, 2021 at 12:11 PM Anup Patel <[email protected]> wrote:
>
> On Mon, Jun 14, 2021 at 11:22 PM <[email protected]> wrote:
> >
> > From: Guo Ren <[email protected]>
> >
> > "PBMT" - Page-Based Memory Types (see Link for detail), current it
> > has defined 3 memory types [62:61] in PTE:
> > - WB 00 "Cacheable 'main memory'"
> > - NC 01 "Noncacheable 'main memory'"
> > - IO 11 "Non-cacheable non-idempotent 'I/O'"
> >
> > The patch not only implements the current PBMT extension but also
> > considers future scalability. It uses 3 words of image header to
> > store 8 memory types' values plus a mask value. That means there
> > are still 5 memory types reserved for future scalability.
>
> This is the worst work-around to the Linux RISC-V patch acceptance
> policy.
>
> Passing PTE attributes in the Linux Image header means boot-loaders
> will have to update the image header before jumping to Linux kernel.
> Basically, this is changing the Linux boot-process by adding platform
> specific image header updation step.
>
> Further, this patch is doing too many things in one-go. I needs to be
> broken down into smaller fine-grained patches.
Next time, I'll separate it into below patches:
- riscv: pgtable: Fixup _PAGE_CHG_MASK usage
- riscv: pgtable: Add "PBMT" extension supported
- riscv: pgtable: Add vendor custom "PBMT" definitions interface
How?
>
> I totally disapprove of this patch in it's current shape since the
> Linux boot-protocol should not have any platform specific part.
I think you mean we shouldn't modify the uImage header.
How about parsing dtb before setup_vm (It's a very early stage before
mmu enabled)? eg:
cpus {
...
rv64-pbmt-extension;
rv64-pbmt-custom-remapping = <[mask] [type0] [type1] [type2]>;
>
> Also, please don't CC RVI mailing list for Linux patches because
> the people can post to RVI mailing list only by joining it.
Okay, I'll remove the RVI mailing list. If I needed to, I would send a
notification email to RVI separately.
--
Best Regards
Guo Ren
ML: https://lore.kernel.org/linux-csky/