2022-08-30 04:50:38

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2 0/4] Add PMEM support for RISC-V

The Linux NVDIMM PEM drivers require arch support to map and access the
persistent memory device. This series adds RISC-V PMEM support using
recently added Svpbmt and Zicbom support.

These patches can also be found in riscv_pmem_v2 branch at:
https://github.com/avpatel/linux.git

Changes since v1:
- Fix error reported by test bot
https://lore.kernel.org/all/[email protected]/

Anup Patel (4):
RISC-V: Fix ioremap_cache() and ioremap_wc() for systems with Svpbmt
RISC-V: Move riscv_init_cbom_blocksize() to cacheflush.c
RISC-V: Implement arch specific PMEM APIs
RISC-V: Enable PMEM drivers

arch/riscv/Kconfig | 1 +
arch/riscv/configs/defconfig | 1 +
arch/riscv/include/asm/cacheflush.h | 2 ++
arch/riscv/include/asm/io.h | 10 ++++++++
arch/riscv/include/asm/pgtable.h | 2 ++
arch/riscv/mm/Makefile | 1 +
arch/riscv/mm/cacheflush.c | 39 +++++++++++++++++++++++++++++
arch/riscv/mm/dma-noncoherent.c | 38 ----------------------------
arch/riscv/mm/pmem.c | 21 ++++++++++++++++
9 files changed, 77 insertions(+), 38 deletions(-)
create mode 100644 arch/riscv/mm/pmem.c

--
2.34.1


2022-08-30 04:50:44

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2 2/4] RISC-V: Move riscv_init_cbom_blocksize() to cacheflush.c

The riscv_cbom_block_size parsing from DT belongs to cacheflush.c which
is home for all cache maintenance related stuff so let us move the
riscv_init_cbom_blocksize() and riscv_cbom_block_size to cacheflush.c.

Co-developed-by: Mayuresh Chitale <[email protected]>
Signed-off-by: Mayuresh Chitale <[email protected]>
Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/include/asm/cacheflush.h | 2 ++
arch/riscv/mm/cacheflush.c | 39 +++++++++++++++++++++++++++++
arch/riscv/mm/dma-noncoherent.c | 38 ----------------------------
3 files changed, 41 insertions(+), 38 deletions(-)

diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
index a60acaecfeda..de55d6b8deeb 100644
--- a/arch/riscv/include/asm/cacheflush.h
+++ b/arch/riscv/include/asm/cacheflush.h
@@ -42,6 +42,8 @@ void flush_icache_mm(struct mm_struct *mm, bool local);

#endif /* CONFIG_SMP */

+extern unsigned int riscv_cbom_block_size;
+
#ifdef CONFIG_RISCV_ISA_ZICBOM
void riscv_init_cbom_blocksize(void);
#else
diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index 6cb7d96ad9c7..336c5deea870 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -3,6 +3,8 @@
* Copyright (C) 2017 SiFive
*/

+#include <linux/of.h>
+#include <linux/of_device.h>
#include <asm/cacheflush.h>

#ifdef CONFIG_SMP
@@ -86,3 +88,40 @@ void flush_icache_pte(pte_t pte)
flush_icache_all();
}
#endif /* CONFIG_MMU */
+
+unsigned int riscv_cbom_block_size = L1_CACHE_BYTES;
+
+#ifdef CONFIG_RISCV_ISA_ZICBOM
+void riscv_init_cbom_blocksize(void)
+{
+ struct device_node *node;
+ int ret;
+ u32 val;
+
+ for_each_of_cpu_node(node) {
+ unsigned long hartid;
+ int cbom_hartid;
+
+ ret = riscv_of_processor_hartid(node, &hartid);
+ if (ret)
+ continue;
+
+ if (hartid < 0)
+ continue;
+
+ /* set block-size for cbom extension if available */
+ ret = of_property_read_u32(node, "riscv,cbom-block-size", &val);
+ if (ret)
+ continue;
+
+ if (!riscv_cbom_block_size) {
+ riscv_cbom_block_size = val;
+ cbom_hartid = hartid;
+ } else {
+ if (riscv_cbom_block_size != val)
+ pr_warn("cbom-block-size mismatched between harts %d and %lu\n",
+ cbom_hartid, hartid);
+ }
+ }
+}
+#endif
diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
index cd2225304c82..3f502a1a68b1 100644
--- a/arch/riscv/mm/dma-noncoherent.c
+++ b/arch/riscv/mm/dma-noncoherent.c
@@ -8,11 +8,8 @@
#include <linux/dma-direct.h>
#include <linux/dma-map-ops.h>
#include <linux/mm.h>
-#include <linux/of.h>
-#include <linux/of_device.h>
#include <asm/cacheflush.h>

-static unsigned int riscv_cbom_block_size = L1_CACHE_BYTES;
static bool noncoherent_supported;

void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
@@ -75,41 +72,6 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
dev->dma_coherent = coherent;
}

-#ifdef CONFIG_RISCV_ISA_ZICBOM
-void riscv_init_cbom_blocksize(void)
-{
- struct device_node *node;
- int ret;
- u32 val;
-
- for_each_of_cpu_node(node) {
- unsigned long hartid;
- int cbom_hartid;
-
- ret = riscv_of_processor_hartid(node, &hartid);
- if (ret)
- continue;
-
- if (hartid < 0)
- continue;
-
- /* set block-size for cbom extension if available */
- ret = of_property_read_u32(node, "riscv,cbom-block-size", &val);
- if (ret)
- continue;
-
- if (!riscv_cbom_block_size) {
- riscv_cbom_block_size = val;
- cbom_hartid = hartid;
- } else {
- if (riscv_cbom_block_size != val)
- pr_warn("cbom-block-size mismatched between harts %d and %lu\n",
- cbom_hartid, hartid);
- }
- }
-}
-#endif
-
void riscv_noncoherent_supported(void)
{
noncoherent_supported = true;
--
2.34.1

2022-08-30 04:50:44

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2 1/4] RISC-V: Fix ioremap_cache() and ioremap_wc() for systems with Svpbmt

Currently, all flavors of ioremap_xyz() function maps to the generic
ioremap() which means any ioremap_xyz() call will always map the
target memory as IO using _PAGE_IOREMAP page attributes. This breaks
ioremap_cache() and ioremap_wc() on systems with Svpbmt because memory
remapped using ioremap_cache() and ioremap_wc() will use _PAGE_IOREMAP
page attributes.

To address above (just like other architectures), we implement RISC-V
specific ioremap_cache() and ioremap_wc() which maps memory using page
attributes as defined by the Svpbmt specification.

Fixes: ff689fd21cb1 ("riscv: add RISC-V Svpbmt extension support")
Co-developed-by: Mayuresh Chitale <[email protected]>
Signed-off-by: Mayuresh Chitale <[email protected]>
Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/include/asm/io.h | 10 ++++++++++
arch/riscv/include/asm/pgtable.h | 2 ++
2 files changed, 12 insertions(+)

diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
index 69605a474270..07ac63999575 100644
--- a/arch/riscv/include/asm/io.h
+++ b/arch/riscv/include/asm/io.h
@@ -133,6 +133,16 @@ __io_writes_outs(outs, u64, q, __io_pbr(), __io_paw())
#define outsq(addr, buffer, count) __outsq((void __iomem *)addr, buffer, count)
#endif

+#ifdef CONFIG_MMU
+#define ioremap_wc(addr, size) \
+ ioremap_prot((addr), (size), _PAGE_IOREMAP_WC)
+#endif
+
#include <asm-generic/io.h>

+#ifdef CONFIG_MMU
+#define ioremap_cache(addr, size) \
+ ioremap_prot((addr), (size), _PAGE_KERNEL)
+#endif
+
#endif /* _ASM_RISCV_IO_H */
diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
index 7ec936910a96..346b7c1a3eeb 100644
--- a/arch/riscv/include/asm/pgtable.h
+++ b/arch/riscv/include/asm/pgtable.h
@@ -182,6 +182,8 @@ extern struct pt_alloc_ops pt_ops __initdata;
#define PAGE_TABLE __pgprot(_PAGE_TABLE)

#define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_MTMASK) | _PAGE_IO)
+#define _PAGE_IOREMAP_WC ((_PAGE_KERNEL & ~_PAGE_MTMASK) | \
+ _PAGE_NOCACHE)
#define PAGE_KERNEL_IO __pgprot(_PAGE_IOREMAP)

extern pgd_t swapper_pg_dir[];
--
2.34.1

2022-08-30 05:04:12

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2 4/4] RISC-V: Enable PMEM drivers

We now have PMEM arch support available in RISC-V kernel so let us
enable relevant drivers in defconfig.

Co-developed-by: Mayuresh Chitale <[email protected]>
Signed-off-by: Mayuresh Chitale <[email protected]>
Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/configs/defconfig | 1 +
1 file changed, 1 insertion(+)

diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig
index aed332a9d4ea..010b673ebd11 100644
--- a/arch/riscv/configs/defconfig
+++ b/arch/riscv/configs/defconfig
@@ -159,6 +159,7 @@ CONFIG_VIRTIO_MMIO=y
CONFIG_RPMSG_CHAR=y
CONFIG_RPMSG_CTRL=y
CONFIG_RPMSG_VIRTIO=y
+CONFIG_LIBNVDIMM=y
CONFIG_EXT4_FS=y
CONFIG_EXT4_FS_POSIX_ACL=y
CONFIG_EXT4_FS_SECURITY=y
--
2.34.1

2022-08-30 05:06:14

by Anup Patel

[permalink] [raw]
Subject: [PATCH v2 3/4] RISC-V: Implement arch specific PMEM APIs

The NVDIMM PMEM driver expects arch specific APIs for cache maintenance
and if arch does not provide these APIs then NVDIMM PMEM driver will
always use MEMREMAP_WT to map persistent memory which in-turn maps as
UC memory type defined by the RISC-V Svpbmt specification.

Now that the Svpbmt and Zicbom support is available in RISC-V kernel,
we implement PMEM APIs using ALT_CMO_OP() macros so that the NVDIMM
PMEM driver can use MEMREMAP_WB to map persistent memory.

Co-developed-by: Mayuresh Chitale <[email protected]>
Signed-off-by: Mayuresh Chitale <[email protected]>
Signed-off-by: Anup Patel <[email protected]>
---
arch/riscv/Kconfig | 1 +
arch/riscv/mm/Makefile | 1 +
arch/riscv/mm/pmem.c | 21 +++++++++++++++++++++
3 files changed, 23 insertions(+)
create mode 100644 arch/riscv/mm/pmem.c

diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 0ebd8da388d8..37d6370d29c3 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -25,6 +25,7 @@ config RISCV
select ARCH_HAS_GIGANTIC_PAGE
select ARCH_HAS_KCOV
select ARCH_HAS_MMIOWB
+ select ARCH_HAS_PMEM_API
select ARCH_HAS_PTE_SPECIAL
select ARCH_HAS_SET_DIRECT_MAP if MMU
select ARCH_HAS_SET_MEMORY if MMU
diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
index d76aabf4b94d..3b368e547f83 100644
--- a/arch/riscv/mm/Makefile
+++ b/arch/riscv/mm/Makefile
@@ -31,3 +31,4 @@ endif

obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
obj-$(CONFIG_RISCV_DMA_NONCOHERENT) += dma-noncoherent.o
+obj-$(CONFIG_ARCH_HAS_PMEM_API) += pmem.o
diff --git a/arch/riscv/mm/pmem.c b/arch/riscv/mm/pmem.c
new file mode 100644
index 000000000000..089df92ae876
--- /dev/null
+++ b/arch/riscv/mm/pmem.c
@@ -0,0 +1,21 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * Copyright (C) 2022 Ventana Micro Systems Inc.
+ */
+
+#include <linux/export.h>
+#include <linux/libnvdimm.h>
+
+#include <asm/cacheflush.h>
+
+void arch_wb_cache_pmem(void *addr, size_t size)
+{
+ ALT_CMO_OP(clean, addr, size, riscv_cbom_block_size);
+}
+EXPORT_SYMBOL_GPL(arch_wb_cache_pmem);
+
+void arch_invalidate_pmem(void *addr, size_t size)
+{
+ ALT_CMO_OP(inval, addr, size, riscv_cbom_block_size);
+}
+EXPORT_SYMBOL_GPL(arch_invalidate_pmem);
--
2.34.1

2022-09-01 15:42:48

by Heiko Stuebner

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] RISC-V: Move riscv_init_cbom_blocksize() to cacheflush.c

Hi,

Am Dienstag, 30. August 2022, 06:46:40 CEST schrieb Anup Patel:
> The riscv_cbom_block_size parsing from DT belongs to cacheflush.c which
> is home for all cache maintenance related stuff so let us move the
> riscv_init_cbom_blocksize() and riscv_cbom_block_size to cacheflush.c.
>
> Co-developed-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Anup Patel <[email protected]>

Makes a lot of sense to keep stuff together.

Reviewed-by: Heiko Stuebner <[email protected]>
Tested-by: Heiko Stuebner <[email protected]>


Also, can we handle this as fix patch?

I.e. Currently the t-head code somewhat relies on the default value
set to L1_CACHE_BYTES. The cache-block-size is static there.

Palmers upcoming patch reworking the parsing [0], will remove that default,
so having the riscv_cbom_block_size defined in the cacheflush header
will allow an easy fix by setting that value from the t-head errata init
for those cores.


Heiko

[0] https://lore.kernel.org/r/[email protected]

> ---
> arch/riscv/include/asm/cacheflush.h | 2 ++
> arch/riscv/mm/cacheflush.c | 39 +++++++++++++++++++++++++++++
> arch/riscv/mm/dma-noncoherent.c | 38 ----------------------------
> 3 files changed, 41 insertions(+), 38 deletions(-)
>
> diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
> index a60acaecfeda..de55d6b8deeb 100644
> --- a/arch/riscv/include/asm/cacheflush.h
> +++ b/arch/riscv/include/asm/cacheflush.h
> @@ -42,6 +42,8 @@ void flush_icache_mm(struct mm_struct *mm, bool local);
>
> #endif /* CONFIG_SMP */
>
> +extern unsigned int riscv_cbom_block_size;
> +
> #ifdef CONFIG_RISCV_ISA_ZICBOM
> void riscv_init_cbom_blocksize(void);
> #else
> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
> index 6cb7d96ad9c7..336c5deea870 100644
> --- a/arch/riscv/mm/cacheflush.c
> +++ b/arch/riscv/mm/cacheflush.c
> @@ -3,6 +3,8 @@
> * Copyright (C) 2017 SiFive
> */
>
> +#include <linux/of.h>
> +#include <linux/of_device.h>
> #include <asm/cacheflush.h>
>
> #ifdef CONFIG_SMP
> @@ -86,3 +88,40 @@ void flush_icache_pte(pte_t pte)
> flush_icache_all();
> }
> #endif /* CONFIG_MMU */
> +
> +unsigned int riscv_cbom_block_size = L1_CACHE_BYTES;
> +
> +#ifdef CONFIG_RISCV_ISA_ZICBOM
> +void riscv_init_cbom_blocksize(void)
> +{
> + struct device_node *node;
> + int ret;
> + u32 val;
> +
> + for_each_of_cpu_node(node) {
> + unsigned long hartid;
> + int cbom_hartid;
> +
> + ret = riscv_of_processor_hartid(node, &hartid);
> + if (ret)
> + continue;
> +
> + if (hartid < 0)
> + continue;
> +
> + /* set block-size for cbom extension if available */
> + ret = of_property_read_u32(node, "riscv,cbom-block-size", &val);
> + if (ret)
> + continue;
> +
> + if (!riscv_cbom_block_size) {
> + riscv_cbom_block_size = val;
> + cbom_hartid = hartid;
> + } else {
> + if (riscv_cbom_block_size != val)
> + pr_warn("cbom-block-size mismatched between harts %d and %lu\n",
> + cbom_hartid, hartid);
> + }
> + }
> +}
> +#endif
> diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
> index cd2225304c82..3f502a1a68b1 100644
> --- a/arch/riscv/mm/dma-noncoherent.c
> +++ b/arch/riscv/mm/dma-noncoherent.c
> @@ -8,11 +8,8 @@
> #include <linux/dma-direct.h>
> #include <linux/dma-map-ops.h>
> #include <linux/mm.h>
> -#include <linux/of.h>
> -#include <linux/of_device.h>
> #include <asm/cacheflush.h>
>
> -static unsigned int riscv_cbom_block_size = L1_CACHE_BYTES;
> static bool noncoherent_supported;
>
> void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
> @@ -75,41 +72,6 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
> dev->dma_coherent = coherent;
> }
>
> -#ifdef CONFIG_RISCV_ISA_ZICBOM
> -void riscv_init_cbom_blocksize(void)
> -{
> - struct device_node *node;
> - int ret;
> - u32 val;
> -
> - for_each_of_cpu_node(node) {
> - unsigned long hartid;
> - int cbom_hartid;
> -
> - ret = riscv_of_processor_hartid(node, &hartid);
> - if (ret)
> - continue;
> -
> - if (hartid < 0)
> - continue;
> -
> - /* set block-size for cbom extension if available */
> - ret = of_property_read_u32(node, "riscv,cbom-block-size", &val);
> - if (ret)
> - continue;
> -
> - if (!riscv_cbom_block_size) {
> - riscv_cbom_block_size = val;
> - cbom_hartid = hartid;
> - } else {
> - if (riscv_cbom_block_size != val)
> - pr_warn("cbom-block-size mismatched between harts %d and %lu\n",
> - cbom_hartid, hartid);
> - }
> - }
> -}
> -#endif
> -
> void riscv_noncoherent_supported(void)
> {
> noncoherent_supported = true;
>




2022-09-01 16:31:20

by Heiko Stuebner

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] RISC-V: Fix ioremap_cache() and ioremap_wc() for systems with Svpbmt

Am Dienstag, 30. August 2022, 06:46:39 CEST schrieb Anup Patel:
> Currently, all flavors of ioremap_xyz() function maps to the generic
> ioremap() which means any ioremap_xyz() call will always map the
> target memory as IO using _PAGE_IOREMAP page attributes. This breaks
> ioremap_cache() and ioremap_wc() on systems with Svpbmt because memory
> remapped using ioremap_cache() and ioremap_wc() will use _PAGE_IOREMAP
> page attributes.
>
> To address above (just like other architectures), we implement RISC-V
> specific ioremap_cache() and ioremap_wc() which maps memory using page
> attributes as defined by the Svpbmt specification.
>
> Fixes: ff689fd21cb1 ("riscv: add RISC-V Svpbmt extension support")
> Co-developed-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Anup Patel <[email protected]>

Reviewed-by: Heiko Stuebner <[email protected]>

And at least on a qemu
Tested-by: Heiko Stuebner <[email protected]>

But it was running there before as well of course.

Heiko

> ---
> arch/riscv/include/asm/io.h | 10 ++++++++++
> arch/riscv/include/asm/pgtable.h | 2 ++
> 2 files changed, 12 insertions(+)
>
> diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
> index 69605a474270..07ac63999575 100644
> --- a/arch/riscv/include/asm/io.h
> +++ b/arch/riscv/include/asm/io.h
> @@ -133,6 +133,16 @@ __io_writes_outs(outs, u64, q, __io_pbr(), __io_paw())
> #define outsq(addr, buffer, count) __outsq((void __iomem *)addr, buffer, count)
> #endif
>
> +#ifdef CONFIG_MMU
> +#define ioremap_wc(addr, size) \
> + ioremap_prot((addr), (size), _PAGE_IOREMAP_WC)
> +#endif
> +
> #include <asm-generic/io.h>
>
> +#ifdef CONFIG_MMU
> +#define ioremap_cache(addr, size) \
> + ioremap_prot((addr), (size), _PAGE_KERNEL)
> +#endif
> +
> #endif /* _ASM_RISCV_IO_H */
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index 7ec936910a96..346b7c1a3eeb 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -182,6 +182,8 @@ extern struct pt_alloc_ops pt_ops __initdata;
> #define PAGE_TABLE __pgprot(_PAGE_TABLE)
>
> #define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_MTMASK) | _PAGE_IO)
> +#define _PAGE_IOREMAP_WC ((_PAGE_KERNEL & ~_PAGE_MTMASK) | \
> + _PAGE_NOCACHE)
> #define PAGE_KERNEL_IO __pgprot(_PAGE_IOREMAP)
>
> extern pgd_t swapper_pg_dir[];
>




2022-09-01 16:38:48

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v2 4/4] RISC-V: Enable PMEM drivers

On 30/08/2022 05:46, Anup Patel wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> We now have PMEM arch support available in RISC-V kernel so let us
> enable relevant drivers in defconfig.
>
> Co-developed-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Anup Patel <[email protected]>

Bunch of new build time warnings related to pmem w/ this patch
but they all seem to be problems in the nvdimm core code that've
just been exposed here...

Here's another name to add to your single-line defconfig change:
Reviewed-by: Conor Dooley <[email protected]>

> ---
> arch/riscv/configs/defconfig | 1 +
> 1 file changed, 1 insertion(+)
>
> diff --git a/arch/riscv/configs/defconfig b/arch/riscv/configs/defconfig
> index aed332a9d4ea..010b673ebd11 100644
> --- a/arch/riscv/configs/defconfig
> +++ b/arch/riscv/configs/defconfig
> @@ -159,6 +159,7 @@ CONFIG_VIRTIO_MMIO=y
> CONFIG_RPMSG_CHAR=y
> CONFIG_RPMSG_CTRL=y
> CONFIG_RPMSG_VIRTIO=y
> +CONFIG_LIBNVDIMM=y
> CONFIG_EXT4_FS=y
> CONFIG_EXT4_FS_POSIX_ACL=y
> CONFIG_EXT4_FS_SECURITY=y
> --
> 2.34.1
>

2022-09-01 16:45:14

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] RISC-V: Fix ioremap_cache() and ioremap_wc() for systems with Svpbmt

On 30/08/2022 05:46, Anup Patel wrote:
> Currently, all flavors of ioremap_xyz() function maps to the generic
> ioremap() which means any ioremap_xyz() call will always map the
> target memory as IO using _PAGE_IOREMAP page attributes. This breaks
> ioremap_cache() and ioremap_wc() on systems with Svpbmt because memory
> remapped using ioremap_cache() and ioremap_wc() will use _PAGE_IOREMAP
> page attributes.
>
> To address above (just like other architectures), we implement RISC-V
> specific ioremap_cache() and ioremap_wc() which maps memory using page
> attributes as defined by the Svpbmt specification.
>
> Fixes: ff689fd21cb1 ("riscv: add RISC-V Svpbmt extension support")
> Co-developed-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Anup Patel <[email protected]>

Seems sane to me too :)
Reviewed-by: Conor Dooley <[email protected]>

> ---
> arch/riscv/include/asm/io.h | 10 ++++++++++
> arch/riscv/include/asm/pgtable.h | 2 ++
> 2 files changed, 12 insertions(+)
>
> diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
> index 69605a474270..07ac63999575 100644
> --- a/arch/riscv/include/asm/io.h
> +++ b/arch/riscv/include/asm/io.h
> @@ -133,6 +133,16 @@ __io_writes_outs(outs, u64, q, __io_pbr(), __io_paw())
> #define outsq(addr, buffer, count) __outsq((void __iomem *)addr, buffer, count)
> #endif
>
> +#ifdef CONFIG_MMU
> +#define ioremap_wc(addr, size) \
> + ioremap_prot((addr), (size), _PAGE_IOREMAP_WC)
> +#endif
> +
> #include <asm-generic/io.h>
>
> +#ifdef CONFIG_MMU
> +#define ioremap_cache(addr, size) \
> + ioremap_prot((addr), (size), _PAGE_KERNEL)
> +#endif
> +
> #endif /* _ASM_RISCV_IO_H */
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index 7ec936910a96..346b7c1a3eeb 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -182,6 +182,8 @@ extern struct pt_alloc_ops pt_ops __initdata;
> #define PAGE_TABLE __pgprot(_PAGE_TABLE)
>
> #define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_MTMASK) | _PAGE_IO)
> +#define _PAGE_IOREMAP_WC ((_PAGE_KERNEL & ~_PAGE_MTMASK) | \
> + _PAGE_NOCACHE)
> #define PAGE_KERNEL_IO __pgprot(_PAGE_IOREMAP)
>
> extern pgd_t swapper_pg_dir[];

2022-09-01 16:49:14

by Heiko Stuebner

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] RISC-V: Implement arch specific PMEM APIs

Hi Anup,

Am Dienstag, 30. August 2022, 06:46:41 CEST schrieb Anup Patel:
> The NVDIMM PMEM driver expects arch specific APIs for cache maintenance
> and if arch does not provide these APIs then NVDIMM PMEM driver will
> always use MEMREMAP_WT to map persistent memory which in-turn maps as
> UC memory type defined by the RISC-V Svpbmt specification.
>
> Now that the Svpbmt and Zicbom support is available in RISC-V kernel,
> we implement PMEM APIs using ALT_CMO_OP() macros so that the NVDIMM
> PMEM driver can use MEMREMAP_WB to map persistent memory.

Zicbom is detected at runtime, though that kconfig setting changes the
behaviour for the memremap-type at compile-time. So what happens on
systems not using zicbom (or another cmo-variant) ?


Heiko

> Co-developed-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Anup Patel <[email protected]>
> ---
> arch/riscv/Kconfig | 1 +
> arch/riscv/mm/Makefile | 1 +
> arch/riscv/mm/pmem.c | 21 +++++++++++++++++++++
> 3 files changed, 23 insertions(+)
> create mode 100644 arch/riscv/mm/pmem.c
>
> diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> index 0ebd8da388d8..37d6370d29c3 100644
> --- a/arch/riscv/Kconfig
> +++ b/arch/riscv/Kconfig
> @@ -25,6 +25,7 @@ config RISCV
> select ARCH_HAS_GIGANTIC_PAGE
> select ARCH_HAS_KCOV
> select ARCH_HAS_MMIOWB
> + select ARCH_HAS_PMEM_API
> select ARCH_HAS_PTE_SPECIAL
> select ARCH_HAS_SET_DIRECT_MAP if MMU
> select ARCH_HAS_SET_MEMORY if MMU
> diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
> index d76aabf4b94d..3b368e547f83 100644
> --- a/arch/riscv/mm/Makefile
> +++ b/arch/riscv/mm/Makefile
> @@ -31,3 +31,4 @@ endif
>
> obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
> obj-$(CONFIG_RISCV_DMA_NONCOHERENT) += dma-noncoherent.o
> +obj-$(CONFIG_ARCH_HAS_PMEM_API) += pmem.o
> diff --git a/arch/riscv/mm/pmem.c b/arch/riscv/mm/pmem.c
> new file mode 100644
> index 000000000000..089df92ae876
> --- /dev/null
> +++ b/arch/riscv/mm/pmem.c
> @@ -0,0 +1,21 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * Copyright (C) 2022 Ventana Micro Systems Inc.
> + */
> +
> +#include <linux/export.h>
> +#include <linux/libnvdimm.h>
> +
> +#include <asm/cacheflush.h>
> +
> +void arch_wb_cache_pmem(void *addr, size_t size)
> +{
> + ALT_CMO_OP(clean, addr, size, riscv_cbom_block_size);
> +}
> +EXPORT_SYMBOL_GPL(arch_wb_cache_pmem);
> +
> +void arch_invalidate_pmem(void *addr, size_t size)
> +{
> + ALT_CMO_OP(inval, addr, size, riscv_cbom_block_size);
> +}
> +EXPORT_SYMBOL_GPL(arch_invalidate_pmem);
>




2022-09-01 17:11:01

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v2 2/4] RISC-V: Move riscv_init_cbom_blocksize() to cacheflush.c

On 01/09/2022 16:29, Heiko Stübner wrote:
> Hi,
>
> Am Dienstag, 30. August 2022, 06:46:40 CEST schrieb Anup Patel:
>> The riscv_cbom_block_size parsing from DT belongs to cacheflush.c which
>> is home for all cache maintenance related stuff so let us move the
>> riscv_init_cbom_blocksize() and riscv_cbom_block_size to cacheflush.c.
>>
>> Co-developed-by: Mayuresh Chitale <[email protected]>
>> Signed-off-by: Mayuresh Chitale <[email protected]>
>> Signed-off-by: Anup Patel <[email protected]>
>
> Makes a lot of sense to keep stuff together.
>
> Reviewed-by: Heiko Stuebner <[email protected]>
> Tested-by: Heiko Stuebner <[email protected]>
>
>
> Also, can we handle this as fix patch?
>
> I.e. Currently the t-head code somewhat relies on the default value
> set to L1_CACHE_BYTES. The cache-block-size is static there.
>
> Palmers upcoming patch reworking the parsing [0], will remove that default,
> so having the riscv_cbom_block_size defined in the cacheflush header
> will allow an easy fix by setting that value from the t-head errata init
> for those cores.

@Palmer what is the status of that fix?
Since I have a clang toolchain set up, I could squash my fixup-for-clang into
your patch and respin it as a real patch for Anup to base on?

Anyway, I too like this cleanup:
Reviewed-by: Conor Dooley <[email protected]>

>
>
> Heiko
>
> [0] https://lore.kernel.org/r/[email protected]
>
>> ---
>> arch/riscv/include/asm/cacheflush.h | 2 ++
>> arch/riscv/mm/cacheflush.c | 39 +++++++++++++++++++++++++++++
>> arch/riscv/mm/dma-noncoherent.c | 38 ----------------------------
>> 3 files changed, 41 insertions(+), 38 deletions(-)
>>
>> diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
>> index a60acaecfeda..de55d6b8deeb 100644
>> --- a/arch/riscv/include/asm/cacheflush.h
>> +++ b/arch/riscv/include/asm/cacheflush.h
>> @@ -42,6 +42,8 @@ void flush_icache_mm(struct mm_struct *mm, bool local);
>>
>> #endif /* CONFIG_SMP */
>>
>> +extern unsigned int riscv_cbom_block_size;
>> +
>> #ifdef CONFIG_RISCV_ISA_ZICBOM
>> void riscv_init_cbom_blocksize(void);
>> #else
>> diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
>> index 6cb7d96ad9c7..336c5deea870 100644
>> --- a/arch/riscv/mm/cacheflush.c
>> +++ b/arch/riscv/mm/cacheflush.c
>> @@ -3,6 +3,8 @@
>> * Copyright (C) 2017 SiFive
>> */
>>
>> +#include <linux/of.h>
>> +#include <linux/of_device.h>
>> #include <asm/cacheflush.h>
>>
>> #ifdef CONFIG_SMP
>> @@ -86,3 +88,40 @@ void flush_icache_pte(pte_t pte)
>> flush_icache_all();
>> }
>> #endif /* CONFIG_MMU */
>> +
>> +unsigned int riscv_cbom_block_size = L1_CACHE_BYTES;
>> +
>> +#ifdef CONFIG_RISCV_ISA_ZICBOM
>> +void riscv_init_cbom_blocksize(void)
>> +{
>> + struct device_node *node;
>> + int ret;
>> + u32 val;
>> +
>> + for_each_of_cpu_node(node) {
>> + unsigned long hartid;
>> + int cbom_hartid;
>> +
>> + ret = riscv_of_processor_hartid(node, &hartid);
>> + if (ret)
>> + continue;
>> +
>> + if (hartid < 0)
>> + continue;
>> +
>> + /* set block-size for cbom extension if available */
>> + ret = of_property_read_u32(node, "riscv,cbom-block-size", &val);
>> + if (ret)
>> + continue;
>> +
>> + if (!riscv_cbom_block_size) {
>> + riscv_cbom_block_size = val;
>> + cbom_hartid = hartid;
>> + } else {
>> + if (riscv_cbom_block_size != val)
>> + pr_warn("cbom-block-size mismatched between harts %d and %lu\n",
>> + cbom_hartid, hartid);
>> + }
>> + }
>> +}
>> +#endif
>> diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
>> index cd2225304c82..3f502a1a68b1 100644
>> --- a/arch/riscv/mm/dma-noncoherent.c
>> +++ b/arch/riscv/mm/dma-noncoherent.c
>> @@ -8,11 +8,8 @@
>> #include <linux/dma-direct.h>
>> #include <linux/dma-map-ops.h>
>> #include <linux/mm.h>
>> -#include <linux/of.h>
>> -#include <linux/of_device.h>
>> #include <asm/cacheflush.h>
>>
>> -static unsigned int riscv_cbom_block_size = L1_CACHE_BYTES;
>> static bool noncoherent_supported;
>>
>> void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
>> @@ -75,41 +72,6 @@ void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
>> dev->dma_coherent = coherent;
>> }
>>
>> -#ifdef CONFIG_RISCV_ISA_ZICBOM
>> -void riscv_init_cbom_blocksize(void)
>> -{
>> - struct device_node *node;
>> - int ret;
>> - u32 val;
>> -
>> - for_each_of_cpu_node(node) {
>> - unsigned long hartid;
>> - int cbom_hartid;
>> -
>> - ret = riscv_of_processor_hartid(node, &hartid);
>> - if (ret)
>> - continue;
>> -
>> - if (hartid < 0)
>> - continue;
>> -
>> - /* set block-size for cbom extension if available */
>> - ret = of_property_read_u32(node, "riscv,cbom-block-size", &val);
>> - if (ret)
>> - continue;
>> -
>> - if (!riscv_cbom_block_size) {
>> - riscv_cbom_block_size = val;
>> - cbom_hartid = hartid;
>> - } else {
>> - if (riscv_cbom_block_size != val)
>> - pr_warn("cbom-block-size mismatched between harts %d and %lu\n",
>> - cbom_hartid, hartid);
>> - }
>> - }
>> -}
>> -#endif
>> -
>> void riscv_noncoherent_supported(void)
>> {
>> noncoherent_supported = true;
>>
>
>
>
>
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-riscv

2022-09-03 16:15:21

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v2 3/4] RISC-V: Implement arch specific PMEM APIs

On Thu, Sep 1, 2022 at 9:08 PM Heiko Stübner <[email protected]> wrote:
>
> Hi Anup,
>
> Am Dienstag, 30. August 2022, 06:46:41 CEST schrieb Anup Patel:
> > The NVDIMM PMEM driver expects arch specific APIs for cache maintenance
> > and if arch does not provide these APIs then NVDIMM PMEM driver will
> > always use MEMREMAP_WT to map persistent memory which in-turn maps as
> > UC memory type defined by the RISC-V Svpbmt specification.
> >
> > Now that the Svpbmt and Zicbom support is available in RISC-V kernel,
> > we implement PMEM APIs using ALT_CMO_OP() macros so that the NVDIMM
> > PMEM driver can use MEMREMAP_WB to map persistent memory.
>
> Zicbom is detected at runtime, though that kconfig setting changes the
> behaviour for the memremap-type at compile-time. So what happens on
> systems not using zicbom (or another cmo-variant) ?

On a system without Zicbom (or some other cmo-variant), the PMEM read
will always work but PMEM writes will not be reliable.

Currently, the generic PMEM driver has no provision to allow arch code to
disable cacheable mapping at boot-time.

Maybe we can add WARN_ONCE() for the case when arch_xyz_pmem()
is called on a system not having Zicbom ?

Regards,
Anup

>
>
> Heiko
>
> > Co-developed-by: Mayuresh Chitale <[email protected]>
> > Signed-off-by: Mayuresh Chitale <[email protected]>
> > Signed-off-by: Anup Patel <[email protected]>
> > ---
> > arch/riscv/Kconfig | 1 +
> > arch/riscv/mm/Makefile | 1 +
> > arch/riscv/mm/pmem.c | 21 +++++++++++++++++++++
> > 3 files changed, 23 insertions(+)
> > create mode 100644 arch/riscv/mm/pmem.c
> >
> > diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
> > index 0ebd8da388d8..37d6370d29c3 100644
> > --- a/arch/riscv/Kconfig
> > +++ b/arch/riscv/Kconfig
> > @@ -25,6 +25,7 @@ config RISCV
> > select ARCH_HAS_GIGANTIC_PAGE
> > select ARCH_HAS_KCOV
> > select ARCH_HAS_MMIOWB
> > + select ARCH_HAS_PMEM_API
> > select ARCH_HAS_PTE_SPECIAL
> > select ARCH_HAS_SET_DIRECT_MAP if MMU
> > select ARCH_HAS_SET_MEMORY if MMU
> > diff --git a/arch/riscv/mm/Makefile b/arch/riscv/mm/Makefile
> > index d76aabf4b94d..3b368e547f83 100644
> > --- a/arch/riscv/mm/Makefile
> > +++ b/arch/riscv/mm/Makefile
> > @@ -31,3 +31,4 @@ endif
> >
> > obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o
> > obj-$(CONFIG_RISCV_DMA_NONCOHERENT) += dma-noncoherent.o
> > +obj-$(CONFIG_ARCH_HAS_PMEM_API) += pmem.o
> > diff --git a/arch/riscv/mm/pmem.c b/arch/riscv/mm/pmem.c
> > new file mode 100644
> > index 000000000000..089df92ae876
> > --- /dev/null
> > +++ b/arch/riscv/mm/pmem.c
> > @@ -0,0 +1,21 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * Copyright (C) 2022 Ventana Micro Systems Inc.
> > + */
> > +
> > +#include <linux/export.h>
> > +#include <linux/libnvdimm.h>
> > +
> > +#include <asm/cacheflush.h>
> > +
> > +void arch_wb_cache_pmem(void *addr, size_t size)
> > +{
> > + ALT_CMO_OP(clean, addr, size, riscv_cbom_block_size);
> > +}
> > +EXPORT_SYMBOL_GPL(arch_wb_cache_pmem);
> > +
> > +void arch_invalidate_pmem(void *addr, size_t size)
> > +{
> > + ALT_CMO_OP(inval, addr, size, riscv_cbom_block_size);
> > +}
> > +EXPORT_SYMBOL_GPL(arch_invalidate_pmem);
> >
>
>
>
>

2022-09-09 08:12:48

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] RISC-V: Fix ioremap_cache() and ioremap_wc() for systems with Svpbmt

Hi Palmer,

On Tue, Aug 30, 2022 at 10:17 AM Anup Patel <[email protected]> wrote:
>
> Currently, all flavors of ioremap_xyz() function maps to the generic
> ioremap() which means any ioremap_xyz() call will always map the
> target memory as IO using _PAGE_IOREMAP page attributes. This breaks
> ioremap_cache() and ioremap_wc() on systems with Svpbmt because memory
> remapped using ioremap_cache() and ioremap_wc() will use _PAGE_IOREMAP
> page attributes.
>
> To address above (just like other architectures), we implement RISC-V
> specific ioremap_cache() and ioremap_wc() which maps memory using page
> attributes as defined by the Svpbmt specification.
>
> Fixes: ff689fd21cb1 ("riscv: add RISC-V Svpbmt extension support")
> Co-developed-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Anup Patel <[email protected]>

Can you please consider this patch for Linux-6.0-rc5 ?

Regards,
Anup

> ---
> arch/riscv/include/asm/io.h | 10 ++++++++++
> arch/riscv/include/asm/pgtable.h | 2 ++
> 2 files changed, 12 insertions(+)
>
> diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
> index 69605a474270..07ac63999575 100644
> --- a/arch/riscv/include/asm/io.h
> +++ b/arch/riscv/include/asm/io.h
> @@ -133,6 +133,16 @@ __io_writes_outs(outs, u64, q, __io_pbr(), __io_paw())
> #define outsq(addr, buffer, count) __outsq((void __iomem *)addr, buffer, count)
> #endif
>
> +#ifdef CONFIG_MMU
> +#define ioremap_wc(addr, size) \
> + ioremap_prot((addr), (size), _PAGE_IOREMAP_WC)
> +#endif
> +
> #include <asm-generic/io.h>
>
> +#ifdef CONFIG_MMU
> +#define ioremap_cache(addr, size) \
> + ioremap_prot((addr), (size), _PAGE_KERNEL)
> +#endif
> +
> #endif /* _ASM_RISCV_IO_H */
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index 7ec936910a96..346b7c1a3eeb 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -182,6 +182,8 @@ extern struct pt_alloc_ops pt_ops __initdata;
> #define PAGE_TABLE __pgprot(_PAGE_TABLE)
>
> #define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_MTMASK) | _PAGE_IO)
> +#define _PAGE_IOREMAP_WC ((_PAGE_KERNEL & ~_PAGE_MTMASK) | \
> + _PAGE_NOCACHE)
> #define PAGE_KERNEL_IO __pgprot(_PAGE_IOREMAP)
>
> extern pgd_t swapper_pg_dir[];
> --
> 2.34.1
>

2022-09-16 02:52:40

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] RISC-V: Fix ioremap_cache() and ioremap_wc() for systems with Svpbmt

Hi Palmer,

On Tue, Aug 30, 2022 at 10:17 AM Anup Patel <[email protected]> wrote:
>
> Currently, all flavors of ioremap_xyz() function maps to the generic
> ioremap() which means any ioremap_xyz() call will always map the
> target memory as IO using _PAGE_IOREMAP page attributes. This breaks
> ioremap_cache() and ioremap_wc() on systems with Svpbmt because memory
> remapped using ioremap_cache() and ioremap_wc() will use _PAGE_IOREMAP
> page attributes.
>
> To address above (just like other architectures), we implement RISC-V
> specific ioremap_cache() and ioremap_wc() which maps memory using page
> attributes as defined by the Svpbmt specification.
>
> Fixes: ff689fd21cb1 ("riscv: add RISC-V Svpbmt extension support")
> Co-developed-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Mayuresh Chitale <[email protected]>
> Signed-off-by: Anup Patel <[email protected]>

This is a crucial RC fix. Can you please take this ?

Regards,
Anup

> ---
> arch/riscv/include/asm/io.h | 10 ++++++++++
> arch/riscv/include/asm/pgtable.h | 2 ++
> 2 files changed, 12 insertions(+)
>
> diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
> index 69605a474270..07ac63999575 100644
> --- a/arch/riscv/include/asm/io.h
> +++ b/arch/riscv/include/asm/io.h
> @@ -133,6 +133,16 @@ __io_writes_outs(outs, u64, q, __io_pbr(), __io_paw())
> #define outsq(addr, buffer, count) __outsq((void __iomem *)addr, buffer, count)
> #endif
>
> +#ifdef CONFIG_MMU
> +#define ioremap_wc(addr, size) \
> + ioremap_prot((addr), (size), _PAGE_IOREMAP_WC)
> +#endif
> +
> #include <asm-generic/io.h>
>
> +#ifdef CONFIG_MMU
> +#define ioremap_cache(addr, size) \
> + ioremap_prot((addr), (size), _PAGE_KERNEL)
> +#endif
> +
> #endif /* _ASM_RISCV_IO_H */
> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
> index 7ec936910a96..346b7c1a3eeb 100644
> --- a/arch/riscv/include/asm/pgtable.h
> +++ b/arch/riscv/include/asm/pgtable.h
> @@ -182,6 +182,8 @@ extern struct pt_alloc_ops pt_ops __initdata;
> #define PAGE_TABLE __pgprot(_PAGE_TABLE)
>
> #define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_MTMASK) | _PAGE_IO)
> +#define _PAGE_IOREMAP_WC ((_PAGE_KERNEL & ~_PAGE_MTMASK) | \
> + _PAGE_NOCACHE)
> #define PAGE_KERNEL_IO __pgprot(_PAGE_IOREMAP)
>
> extern pgd_t swapper_pg_dir[];
> --
> 2.34.1
>

2022-09-22 16:39:10

by Palmer Dabbelt

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] RISC-V: Fix ioremap_cache() and ioremap_wc() for systems with Svpbmt

On Thu, 15 Sep 2022 19:24:55 PDT (-0700), [email protected] wrote:
> Hi Palmer,
>
> On Tue, Aug 30, 2022 at 10:17 AM Anup Patel <[email protected]> wrote:
>>
>> Currently, all flavors of ioremap_xyz() function maps to the generic
>> ioremap() which means any ioremap_xyz() call will always map the
>> target memory as IO using _PAGE_IOREMAP page attributes. This breaks
>> ioremap_cache() and ioremap_wc() on systems with Svpbmt because memory
>> remapped using ioremap_cache() and ioremap_wc() will use _PAGE_IOREMAP
>> page attributes.
>>
>> To address above (just like other architectures), we implement RISC-V
>> specific ioremap_cache() and ioremap_wc() which maps memory using page
>> attributes as defined by the Svpbmt specification.
>>
>> Fixes: ff689fd21cb1 ("riscv: add RISC-V Svpbmt extension support")
>> Co-developed-by: Mayuresh Chitale <[email protected]>
>> Signed-off-by: Mayuresh Chitale <[email protected]>
>> Signed-off-by: Anup Patel <[email protected]>
>
> This is a crucial RC fix. Can you please take this ?

Sorry I missed this, I thought it was just part of the rest of this
patch set. That said, I'm not actually sure this is a critical fix:
sure it's a performance problem, and if some driver is expecting
ioremap_cache() to go fast then possibly a pretty big one, but the only
Svpmbt hardware that exists is the D1 and that was just supported this
release so it's not a regression. Maybe that's a bit pedantic, but all
this travel has kind of made things a mess and I'm trying to make sure
nothing goes off the rails.

Probably a pointless distinction as it'll just get backported anyway,
but I'm going to hold off on this for now -- it generally looks OK, but
I don't get back until this weekend and I'm super tired so I'm trying to
avoid screwing anything up.

>
> Regards,
> Anup
>
>> ---
>> arch/riscv/include/asm/io.h | 10 ++++++++++
>> arch/riscv/include/asm/pgtable.h | 2 ++
>> 2 files changed, 12 insertions(+)
>>
>> diff --git a/arch/riscv/include/asm/io.h b/arch/riscv/include/asm/io.h
>> index 69605a474270..07ac63999575 100644
>> --- a/arch/riscv/include/asm/io.h
>> +++ b/arch/riscv/include/asm/io.h
>> @@ -133,6 +133,16 @@ __io_writes_outs(outs, u64, q, __io_pbr(), __io_paw())
>> #define outsq(addr, buffer, count) __outsq((void __iomem *)addr, buffer, count)
>> #endif
>>
>> +#ifdef CONFIG_MMU
>> +#define ioremap_wc(addr, size) \
>> + ioremap_prot((addr), (size), _PAGE_IOREMAP_WC)
>> +#endif
>> +
>> #include <asm-generic/io.h>
>>
>> +#ifdef CONFIG_MMU
>> +#define ioremap_cache(addr, size) \
>> + ioremap_prot((addr), (size), _PAGE_KERNEL)
>> +#endif
>> +
>> #endif /* _ASM_RISCV_IO_H */
>> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
>> index 7ec936910a96..346b7c1a3eeb 100644
>> --- a/arch/riscv/include/asm/pgtable.h
>> +++ b/arch/riscv/include/asm/pgtable.h
>> @@ -182,6 +182,8 @@ extern struct pt_alloc_ops pt_ops __initdata;
>> #define PAGE_TABLE __pgprot(_PAGE_TABLE)
>>
>> #define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_MTMASK) | _PAGE_IO)
>> +#define _PAGE_IOREMAP_WC ((_PAGE_KERNEL & ~_PAGE_MTMASK) | \
>> + _PAGE_NOCACHE)
>> #define PAGE_KERNEL_IO __pgprot(_PAGE_IOREMAP)
>>
>> extern pgd_t swapper_pg_dir[];
>> --
>> 2.34.1
>>

2022-09-23 11:06:02

by Palmer Dabbelt

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] RISC-V: Fix ioremap_cache() and ioremap_wc() for systems with Svpbmt

On Fri, 23 Sep 2022 03:35:50 PDT (-0700), Arnd Bergmann wrote:
> On Thu, Sep 22, 2022, at 6:35 PM, Palmer Dabbelt wrote:
>> On Thu, 15 Sep 2022 19:24:55 PDT (-0700), [email protected] wrote:
>>>
>>> On Tue, Aug 30, 2022 at 10:17 AM Anup Patel <[email protected]> wrote:
>>>>
>>>> Currently, all flavors of ioremap_xyz() function maps to the generic
>>>> ioremap() which means any ioremap_xyz() call will always map the
>>>> target memory as IO using _PAGE_IOREMAP page attributes. This breaks
>>>> ioremap_cache() and ioremap_wc() on systems with Svpbmt because memory
>>>> remapped using ioremap_cache() and ioremap_wc() will use _PAGE_IOREMAP
>>>> page attributes.
>>>>
>>>> To address above (just like other architectures), we implement RISC-V
>>>> specific ioremap_cache() and ioremap_wc() which maps memory using page
>>>> attributes as defined by the Svpbmt specification.
>>>>
>>>> Fixes: ff689fd21cb1 ("riscv: add RISC-V Svpbmt extension support")
>>>> Co-developed-by: Mayuresh Chitale <[email protected]>
>>>> Signed-off-by: Mayuresh Chitale <[email protected]>
>>>> Signed-off-by: Anup Patel <[email protected]>
>>>
>>> This is a crucial RC fix. Can you please take this ?
>>
>> Sorry I missed this, I thought it was just part of the rest of this
>> patch set. That said, I'm not actually sure this is a critical fix:
>> sure it's a performance problem, and if some driver is expecting
>> ioremap_cache() to go fast then possibly a pretty big one, but the only
>> Svpmbt hardware that exists is the D1 and that was just supported this
>> release so it's not a regression. Maybe that's a bit pedantic, but all
>> this travel has kind of made things a mess and I'm trying to make sure
>> nothing goes off the rails.
>
> I think generally speaking any use of ioremap_cache() in a driver
> is a mistake. The few users that exist are usually from historic
> x86 specific code and are hard to kill off.

Should we just add some sort of CONFIG_ARCH_HAS_IOREMAP_CACHE and then
ban those drivers from everywhere else?

2022-09-23 11:42:58

by Arnd Bergmann

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] RISC-V: Fix ioremap_cache() and ioremap_wc() for systems with Svpbmt

On Thu, Sep 22, 2022, at 6:35 PM, Palmer Dabbelt wrote:
> On Thu, 15 Sep 2022 19:24:55 PDT (-0700), [email protected] wrote:
>>
>> On Tue, Aug 30, 2022 at 10:17 AM Anup Patel <[email protected]> wrote:
>>>
>>> Currently, all flavors of ioremap_xyz() function maps to the generic
>>> ioremap() which means any ioremap_xyz() call will always map the
>>> target memory as IO using _PAGE_IOREMAP page attributes. This breaks
>>> ioremap_cache() and ioremap_wc() on systems with Svpbmt because memory
>>> remapped using ioremap_cache() and ioremap_wc() will use _PAGE_IOREMAP
>>> page attributes.
>>>
>>> To address above (just like other architectures), we implement RISC-V
>>> specific ioremap_cache() and ioremap_wc() which maps memory using page
>>> attributes as defined by the Svpbmt specification.
>>>
>>> Fixes: ff689fd21cb1 ("riscv: add RISC-V Svpbmt extension support")
>>> Co-developed-by: Mayuresh Chitale <[email protected]>
>>> Signed-off-by: Mayuresh Chitale <[email protected]>
>>> Signed-off-by: Anup Patel <[email protected]>
>>
>> This is a crucial RC fix. Can you please take this ?
>
> Sorry I missed this, I thought it was just part of the rest of this
> patch set. That said, I'm not actually sure this is a critical fix:
> sure it's a performance problem, and if some driver is expecting
> ioremap_cache() to go fast then possibly a pretty big one, but the only
> Svpmbt hardware that exists is the D1 and that was just supported this
> release so it's not a regression. Maybe that's a bit pedantic, but all
> this travel has kind of made things a mess and I'm trying to make sure
> nothing goes off the rails.

I think generally speaking any use of ioremap_cache() in a driver
is a mistake. The few users that exist are usually from historic
x86 specific code and are hard to kill off.

Arnd

2022-09-28 12:58:17

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] RISC-V: Fix ioremap_cache() and ioremap_wc() for systems with Svpbmt

On Thu, Sep 22, 2022 at 09:35:56AM -0700, Palmer Dabbelt wrote:
> Sorry I missed this, I thought it was just part of the rest of this patch
> set. That said, I'm not actually sure this is a critical fix: sure it's a
> performance problem, and if some driver is expecting ioremap_cache() to go
> fast then possibly a pretty big one, but the only Svpmbt hardware that

More importantly nothing should be using ioremap_cache at all. It is
an API that has long been deprecated in favor of memremap.

2022-10-07 04:48:50

by Palmer Dabbelt

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] RISC-V: Fix ioremap_cache() and ioremap_wc() for systems with Svpbmt

On Wed, 28 Sep 2022 05:14:30 PDT (-0700), Christoph Hellwig wrote:
> On Thu, Sep 22, 2022 at 09:35:56AM -0700, Palmer Dabbelt wrote:
>> Sorry I missed this, I thought it was just part of the rest of this patch
>> set. That said, I'm not actually sure this is a critical fix: sure it's a
>> performance problem, and if some driver is expecting ioremap_cache() to go
>> fast then possibly a pretty big one, but the only Svpmbt hardware that
>
> More importantly nothing should be using ioremap_cache at all. It is
> an API that has long been deprecated in favor of memremap.

There's a few uses of it, I just send along a patch to make it
arch-specific and make the users depend on it. That should let us stop
adding this to ports just to get the drivers building.

2022-10-07 05:57:34

by Anup Patel

[permalink] [raw]
Subject: Re: [PATCH v2 1/4] RISC-V: Fix ioremap_cache() and ioremap_wc() for systems with Svpbmt

On Fri, Oct 7, 2022 at 9:20 AM Palmer Dabbelt <[email protected]> wrote:
>
> On Wed, 28 Sep 2022 05:14:30 PDT (-0700), Christoph Hellwig wrote:
> > On Thu, Sep 22, 2022 at 09:35:56AM -0700, Palmer Dabbelt wrote:
> >> Sorry I missed this, I thought it was just part of the rest of this patch
> >> set. That said, I'm not actually sure this is a critical fix: sure it's a
> >> performance problem, and if some driver is expecting ioremap_cache() to go
> >> fast then possibly a pretty big one, but the only Svpmbt hardware that
> >
> > More importantly nothing should be using ioremap_cache at all. It is
> > an API that has long been deprecated in favor of memremap.
>
> There's a few uses of it, I just send along a patch to make it
> arch-specific and make the users depend on it. That should let us stop
> adding this to ports just to get the drivers building.

I agree, ioremap_cache() should not be used by drivers.

I encountered this issue with the PMEM driver which uses devm_memremap()
which in-turn calls memremap() (kernel/iomem.c). The kernel memremap()
still expects arch-specific ioremap_xyz() functions/macros to implement various
MEMREMAP_xyz flags which is why we need these RISC-V specific ioremap_xyz().

An alternative way is to implement RISC-V specific arch_memremap_wb()
but this will still look similar to ioremap_cache() added by this patch. Also,
only 32bit ARM implements arch_memremap_wb() whereas all other archs
(ARM64, x86_64, etc) implement ioremap_xyz() functions/macros.

Regards,
Anup