From: Lad Prabhakar <[email protected]>
Hi All,
non-coherent DMA support for AX45MP
====================================
On the Andes AX45MP core, cache coherency is a specification option so it
may not be supported. In this case DMA will fail. To get around with this
issue this patch series does the below:
1] Andes alternative ports is implemented as errata which checks if the IOCP
is missing and only then applies to CMO errata. One vendor specific SBI EXT
(ANDES_SBI_EXT_IOCP_SW_WORKAROUND) is implemented as part of errata.
Below are the configs which Andes port provides (and are selected by RZ/Five):
- ERRATA_ANDES
- ERRATA_ANDES_CMO
OpenSBI patch supporting ANDES_SBI_EXT_IOCP_SW_WORKAROUND SBI can be found here,
https://patchwork.ozlabs.org/project/opensbi/patch/[email protected]/
2] Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
block that allows dynamic adjustment of memory attributes in the runtime.
It contains a configurable amount of PMA entries implemented as CSR
registers to control the attributes of memory locations in interest.
OpenSBI configures the PMA regions as required and creates a reserve memory
node and propagates it to the higher boot stack.
Currently OpenSBI (upstream) configures the required PMA region and passes
this a shared DMA pool to Linux.
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;
pma_resv0@58000000 {
compatible = "shared-dma-pool";
reg = <0x0 0x58000000 0x0 0x08000000>;
no-map;
linux,dma-default;
};
};
The above shared DMA pool gets appended to Linux DTB so the DMA memory
requests go through this region.
3] We provide callbacks to synchronize specific content between memory and
cache.
4] RZ/Five SoC selects the below configs
- AX45MP_L2_CACHE
- DMA_GLOBAL_POOL
- ERRATA_ANDES
- ERRATA_ANDES_CMO
----------x---------------------x--------------------x---------------x--------------
Note,
- Ive used GCC 12.2.0 for compilation
- Tested all the IP blocks on RZ/Five which use DMA
- Patch series is dependent on the series from Arnd,
https://patchwork.kernel.org/project/linux-riscv/cover/[email protected]/.
(Ive rebased Arnd's series on v6.4-rc-1)
- Patches applies on top of palmer/for-next (255b34d799dd)
- Ive pushed the complete tree here https://github.com/prabhakarlad/linux/tree/rzfive-cmo-v9
- Previously the function pointer approach was NAKed by Christoph Hellwig but based on the discussion
on #riscv Ive implemented this approach.
v8 -> v9
* Dropped adding ALTERNATIVE_3
* Implemented function pointer support for nonstandard noncoherent systems
* Added a new config option CONFIG_RISCV_NONSTANDARD_CACHE_OPS
* Updated Andes errata code to drop patching the calls as we no more use
ALTERNATIVE_X() macro.
* Updated Andes CMO code to use function pointer for doing cache management.
v7 -> v8
* Dropped using function pointers and switched to ALTERNATIVE_X()
* Added new patches (#1, #2)
v6 -> v7
* Reworked the code based on Arnd's work
* Fixed review comments pointed by Arnd
* Fixed review comments pointed by Conor
v5.1 -> v6
* Dropped use of ALTERNATIVE_x() macro
* Now switched to used function pointers for CMO
* Moved driver to drivers/cache folder
v5 -> v5.1
* https://patchwork.kernel.org/project/linux-riscv/list/?series=708610&state=%2A&archive=both
v4 -> v5
* Rebased ALTERNATIVE_3() macro on top of Andrew's patches
* Rebased the changes on top of Heiko's alternative call patches
* Dropped configuring the PMA from Linux
* Dropped configuring the L2 cache from Linux and dropped the binding for same
* Now using runtime patching mechanism instead of compile time config
RFC v3 -> v4
* Implemented ALTERNATIVE_3() macro
* Now using runtime patching mechanism instead of compile time config
* Added Andes CMO as and errata
* Fixed comments pointed by Geert
RFC v2-> RFC v3
* Fixed review comments pointed by Conor
* Move DT binding into cache folder
* Fixed DT binding check issue
* Added andestech,ax45mp-cache.h header file
* Now passing the flags for the PMA setup as part of andestech,pma-regions
property.
* Added andestech,inst/data-prefetch and andestech,tag/data-ram-ctl
properties to configure the L2 cache.
* Registered the cache driver as platform driver
RFC v1-> RFC v2
* Moved out the code from arc/riscv to drivers/soc/renesas
* Now handling the PMA setup as part of the L2 cache
* Now making use of dma-noncoherent.c instead SoC specific implementation.
* Dropped arch_dma_alloc() and arch_dma_free()
* Switched to RISCV_DMA_NONCOHERENT
* Included DT binding doc
RFC v2: https://patchwork.kernel.org/project/linux-renesas-soc/cover/[email protected]/
RFC v1: https://patchwork.kernel.org/project/linux-renesas-soc/cover/[email protected]/
Cheers,
Prabhakar
Lad Prabhakar (6):
riscv: asm: vendorid_list: Add Andes Technology to the vendors list
riscv: errata: Add Andes alternative ports
riscv: mm: dma-noncoherent: nonstandard cache operations support
dt-bindings: cache: andestech,ax45mp-cache: Add DT binding
documentation for L2 cache controller
cache: Add L2 cache management for Andes AX45MP RISC-V core
soc: renesas: Kconfig: Select the required configs for RZ/Five SoC
.../cache/andestech,ax45mp-cache.yaml | 81 +++++++
MAINTAINERS | 7 +
arch/riscv/Kconfig | 7 +
arch/riscv/Kconfig.errata | 21 ++
arch/riscv/errata/Makefile | 1 +
arch/riscv/errata/andes/Makefile | 1 +
arch/riscv/errata/andes/errata.c | 66 +++++
arch/riscv/include/asm/alternative.h | 3 +
arch/riscv/include/asm/dma-noncoherent.h | 28 +++
arch/riscv/include/asm/errata_list.h | 5 +
arch/riscv/include/asm/vendorid_list.h | 1 +
arch/riscv/kernel/alternative.c | 5 +
arch/riscv/mm/dma-noncoherent.c | 43 ++++
arch/riscv/mm/pmem.c | 13 +
drivers/Kconfig | 2 +
drivers/Makefile | 1 +
drivers/cache/Kconfig | 11 +
drivers/cache/Makefile | 3 +
drivers/cache/ax45mp_cache.c | 229 ++++++++++++++++++
drivers/soc/renesas/Kconfig | 4 +
20 files changed, 532 insertions(+)
create mode 100644 Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
create mode 100644 arch/riscv/errata/andes/Makefile
create mode 100644 arch/riscv/errata/andes/errata.c
create mode 100644 arch/riscv/include/asm/dma-noncoherent.h
create mode 100644 drivers/cache/Kconfig
create mode 100644 drivers/cache/Makefile
create mode 100644 drivers/cache/ax45mp_cache.c
--
2.25.1
From: Lad Prabhakar <[email protected]>
Add Andes Technology to the vendors list.
Signed-off-by: Lad Prabhakar <[email protected]>
Reviewed-by: Heiko Stuebner <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Reviewed-by: Geert Uytterhoeven <[email protected]>
---
v8 -> v9
* Included RB tag from Geert
v7 -> v8
* No change
v6 -> v7
* No change
v5 -> v6
* No change
v4 -> v5
* Included RB tags
RFC v3 -> v4
* New patch
---
arch/riscv/include/asm/vendorid_list.h | 1 +
1 file changed, 1 insertion(+)
diff --git a/arch/riscv/include/asm/vendorid_list.h b/arch/riscv/include/asm/vendorid_list.h
index cb89af3f0704..e55407ace0c3 100644
--- a/arch/riscv/include/asm/vendorid_list.h
+++ b/arch/riscv/include/asm/vendorid_list.h
@@ -5,6 +5,7 @@
#ifndef ASM_VENDOR_LIST_H
#define ASM_VENDOR_LIST_H
+#define ANDESTECH_VENDOR_ID 0x31e
#define SIFIVE_VENDOR_ID 0x489
#define THEAD_VENDOR_ID 0x5b7
--
2.25.1
From: Lad Prabhakar <[email protected]>
Add required ports of the Alternative scheme for Andes CPU cores.
I/O Coherence Port (IOCP) provides an AXI interface for connecting external
non-caching masters, such as DMA controllers. IOCP is a specification
option and is disabled on the Renesas RZ/Five SoC due to this reason cache
management needs a software workaround.
Signed-off-by: Lad Prabhakar <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
---
v8 -> v9
* Rebased to code on Palmer's for/next
* Dropped calling patch_text_nosync() as dont use
patch_text_nosync() call
v7 -> v8
* Now patching the code using patch_text_nosync() and riscv_alternative_fix_offsets()
v6 -> v7
* Renamed RZFIVE_SBI_EXT_IOCP_SW_WORKAROUND -> ANDES_SBI_EXT_IOCP_SW_WORKAROUND
* Dropped "depends on !XIP_KERNEL" for ERRATA_ANDES config
v5 -> v6
* Dropped patching alternative and now just probing IOCP
v4 -> v5
* Sorted the Kconfig/Makefile/Switch based on Core name
* Added a comments
* Introduced RZFIVE_SBI_EXT_IOCP_SW_WORKAROUND SBI EXT ID to check if
CMO needs to be applied. Is there a way we can access the DTB while patching
as we can drop this SBI EXT ID and add a DT property instead for cmo?
RFC v3 -> v4
* New patch
---
arch/riscv/Kconfig.errata | 21 +++++++++
arch/riscv/errata/Makefile | 1 +
arch/riscv/errata/andes/Makefile | 1 +
arch/riscv/errata/andes/errata.c | 66 ++++++++++++++++++++++++++++
arch/riscv/include/asm/alternative.h | 3 ++
arch/riscv/include/asm/errata_list.h | 5 +++
arch/riscv/kernel/alternative.c | 5 +++
7 files changed, 102 insertions(+)
create mode 100644 arch/riscv/errata/andes/Makefile
create mode 100644 arch/riscv/errata/andes/errata.c
diff --git a/arch/riscv/Kconfig.errata b/arch/riscv/Kconfig.errata
index 0c8f4652cd82..92c779764b27 100644
--- a/arch/riscv/Kconfig.errata
+++ b/arch/riscv/Kconfig.errata
@@ -1,5 +1,26 @@
menu "CPU errata selection"
+config ERRATA_ANDES
+ bool "Andes AX45MP errata"
+ depends on RISCV_ALTERNATIVE
+ help
+ All Andes errata Kconfig depend on this Kconfig. Disabling
+ this Kconfig will disable all Andes errata. Please say "Y"
+ here if your platform uses Andes CPU cores.
+
+ Otherwise, please say "N" here to avoid unnecessary overhead.
+
+config ERRATA_ANDES_CMO
+ bool "Apply Andes cache management errata"
+ depends on ERRATA_ANDES && MMU && ARCH_R9A07G043
+ select RISCV_DMA_NONCOHERENT
+ default y
+ help
+ This will apply the cache management errata to handle the
+ non-standard handling on non-coherent operations on Andes cores.
+
+ If you don't know what to do here, say "Y".
+
config ERRATA_SIFIVE
bool "SiFive errata"
depends on RISCV_ALTERNATIVE
diff --git a/arch/riscv/errata/Makefile b/arch/riscv/errata/Makefile
index 7b2637c8c332..8a2739485123 100644
--- a/arch/riscv/errata/Makefile
+++ b/arch/riscv/errata/Makefile
@@ -2,5 +2,6 @@ ifdef CONFIG_RELOCATABLE
KBUILD_CFLAGS += -fno-pie
endif
+obj-$(CONFIG_ERRATA_ANDES) += andes/
obj-$(CONFIG_ERRATA_SIFIVE) += sifive/
obj-$(CONFIG_ERRATA_THEAD) += thead/
diff --git a/arch/riscv/errata/andes/Makefile b/arch/riscv/errata/andes/Makefile
new file mode 100644
index 000000000000..2d644e19caef
--- /dev/null
+++ b/arch/riscv/errata/andes/Makefile
@@ -0,0 +1 @@
+obj-y += errata.o
diff --git a/arch/riscv/errata/andes/errata.c b/arch/riscv/errata/andes/errata.c
new file mode 100644
index 000000000000..197db68cc8da
--- /dev/null
+++ b/arch/riscv/errata/andes/errata.c
@@ -0,0 +1,66 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Erratas to be applied for Andes CPU cores
+ *
+ * Copyright (C) 2023 Renesas Electronics Corporation.
+ *
+ * Author: Lad Prabhakar <[email protected]>
+ */
+
+#include <linux/memory.h>
+#include <linux/module.h>
+
+#include <asm/alternative.h>
+#include <asm/cacheflush.h>
+#include <asm/errata_list.h>
+#include <asm/patch.h>
+#include <asm/processor.h>
+#include <asm/sbi.h>
+#include <asm/vendorid_list.h>
+
+#define ANDESTECH_AX45MP_MARCHID 0x8000000000008a45UL
+#define ANDESTECH_AX45MP_MIMPID 0x500UL
+#define ANDESTECH_SBI_EXT_ANDES 0x0900031E
+
+#define ANDES_SBI_EXT_IOCP_SW_WORKAROUND 1
+
+static long ax45mp_iocp_sw_workaround(void)
+{
+ struct sbiret ret;
+
+ /*
+ * ANDES_SBI_EXT_IOCP_SW_WORKAROUND SBI EXT checks if the IOCP is missing and
+ * cache is controllable only then CMO will be applied to the platform.
+ */
+ ret = sbi_ecall(ANDESTECH_SBI_EXT_ANDES, ANDES_SBI_EXT_IOCP_SW_WORKAROUND,
+ 0, 0, 0, 0, 0, 0);
+
+ return ret.error ? 0 : ret.value;
+}
+
+static bool errata_probe_iocp(unsigned int stage, unsigned long arch_id, unsigned long impid)
+{
+ if (!IS_ENABLED(CONFIG_ERRATA_ANDES_CMO))
+ return false;
+
+ if (arch_id != ANDESTECH_AX45MP_MARCHID || impid != ANDESTECH_AX45MP_MIMPID)
+ return false;
+
+ if (!ax45mp_iocp_sw_workaround())
+ return false;
+
+ /* Set this just to make core cbo code happy */
+ riscv_cbom_block_size = 1;
+ riscv_noncoherent_supported();
+
+ return true;
+}
+
+void __init_or_module andes_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
+ unsigned long archid, unsigned long impid,
+ unsigned int stage)
+{
+ errata_probe_iocp(stage, archid, impid);
+
+ /* we have nothing to patch here ATM so just return back */
+}
diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
index 6a41537826a7..f6cfca939c92 100644
--- a/arch/riscv/include/asm/alternative.h
+++ b/arch/riscv/include/asm/alternative.h
@@ -46,6 +46,9 @@ struct alt_entry {
u32 patch_id; /* The patch ID (erratum ID or cpufeature ID) */
};
+void andes_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
+ unsigned long archid, unsigned long impid,
+ unsigned int stage);
void sifive_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
unsigned long archid, unsigned long impid,
unsigned int stage);
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index fb1a810f3d8c..e2ecd01bfac7 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -11,6 +11,11 @@
#include <asm/hwcap.h>
#include <asm/vendorid_list.h>
+#ifdef CONFIG_ERRATA_ANDES
+#define ERRATA_ANDESTECH_NO_IOCP 0
+#define ERRATA_ANDESTECH_NUMBER 1
+#endif
+
#ifdef CONFIG_ERRATA_SIFIVE
#define ERRATA_SIFIVE_CIP_453 0
#define ERRATA_SIFIVE_CIP_1200 1
diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
index 6b75788c18e6..b0345992a35e 100644
--- a/arch/riscv/kernel/alternative.c
+++ b/arch/riscv/kernel/alternative.c
@@ -45,6 +45,11 @@ static void riscv_fill_cpu_mfr_info(struct cpu_manufacturer_info_t *cpu_mfr_info
cpu_mfr_info->feature_probe_func = NULL;
switch (cpu_mfr_info->vendor_id) {
+#ifdef CONFIG_ERRATA_ANDES
+ case ANDESTECH_VENDOR_ID:
+ cpu_mfr_info->patch_func = andes_errata_patch_func;
+ break;
+#endif
#ifdef CONFIG_ERRATA_SIFIVE
case SIFIVE_VENDOR_ID:
cpu_mfr_info->patch_func = sifive_errata_patch_func;
--
2.25.1
From: Lad Prabhakar <[email protected]>
Introduce support for nonstandard noncoherent systems in the RISC-V
architecture. It enables function pointer support to handle cache
management in such systems.
This patch adds a new configuration option called
"RISCV_NONSTANDARD_CACHE_OPS." This option is a boolean flag that
depends on "RISCV_DMA_NONCOHERENT" and enables the function pointer
support for cache management in nonstandard noncoherent systems.
Signed-off-by: Lad Prabhakar <[email protected]>
---
v8 -> v9
* New patch
---
arch/riscv/Kconfig | 7 ++++
arch/riscv/include/asm/dma-noncoherent.h | 28 +++++++++++++++
arch/riscv/mm/dma-noncoherent.c | 43 ++++++++++++++++++++++++
arch/riscv/mm/pmem.c | 13 +++++++
4 files changed, 91 insertions(+)
create mode 100644 arch/riscv/include/asm/dma-noncoherent.h
diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig
index 628aad4fb6e2..325ab2124f0a 100644
--- a/arch/riscv/Kconfig
+++ b/arch/riscv/Kconfig
@@ -261,6 +261,13 @@ config RISCV_DMA_NONCOHERENT
select ARCH_HAS_SYNC_DMA_FOR_DEVICE
select DMA_DIRECT_REMAP
+config RISCV_NONSTANDARD_CACHE_OPS
+ bool
+ depends on RISCV_DMA_NONCOHERENT
+ help
+ This enables function pointer support for non-standard noncoherent
+ systems to handle cache management.
+
config AS_HAS_INSN
def_bool $(as-instr,.insn r 51$(comma) 0$(comma) 0$(comma) t0$(comma) t0$(comma) zero)
diff --git a/arch/riscv/include/asm/dma-noncoherent.h b/arch/riscv/include/asm/dma-noncoherent.h
new file mode 100644
index 000000000000..f4e9bb2d3800
--- /dev/null
+++ b/arch/riscv/include/asm/dma-noncoherent.h
@@ -0,0 +1,28 @@
+/* SPDX-License-Identifier: GPL-2.0-only */
+/*
+ * Copyright (C) 2023 Renesas Electronics Corp.
+ */
+
+#ifndef __ASM_DMA_NONCOHERENT_H
+#define __ASM_DMA_NONCOHERENT_H
+
+#include <linux/dma-direct.h>
+
+/*
+ * struct riscv_cache_ops - Structure for CMO function pointers
+ *
+ * @clean: Function pointer for clean cache
+ * @inval: Function pointer for invalidate cache
+ * @flush: Function pointer for flushing the cache
+ */
+struct riscv_cache_ops {
+ void (*clean)(phys_addr_t paddr, unsigned long size);
+ void (*inval)(phys_addr_t paddr, unsigned long size);
+ void (*flush)(phys_addr_t paddr, unsigned long size);
+};
+
+extern struct riscv_cache_ops noncoherent_cache_ops;
+
+void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops);
+
+#endif /* __ASM_DMA_NONCOHERENT_H */
diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
index b9a9f57e02be..4cdaa879839a 100644
--- a/arch/riscv/mm/dma-noncoherent.c
+++ b/arch/riscv/mm/dma-noncoherent.c
@@ -9,13 +9,26 @@
#include <linux/dma-map-ops.h>
#include <linux/mm.h>
#include <asm/cacheflush.h>
+#include <asm/dma-noncoherent.h>
static bool noncoherent_supported;
+struct riscv_cache_ops noncoherent_cache_ops = {
+ .clean = NULL,
+ .inval = NULL,
+ .flush = NULL,
+};
+
static inline void arch_dma_cache_wback(phys_addr_t paddr, size_t size)
{
void *vaddr = phys_to_virt(paddr);
+#ifdef CONFIG_RISCV_NONSTANDARD_CACHE_OPS
+ if (unlikely(noncoherent_cache_ops.clean)) {
+ noncoherent_cache_ops.clean(paddr, size);
+ return;
+ }
+#endif
ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
}
@@ -23,6 +36,13 @@ static inline void arch_dma_cache_inv(phys_addr_t paddr, size_t size)
{
void *vaddr = phys_to_virt(paddr);
+#ifdef CONFIG_RISCV_NONSTANDARD_CACHE_OPS
+ if (unlikely(noncoherent_cache_ops.inval)) {
+ noncoherent_cache_ops.inval(paddr, size);
+ return;
+ }
+#endif
+
ALT_CMO_OP(inval, vaddr, size, riscv_cbom_block_size);
}
@@ -30,6 +50,13 @@ static inline void arch_dma_cache_wback_inv(phys_addr_t paddr, size_t size)
{
void *vaddr = phys_to_virt(paddr);
+#ifdef CONFIG_RISCV_NONSTANDARD_CACHE_OPS
+ if (unlikely(noncoherent_cache_ops.flush)) {
+ noncoherent_cache_ops.flush(paddr, size);
+ return;
+ }
+#endif
+
ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
}
@@ -50,6 +77,13 @@ void arch_dma_prep_coherent(struct page *page, size_t size)
{
void *flush_addr = page_address(page);
+#ifdef CONFIG_RISCV_NONSTANDARD_CACHE_OPS
+ if (unlikely(noncoherent_cache_ops.flush)) {
+ noncoherent_cache_ops.flush(page_to_phys(page), size);
+ return;
+ }
+#endif
+
ALT_CMO_OP(flush, flush_addr, size, riscv_cbom_block_size);
}
@@ -75,3 +109,12 @@ void riscv_noncoherent_supported(void)
"Non-coherent DMA support enabled without a block size\n");
noncoherent_supported = true;
}
+
+void riscv_noncoherent_register_cache_ops(const struct riscv_cache_ops *ops)
+{
+ if (!ops)
+ return;
+
+ noncoherent_cache_ops = *ops;
+}
+EXPORT_SYMBOL_GPL(riscv_noncoherent_register_cache_ops);
diff --git a/arch/riscv/mm/pmem.c b/arch/riscv/mm/pmem.c
index 089df92ae876..fb481f5b930a 100644
--- a/arch/riscv/mm/pmem.c
+++ b/arch/riscv/mm/pmem.c
@@ -7,15 +7,28 @@
#include <linux/libnvdimm.h>
#include <asm/cacheflush.h>
+#include <asm/dma-noncoherent.h>
void arch_wb_cache_pmem(void *addr, size_t size)
{
+#ifdef CONFIG_RISCV_NONSTANDARD_CACHE_OPS
+ if (unlikely(noncoherent_cache_ops.clean)) {
+ noncoherent_cache_ops.clean(virt_to_phys(addr), size);
+ return;
+ }
+#endif
ALT_CMO_OP(clean, addr, size, riscv_cbom_block_size);
}
EXPORT_SYMBOL_GPL(arch_wb_cache_pmem);
void arch_invalidate_pmem(void *addr, size_t size)
{
+#ifdef CONFIG_RISCV_NONSTANDARD_CACHE_OPS
+ if (unlikely(noncoherent_cache_ops.inval)) {
+ noncoherent_cache_ops.inval(virt_to_phys(addr), size);
+ return;
+ }
+#endif
ALT_CMO_OP(inval, addr, size, riscv_cbom_block_size);
}
EXPORT_SYMBOL_GPL(arch_invalidate_pmem);
--
2.25.1
From: Lad Prabhakar <[email protected]>
I/O Coherence Port (IOCP) provides an AXI interface for connecting
external non-caching masters, such as DMA controllers. The accesses
from IOCP are coherent with D-Caches and L2 Cache.
IOCP is a specification option and is disabled on the Renesas RZ/Five
SoC due to this reason IP blocks using DMA will fail.
The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
block that allows dynamic adjustment of memory attributes in the runtime.
It contains a configurable amount of PMA entries implemented as CSR
registers to control the attributes of memory locations in interest.
Below are the memory attributes supported:
* Device, Non-bufferable
* Device, bufferable
* Memory, Non-cacheable, Non-bufferable
* Memory, Non-cacheable, Bufferable
* Memory, Write-back, No-allocate
* Memory, Write-back, Read-allocate
* Memory, Write-back, Write-allocate
* Memory, Write-back, Read and Write-allocate
More info about PMA (section 10.3):
Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
As a workaround for SoCs with IOCP disabled CMO needs to be handled by
software. Firstly OpenSBI configures the memory region as
"Memory, Non-cacheable, Bufferable" and passes this region as a global
shared dma pool as a DT node. With DMA_GLOBAL_POOL enabled all DMA
allocations happen from this region and synchronization callbacks are
implemented to synchronize when doing DMA transactions.
Example PMA region passes as a DT node from OpenSBI:
reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;
pma_resv0@58000000 {
compatible = "shared-dma-pool";
reg = <0x0 0x58000000 0x0 0x08000000>;
no-map;
linux,dma-default;
};
};
Signed-off-by: Lad Prabhakar <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
---
v8 -> v9
* Dropped exporting CMO functions as we no more used ALTERNATIVE_X() macro
* Now using the riscv_noncoherent_register_cache_ops() for registering
CMO ops
* Added RB tag from Conor
v7 -> v8
* Dropped function pointer usage
* Now exporting the functions for clean/inval/flush
* Switched to using early_initcall instead of arch_initcall
* Dropped entry for "include/cache" from MAINTAINERS
* Dropped dependency of RISCV on AX45MP_L2_CACHE
* Returning error in case of cache line mismatch
* Renamed clean/inval/flush functions
v6 -> v7
* Implemented flush callback
* Dropped using riscv_dma_noncoherent_cmo_ops
v5 -> v6
* Moved driver to cache folder
* Switched to new API for CMO
v4 -> v5
* Dropped code for configuring L2 cache
* Dropped code for configuring PMA
* Updated commit message
* Added comments
* Changed static branch enable/disable order
RFC v3 -> v4
* Made use of runtime patching instead of compile time
* Now just exposing single function ax45mp_no_iocp_cmo() for CMO handling
* Added a check to make sure cache line size is always 64 bytes
* Renamed folder rzf -> rzfive
* Improved Kconfig description
* Dropped L2 cache configuration
* Dropped unnecessary casts
* Fixed comments pointed by Geert.
---
MAINTAINERS | 7 ++
drivers/Kconfig | 2 +
drivers/Makefile | 1 +
drivers/cache/Kconfig | 11 ++
drivers/cache/Makefile | 3 +
drivers/cache/ax45mp_cache.c | 229 +++++++++++++++++++++++++++++++++++
6 files changed, 253 insertions(+)
create mode 100644 drivers/cache/Kconfig
create mode 100644 drivers/cache/Makefile
create mode 100644 drivers/cache/ax45mp_cache.c
diff --git a/MAINTAINERS b/MAINTAINERS
index 55ac73793856..899452038a5b 100644
--- a/MAINTAINERS
+++ b/MAINTAINERS
@@ -20073,6 +20073,13 @@ S: Supported
T: git git://git.kernel.org/pub/scm/linux/kernel/git/gregkh/staging.git
F: drivers/staging/
+STANDALONE CACHE CONTROLLER DRIVERS
+M: Conor Dooley <[email protected]>
+L: [email protected]
+S: Maintained
+T: git https://git.kernel.org/pub/scm/linux/kernel/git/conor/linux.git/
+F: drivers/cache
+
STARFIRE/DURALAN NETWORK DRIVER
M: Ion Badulescu <[email protected]>
S: Odd Fixes
diff --git a/drivers/Kconfig b/drivers/Kconfig
index 514ae6b24cb2..2ae1b6707c2c 100644
--- a/drivers/Kconfig
+++ b/drivers/Kconfig
@@ -15,6 +15,8 @@ source "drivers/base/Kconfig"
source "drivers/bus/Kconfig"
+source "drivers/cache/Kconfig"
+
source "drivers/connector/Kconfig"
source "drivers/firmware/Kconfig"
diff --git a/drivers/Makefile b/drivers/Makefile
index 7241d80a7b29..23eb201fe18a 100644
--- a/drivers/Makefile
+++ b/drivers/Makefile
@@ -11,6 +11,7 @@ ifdef building_out_of_srctree
MAKEFLAGS += --include-dir=$(srctree)
endif
+obj-y += cache/
obj-y += irqchip/
obj-y += bus/
diff --git a/drivers/cache/Kconfig b/drivers/cache/Kconfig
new file mode 100644
index 000000000000..a57677f908f3
--- /dev/null
+++ b/drivers/cache/Kconfig
@@ -0,0 +1,11 @@
+# SPDX-License-Identifier: GPL-2.0
+menu "Cache Drivers"
+
+config AX45MP_L2_CACHE
+ bool "Andes Technology AX45MP L2 Cache controller"
+ depends on RISCV_DMA_NONCOHERENT
+ select RISCV_NONSTANDARD_CACHE_OPS
+ help
+ Support for the L2 cache controller on Andes Technology AX45MP platforms.
+
+endmenu
diff --git a/drivers/cache/Makefile b/drivers/cache/Makefile
new file mode 100644
index 000000000000..2012e7fb978d
--- /dev/null
+++ b/drivers/cache/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_AX45MP_L2_CACHE) += ax45mp_cache.o
diff --git a/drivers/cache/ax45mp_cache.c b/drivers/cache/ax45mp_cache.c
new file mode 100644
index 000000000000..df868872aabf
--- /dev/null
+++ b/drivers/cache/ax45mp_cache.c
@@ -0,0 +1,229 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * non-coherent cache functions for Andes AX45MP
+ *
+ * Copyright (C) 2023 Renesas Electronics Corp.
+ */
+
+#include <linux/cacheflush.h>
+#include <linux/cacheinfo.h>
+#include <linux/dma-direction.h>
+#include <linux/of_address.h>
+#include <linux/of_platform.h>
+
+#include <asm/dma-noncoherent.h>
+
+/* L2 cache registers */
+#define AX45MP_L2C_REG_CTL_OFFSET 0x8
+
+#define AX45MP_L2C_REG_C0_CMD_OFFSET 0x40
+#define AX45MP_L2C_REG_C0_ACC_OFFSET 0x48
+#define AX45MP_L2C_REG_STATUS_OFFSET 0x80
+
+/* D-cache operation */
+#define AX45MP_CCTL_L1D_VA_INVAL 0 /* Invalidate an L1 cache entry */
+#define AX45MP_CCTL_L1D_VA_WB 1 /* Write-back an L1 cache entry */
+
+/* L2 CCTL status */
+#define AX45MP_CCTL_L2_STATUS_IDLE 0
+
+/* L2 CCTL status cores mask */
+#define AX45MP_CCTL_L2_STATUS_C0_MASK 0xf
+
+/* L2 cache operation */
+#define AX45MP_CCTL_L2_PA_INVAL 0x8 /* Invalidate an L2 cache entry */
+#define AX45MP_CCTL_L2_PA_WB 0x9 /* Write-back an L2 cache entry */
+
+#define AX45MP_L2C_REG_PER_CORE_OFFSET 0x10
+#define AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET 4
+
+#define AX45MP_L2C_REG_CN_CMD_OFFSET(n) \
+ (AX45MP_L2C_REG_C0_CMD_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET))
+#define AX45MP_L2C_REG_CN_ACC_OFFSET(n) \
+ (AX45MP_L2C_REG_C0_ACC_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET))
+#define AX45MP_CCTL_L2_STATUS_CN_MASK(n) \
+ (AX45MP_CCTL_L2_STATUS_C0_MASK << ((n) * AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET))
+
+#define AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM 0x80b
+#define AX45MP_CCTL_REG_UCCTLCOMMAND_NUM 0x80c
+
+#define AX45MP_CACHE_LINE_SIZE 64
+
+struct ax45mp_priv {
+ void __iomem *l2c_base;
+ u32 ax45mp_cache_line_size;
+};
+
+static struct ax45mp_priv ax45mp_priv;
+
+/* L2 Cache operations */
+static inline uint32_t ax45mp_cpu_l2c_get_cctl_status(void)
+{
+ return readl(ax45mp_priv.l2c_base + AX45MP_L2C_REG_STATUS_OFFSET);
+}
+
+static void ax45mp_cpu_cache_operation(unsigned long start, unsigned long end,
+ unsigned long line_size, unsigned int l1_op,
+ unsigned int l2_op)
+{
+ void __iomem *base = ax45mp_priv.l2c_base;
+ int mhartid = smp_processor_id();
+ unsigned long pa;
+
+ while (end > start) {
+ csr_write(AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM, start);
+ csr_write(AX45MP_CCTL_REG_UCCTLCOMMAND_NUM, l1_op);
+
+ pa = virt_to_phys((void *)start);
+ writel(pa, base + AX45MP_L2C_REG_CN_ACC_OFFSET(mhartid));
+ writel(l2_op, base + AX45MP_L2C_REG_CN_CMD_OFFSET(mhartid));
+ while ((ax45mp_cpu_l2c_get_cctl_status() &
+ AX45MP_CCTL_L2_STATUS_CN_MASK(mhartid)) !=
+ AX45MP_CCTL_L2_STATUS_IDLE)
+ ;
+
+ start += line_size;
+ }
+}
+
+/* Write-back L1 and L2 cache entry */
+static inline void ax45mp_cpu_dcache_wb_range(unsigned long start, unsigned long end,
+ unsigned long line_size)
+{
+ ax45mp_cpu_cache_operation(start, end, line_size,
+ AX45MP_CCTL_L1D_VA_WB,
+ AX45MP_CCTL_L2_PA_WB);
+}
+
+/* Invalidate the L1 and L2 cache entry */
+static inline void ax45mp_cpu_dcache_inval_range(unsigned long start, unsigned long end,
+ unsigned long line_size)
+{
+ ax45mp_cpu_cache_operation(start, end, line_size,
+ AX45MP_CCTL_L1D_VA_INVAL,
+ AX45MP_CCTL_L2_PA_INVAL);
+}
+
+static void ax45mp_dma_cache_inv(phys_addr_t paddr, unsigned long size)
+{
+ unsigned long start = (unsigned long)phys_to_virt(paddr);
+ char cache_buf[2][AX45MP_CACHE_LINE_SIZE];
+ unsigned long end = start + size;
+ unsigned long old_start = start;
+ unsigned long old_end = end;
+ unsigned long line_size;
+ unsigned long flags;
+
+ if (unlikely(start == end))
+ return;
+
+ line_size = ax45mp_priv.ax45mp_cache_line_size;
+
+ memset(&cache_buf, 0x0, sizeof(cache_buf));
+ start = start & (~(line_size - 1));
+ end = ((end + line_size - 1) & (~(line_size - 1)));
+
+ local_irq_save(flags);
+ if (unlikely(start != old_start))
+ memcpy(&cache_buf[0][0], (void *)start, line_size);
+
+ if (unlikely(end != old_end))
+ memcpy(&cache_buf[1][0], (void *)(old_end & (~(line_size - 1))), line_size);
+
+ ax45mp_cpu_dcache_inval_range(start, end, line_size);
+
+ if (unlikely(start != old_start))
+ memcpy((void *)start, &cache_buf[0][0], (old_start & (line_size - 1)));
+
+ local_irq_restore(flags);
+}
+
+static void ax45mp_dma_cache_wback(phys_addr_t paddr, unsigned long size)
+{
+ unsigned long start = (unsigned long)phys_to_virt(paddr);
+ unsigned long end = start + size;
+ unsigned long line_size;
+ unsigned long flags;
+
+ line_size = ax45mp_priv.ax45mp_cache_line_size;
+ start = start & (~(line_size - 1));
+ local_irq_save(flags);
+ ax45mp_cpu_dcache_wb_range(start, end, line_size);
+ local_irq_restore(flags);
+}
+
+static void ax45mp_dma_cache_wback_inv(phys_addr_t paddr, unsigned long size)
+{
+ ax45mp_dma_cache_wback(paddr, size);
+ ax45mp_dma_cache_inv(paddr, size);
+}
+
+static int ax45mp_get_l2_line_size(struct device_node *np)
+{
+ int ret;
+
+ ret = of_property_read_u32(np, "cache-line-size", &ax45mp_priv.ax45mp_cache_line_size);
+ if (ret) {
+ pr_err("Failed to get cache-line-size, defaulting to 64 bytes\n");
+ return ret;
+ }
+
+ if (ax45mp_priv.ax45mp_cache_line_size != AX45MP_CACHE_LINE_SIZE) {
+ pr_err("Expected cache-line-size to be 64 bytes (found:%u)\n",
+ ax45mp_priv.ax45mp_cache_line_size);
+ return -EINVAL;
+ }
+
+ return 0;
+}
+
+static const struct riscv_cache_ops ax45mp_cmo_ops = {
+ .clean = &ax45mp_dma_cache_wback,
+ .inval = &ax45mp_dma_cache_inv,
+ .flush = &ax45mp_dma_cache_wback_inv,
+};
+
+static const struct of_device_id ax45mp_cache_ids[] = {
+ { .compatible = "andestech,ax45mp-cache" },
+ { /* sentinel */ }
+};
+
+static int __init ax45mp_cache_init(void)
+{
+ struct device_node *np;
+ struct resource res;
+ int ret;
+
+ np = of_find_matching_node(NULL, ax45mp_cache_ids);
+ if (!of_device_is_available(np))
+ return -ENODEV;
+
+ ret = of_address_to_resource(np, 0, &res);
+ if (ret)
+ return ret;
+
+ /*
+ * If IOCP is present on the Andes AX45MP core riscv_cbom_block_size
+ * will be 0 for sure, so we can definitely rely on it. If
+ * riscv_cbom_block_size = 0 we don't need to handle CMO using SW any
+ * more so we just return success here and only if its being set we
+ * continue further in the probe path.
+ */
+ if (!riscv_cbom_block_size)
+ return 0;
+
+ ax45mp_priv.l2c_base = ioremap(res.start, resource_size(&res));
+ if (!ax45mp_priv.l2c_base)
+ return -ENOMEM;
+
+ ret = ax45mp_get_l2_line_size(np);
+ if (ret) {
+ iounmap(ax45mp_priv.l2c_base);
+ return ret;
+ }
+
+ riscv_noncoherent_register_cache_ops(&ax45mp_cmo_ops);
+
+ return 0;
+}
+early_initcall(ax45mp_cache_init);
--
2.25.1
From: Lad Prabhakar <[email protected]>
Add DT binding documentation for L2 cache controller found on RZ/Five SoC.
The Renesas RZ/Five microprocessor includes a RISC-V CPU Core (AX45MP
Single) from Andes. The AX45MP core has an L2 cache controller, this patch
describes the L2 cache block.
Signed-off-by: Lad Prabhakar <[email protected]>
Reviewed-by: Rob Herring <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
---
v8 -> v9
* No Change
v7 -> v8
* Updated commit header message
v6 -> v7
* No Change
v5 -> v6
* Included RB tag from Rob
v4 -> v5
* Dropped L2 cache configuration properties
* Dropped PMA configuration properties
* Ordered the required list to match the properties list
RFC v3 -> v4
* Dropped l2 cache configuration parameters
* s/larger/large
* Added minItems/maxItems for andestech,pma-regions
---
.../cache/andestech,ax45mp-cache.yaml | 81 +++++++++++++++++++
1 file changed, 81 insertions(+)
create mode 100644 Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
diff --git a/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml b/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
new file mode 100644
index 000000000000..9ab5f0c435d4
--- /dev/null
+++ b/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
@@ -0,0 +1,81 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+# Copyright (C) 2023 Renesas Electronics Corp.
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/cache/andestech,ax45mp-cache.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Andestech AX45MP L2 Cache Controller
+
+maintainers:
+ - Lad Prabhakar <[email protected]>
+
+description:
+ A level-2 cache (L2C) is used to improve the system performance by providing
+ a large amount of cache line entries and reasonable access delays. The L2C
+ is shared between cores, and a non-inclusive non-exclusive policy is used.
+
+select:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - andestech,ax45mp-cache
+
+ required:
+ - compatible
+
+properties:
+ compatible:
+ items:
+ - const: andestech,ax45mp-cache
+ - const: cache
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ cache-line-size:
+ const: 64
+
+ cache-level:
+ const: 2
+
+ cache-sets:
+ const: 1024
+
+ cache-size:
+ enum: [131072, 262144, 524288, 1048576, 2097152]
+
+ cache-unified: true
+
+ next-level-cache: true
+
+additionalProperties: false
+
+required:
+ - compatible
+ - reg
+ - interrupts
+ - cache-line-size
+ - cache-level
+ - cache-sets
+ - cache-size
+ - cache-unified
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/irq.h>
+
+ cache-controller@2010000 {
+ compatible = "andestech,ax45mp-cache", "cache";
+ reg = <0x13400000 0x100000>;
+ interrupts = <508 IRQ_TYPE_LEVEL_HIGH>;
+ cache-line-size = <64>;
+ cache-level = <2>;
+ cache-sets = <1024>;
+ cache-size = <262144>;
+ cache-unified;
+ };
--
2.25.1
On Wed, Jun 14, 2023, at 12:47, Prabhakar wrote:
> From: Lad Prabhakar <[email protected]>
>
> Introduce support for nonstandard noncoherent systems in the RISC-V
> architecture. It enables function pointer support to handle cache
> management in such systems.
>
> This patch adds a new configuration option called
> "RISCV_NONSTANDARD_CACHE_OPS." This option is a boolean flag that
> depends on "RISCV_DMA_NONCOHERENT" and enables the function pointer
> support for cache management in nonstandard noncoherent systems.
>
> Signed-off-by: Lad Prabhakar <[email protected]>
I understand that Christoph will still not like this, but I think this
is as good as it gets, making the standard variant the fast path,
and using the function pointers only for the nonstandard cases.
> #include <asm/cacheflush.h>
> +#include <asm/dma-noncoherent.h>
>
> static bool noncoherent_supported;
>
> +struct riscv_cache_ops noncoherent_cache_ops = {
> + .clean = NULL,
> + .inval = NULL,
> + .flush = NULL,
> +};
This could be marked __read_mostly or __ro_after_init as
a micro-optimization, if anyone cares.
Arnd
Hi Prabhakar,
On Wed, Jun 14, 2023 at 12:48 PM Prabhakar <[email protected]> wrote:
> From: Lad Prabhakar <[email protected]>
>
> Introduce support for nonstandard noncoherent systems in the RISC-V
> architecture. It enables function pointer support to handle cache
> management in such systems.
>
> This patch adds a new configuration option called
> "RISCV_NONSTANDARD_CACHE_OPS." This option is a boolean flag that
> depends on "RISCV_DMA_NONCOHERENT" and enables the function pointer
> support for cache management in nonstandard noncoherent systems.
>
> Signed-off-by: Lad Prabhakar <[email protected]>
> ---
> v8 -> v9
> * New patch
Thanks for your patch!
> --- /dev/null
> +++ b/arch/riscv/include/asm/dma-noncoherent.h
> @@ -0,0 +1,28 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Copyright (C) 2023 Renesas Electronics Corp.
> + */
> +
> +#ifndef __ASM_DMA_NONCOHERENT_H
> +#define __ASM_DMA_NONCOHERENT_H
> +
> +#include <linux/dma-direct.h>
> +
> +/*
> + * struct riscv_cache_ops - Structure for CMO function pointers
> + *
> + * @clean: Function pointer for clean cache
> + * @inval: Function pointer for invalidate cache
> + * @flush: Function pointer for flushing the cache
> + */
> +struct riscv_cache_ops {
> + void (*clean)(phys_addr_t paddr, unsigned long size);
> + void (*inval)(phys_addr_t paddr, unsigned long size);
> + void (*flush)(phys_addr_t paddr, unsigned long size);
> +};
I guess the naming can be improved?
.clean() is used by arch_dma_cache_wback() / arch_wb_cache_pmem(),
.inval() is used by arch_dma_cache_inv() / arch_invalidate_pmem(),
.flush() is used by arch_dma_cache_wback_inv() / arch_dma_prep_coherent().
Perhaps .wback(), .inv(), .wback_inv() are more clear?
I understand this is subject to bikeshedding...
But hey, how many innocent bits of data have already been lost due
to cache semantic mismatches?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]
In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds
On Wed, Jun 14, 2023, at 12:47, Prabhakar wrote:
> +static void ax45mp_dma_cache_inv(phys_addr_t paddr, unsigned long size)
> +{
> + unsigned long start = (unsigned long)phys_to_virt(paddr);
> + char cache_buf[2][AX45MP_CACHE_LINE_SIZE];
> + unsigned long end = start + size;
> + unsigned long old_start = start;
> + unsigned long old_end = end;
> + unsigned long line_size;
> + unsigned long flags;
> +
> + if (unlikely(start == end))
> + return;
> +
> + line_size = ax45mp_priv.ax45mp_cache_line_size;
> +
> + memset(&cache_buf, 0x0, sizeof(cache_buf));
> + start = start & (~(line_size - 1));
> + end = ((end + line_size - 1) & (~(line_size - 1)));
> +
> + local_irq_save(flags);
> + if (unlikely(start != old_start))
> + memcpy(&cache_buf[0][0], (void *)start, line_size);
> +
> + if (unlikely(end != old_end))
> + memcpy(&cache_buf[1][0], (void *)(old_end & (~(line_size - 1))),
> line_size);
> +
> + ax45mp_cpu_dcache_inval_range(start, end, line_size);
> +
> + if (unlikely(start != old_start))
> + memcpy((void *)start, &cache_buf[0][0], (old_start & (line_size -
> 1)));
> +
> + local_irq_restore(flags);
> +}
I'm not quite sure what this does, maybe some comments would help.
This looks like a novel method of preserving partial cache lines
at the beginning (but not the end?) of an unaligned area.
As far as I can tell, most dma_mapping implementations don't
even try to do that at all, but the ones that do take a different
approach by calling _wback_inv() on partial cache lines instead
of calling _inv().
I'd say this does not belong into the low-level operations
here, especially since the ZICBOM variant doesn't do this either.
Arnd
On Wed, Jun 14, 2023 at 02:53:26PM +0200, Geert Uytterhoeven wrote:
> Hi Prabhakar,
>
> On Wed, Jun 14, 2023 at 12:48 PM Prabhakar <[email protected]> wrote:
> > From: Lad Prabhakar <[email protected]>
> >
> > Introduce support for nonstandard noncoherent systems in the RISC-V
> > architecture. It enables function pointer support to handle cache
> > management in such systems.
> >
> > This patch adds a new configuration option called
> > "RISCV_NONSTANDARD_CACHE_OPS." This option is a boolean flag that
> > depends on "RISCV_DMA_NONCOHERENT" and enables the function pointer
> > support for cache management in nonstandard noncoherent systems.
> >
> > Signed-off-by: Lad Prabhakar <[email protected]>
> > ---
> > v8 -> v9
> > * New patch
>
> Thanks for your patch!
>
> > --- /dev/null
> > +++ b/arch/riscv/include/asm/dma-noncoherent.h
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2023 Renesas Electronics Corp.
> > + */
> > +
> > +#ifndef __ASM_DMA_NONCOHERENT_H
> > +#define __ASM_DMA_NONCOHERENT_H
> > +
> > +#include <linux/dma-direct.h>
> > +
> > +/*
> > + * struct riscv_cache_ops - Structure for CMO function pointers
> > + *
> > + * @clean: Function pointer for clean cache
> > + * @inval: Function pointer for invalidate cache
> > + * @flush: Function pointer for flushing the cache
> > + */
> > +struct riscv_cache_ops {
> > + void (*clean)(phys_addr_t paddr, unsigned long size);
> > + void (*inval)(phys_addr_t paddr, unsigned long size);
> > + void (*flush)(phys_addr_t paddr, unsigned long size);
> > +};
>
> I guess the naming can be improved?
>
> .clean() is used by arch_dma_cache_wback() / arch_wb_cache_pmem(),
> .inval() is used by arch_dma_cache_inv() / arch_invalidate_pmem(),
> .flush() is used by arch_dma_cache_wback_inv() / arch_dma_prep_coherent().
>
> Perhaps .wback(), .inv(), .wback_inv() are more clear?
>
> I understand this is subject to bikeshedding...
> But hey, how many innocent bits of data have already been lost due
> to cache semantic mismatches?
Given this is based on Arnd's stuff, +1 on the bikeshed. The names have
been chosen I guess to match the CBOM extensions naming.
Otherwise, I had it in my head that the next revision of this was going
to take patches 8 & 9 from Arnd's series, to align the semantics. Not
that it really bothers me, just means this will have to wait for the
cross-arch series, when pretty sure Arnd suggested not depending on that
any more... Am I missing something Prabhakar?
Other than the bikeshedding, no objections.
Reviewed-by: Conor Dooley <[email protected]>
On Wed, Jun 14, 2023 at 11:47:53AM +0100, Prabhakar wrote:
> From: Lad Prabhakar <[email protected]>
>
> Hi All,
>
> non-coherent DMA support for AX45MP
> ====================================
>
> On the Andes AX45MP core, cache coherency is a specification option so it
> may not be supported. In this case DMA will fail. To get around with this
> issue this patch series does the below:
>
> 1] Andes alternative ports is implemented as errata which checks if the IOCP
> is missing and only then applies to CMO errata. One vendor specific SBI EXT
> (ANDES_SBI_EXT_IOCP_SW_WORKAROUND) is implemented as part of errata.
>
> Below are the configs which Andes port provides (and are selected by RZ/Five):
> - ERRATA_ANDES
> - ERRATA_ANDES_CMO
>
> OpenSBI patch supporting ANDES_SBI_EXT_IOCP_SW_WORKAROUND SBI can be found here,
> https://patchwork.ozlabs.org/project/opensbi/patch/[email protected]/
>
> 2] Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> block that allows dynamic adjustment of memory attributes in the runtime.
> It contains a configurable amount of PMA entries implemented as CSR
> registers to control the attributes of memory locations in interest.
> OpenSBI configures the PMA regions as required and creates a reserve memory
> node and propagates it to the higher boot stack.
>
> Currently OpenSBI (upstream) configures the required PMA region and passes
> this a shared DMA pool to Linux.
>
> reserved-memory {
> #address-cells = <2>;
> #size-cells = <2>;
> ranges;
>
> pma_resv0@58000000 {
> compatible = "shared-dma-pool";
> reg = <0x0 0x58000000 0x0 0x08000000>;
> no-map;
> linux,dma-default;
> };
> };
>
> The above shared DMA pool gets appended to Linux DTB so the DMA memory
> requests go through this region.
>
> 3] We provide callbacks to synchronize specific content between memory and
> cache.
>
> 4] RZ/Five SoC selects the below configs
> - AX45MP_L2_CACHE
> - DMA_GLOBAL_POOL
> - ERRATA_ANDES
> - ERRATA_ANDES_CMO
>
> ----------x---------------------x--------------------x---------------x--------------
>
> Note,
> - Ive used GCC 12.2.0 for compilation
> - Tested all the IP blocks on RZ/Five which use DMA
> - Patch series is dependent on the series from Arnd,
> https://patchwork.kernel.org/project/linux-riscv/cover/[email protected]/.
> (Ive rebased Arnd's series on v6.4-rc-1)
> - Patches applies on top of palmer/for-next (255b34d799dd)
> - Ive pushed the complete tree here https://github.com/prabhakarlad/linux/tree/rzfive-cmo-v9
> - Previously the function pointer approach was NAKed by Christoph Hellwig but based on the discussion
> on #riscv Ive implemented this approach.
Last time around you wanted someone to try this on a d1. I have done &
seems to work just as well as it did before. For where it is relevant:
Tested-by: Conor Dooley <[email protected]> # tyre-kicking on a d1
Cheers,
Conor.
On Wed, Jun 14, 2023 at 02:35:44PM +0200, Arnd Bergmann wrote:
> I understand that Christoph will still not like this, but I think this
> is as good as it gets, making the standard variant the fast path,
> and using the function pointers only for the nonstandard cases.
Yes. And I really do not want to make the non-standard cases easy.
The extension to do cache maintenance have been ratified for a while
and under discussion for much longer. Adding random crap because
vendors suck is a bad idea, and we should not be adding this crap.
Hi Arnd,
Thank you for the review.
On Wed, Jun 14, 2023 at 1:36 PM Arnd Bergmann <[email protected]> wrote:
>
> On Wed, Jun 14, 2023, at 12:47, Prabhakar wrote:
> > From: Lad Prabhakar <[email protected]>
> >
> > Introduce support for nonstandard noncoherent systems in the RISC-V
> > architecture. It enables function pointer support to handle cache
> > management in such systems.
> >
> > This patch adds a new configuration option called
> > "RISCV_NONSTANDARD_CACHE_OPS." This option is a boolean flag that
> > depends on "RISCV_DMA_NONCOHERENT" and enables the function pointer
> > support for cache management in nonstandard noncoherent systems.
> >
> > Signed-off-by: Lad Prabhakar <[email protected]>
>
> I understand that Christoph will still not like this, but I think this
> is as good as it gets, making the standard variant the fast path,
> and using the function pointers only for the nonstandard cases.
>
>
> > #include <asm/cacheflush.h>
> > +#include <asm/dma-noncoherent.h>
> >
> > static bool noncoherent_supported;
> >
> > +struct riscv_cache_ops noncoherent_cache_ops = {
> > + .clean = NULL,
> > + .inval = NULL,
> > + .flush = NULL,
> > +};
>
> This could be marked __read_mostly or __ro_after_init as
> a micro-optimization, if anyone cares.
>
Ok, I will do that in the next version.
Cheers,
Prabhakar
Hi Geert,
Thank you for the review.
On Wed, Jun 14, 2023 at 1:53 PM Geert Uytterhoeven <[email protected]> wrote:
>
> Hi Prabhakar,
>
> On Wed, Jun 14, 2023 at 12:48 PM Prabhakar <[email protected]> wrote:
> > From: Lad Prabhakar <[email protected]>
> >
> > Introduce support for nonstandard noncoherent systems in the RISC-V
> > architecture. It enables function pointer support to handle cache
> > management in such systems.
> >
> > This patch adds a new configuration option called
> > "RISCV_NONSTANDARD_CACHE_OPS." This option is a boolean flag that
> > depends on "RISCV_DMA_NONCOHERENT" and enables the function pointer
> > support for cache management in nonstandard noncoherent systems.
> >
> > Signed-off-by: Lad Prabhakar <[email protected]>
> > ---
> > v8 -> v9
> > * New patch
>
> Thanks for your patch!
>
> > --- /dev/null
> > +++ b/arch/riscv/include/asm/dma-noncoherent.h
> > @@ -0,0 +1,28 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Copyright (C) 2023 Renesas Electronics Corp.
> > + */
> > +
> > +#ifndef __ASM_DMA_NONCOHERENT_H
> > +#define __ASM_DMA_NONCOHERENT_H
> > +
> > +#include <linux/dma-direct.h>
> > +
> > +/*
> > + * struct riscv_cache_ops - Structure for CMO function pointers
> > + *
> > + * @clean: Function pointer for clean cache
> > + * @inval: Function pointer for invalidate cache
> > + * @flush: Function pointer for flushing the cache
> > + */
> > +struct riscv_cache_ops {
> > + void (*clean)(phys_addr_t paddr, unsigned long size);
> > + void (*inval)(phys_addr_t paddr, unsigned long size);
> > + void (*flush)(phys_addr_t paddr, unsigned long size);
> > +};
>
> I guess the naming can be improved?
>
> .clean() is used by arch_dma_cache_wback() / arch_wb_cache_pmem(),
> .inval() is used by arch_dma_cache_inv() / arch_invalidate_pmem(),
> .flush() is used by arch_dma_cache_wback_inv() / arch_dma_prep_coherent().
>
> Perhaps .wback(), .inv(), .wback_inv() are more clear?
>
Ok I will update pointer names in the next version.
Cheers,
Prabhakar
Hi Arnd,
Thank you for the review.
On Wed, Jun 14, 2023 at 2:11 PM Arnd Bergmann <[email protected]> wrote:
>
> On Wed, Jun 14, 2023, at 12:47, Prabhakar wrote:
>
> > +static void ax45mp_dma_cache_inv(phys_addr_t paddr, unsigned long size)
> > +{
> > + unsigned long start = (unsigned long)phys_to_virt(paddr);
> > + char cache_buf[2][AX45MP_CACHE_LINE_SIZE];
> > + unsigned long end = start + size;
> > + unsigned long old_start = start;
> > + unsigned long old_end = end;
> > + unsigned long line_size;
> > + unsigned long flags;
> > +
> > + if (unlikely(start == end))
> > + return;
> > +
> > + line_size = ax45mp_priv.ax45mp_cache_line_size;
> > +
> > + memset(&cache_buf, 0x0, sizeof(cache_buf));
> > + start = start & (~(line_size - 1));
> > + end = ((end + line_size - 1) & (~(line_size - 1)));
> > +
> > + local_irq_save(flags);
> > + if (unlikely(start != old_start))
> > + memcpy(&cache_buf[0][0], (void *)start, line_size);
> > +
> > + if (unlikely(end != old_end))
> > + memcpy(&cache_buf[1][0], (void *)(old_end & (~(line_size - 1))),
> > line_size);
> > +
> > + ax45mp_cpu_dcache_inval_range(start, end, line_size);
> > +
> > + if (unlikely(start != old_start))
> > + memcpy((void *)start, &cache_buf[0][0], (old_start & (line_size -
> > 1)));
> > +
> > + local_irq_restore(flags);
> > +}
>
> I'm not quite sure what this does, maybe some comments would help.
>
> This looks like a novel method of preserving partial cache lines
> at the beginning (but not the end?) of an unaligned area.
>
I missed this earlier, (I removed the preserving of cache line from
then end and left the beginning part. Samuel pointed to earlier on
these patches if the cache-line-size is 64 we dont need to do this).
Preserving cache lines is not needed at all, Ive verified this and
just doing a _inval() is sufficient. I'll update this and send a new
version.
Cheers,
Prabhakar
> As far as I can tell, most dma_mapping implementations don't
> even try to do that at all, but the ones that do take a different
> approach by calling _wback_inv() on partial cache lines instead
> of calling _inv().
>
> I'd say this does not belong into the low-level operations
> here, especially since the ZICBOM variant doesn't do this either.
>
> Arnd