2022-11-24 17:34:32

by Lad, Prabhakar

[permalink] [raw]
Subject: [PATCH v4 0/7] AX45MP: Add support to non-coherent DMA

From: Lad Prabhakar <[email protected]>

Hi All,

On the Andes AX45MP core, cache coherency is a specification option so it
may not be supported. In this case DMA will fail. To get around with this
issue this patch series does the below:

1] Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
block that allows dynamic adjustment of memory attributes in the runtime.
It contains a configurable amount of PMA entries implemented as CSR
registers to control the attributes of memory locations in interest. PMA
regions are passed from the l2 node which are configured as
non-cacheable + bufferable with the SBI call.

l2cache: [email protected] {
....
andestech,pma-regions = <0x58000000 0x08000000
(AX45MP_PMACFG_ETYP_NAPOT |
AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF)>;
....
};

2] We provide callbacks to synchronize specific content between memory and
cache.

- arch_sync_dma_for_device()
- arch_sync_dma_for_cpu()

Below are the configs that are enabled:

- DMA_GLOBAL_POOL
- RISCV_DMA_NONCOHERENT

3] We reserve the shared DMA pool, so the DMA memory requests go through
this pool:

reserved-memory {
#address-cells = <2>;
#size-cells = <2>;
ranges;

reserved: linux,[email protected] {
compatible = "shared-dma-pool";
no-map;
linux,dma-default;
reg = <0x0 0x58000000 0x0 0x08000000>;
};
};


Below is the L2 cache DT node:

l2cache: [email protected] {
compatible = "andestech,ax45mp-cache", "cache";
cache-size = <0x40000>;
cache-line-size = <64>;
cache-sets = <1024>;
cache-unified;
reg = <0x0 0x13400000 0x0 0x100000>;
andestech,pma-regions = <0x0 0x58000000 0x0 0x08000000 0x0
(AX45MP_PMACFG_ETYP_NAPOT |
AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF)>;
interrupts = <SOC_PERIPHERAL_IRQ(476, IRQ_TYPE_LEVEL_HIGH)>;
};

Due to the above approach custom SBI calls have been implemented. The
above implementation is in preparation for adding support for Renesas
RZ/Five SoC which uses the AX45MP core. As with the above approach the
kernel image might not be generic so that it can be used on other
platforms.

OpenSBI implementation isn't upstreamed yet, public repo for access is
available at [0].

[0] https://github.com/renesas-rz/rz_opensbi/tree/work/OpenSBI-PMA

Note,
- This series requires testing on Cores with zibcom and T-Head SoCs
- Ive used GCC 9.4.0 for compilation
- Tested all the IP blocks on RZ/Five which use DMA

RFC v3 -> v4
* Implemented ALTERNATIVE_3() macro
* Now using runtime patching mechanism instead of compile time config
* Added Andes CMO as and errata
* Fixed comments pointed by Geert

RFC v2-> RFC v3
* Fixed review comments pointed by Conor
* Move DT binding into cache folder
* Fixed DT binding check issue
* Added andestech,ax45mp-cache.h header file
* Now passing the flags for the PMA setup as part of andestech,pma-regions
property.
* Added andestech,inst/data-prefetch and andestech,tag/data-ram-ctl
properties to configure the L2 cache.
* Registered the cache driver as platform driver

RFC v1-> RFC v2
* Moved out the code from arc/riscv to drivers/soc/renesas
* Now handling the PMA setup as part of the L2 cache
* Now making use of dma-noncoherent.c instead SoC specific implementation.
* Dropped arch_dma_alloc() and arch_dma_free()
* Switched to RISCV_DMA_NONCOHERENT
* Included DT binding doc

RFC v2: https://patchwork.kernel.org/project/linux-renesas-soc/cover/[email protected]/
RFC v1: https://patchwork.kernel.org/project/linux-renesas-soc/cover/[email protected]/

Cheers,
Prabhakar

Lad Prabhakar (7):
riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro
riscv: asm: vendorid_list: Add Andes Technology to the vendors list
riscv: errata: Add Andes alternative ports
riscv: errata: andes: Fix auipc-jalr addresses in patched alternatives
riscv: mm: dma-noncoherent: Pass direction and operation to
ALT_CMO_OP()
dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation
for L2 cache controller
soc: renesas: Add L2 cache management for RZ/Five SoC

.../cache/andestech,ax45mp-cache.yaml | 93 ++++
arch/riscv/Kconfig.erratas | 22 +
arch/riscv/errata/Makefile | 1 +
arch/riscv/errata/andes/Makefile | 1 +
arch/riscv/errata/andes/errata.c | 139 ++++++
arch/riscv/include/asm/alternative-macros.h | 94 ++++
arch/riscv/include/asm/alternative.h | 3 +
arch/riscv/include/asm/cacheflush.h | 12 +
arch/riscv/include/asm/errata_list.h | 45 +-
arch/riscv/include/asm/vendorid_list.h | 1 +
arch/riscv/kernel/alternative.c | 5 +
arch/riscv/mm/dma-noncoherent.c | 15 +-
drivers/soc/renesas/Kconfig | 7 +
drivers/soc/renesas/Makefile | 2 +
drivers/soc/renesas/rzfive/Kconfig | 6 +
drivers/soc/renesas/rzfive/Makefile | 3 +
drivers/soc/renesas/rzfive/ax45mp_cache.c | 415 ++++++++++++++++++
drivers/soc/renesas/rzfive/ax45mp_sbi.h | 29 ++
.../cache/andestech,ax45mp-cache.h | 38 ++
19 files changed, 918 insertions(+), 13 deletions(-)
create mode 100644 Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
create mode 100644 arch/riscv/errata/andes/Makefile
create mode 100644 arch/riscv/errata/andes/errata.c
create mode 100644 drivers/soc/renesas/rzfive/Kconfig
create mode 100644 drivers/soc/renesas/rzfive/Makefile
create mode 100644 drivers/soc/renesas/rzfive/ax45mp_cache.c
create mode 100644 drivers/soc/renesas/rzfive/ax45mp_sbi.h
create mode 100644 include/dt-bindings/cache/andestech,ax45mp-cache.h

--
2.25.1


2022-11-24 17:34:36

by Lad, Prabhakar

[permalink] [raw]
Subject: [PATCH v4 5/7] riscv: mm: dma-noncoherent: Pass direction and operation to ALT_CMO_OP()

From: Lad Prabhakar <[email protected]>

Pass direction and operation to ALT_CMO_OP() macro.

This is in preparation for adding errata for the Andes CPU core.

Signed-off-by: Lad Prabhakar <[email protected]>
---
RFC v3 -> v4
* New patch
---
arch/riscv/include/asm/cacheflush.h | 4 ++++
arch/riscv/include/asm/errata_list.h | 8 ++++++--
arch/riscv/mm/dma-noncoherent.c | 15 ++++++++++-----
3 files changed, 20 insertions(+), 7 deletions(-)

diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
index f6fbe7042f1c..4a04d1be7c67 100644
--- a/arch/riscv/include/asm/cacheflush.h
+++ b/arch/riscv/include/asm/cacheflush.h
@@ -8,6 +8,10 @@

#include <linux/mm.h>

+#define NON_COHERENT_SYNC_DMA_FOR_DEVICE 0
+#define NON_COHERENT_SYNC_DMA_FOR_CPU 1
+#define NON_COHERENT_DMA_PREP 2
+
static inline void local_flush_icache_all(void)
{
asm volatile ("fence.i" ::: "memory");
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index 2ba7e6e74540..48e899a8e7a9 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -124,7 +124,7 @@ asm volatile(ALTERNATIVE( \
#define THEAD_flush_A0 ".long 0x0275000b"
#define THEAD_SYNC_S ".long 0x0190000b"

-#define ALT_CMO_OP(_op, _start, _size, _cachesize) \
+#define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops) \
asm volatile(ALTERNATIVE_2( \
__nops(6), \
"mv a0, %1\n\t" \
@@ -146,7 +146,11 @@ asm volatile(ALTERNATIVE_2( \
ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO) \
: : "r"(_cachesize), \
"r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \
- "r"((unsigned long)(_start) + (_size)) \
+ "r"((unsigned long)(_start) + (_size)), \
+ "r"((unsigned long)(_start)), \
+ "r"((unsigned long)(_size)), \
+ "r"((unsigned long)(_dir)), \
+ "r"((unsigned long)(_ops)) \
: "a0")

#define THEAD_C9XX_RV_IRQ_PMU 17
diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
index d919efab6eba..e2b82034f504 100644
--- a/arch/riscv/mm/dma-noncoherent.c
+++ b/arch/riscv/mm/dma-noncoherent.c
@@ -19,13 +19,16 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,

switch (dir) {
case DMA_TO_DEVICE:
- ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
+ ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size,
+ dir, NON_COHERENT_SYNC_DMA_FOR_DEVICE);
break;
case DMA_FROM_DEVICE:
- ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
+ ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size,
+ dir, NON_COHERENT_SYNC_DMA_FOR_DEVICE);
break;
case DMA_BIDIRECTIONAL:
- ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
+ ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size,
+ dir, NON_COHERENT_SYNC_DMA_FOR_DEVICE);
break;
default:
break;
@@ -42,7 +45,8 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
break;
case DMA_FROM_DEVICE:
case DMA_BIDIRECTIONAL:
- ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
+ ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size,
+ dir, NON_COHERENT_SYNC_DMA_FOR_CPU);
break;
default:
break;
@@ -53,7 +57,8 @@ void arch_dma_prep_coherent(struct page *page, size_t size)
{
void *flush_addr = page_address(page);

- ALT_CMO_OP(flush, flush_addr, size, riscv_cbom_block_size);
+ ALT_CMO_OP(flush, flush_addr, size, riscv_cbom_block_size,
+ 0, NON_COHERENT_DMA_PREP);
}

void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
--
2.25.1

2022-11-24 17:35:21

by Lad, Prabhakar

[permalink] [raw]
Subject: [PATCH v4 1/7] riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro

From: Lad Prabhakar <[email protected]>

Introduce ALTERNATIVE_3() macro.

Signed-off-by: Lad Prabhakar <[email protected]>
---
RFC v3 -> v4
* New patch
---
arch/riscv/include/asm/alternative-macros.h | 94 +++++++++++++++++++++
1 file changed, 94 insertions(+)

diff --git a/arch/riscv/include/asm/alternative-macros.h b/arch/riscv/include/asm/alternative-macros.h
index ec2f3f1b836f..1caf4306b3d6 100644
--- a/arch/riscv/include/asm/alternative-macros.h
+++ b/arch/riscv/include/asm/alternative-macros.h
@@ -69,6 +69,34 @@
new_c_2, vendor_id_2, errata_id_2, \
IS_ENABLED(CONFIG_k_2)

+.macro __ALTERNATIVE_CFG_3 old_c, new_c_1, vendor_id_1, errata_id_1, enable_1, \
+ new_c_2, vendor_id_2, errata_id_2, enable_2, \
+ new_c_3, vendor_id_3, errata_id_3, enable_3
+886 :
+ .option push
+ .option norvc
+ .option norelax
+ \old_c
+ .option pop
+887 :
+ ALT_NEW_CONTENT \vendor_id_1, \errata_id_1, \enable_1, \new_c_1
+ ALT_NEW_CONTENT \vendor_id_2, \errata_id_2, \enable_2, \new_c_2
+ ALT_NEW_CONTENT \vendor_id_3, \errata_id_3, \enable_3, \new_c_3
+.endm
+
+#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
+ CONFIG_k_1, \
+ new_c_2, vendor_id_2, errata_id_2, \
+ CONFIG_k_2, \
+ new_c_3, vendor_id_3, errata_id_3, \
+ CONFIG_k_3) \
+ __ALTERNATIVE_CFG_3 old_c, new_c_1, vendor_id_1, errata_id_1, \
+ IS_ENABLED(CONFIG_k_1), \
+ new_c_2, vendor_id_2, errata_id_2, \
+ IS_ENABLED(CONFIG_k_2), \
+ new_c_3, vendor_id_3, errata_id_3, \
+ IS_ENABLED(CONFIG_k_3)
+
#else /* !__ASSEMBLY__ */

#include <asm/asm.h>
@@ -135,6 +163,36 @@
new_c_2, vendor_id_2, errata_id_2, \
IS_ENABLED(CONFIG_k_2))

+#define __ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
+ enable_1, \
+ new_c_2, vendor_id_2, errata_id_2, \
+ enable_2, \
+ new_c_3, vendor_id_3, errata_id_3, \
+ enable_3) \
+ "886 :\n" \
+ ".option push\n" \
+ ".option norvc\n" \
+ ".option norelax\n" \
+ old_c "\n" \
+ ".option pop\n" \
+ "887 :\n" \
+ ALT_NEW_CONTENT(vendor_id_1, errata_id_1, enable_1, new_c_1) \
+ ALT_NEW_CONTENT(vendor_id_2, errata_id_2, enable_2, new_c_2) \
+ ALT_NEW_CONTENT(vendor_id_3, errata_id_3, enable_3, new_c_3)
+
+#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
+ CONFIG_k_1, \
+ new_c_2, vendor_id_2, errata_id_2, \
+ CONFIG_k_2, \
+ new_c_3, vendor_id_3, errata_id_3, \
+ CONFIG_k_3) \
+ __ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
+ IS_ENABLED(CONFIG_k_1), \
+ new_c_2, vendor_id_2, errata_id_2, \
+ IS_ENABLED(CONFIG_k_2), \
+ new_c_3, vendor_id_3, errata_id_3, \
+ IS_ENABLED(CONFIG_k_3))
+
#endif /* __ASSEMBLY__ */

#else /* CONFIG_RISCV_ALTERNATIVE */
@@ -153,6 +211,14 @@
CONFIG_k_2) \
__ALTERNATIVE_CFG old_c

+#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
+ CONFIG_k_1, \
+ new_c_2, vendor_id_2, errata_id_2, \
+ CONFIG_k_2, \
+ new_c_3, vendor_id_3, errata_id_3, \
+ CONFIG_k_3) \
+ __ALTERNATIVE_CFG old_c
+
#else /* !__ASSEMBLY__ */

#define __ALTERNATIVE_CFG(old_c) \
@@ -167,6 +233,14 @@
CONFIG_k_2) \
__ALTERNATIVE_CFG(old_c)

+#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
+ CONFIG_k_1, \
+ new_c_2, vendor_id_2, errata_id_2, \
+ CONFIG_k_2, \
+ new_c_3, vendor_id_3, errata_id_3, \
+ CONFIG_k_3) \
+ __ALTERNATIVE_CFG(old_c)
+
#endif /* __ASSEMBLY__ */
#endif /* CONFIG_RISCV_ALTERNATIVE */

@@ -202,4 +276,24 @@
new_content_2, vendor_id_2, \
errata_id_2, CONFIG_k_2)

+/*
+ * A vendor wants to replace an old_content, but another vendor has used
+ * ALTERNATIVE_2() to patch its customized content at the same location. In
+ * this case, this vendor can create a new macro ALTERNATIVE_3() based
+ * on the following sample code and then replace ALTERNATIVE_2() with
+ * ALTERNATIVE_3() to append its customized content.
+ */
+#define ALTERNATIVE_3(old_content, new_content_1, vendor_id_1, \
+ errata_id_1, CONFIG_k_1, \
+ new_content_2, vendor_id_2, \
+ errata_id_2, CONFIG_k_2, \
+ new_content_3, vendor_id_3, \
+ errata_id_3, CONFIG_k_3) \
+ _ALTERNATIVE_CFG_3(old_content, new_content_1, vendor_id_1, \
+ errata_id_1, CONFIG_k_1, \
+ new_content_2, vendor_id_2, \
+ errata_id_2, CONFIG_k_2, \
+ new_content_3, vendor_id_3, \
+ errata_id_3, CONFIG_k_3)
+
#endif
--
2.25.1

2022-11-24 17:35:40

by Lad, Prabhakar

[permalink] [raw]
Subject: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

From: Lad Prabhakar <[email protected]>

On the AX45MP core, cache coherency is a specification option so it may
not be supported. In this case DMA will fail. As a workaround, firstly we
allocate a global dma coherent pool from which DMA allocations are taken
and marked as non-cacheable + bufferable using the PMA region as specified
in the device tree. Synchronization callbacks are implemented to
synchronize when doing DMA transactions.

The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
block that allows dynamic adjustment of memory attributes in the runtime.
It contains a configurable amount of PMA entries implemented as CSR
registers to control the attributes of memory locations in interest.

Below are the memory attributes supported:
* Device, Non-bufferable
* Device, bufferable
* Memory, Non-cacheable, Non-bufferable
* Memory, Non-cacheable, Bufferable
* Memory, Write-back, No-allocate
* Memory, Write-back, Read-allocate
* Memory, Write-back, Write-allocate
* Memory, Write-back, Read and Write-allocate

This patch adds support to configure the memory attributes of the memory
regions as passed from the l2 cache node and exposes the cache management
ops.

More info about PMA (section 10.3):
Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf

Signed-off-by: Lad Prabhakar <[email protected]>
---
RFC v3 -> v4
* Made use of runtime patching instead of compile time
* Now just exposing single function ax45mp_no_iocp_cmo() for CMO handling
* Added a check to make sure cache line size is always 64 bytes
* Renamed folder rzf -> rzfive
* Improved Kconfig description
* Dropped L2 cache configuration
* Dropped unnecessary casts
* Fixed comments pointed by Geert, apart from use of PTR_ALIGN_XYZ() macros.
---
arch/riscv/include/asm/cacheflush.h | 8 +
arch/riscv/include/asm/errata_list.h | 32 +-
drivers/soc/renesas/Kconfig | 7 +
drivers/soc/renesas/Makefile | 2 +
drivers/soc/renesas/rzfive/Kconfig | 6 +
drivers/soc/renesas/rzfive/Makefile | 3 +
drivers/soc/renesas/rzfive/ax45mp_cache.c | 415 ++++++++++++++++++++++
drivers/soc/renesas/rzfive/ax45mp_sbi.h | 29 ++
8 files changed, 496 insertions(+), 6 deletions(-)
create mode 100644 drivers/soc/renesas/rzfive/Kconfig
create mode 100644 drivers/soc/renesas/rzfive/Makefile
create mode 100644 drivers/soc/renesas/rzfive/ax45mp_cache.c
create mode 100644 drivers/soc/renesas/rzfive/ax45mp_sbi.h

diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
index 4a04d1be7c67..3226f3aceafe 100644
--- a/arch/riscv/include/asm/cacheflush.h
+++ b/arch/riscv/include/asm/cacheflush.h
@@ -61,6 +61,14 @@ static inline void riscv_noncoherent_supported(void) {}
#define SYS_RISCV_FLUSH_ICACHE_LOCAL 1UL
#define SYS_RISCV_FLUSH_ICACHE_ALL (SYS_RISCV_FLUSH_ICACHE_LOCAL)

+#ifdef CONFIG_AX45MP_L2_CACHE
+extern asmlinkage void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr,
+ size_t size, int dir, int ops);
+#else
+inline void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr,
+ size_t size, int dir, int ops) {}
+#endif
+
#include <asm-generic/cacheflush.h>

#endif /* _ASM_RISCV_CACHEFLUSH_H */
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index 48e899a8e7a9..300fed3bfd80 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -125,8 +125,8 @@ asm volatile(ALTERNATIVE( \
#define THEAD_SYNC_S ".long 0x0190000b"

#define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops) \
-asm volatile(ALTERNATIVE_2( \
- __nops(6), \
+asm volatile(ALTERNATIVE_3( \
+ __nops(14), \
"mv a0, %1\n\t" \
"j 2f\n\t" \
"3:\n\t" \
@@ -134,7 +134,7 @@ asm volatile(ALTERNATIVE_2( \
"add a0, a0, %0\n\t" \
"2:\n\t" \
"bltu a0, %2, 3b\n\t" \
- "nop", 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
+ __nops(8), 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
"mv a0, %1\n\t" \
"j 2f\n\t" \
"3:\n\t" \
@@ -142,8 +142,28 @@ asm volatile(ALTERNATIVE_2( \
"add a0, a0, %0\n\t" \
"2:\n\t" \
"bltu a0, %2, 3b\n\t" \
- THEAD_SYNC_S, THEAD_VENDOR_ID, \
- ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO) \
+ THEAD_SYNC_S "\n\t" \
+ __nops(8), THEAD_VENDOR_ID, \
+ ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO, \
+ ".option push\n\t\n\t" \
+ ".option norvc\n\t" \
+ ".option norelax\n\t" \
+ "addi sp,sp,-16\n\t" \
+ "sd s0,0(sp)\n\t" \
+ "sd ra,8(sp)\n\t" \
+ "addi s0,sp,16\n\t" \
+ "mv a4,%6\n\t" \
+ "mv a3,%5\n\t" \
+ "mv a2,%4\n\t" \
+ "mv a1,%3\n\t" \
+ "mv a0,%0\n\t" \
+ "call ax45mp_no_iocp_cmo\n\t" \
+ "ld ra,8(sp)\n\t" \
+ "ld s0,0(sp)\n\t" \
+ "addi sp,sp,16\n\t" \
+ ".option pop\n\t", \
+ ANDESTECH_VENDOR_ID, ERRATA_ANDESTECH_NO_IOCP, \
+ CONFIG_ERRATA_ANDES_CMO) \
: : "r"(_cachesize), \
"r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \
"r"((unsigned long)(_start) + (_size)), \
@@ -151,7 +171,7 @@ asm volatile(ALTERNATIVE_2( \
"r"((unsigned long)(_size)), \
"r"((unsigned long)(_dir)), \
"r"((unsigned long)(_ops)) \
- : "a0")
+ : "a0", "a1", "a2", "a3", "a4", "memory")

#define THEAD_C9XX_RV_IRQ_PMU 17
#define THEAD_C9XX_CSR_SCOUNTEROF 0x5c5
diff --git a/drivers/soc/renesas/Kconfig b/drivers/soc/renesas/Kconfig
index 660498252ec5..e7810256c60d 100644
--- a/drivers/soc/renesas/Kconfig
+++ b/drivers/soc/renesas/Kconfig
@@ -340,9 +340,16 @@ if RISCV
config ARCH_R9A07G043
bool "RISC-V Platform support for RZ/Five"
select ARCH_RZG2L
+ select AX45MP_L2_CACHE
+ select DMA_GLOBAL_POOL
+ select ERRATA_ANDES
+ select ERRATA_ANDES_CMO
+ select RISCV_DMA_NONCOHERENT
help
This enables support for the Renesas RZ/Five SoC.

+source "drivers/soc/renesas/rzfive/Kconfig"
+
endif # RISCV

config RST_RCAR
diff --git a/drivers/soc/renesas/Makefile b/drivers/soc/renesas/Makefile
index 535868c9c7e4..9df9f759a039 100644
--- a/drivers/soc/renesas/Makefile
+++ b/drivers/soc/renesas/Makefile
@@ -31,6 +31,8 @@ ifdef CONFIG_SMP
obj-$(CONFIG_ARCH_R9A06G032) += r9a06g032-smp.o
endif

+obj-$(CONFIG_RISCV) += rzfive/
+
# Family
obj-$(CONFIG_RST_RCAR) += rcar-rst.o
obj-$(CONFIG_SYSC_RCAR) += rcar-sysc.o
diff --git a/drivers/soc/renesas/rzfive/Kconfig b/drivers/soc/renesas/rzfive/Kconfig
new file mode 100644
index 000000000000..b6bc00337d99
--- /dev/null
+++ b/drivers/soc/renesas/rzfive/Kconfig
@@ -0,0 +1,6 @@
+# SPDX-License-Identifier: GPL-2.0
+
+config AX45MP_L2_CACHE
+ bool "Andes Technology AX45MP L2 Cache controller"
+ help
+ Support for the L2 cache controller on Andes Technology AX45MP platforms.
diff --git a/drivers/soc/renesas/rzfive/Makefile b/drivers/soc/renesas/rzfive/Makefile
new file mode 100644
index 000000000000..2012e7fb978d
--- /dev/null
+++ b/drivers/soc/renesas/rzfive/Makefile
@@ -0,0 +1,3 @@
+# SPDX-License-Identifier: GPL-2.0
+
+obj-$(CONFIG_AX45MP_L2_CACHE) += ax45mp_cache.o
diff --git a/drivers/soc/renesas/rzfive/ax45mp_cache.c b/drivers/soc/renesas/rzfive/ax45mp_cache.c
new file mode 100644
index 000000000000..4e0d0545d3af
--- /dev/null
+++ b/drivers/soc/renesas/rzfive/ax45mp_cache.c
@@ -0,0 +1,415 @@
+// SPDX-License-Identifier: GPL-2.0
+/*
+ * PMA setup and non-coherent cache functions for Andes AX45MP
+ *
+ * Copyright (C) 2022 Renesas Electronics Corp.
+ */
+
+#include <linux/cacheflush.h>
+#include <linux/cacheinfo.h>
+#include <linux/dma-direction.h>
+#include <linux/of_address.h>
+#include <linux/of_platform.h>
+
+#include <asm/cacheflush.h>
+#include <asm/sbi.h>
+
+#include "ax45mp_sbi.h"
+
+/* L2 cache registers */
+#define AX45MP_L2C_REG_CTL_OFFSET 0x8
+
+#define AX45MP_L2C_REG_C0_CMD_OFFSET 0x40
+#define AX45MP_L2C_REG_C0_ACC_OFFSET 0x48
+#define AX45MP_L2C_REG_STATUS_OFFSET 0x80
+
+/* D-cache operation */
+#define AX45MP_CCTL_L1D_VA_INVAL 0
+#define AX45MP_CCTL_L1D_VA_WB 1
+
+/* L2 cache */
+#define AX45MP_L2_CACHE_CTL_CEN_MASK 1
+
+/* L2 CCTL status */
+#define AX45MP_CCTL_L2_STATUS_IDLE 0
+
+/* L2 CCTL status cores mask */
+#define AX45MP_CCTL_L2_STATUS_C0_MASK 0xf
+
+/* L2 cache operation */
+#define AX45MP_CCTL_L2_PA_INVAL 0x8
+#define AX45MP_CCTL_L2_PA_WB 0x9
+
+#define AX45MP_L2C_HPM_PER_CORE_OFFSET 0x8
+#define AX45MP_L2C_REG_PER_CORE_OFFSET 0x10
+#define AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET 4
+
+#define AX45MP_L2C_REG_CN_CMD_OFFSET(n) \
+ (AX45MP_L2C_REG_C0_CMD_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET))
+#define AX45MP_L2C_REG_CN_ACC_OFFSET(n) \
+ (AX45MP_L2C_REG_C0_ACC_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET))
+#define AX45MP_CCTL_L2_STATUS_CN_MASK(n) \
+ (AX45MP_CCTL_L2_STATUS_C0_MASK << ((n) * AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET))
+
+#define AX45MP_MICM_CFG_ISZ_OFFSET 6
+#define AX45MP_MICM_CFG_ISZ_MASK (0x7 << AX45MP_MICM_CFG_ISZ_OFFSET)
+
+#define AX45MP_MDCM_CFG_DSZ_OFFSET 6
+#define AX45MP_MDCM_CFG_DSZ_MASK (0x7 << AX45MP_MDCM_CFG_DSZ_OFFSET)
+
+#define AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM 0x80b
+#define AX45MP_CCTL_REG_UCCTLCOMMAND_NUM 0x80c
+
+#define AX45MP_MCACHE_CTL_CCTL_SUEN_OFFSET 8
+#define AX45MP_MMSC_CFG_CCTLCSR_OFFSET 16
+#define AX45MP_MISA_20_OFFSET 20
+
+#define AX45MP_MCACHE_CTL_CCTL_SUEN_MASK (0x1 << AX45MP_MCACHE_CTL_CCTL_SUEN_OFFSET)
+#define AX45MP_MMSC_CFG_CCTLCSR_MASK (0x1 << AX45MP_MMSC_CFG_CCTLCSR_OFFSET)
+#define AX45MP_MISA_20_MASK (0x1 << AX45MP_MISA_20_OFFSET)
+
+#define AX45MP_MAX_CACHE_LINE_SIZE 256
+
+#define AX45MP_MAX_PMA_REGIONS 16
+
+struct ax45mp_priv {
+ void __iomem *l2c_base;
+ u32 ax45mp_cache_line_size;
+ bool l2cache_enabled;
+ bool ucctl_ok;
+};
+
+static struct ax45mp_priv *ax45mp_priv;
+static DEFINE_STATIC_KEY_FALSE(ax45mp_l2c_configured);
+
+/* PMA setup */
+static long ax45mp_sbi_set_pma(unsigned long start,
+ unsigned long size,
+ unsigned long flags,
+ unsigned int entry_id)
+{
+ struct sbiret ret;
+
+ ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_SET_PMA,
+ start, size, entry_id, flags, 0, 0);
+
+ return ret.value;
+}
+
+static int ax45mp_configure_pma_regions(struct device_node *np)
+{
+ const char *propname = "andestech,pma-regions";
+ u32 start, size, flags;
+ unsigned int entry_id;
+ unsigned int i;
+ int count;
+ int ret;
+
+ count = of_property_count_elems_of_size(np, propname, sizeof(u32) * 3);
+ if (count < 0)
+ return count;
+
+ if (count > AX45MP_MAX_PMA_REGIONS)
+ return -EINVAL;
+
+ for (i = 0, entry_id = 0 ; entry_id < count ; i += 3, entry_id++) {
+ of_property_read_u32_index(np, propname, i, &start);
+ of_property_read_u32_index(np, propname, i + 1, &size);
+ of_property_read_u32_index(np, propname, i + 2, &flags);
+ ret = ax45mp_sbi_set_pma(start, size, flags, entry_id);
+ if (!ret)
+ pr_err("Failed to setup PMA region 0x%x - 0x%x flags: 0x%x",
+ start, start + size, flags);
+ }
+
+ return 0;
+}
+
+/* L2 Cache operations */
+static uint32_t ax45mp_cpu_get_mcache_ctl_status(void)
+{
+ struct sbiret ret;
+
+ ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MCACHE_CTL_STATUS,
+ 0, 0, 0, 0, 0, 0);
+ return ret.value;
+}
+
+static uint32_t ax45mp_cpu_get_micm_cfg_status(void)
+{
+ struct sbiret ret;
+
+ ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MICM_CTL_STATUS,
+ 0, 0, 0, 0, 0, 0);
+ return ret.value;
+}
+
+static uint32_t ax45mp_cpu_get_mdcm_cfg_status(void)
+{
+ struct sbiret ret;
+
+ ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MDCM_CTL_STATUS,
+ 0, 0, 0, 0, 0, 0);
+ return ret.value;
+}
+
+static uint32_t ax45mp_cpu_get_mmsc_cfg_status(void)
+{
+ struct sbiret ret;
+
+ ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MMSC_CTL_STATUS,
+ 0, 0, 0, 0, 0, 0);
+ return ret.value;
+}
+
+static uint32_t ax45mp_cpu_get_misa_cfg_status(void)
+{
+ struct sbiret ret;
+
+ ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MISA_CTL_STATUS,
+ 0, 0, 0, 0, 0, 0);
+ return ret.value;
+}
+
+static inline uint32_t ax45mp_cpu_l2c_get_cctl_status(void)
+{
+ return readl(ax45mp_priv->l2c_base + AX45MP_L2C_REG_STATUS_OFFSET);
+}
+
+static inline uint32_t ax45mp_cpu_l2c_ctl_status(void)
+{
+ return readl(ax45mp_priv->l2c_base + AX45MP_L2C_REG_CTL_OFFSET);
+}
+
+static bool ax45mp_cpu_cache_controlable(void)
+{
+ return (((ax45mp_cpu_get_micm_cfg_status() & AX45MP_MICM_CFG_ISZ_MASK) ||
+ (ax45mp_cpu_get_mdcm_cfg_status() & AX45MP_MDCM_CFG_DSZ_MASK)) &&
+ (ax45mp_cpu_get_misa_cfg_status() & AX45MP_MISA_20_MASK) &&
+ (ax45mp_cpu_get_mmsc_cfg_status() & AX45MP_MMSC_CFG_CCTLCSR_MASK) &&
+ (ax45mp_cpu_get_mcache_ctl_status() & AX45MP_MCACHE_CTL_CCTL_SUEN_MASK));
+}
+
+static void ax45mp_cpu_dcache_wb_range(void *start, void *end, int line_size)
+{
+ void __iomem *base = ax45mp_priv->l2c_base;
+ unsigned long pa;
+ int mhartid = 0;
+#ifdef CONFIG_SMP
+ mhartid = smp_processor_id();
+#endif
+
+ while (end > start) {
+ if (ax45mp_priv->ucctl_ok) {
+ csr_write(AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM, start);
+ csr_write(AX45MP_CCTL_REG_UCCTLCOMMAND_NUM, AX45MP_CCTL_L1D_VA_WB);
+ }
+
+ if (ax45mp_priv->l2cache_enabled) {
+ pa = virt_to_phys(start);
+ writel(pa, base + AX45MP_L2C_REG_CN_ACC_OFFSET(mhartid));
+ writel(AX45MP_CCTL_L2_PA_WB,
+ base + AX45MP_L2C_REG_CN_CMD_OFFSET(mhartid));
+ while ((ax45mp_cpu_l2c_get_cctl_status() &
+ AX45MP_CCTL_L2_STATUS_CN_MASK(mhartid)) !=
+ AX45MP_CCTL_L2_STATUS_IDLE)
+ ;
+ }
+
+ start += line_size;
+ }
+}
+
+static void ax45mp_cpu_dcache_inval_range(void *start, void *end, int line_size)
+{
+ void __iomem *base = ax45mp_priv->l2c_base;
+ unsigned long pa;
+ int mhartid = 0;
+#ifdef CONFIG_SMP
+ mhartid = smp_processor_id();
+#endif
+
+ while (end > start) {
+ if (ax45mp_priv->ucctl_ok) {
+ csr_write(AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM, start);
+ csr_write(AX45MP_CCTL_REG_UCCTLCOMMAND_NUM, AX45MP_CCTL_L1D_VA_INVAL);
+ }
+
+ if (ax45mp_priv->l2cache_enabled) {
+ pa = virt_to_phys(start);
+ writel(pa, base + AX45MP_L2C_REG_CN_ACC_OFFSET(mhartid));
+ writel(AX45MP_CCTL_L2_PA_INVAL,
+ base + AX45MP_L2C_REG_CN_CMD_OFFSET(mhartid));
+ while ((ax45mp_cpu_l2c_get_cctl_status() &
+ AX45MP_CCTL_L2_STATUS_CN_MASK(mhartid)) !=
+ AX45MP_CCTL_L2_STATUS_IDLE)
+ ;
+ }
+
+ start += line_size;
+ }
+}
+
+static void ax45mp_cpu_dma_inval_range(void *vaddr, size_t size)
+{
+ char cache_buf[2][AX45MP_MAX_CACHE_LINE_SIZE];
+ unsigned long start = (unsigned long)vaddr;
+ unsigned long end = start + size;
+ unsigned long old_start = start;
+ unsigned long old_end = end;
+ unsigned long line_size;
+ unsigned long flags;
+
+ if (static_branch_unlikely(&ax45mp_l2c_configured) && !ax45mp_priv)
+ return;
+
+ if (unlikely(start == end))
+ return;
+
+ line_size = ax45mp_priv->ax45mp_cache_line_size;
+
+ memset(&cache_buf, 0x0, sizeof(cache_buf));
+ start = start & (~(line_size - 1));
+ end = ((end + line_size - 1) & (~(line_size - 1)));
+
+ local_irq_save(flags);
+ if (unlikely(start != old_start))
+ memcpy(&cache_buf[0][0], (void *)start, line_size);
+
+ if (unlikely(end != old_end))
+ memcpy(&cache_buf[1][0], (void *)(old_end & (~(line_size - 1))), line_size);
+
+ ax45mp_cpu_dcache_inval_range(vaddr, (void *)end, line_size);
+
+ if (unlikely(start != old_start))
+ memcpy((void *)start, &cache_buf[0][0], (old_start & (line_size - 1)));
+
+ if (unlikely(end != old_end))
+ memcpy((void *)(old_end + 1),
+ &cache_buf[1][(old_end & (line_size - 1)) + 1],
+ end - old_end - 1);
+
+ local_irq_restore(flags);
+}
+
+static void ax45mp_cpu_dma_wb_range(void *vaddr, size_t size)
+{
+ unsigned long start = (unsigned long)vaddr;
+ unsigned long end = start + size;
+ unsigned long line_size;
+ unsigned long flags;
+
+ if (static_branch_unlikely(&ax45mp_l2c_configured) && !ax45mp_priv)
+ return;
+
+ line_size = ax45mp_priv->ax45mp_cache_line_size;
+ local_irq_save(flags);
+ start = start & (~(line_size - 1));
+ ax45mp_cpu_dcache_wb_range(vaddr, (void *)end, line_size);
+ local_irq_restore(flags);
+}
+
+void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr, size_t size, int dir, int ops)
+{
+ if (ops == NON_COHERENT_DMA_PREP)
+ return;
+
+ if (ops == NON_COHERENT_SYNC_DMA_FOR_DEVICE) {
+ switch (dir) {
+ case DMA_FROM_DEVICE:
+ ax45mp_cpu_dma_inval_range(vaddr, size);
+ break;
+ case DMA_TO_DEVICE:
+ case DMA_BIDIRECTIONAL:
+ ax45mp_cpu_dma_wb_range(vaddr, size);
+ break;
+ default:
+ break;
+ }
+ return;
+ }
+
+ /* op == NON_COHERENT_SYNC_DMA_FOR_CPU */
+ if (dir == DMA_BIDIRECTIONAL || dir == DMA_FROM_DEVICE)
+ ax45mp_cpu_dma_inval_range(vaddr, size);
+}
+EXPORT_SYMBOL(ax45mp_no_iocp_cmo);
+
+static int ax45mp_configure_l2_cache(struct device_node *np)
+{
+ int ret;
+
+ ret = of_property_read_u32(np, "cache-line-size", &ax45mp_priv->ax45mp_cache_line_size);
+ if (ret) {
+ pr_err("Failed to get cache-line-size defaulting to 64 bytes\n");
+ ax45mp_priv->ax45mp_cache_line_size = SZ_64;
+ }
+
+ if (ax45mp_priv->ax45mp_cache_line_size != SZ_64) {
+ pr_err("Expected cache-line-size to 64 bytes (found:%u). Defaulting to 64 bytes\n",
+ ax45mp_priv->ax45mp_cache_line_size);
+ ax45mp_priv->ax45mp_cache_line_size = SZ_64;
+ }
+
+ ax45mp_priv->ucctl_ok = ax45mp_cpu_cache_controlable();
+ ax45mp_priv->l2cache_enabled = ax45mp_cpu_l2c_ctl_status() & AX45MP_L2_CACHE_CTL_CEN_MASK;
+
+ return 0;
+}
+
+static int ax45mp_l2c_probe(struct platform_device *pdev)
+{
+ struct device_node *np = pdev->dev.of_node;
+ int ret;
+
+ ax45mp_priv = devm_kzalloc(&pdev->dev, sizeof(*ax45mp_priv), GFP_KERNEL);
+ if (!ax45mp_priv)
+ return -ENOMEM;
+
+ ax45mp_priv->l2c_base = devm_of_iomap(&pdev->dev, pdev->dev.of_node, 0, NULL);
+ if (!ax45mp_priv->l2c_base) {
+ ret = -ENOMEM;
+ goto l2c_err;
+ }
+
+ ret = ax45mp_configure_l2_cache(np);
+ if (ret)
+ goto l2c_err;
+
+ ret = ax45mp_configure_pma_regions(np);
+ if (ret)
+ goto l2c_err;
+
+ static_branch_disable(&ax45mp_l2c_configured);
+
+ return 0;
+
+l2c_err:
+ devm_kfree(&pdev->dev, ax45mp_priv);
+ ax45mp_priv = NULL;
+ return ret;
+}
+
+static const struct of_device_id ax45mp_cache_ids[] = {
+ { .compatible = "andestech,ax45mp-cache" },
+ { /* sentinel */ }
+};
+
+static struct platform_driver ax45mp_l2c_driver = {
+ .driver = {
+ .name = "ax45mp-l2c",
+ .of_match_table = ax45mp_cache_ids,
+ },
+ .probe = ax45mp_l2c_probe,
+};
+
+static int __init ax45mp_cache_init(void)
+{
+ static_branch_enable(&ax45mp_l2c_configured);
+ return platform_driver_register(&ax45mp_l2c_driver);
+}
+arch_initcall(ax45mp_cache_init);
+
+MODULE_AUTHOR("Lad Prabhakar <[email protected]>");
+MODULE_DESCRIPTION("Andes AX45MP L2 cache driver");
+MODULE_LICENSE("GPL");
diff --git a/drivers/soc/renesas/rzfive/ax45mp_sbi.h b/drivers/soc/renesas/rzfive/ax45mp_sbi.h
new file mode 100644
index 000000000000..1604874954d0
--- /dev/null
+++ b/drivers/soc/renesas/rzfive/ax45mp_sbi.h
@@ -0,0 +1,29 @@
+/* SPDX-License-Identifier: GPL-2.0+ */
+
+#ifndef __AX45MP_SBI_H
+#define __AX45MP_SBI_H
+
+#define SBI_EXT_ANDES 0x0900031E
+
+enum ax45mp_sbi_ext_fid {
+ AX45MP_SBI_EXT_GET_MCACHE_CTL_STATUS = 0,
+ AX45MP_SBI_EXT_GET_MMISC_CTL_STATUS,
+ AX45MP_SBI_EXT_SET_MCACHE_CTL,
+ AX45MP_SBI_EXT_SET_MMISC_CTL,
+ AX45MP_SBI_EXT_ICACHE_OP,
+ AX45MP_SBI_EXT_DCACHE_OP,
+ AX45MP_SBI_EXT_L1CACHE_I_PREFETCH,
+ AX45MP_SBI_EXT_L1CACHE_D_PREFETCH,
+ AX45MP_SBI_EXT_NON_BLOCKING_LOAD_STORE,
+ AX45MP_SBI_EXT_WRITE_AROUND,
+ AX45MP_SBI_EXT_SET_PMA,
+ AX45MP_SBI_EXT_FREE_PMA,
+ AX45MP_SBI_EXT_PROBE_PMA,
+ AX45MP_SBI_EXT_DCACHE_WBINVAL_ALL,
+ AX45MP_SBI_EXT_GET_MICM_CTL_STATUS,
+ AX45MP_SBI_EXT_GET_MDCM_CTL_STATUS,
+ AX45MP_SBI_EXT_GET_MMSC_CTL_STATUS,
+ AX45MP_SBI_EXT_GET_MISA_CTL_STATUS,
+};
+
+#endif
--
2.25.1

2022-11-24 17:35:58

by Lad, Prabhakar

[permalink] [raw]
Subject: [PATCH v4 3/7] riscv: errata: Add Andes alternative ports

From: Lad Prabhakar <[email protected]>

Add required ports of the Alternative scheme for Andes CPU cores.

Signed-off-by: Lad Prabhakar <[email protected]>
---
RFC v3 -> v4
* New patch
---
arch/riscv/Kconfig.erratas | 22 +++++++++
arch/riscv/errata/Makefile | 1 +
arch/riscv/errata/andes/Makefile | 1 +
arch/riscv/errata/andes/errata.c | 68 ++++++++++++++++++++++++++++
arch/riscv/include/asm/alternative.h | 3 ++
arch/riscv/include/asm/errata_list.h | 5 ++
arch/riscv/kernel/alternative.c | 5 ++
7 files changed, 105 insertions(+)
create mode 100644 arch/riscv/errata/andes/Makefile
create mode 100644 arch/riscv/errata/andes/errata.c

diff --git a/arch/riscv/Kconfig.erratas b/arch/riscv/Kconfig.erratas
index 69621ae6d647..74b44e5dd710 100644
--- a/arch/riscv/Kconfig.erratas
+++ b/arch/riscv/Kconfig.erratas
@@ -1,5 +1,27 @@
menu "CPU errata selection"

+config ERRATA_ANDES
+ bool "Andes AX45MP errata"
+ depends on !XIP_KERNEL
+ select RISCV_ALTERNATIVE
+ help
+ All Andes errata Kconfig depend on this Kconfig. Disabling
+ this Kconfig will disable all Andes errata. Please say "Y"
+ here if your platform uses Andes CPU cores.
+
+ Otherwise, please say "N" here to avoid unnecessary overhead.
+
+config ERRATA_ANDES_CMO
+ bool "Apply Andes cache management errata"
+ depends on ERRATA_ANDES && MMU
+ select RISCV_DMA_NONCOHERENT
+ default y
+ help
+ This will apply the cache management errata to handle the
+ non-standard handling on non-coherent operations on Andes cores.
+
+ If you don't know what to do here, say "Y".
+
config ERRATA_SIFIVE
bool "SiFive errata"
depends on !XIP_KERNEL
diff --git a/arch/riscv/errata/Makefile b/arch/riscv/errata/Makefile
index a1055965fbee..81828e80f6dc 100644
--- a/arch/riscv/errata/Makefile
+++ b/arch/riscv/errata/Makefile
@@ -1,2 +1,3 @@
obj-$(CONFIG_ERRATA_SIFIVE) += sifive/
obj-$(CONFIG_ERRATA_THEAD) += thead/
+obj-$(CONFIG_ERRATA_ANDES) += andes/
diff --git a/arch/riscv/errata/andes/Makefile b/arch/riscv/errata/andes/Makefile
new file mode 100644
index 000000000000..2d644e19caef
--- /dev/null
+++ b/arch/riscv/errata/andes/Makefile
@@ -0,0 +1 @@
+obj-y += errata.o
diff --git a/arch/riscv/errata/andes/errata.c b/arch/riscv/errata/andes/errata.c
new file mode 100644
index 000000000000..ec3e052ca8c7
--- /dev/null
+++ b/arch/riscv/errata/andes/errata.c
@@ -0,0 +1,68 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * Erratas to be applied for Andes CPU cores
+ *
+ * Copyright (C) 2022 Renesas Electronics Corporation.
+ *
+ * Author: Lad Prabhakar <[email protected]>
+ */
+
+#include <linux/kernel.h>
+#include <linux/module.h>
+
+#include <asm/alternative.h>
+#include <asm/cacheflush.h>
+#include <asm/errata_list.h>
+#include <asm/patch.h>
+#include <asm/vendorid_list.h>
+
+static bool errata_probe_iocp(unsigned int stage, unsigned long arch_id, unsigned long impid)
+{
+ if (!IS_ENABLED(CONFIG_ERRATA_ANDES_CMO))
+ return false;
+
+ if (arch_id != 0x8000000000008a45 || impid != 0x500)
+ return false;
+
+ riscv_cbom_block_size = 1;
+ riscv_noncoherent_supported();
+
+ return true;
+}
+
+static u32 andes_errata_probe(unsigned int stage, unsigned long archid, unsigned long impid)
+{
+ u32 cpu_req_errata = 0;
+
+ if (errata_probe_iocp(stage, archid, impid))
+ cpu_req_errata |= BIT(ERRATA_ANDESTECH_NO_IOCP);
+
+ return cpu_req_errata;
+}
+
+void __init_or_module andes_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
+ unsigned long archid, unsigned long impid,
+ unsigned int stage)
+{
+ u32 cpu_req_errata = andes_errata_probe(stage, archid, impid);
+ struct alt_entry *alt;
+ u32 tmp;
+
+ if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
+ return;
+
+ for (alt = begin; alt < end; alt++) {
+ if (alt->vendor_id != ANDESTECH_VENDOR_ID)
+ continue;
+ if (alt->errata_id >= ERRATA_ANDESTECH_NUMBER)
+ continue;
+
+ tmp = (1U << alt->errata_id);
+ if (cpu_req_errata & tmp) {
+ patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
+
+ riscv_alternative_fix_auipc_jalr(alt->old_ptr, alt->alt_len,
+ alt->old_ptr - alt->alt_ptr);
+ }
+ }
+}
diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
index 6511dd73e812..d8012af30cbd 100644
--- a/arch/riscv/include/asm/alternative.h
+++ b/arch/riscv/include/asm/alternative.h
@@ -46,6 +46,9 @@ void sifive_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
void thead_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
unsigned long archid, unsigned long impid,
unsigned int stage);
+void andes_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
+ unsigned long archid, unsigned long impid,
+ unsigned int stage);

void riscv_cpufeature_patch_func(struct alt_entry *begin, struct alt_entry *end,
unsigned int stage);
diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
index 4180312d2a70..2ba7e6e74540 100644
--- a/arch/riscv/include/asm/errata_list.h
+++ b/arch/riscv/include/asm/errata_list.h
@@ -9,6 +9,11 @@
#include <asm/csr.h>
#include <asm/vendorid_list.h>

+#ifdef CONFIG_ERRATA_ANDES
+#define ERRATA_ANDESTECH_NO_IOCP 0
+#define ERRATA_ANDESTECH_NUMBER 1
+#endif
+
#ifdef CONFIG_ERRATA_SIFIVE
#define ERRATA_SIFIVE_CIP_453 0
#define ERRATA_SIFIVE_CIP_1200 1
diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
index a7d26a00beea..4ded3e9aa3bc 100644
--- a/arch/riscv/kernel/alternative.c
+++ b/arch/riscv/kernel/alternative.c
@@ -47,6 +47,11 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
case THEAD_VENDOR_ID:
cpu_mfr_info->patch_func = thead_errata_patch_func;
break;
+#endif
+#ifdef CONFIG_ERRATA_ANDES
+ case ANDESTECH_VENDOR_ID:
+ cpu_mfr_info->patch_func = andes_errata_patch_func;
+ break;
#endif
default:
cpu_mfr_info->patch_func = NULL;
--
2.25.1

2022-11-24 17:36:36

by Lad, Prabhakar

[permalink] [raw]
Subject: [PATCH v4 6/7] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller

From: Lad Prabhakar <[email protected]>

Add DT binding documentation for L2 cache controller found on RZ/Five SoC.

The Renesas RZ/Five microprocessor includes a RISC-V CPU Core (AX45MP
Single) from Andes. The AX45MP core has an L2 cache controller, this patch
describes the L2 cache block.

Signed-off-by: Lad Prabhakar <[email protected]>
---
RFC v3 -> v4
* Dropped l2 cache configuration parameters
* s/larger/large
* Added minItems/maxItems for andestech,pma-regions
---
.../cache/andestech,ax45mp-cache.yaml | 93 +++++++++++++++++++
.../cache/andestech,ax45mp-cache.h | 38 ++++++++
2 files changed, 131 insertions(+)
create mode 100644 Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
create mode 100644 include/dt-bindings/cache/andestech,ax45mp-cache.h

diff --git a/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml b/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
new file mode 100644
index 000000000000..bf255b177d0a
--- /dev/null
+++ b/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
@@ -0,0 +1,93 @@
+# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
+# Copyright (C) 2022 Renesas Electronics Corp.
+%YAML 1.2
+---
+$id: http://devicetree.org/schemas/cache/andestech,ax45mp-cache.yaml#
+$schema: http://devicetree.org/meta-schemas/core.yaml#
+
+title: Andestech AX45MP L2 Cache Controller
+
+maintainers:
+ - Lad Prabhakar <[email protected]>
+
+description:
+ A level-2 cache (L2C) is used to improve the system performance by providing
+ a large amount of cache line entries and reasonable access delays. The L2C
+ is shared between cores, and a non-inclusive non-exclusive policy is used.
+
+select:
+ properties:
+ compatible:
+ contains:
+ enum:
+ - andestech,ax45mp-cache
+
+ required:
+ - compatible
+
+properties:
+ compatible:
+ items:
+ - const: andestech,ax45mp-cache
+ - const: cache
+
+ reg:
+ maxItems: 1
+
+ interrupts:
+ maxItems: 1
+
+ cache-line-size:
+ const: 64
+
+ cache-level:
+ const: 2
+
+ cache-sets:
+ const: 1024
+
+ cache-size:
+ enum: [131072, 262144, 524288, 1048576, 2097152]
+
+ cache-unified: true
+
+ next-level-cache: true
+
+ andestech,pma-regions:
+ $ref: /schemas/types.yaml#/definitions/uint32-matrix
+ minItems: 1
+ maxItems: 16
+ items:
+ minItems: 3
+ maxItems: 3
+ description: Optional array of memory regions to be set in the PMA.
+
+additionalProperties: false
+
+required:
+ - compatible
+ - cache-line-size
+ - cache-level
+ - cache-sets
+ - cache-size
+ - cache-unified
+ - interrupts
+ - reg
+
+examples:
+ - |
+ #include <dt-bindings/interrupt-controller/irq.h>
+ #include <dt-bindings/cache/andestech,ax45mp-cache.h>
+
+ [email protected] {
+ reg = <0x13400000 0x100000>;
+ compatible = "andestech,ax45mp-cache", "cache";
+ interrupts = <508 IRQ_TYPE_LEVEL_HIGH>;
+ cache-line-size = <64>;
+ cache-level = <2>;
+ cache-sets = <1024>;
+ cache-size = <262144>;
+ cache-unified;
+ andestech,pma-regions = <0x58000000 0x08000000
+ (AX45MP_PMACFG_ETYP_NAPOT | AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF)>;
+ };
diff --git a/include/dt-bindings/cache/andestech,ax45mp-cache.h b/include/dt-bindings/cache/andestech,ax45mp-cache.h
new file mode 100644
index 000000000000..aa1cad24075d
--- /dev/null
+++ b/include/dt-bindings/cache/andestech,ax45mp-cache.h
@@ -0,0 +1,38 @@
+/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */
+/*
+ * This header provides constants for Andes AX45MP PMA configuration
+ *
+ * Copyright (C) 2022 Renesas Electronics Corp.
+ */
+
+#ifndef __DT_BINDINGS_ANDESTECH_AX45MP_CACHE_H
+#define __DT_BINDINGS_ANDESTECH_AX45MP_CACHE_H
+
+/* OFF: PMA entry is disabled */
+#define AX45MP_PMACFG_ETYP_DISABLED 0
+/* Naturally aligned power of 2 region */
+#define AX45MP_PMACFG_ETYP_NAPOT 3
+
+/* Device, Non-bufferable */
+#define AX45MP_PMACFG_MTYP_DEV_NON_BUF (0 << 2)
+/* Device, bufferable */
+#define AX45MP_PMACFG_MTYP_DEV_BUF (1 << 2)
+/* Memory, Non-cacheable, Non-bufferable */
+#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF (2 << 2)
+/* Memory, Non-cacheable, Bufferable */
+#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF (3 << 2)
+/* Memory, Write-back, No-allocate */
+#define AX45MP_PMACFG_MTYP_MEM_WB_NA (8 << 2)
+/* Memory, Write-back, Read-allocate */
+#define AX45MP_PMACFG_MTYP_MEM_WB_RA (9 << 2)
+/* Memory, Write-back, Write-allocate */
+#define AX45MP_PMACFG_MTYP_MEM_WB_WA (10 << 2)
+/* Memory, Write-back, Read and Write-allocate */
+#define AX45MP_PMACFG_MTYP_MEM_WB_R_WA (11 << 2)
+
+/* AMO instructions are supported */
+#define AX45MP_PMACFG_NAMO_AMO_SUPPORT (0 << 6)
+/* AMO instructions are not supported */
+#define AX45MP_PMACFG_NAMO_AMO_NO_SUPPORT (1 << 6)
+
+#endif /* __DT_BINDINGS_ANDESTECH_AX45MP_CACHE_H */
--
2.25.1

2022-11-24 18:33:18

by Heiko Stübner

[permalink] [raw]
Subject: Re: [PATCH v4 1/7] riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro

Am Donnerstag, 24. November 2022, 18:22:01 CET schrieb Prabhakar:
> From: Lad Prabhakar <[email protected]>
>
> Introduce ALTERNATIVE_3() macro.
>
> Signed-off-by: Lad Prabhakar <[email protected]>

Reviewed-by: Heiko Stuebner <[email protected]>

> ---
> RFC v3 -> v4
> * New patch
> ---
> arch/riscv/include/asm/alternative-macros.h | 94 +++++++++++++++++++++
> 1 file changed, 94 insertions(+)
>
> diff --git a/arch/riscv/include/asm/alternative-macros.h b/arch/riscv/include/asm/alternative-macros.h
> index ec2f3f1b836f..1caf4306b3d6 100644
> --- a/arch/riscv/include/asm/alternative-macros.h
> +++ b/arch/riscv/include/asm/alternative-macros.h
> @@ -69,6 +69,34 @@
> new_c_2, vendor_id_2, errata_id_2, \
> IS_ENABLED(CONFIG_k_2)
>
> +.macro __ALTERNATIVE_CFG_3 old_c, new_c_1, vendor_id_1, errata_id_1, enable_1, \
> + new_c_2, vendor_id_2, errata_id_2, enable_2, \
> + new_c_3, vendor_id_3, errata_id_3, enable_3
> +886 :
> + .option push
> + .option norvc
> + .option norelax
> + \old_c
> + .option pop
> +887 :
> + ALT_NEW_CONTENT \vendor_id_1, \errata_id_1, \enable_1, \new_c_1
> + ALT_NEW_CONTENT \vendor_id_2, \errata_id_2, \enable_2, \new_c_2
> + ALT_NEW_CONTENT \vendor_id_3, \errata_id_3, \enable_3, \new_c_3
> +.endm
> +
> +#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> + CONFIG_k_1, \
> + new_c_2, vendor_id_2, errata_id_2, \
> + CONFIG_k_2, \
> + new_c_3, vendor_id_3, errata_id_3, \
> + CONFIG_k_3) \
> + __ALTERNATIVE_CFG_3 old_c, new_c_1, vendor_id_1, errata_id_1, \
> + IS_ENABLED(CONFIG_k_1), \
> + new_c_2, vendor_id_2, errata_id_2, \
> + IS_ENABLED(CONFIG_k_2), \
> + new_c_3, vendor_id_3, errata_id_3, \
> + IS_ENABLED(CONFIG_k_3)
> +
> #else /* !__ASSEMBLY__ */
>
> #include <asm/asm.h>
> @@ -135,6 +163,36 @@
> new_c_2, vendor_id_2, errata_id_2, \
> IS_ENABLED(CONFIG_k_2))
>
> +#define __ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> + enable_1, \
> + new_c_2, vendor_id_2, errata_id_2, \
> + enable_2, \
> + new_c_3, vendor_id_3, errata_id_3, \
> + enable_3) \
> + "886 :\n" \
> + ".option push\n" \
> + ".option norvc\n" \
> + ".option norelax\n" \
> + old_c "\n" \
> + ".option pop\n" \
> + "887 :\n" \
> + ALT_NEW_CONTENT(vendor_id_1, errata_id_1, enable_1, new_c_1) \
> + ALT_NEW_CONTENT(vendor_id_2, errata_id_2, enable_2, new_c_2) \
> + ALT_NEW_CONTENT(vendor_id_3, errata_id_3, enable_3, new_c_3)
> +
> +#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> + CONFIG_k_1, \
> + new_c_2, vendor_id_2, errata_id_2, \
> + CONFIG_k_2, \
> + new_c_3, vendor_id_3, errata_id_3, \
> + CONFIG_k_3) \
> + __ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> + IS_ENABLED(CONFIG_k_1), \
> + new_c_2, vendor_id_2, errata_id_2, \
> + IS_ENABLED(CONFIG_k_2), \
> + new_c_3, vendor_id_3, errata_id_3, \
> + IS_ENABLED(CONFIG_k_3))
> +
> #endif /* __ASSEMBLY__ */
>
> #else /* CONFIG_RISCV_ALTERNATIVE */
> @@ -153,6 +211,14 @@
> CONFIG_k_2) \
> __ALTERNATIVE_CFG old_c
>
> +#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> + CONFIG_k_1, \
> + new_c_2, vendor_id_2, errata_id_2, \
> + CONFIG_k_2, \
> + new_c_3, vendor_id_3, errata_id_3, \
> + CONFIG_k_3) \
> + __ALTERNATIVE_CFG old_c
> +
> #else /* !__ASSEMBLY__ */
>
> #define __ALTERNATIVE_CFG(old_c) \
> @@ -167,6 +233,14 @@
> CONFIG_k_2) \
> __ALTERNATIVE_CFG(old_c)
>
> +#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> + CONFIG_k_1, \
> + new_c_2, vendor_id_2, errata_id_2, \
> + CONFIG_k_2, \
> + new_c_3, vendor_id_3, errata_id_3, \
> + CONFIG_k_3) \
> + __ALTERNATIVE_CFG(old_c)
> +
> #endif /* __ASSEMBLY__ */
> #endif /* CONFIG_RISCV_ALTERNATIVE */
>
> @@ -202,4 +276,24 @@
> new_content_2, vendor_id_2, \
> errata_id_2, CONFIG_k_2)
>
> +/*
> + * A vendor wants to replace an old_content, but another vendor has used
> + * ALTERNATIVE_2() to patch its customized content at the same location. In
> + * this case, this vendor can create a new macro ALTERNATIVE_3() based
> + * on the following sample code and then replace ALTERNATIVE_2() with
> + * ALTERNATIVE_3() to append its customized content.
> + */
> +#define ALTERNATIVE_3(old_content, new_content_1, vendor_id_1, \
> + errata_id_1, CONFIG_k_1, \
> + new_content_2, vendor_id_2, \
> + errata_id_2, CONFIG_k_2, \
> + new_content_3, vendor_id_3, \
> + errata_id_3, CONFIG_k_3) \
> + _ALTERNATIVE_CFG_3(old_content, new_content_1, vendor_id_1, \
> + errata_id_1, CONFIG_k_1, \
> + new_content_2, vendor_id_2, \
> + errata_id_2, CONFIG_k_2, \
> + new_content_3, vendor_id_3, \
> + errata_id_3, CONFIG_k_3)
> +
> #endif
>




2022-11-24 18:39:31

by Heiko Stübner

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] riscv: mm: dma-noncoherent: Pass direction and operation to ALT_CMO_OP()

Am Donnerstag, 24. November 2022, 18:22:05 CET schrieb Prabhakar:
> From: Lad Prabhakar <[email protected]>
>
> Pass direction and operation to ALT_CMO_OP() macro.
>
> This is in preparation for adding errata for the Andes CPU core.

can you provide more explanation why that is necessary please?
I guess you want to use different cache operations for some cases?


Thanks
Heiko

> Signed-off-by: Lad Prabhakar <[email protected]>
> ---
> RFC v3 -> v4
> * New patch
> ---
> arch/riscv/include/asm/cacheflush.h | 4 ++++
> arch/riscv/include/asm/errata_list.h | 8 ++++++--
> arch/riscv/mm/dma-noncoherent.c | 15 ++++++++++-----
> 3 files changed, 20 insertions(+), 7 deletions(-)
>
> diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
> index f6fbe7042f1c..4a04d1be7c67 100644
> --- a/arch/riscv/include/asm/cacheflush.h
> +++ b/arch/riscv/include/asm/cacheflush.h
> @@ -8,6 +8,10 @@
>
> #include <linux/mm.h>
>
> +#define NON_COHERENT_SYNC_DMA_FOR_DEVICE 0
> +#define NON_COHERENT_SYNC_DMA_FOR_CPU 1
> +#define NON_COHERENT_DMA_PREP 2
> +
> static inline void local_flush_icache_all(void)
> {
> asm volatile ("fence.i" ::: "memory");
> diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> index 2ba7e6e74540..48e899a8e7a9 100644
> --- a/arch/riscv/include/asm/errata_list.h
> +++ b/arch/riscv/include/asm/errata_list.h
> @@ -124,7 +124,7 @@ asm volatile(ALTERNATIVE( \
> #define THEAD_flush_A0 ".long 0x0275000b"
> #define THEAD_SYNC_S ".long 0x0190000b"
>
> -#define ALT_CMO_OP(_op, _start, _size, _cachesize) \
> +#define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops) \
> asm volatile(ALTERNATIVE_2( \
> __nops(6), \
> "mv a0, %1\n\t" \
> @@ -146,7 +146,11 @@ asm volatile(ALTERNATIVE_2( \
> ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO) \
> : : "r"(_cachesize), \
> "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \
> - "r"((unsigned long)(_start) + (_size)) \
> + "r"((unsigned long)(_start) + (_size)), \
> + "r"((unsigned long)(_start)), \
> + "r"((unsigned long)(_size)), \
> + "r"((unsigned long)(_dir)), \
> + "r"((unsigned long)(_ops)) \
> : "a0")
>
> #define THEAD_C9XX_RV_IRQ_PMU 17
> diff --git a/arch/riscv/mm/dma-noncoherent.c b/arch/riscv/mm/dma-noncoherent.c
> index d919efab6eba..e2b82034f504 100644
> --- a/arch/riscv/mm/dma-noncoherent.c
> +++ b/arch/riscv/mm/dma-noncoherent.c
> @@ -19,13 +19,16 @@ void arch_sync_dma_for_device(phys_addr_t paddr, size_t size,
>
> switch (dir) {
> case DMA_TO_DEVICE:
> - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
> + ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size,
> + dir, NON_COHERENT_SYNC_DMA_FOR_DEVICE);
> break;
> case DMA_FROM_DEVICE:
> - ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size);
> + ALT_CMO_OP(clean, vaddr, size, riscv_cbom_block_size,
> + dir, NON_COHERENT_SYNC_DMA_FOR_DEVICE);
> break;
> case DMA_BIDIRECTIONAL:
> - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
> + ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size,
> + dir, NON_COHERENT_SYNC_DMA_FOR_DEVICE);
> break;
> default:
> break;
> @@ -42,7 +45,8 @@ void arch_sync_dma_for_cpu(phys_addr_t paddr, size_t size,
> break;
> case DMA_FROM_DEVICE:
> case DMA_BIDIRECTIONAL:
> - ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size);
> + ALT_CMO_OP(flush, vaddr, size, riscv_cbom_block_size,
> + dir, NON_COHERENT_SYNC_DMA_FOR_CPU);
> break;
> default:
> break;
> @@ -53,7 +57,8 @@ void arch_dma_prep_coherent(struct page *page, size_t size)
> {
> void *flush_addr = page_address(page);
>
> - ALT_CMO_OP(flush, flush_addr, size, riscv_cbom_block_size);
> + ALT_CMO_OP(flush, flush_addr, size, riscv_cbom_block_size,
> + 0, NON_COHERENT_DMA_PREP);
> }
>
> void arch_setup_dma_ops(struct device *dev, u64 dma_base, u64 size,
>




2022-11-24 18:40:35

by Heiko Stübner

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

Am Donnerstag, 24. November 2022, 18:22:07 CET schrieb Prabhakar:
> From: Lad Prabhakar <[email protected]>
>
> On the AX45MP core, cache coherency is a specification option so it may
> not be supported. In this case DMA will fail. As a workaround, firstly we
> allocate a global dma coherent pool from which DMA allocations are taken
> and marked as non-cacheable + bufferable using the PMA region as specified
> in the device tree. Synchronization callbacks are implemented to
> synchronize when doing DMA transactions.
>
> The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> block that allows dynamic adjustment of memory attributes in the runtime.
> It contains a configurable amount of PMA entries implemented as CSR
> registers to control the attributes of memory locations in interest.
>
> Below are the memory attributes supported:
> * Device, Non-bufferable
> * Device, bufferable
> * Memory, Non-cacheable, Non-bufferable
> * Memory, Non-cacheable, Bufferable
> * Memory, Write-back, No-allocate
> * Memory, Write-back, Read-allocate
> * Memory, Write-back, Write-allocate
> * Memory, Write-back, Read and Write-allocate
>
> This patch adds support to configure the memory attributes of the memory
> regions as passed from the l2 cache node and exposes the cache management
> ops.
>
> More info about PMA (section 10.3):
> Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
>
> Signed-off-by: Lad Prabhakar <[email protected]>
> ---
> RFC v3 -> v4
> * Made use of runtime patching instead of compile time
> * Now just exposing single function ax45mp_no_iocp_cmo() for CMO handling
> * Added a check to make sure cache line size is always 64 bytes
> * Renamed folder rzf -> rzfive
> * Improved Kconfig description
> * Dropped L2 cache configuration
> * Dropped unnecessary casts
> * Fixed comments pointed by Geert, apart from use of PTR_ALIGN_XYZ() macros.
> ---
> arch/riscv/include/asm/cacheflush.h | 8 +
> arch/riscv/include/asm/errata_list.h | 32 +-
> drivers/soc/renesas/Kconfig | 7 +
> drivers/soc/renesas/Makefile | 2 +
> drivers/soc/renesas/rzfive/Kconfig | 6 +
> drivers/soc/renesas/rzfive/Makefile | 3 +
> drivers/soc/renesas/rzfive/ax45mp_cache.c | 415 ++++++++++++++++++++++
> drivers/soc/renesas/rzfive/ax45mp_sbi.h | 29 ++
> 8 files changed, 496 insertions(+), 6 deletions(-)
> create mode 100644 drivers/soc/renesas/rzfive/Kconfig
> create mode 100644 drivers/soc/renesas/rzfive/Makefile
> create mode 100644 drivers/soc/renesas/rzfive/ax45mp_cache.c
> create mode 100644 drivers/soc/renesas/rzfive/ax45mp_sbi.h
>
> diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
> index 4a04d1be7c67..3226f3aceafe 100644
> --- a/arch/riscv/include/asm/cacheflush.h
> +++ b/arch/riscv/include/asm/cacheflush.h
> @@ -61,6 +61,14 @@ static inline void riscv_noncoherent_supported(void) {}
> #define SYS_RISCV_FLUSH_ICACHE_LOCAL 1UL
> #define SYS_RISCV_FLUSH_ICACHE_ALL (SYS_RISCV_FLUSH_ICACHE_LOCAL)
>
> +#ifdef CONFIG_AX45MP_L2_CACHE
> +extern asmlinkage void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr,
> + size_t size, int dir, int ops);
> +#else
> +inline void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr,
> + size_t size, int dir, int ops) {}
> +#endif
> +
> #include <asm-generic/cacheflush.h>
>
> #endif /* _ASM_RISCV_CACHEFLUSH_H */
> diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> index 48e899a8e7a9..300fed3bfd80 100644
> --- a/arch/riscv/include/asm/errata_list.h
> +++ b/arch/riscv/include/asm/errata_list.h
> @@ -125,8 +125,8 @@ asm volatile(ALTERNATIVE( \
> #define THEAD_SYNC_S ".long 0x0190000b"
>
> #define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops) \
> -asm volatile(ALTERNATIVE_2( \
> - __nops(6), \
> +asm volatile(ALTERNATIVE_3( \
> + __nops(14), \
> "mv a0, %1\n\t" \
> "j 2f\n\t" \
> "3:\n\t" \
> @@ -134,7 +134,7 @@ asm volatile(ALTERNATIVE_2( \
> "add a0, a0, %0\n\t" \
> "2:\n\t" \
> "bltu a0, %2, 3b\n\t" \
> - "nop", 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
> + __nops(8), 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
> "mv a0, %1\n\t" \
> "j 2f\n\t" \
> "3:\n\t" \
> @@ -142,8 +142,28 @@ asm volatile(ALTERNATIVE_2( \
> "add a0, a0, %0\n\t" \
> "2:\n\t" \
> "bltu a0, %2, 3b\n\t" \
> - THEAD_SYNC_S, THEAD_VENDOR_ID, \
> - ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO) \
> + THEAD_SYNC_S "\n\t" \
> + __nops(8), THEAD_VENDOR_ID, \
> + ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO, \
> + ".option push\n\t\n\t" \
> + ".option norvc\n\t" \
> + ".option norelax\n\t"> \

alternatives already do the norvc + norelax options anyway for old and new instructions,
so the .option stuff shouldn't be necessary I guess?


> + "addi sp,sp,-16\n\t" \
> + "sd s0,0(sp)\n\t" \
> + "sd ra,8(sp)\n\t" \
> + "addi s0,sp,16\n\t" \
> + "mv a4,%6\n\t" \
> + "mv a3,%5\n\t" \
> + "mv a2,%4\n\t" \
> + "mv a1,%3\n\t" \
> + "mv a0,%0\n\t" \
> + "call ax45mp_no_iocp_cmo\n\t" \
> + "ld ra,8(sp)\n\t" \
> + "ld s0,0(sp)\n\t" \
> + "addi sp,sp,16\n\t" \
> + ".option pop\n\t", \
> + ANDESTECH_VENDOR_ID, ERRATA_ANDESTECH_NO_IOCP, \
> + CONFIG_ERRATA_ANDES_CMO) \
> : : "r"(_cachesize), \
> "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \
> "r"((unsigned long)(_start) + (_size)), \
> @@ -151,7 +171,7 @@ asm volatile(ALTERNATIVE_2( \
> "r"((unsigned long)(_size)), \
> "r"((unsigned long)(_dir)), \
> "r"((unsigned long)(_ops)) \
> - : "a0")
> + : "a0", "a1", "a2", "a3", "a4", "memory")
>
> #define THEAD_C9XX_RV_IRQ_PMU 17
> #define THEAD_C9XX_CSR_SCOUNTEROF 0x5c5

[...]

> +static int ax45mp_configure_l2_cache(struct device_node *np)
> +{
> + int ret;
> +
> + ret = of_property_read_u32(np, "cache-line-size", &ax45mp_priv->ax45mp_cache_line_size);
> + if (ret) {
> + pr_err("Failed to get cache-line-size defaulting to 64 bytes\n");
> + ax45mp_priv->ax45mp_cache_line_size = SZ_64;
> + }
> +
> + if (ax45mp_priv->ax45mp_cache_line_size != SZ_64) {
> + pr_err("Expected cache-line-size to 64 bytes (found:%u). Defaulting to 64 bytes\n",
> + ax45mp_priv->ax45mp_cache_line_size);
> + ax45mp_priv->ax45mp_cache_line_size = SZ_64;
> + }
> +
> + ax45mp_priv->ucctl_ok = ax45mp_cpu_cache_controlable();
> + ax45mp_priv->l2cache_enabled = ax45mp_cpu_l2c_ctl_status() & AX45MP_L2_CACHE_CTL_CEN_MASK;
> +
> + return 0;
> +}
> +
> +static int ax45mp_l2c_probe(struct platform_device *pdev)
> +{
> + struct device_node *np = pdev->dev.of_node;
> + int ret;
> +
> + ax45mp_priv = devm_kzalloc(&pdev->dev, sizeof(*ax45mp_priv), GFP_KERNEL);
> + if (!ax45mp_priv)
> + return -ENOMEM;
> +
> + ax45mp_priv->l2c_base = devm_of_iomap(&pdev->dev, pdev->dev.of_node, 0, NULL);
> + if (!ax45mp_priv->l2c_base) {
> + ret = -ENOMEM;
> + goto l2c_err;
> + }
> +
> + ret = ax45mp_configure_l2_cache(np);
> + if (ret)
> + goto l2c_err;
> +
> + ret = ax45mp_configure_pma_regions(np);
> + if (ret)
> + goto l2c_err;
> +
> + static_branch_disable(&ax45mp_l2c_configured);
> +
> + return 0;
> +
> +l2c_err:
> + devm_kfree(&pdev->dev, ax45mp_priv);
> + ax45mp_priv = NULL;
> + return ret;
> +}
> +
> +static const struct of_device_id ax45mp_cache_ids[] = {
> + { .compatible = "andestech,ax45mp-cache" },
> + { /* sentinel */ }
> +};
> +
> +static struct platform_driver ax45mp_l2c_driver = {
> + .driver = {
> + .name = "ax45mp-l2c",
> + .of_match_table = ax45mp_cache_ids,
> + },
> + .probe = ax45mp_l2c_probe,
> +};
> +
> +static int __init ax45mp_cache_init(void)
> +{
> + static_branch_enable(&ax45mp_l2c_configured);
> + return platform_driver_register(&ax45mp_l2c_driver);

the ordering is racy I think.

I.e. in the function called from the cmo operations (ax45mp*_range)
you need to access ax45mp_priv and its line-size element.

But when you enable the static branch the driver is not yet registered
but even more important, also not probed yet.

So I guess the static-branch-enable should be living at the end of
ax45mp_l2c_probe()


Heiko


2022-11-24 19:06:46

by Heiko Stübner

[permalink] [raw]
Subject: Re: [PATCH v4 3/7] riscv: errata: Add Andes alternative ports

Am Donnerstag, 24. November 2022, 18:22:03 CET schrieb Prabhakar:
> From: Lad Prabhakar <[email protected]>
>
> Add required ports of the Alternative scheme for Andes CPU cores.
>
> Signed-off-by: Lad Prabhakar <[email protected]>
> ---
> RFC v3 -> v4
> * New patch
> ---

> diff --git a/arch/riscv/errata/Makefile b/arch/riscv/errata/Makefile
> index a1055965fbee..81828e80f6dc 100644
> --- a/arch/riscv/errata/Makefile
> +++ b/arch/riscv/errata/Makefile
> @@ -1,2 +1,3 @@
> obj-$(CONFIG_ERRATA_SIFIVE) += sifive/
> obj-$(CONFIG_ERRATA_THEAD) += thead/
> +obj-$(CONFIG_ERRATA_ANDES) += andes/

alphabetical sorting please


> diff --git a/arch/riscv/errata/andes/errata.c b/arch/riscv/errata/andes/errata.c
> new file mode 100644
> index 000000000000..ec3e052ca8c7
> --- /dev/null
> +++ b/arch/riscv/errata/andes/errata.c
> @@ -0,0 +1,68 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Erratas to be applied for Andes CPU cores
> + *
> + * Copyright (C) 2022 Renesas Electronics Corporation.
> + *
> + * Author: Lad Prabhakar <[email protected]>
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +
> +#include <asm/alternative.h>
> +#include <asm/cacheflush.h>
> +#include <asm/errata_list.h>
> +#include <asm/patch.h>
> +#include <asm/vendorid_list.h>
> +
> +static bool errata_probe_iocp(unsigned int stage, unsigned long arch_id, unsigned long impid)
> +{
> + if (!IS_ENABLED(CONFIG_ERRATA_ANDES_CMO))
> + return false;
> +
> + if (arch_id != 0x8000000000008a45 || impid != 0x500)
> + return false;
> +
> + riscv_cbom_block_size = 1;

as this is mainly to make the core cbo code happy, maybe add a comment
above that line to explain.


> + riscv_noncoherent_supported();
> +
> + return true;
> +}
> +

> diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> index 6511dd73e812..d8012af30cbd 100644
> --- a/arch/riscv/include/asm/alternative.h
> +++ b/arch/riscv/include/asm/alternative.h
> @@ -46,6 +46,9 @@ void sifive_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
> void thead_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
> unsigned long archid, unsigned long impid,
> unsigned int stage);
> +void andes_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
> + unsigned long archid, unsigned long impid,
> + unsigned int stage);

again alphabetical please (i.e. above sifive)


> diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> index 4180312d2a70..2ba7e6e74540 100644
> --- a/arch/riscv/include/asm/errata_list.h
> +++ b/arch/riscv/include/asm/errata_list.h
> @@ -9,6 +9,11 @@
> #include <asm/csr.h>
> #include <asm/vendorid_list.h>
>
> +#ifdef CONFIG_ERRATA_ANDES
> +#define ERRATA_ANDESTECH_NO_IOCP 0
> +#define ERRATA_ANDESTECH_NUMBER 1
> +#endif
> +
> #ifdef CONFIG_ERRATA_SIFIVE
> #define ERRATA_SIFIVE_CIP_453 0
> #define ERRATA_SIFIVE_CIP_1200 1
> diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> index a7d26a00beea..4ded3e9aa3bc 100644
> --- a/arch/riscv/kernel/alternative.c
> +++ b/arch/riscv/kernel/alternative.c
> @@ -47,6 +47,11 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
> case THEAD_VENDOR_ID:
> cpu_mfr_info->patch_func = thead_errata_patch_func;
> break;
> +#endif
> +#ifdef CONFIG_ERRATA_ANDES
> + case ANDESTECH_VENDOR_ID:
> + cpu_mfr_info->patch_func = andes_errata_patch_func;
> + break;

and again alphabetical please


Thanks
Heiko


2022-11-24 19:34:50

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] riscv: mm: dma-noncoherent: Pass direction and operation to ALT_CMO_OP()

Hi Heiko,

Thank you for the review.

On Thu, Nov 24, 2022 at 6:29 PM Heiko Stübner <[email protected]> wrote:
>
> Am Donnerstag, 24. November 2022, 18:22:05 CET schrieb Prabhakar:
> > From: Lad Prabhakar <[email protected]>
> >
> > Pass direction and operation to ALT_CMO_OP() macro.
> >
> > This is in preparation for adding errata for the Andes CPU core.
>
> can you provide more explanation why that is necessary please?
> I guess you want to use different cache operations for some cases?
>
Yes basically to call different cache operations based on the dir and
operations (and also this allows to export just one function to handle
the errata). I'll update the commit message in the next version.

Cheers,
Prabhakar

2022-11-24 19:37:48

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 3/7] riscv: errata: Add Andes alternative ports

Hi Heiko,

Thank you for the review.

On Thu, Nov 24, 2022 at 6:25 PM Heiko Stübner <[email protected]> wrote:
>
> Am Donnerstag, 24. November 2022, 18:22:03 CET schrieb Prabhakar:
> > From: Lad Prabhakar <[email protected]>
> >
> > Add required ports of the Alternative scheme for Andes CPU cores.
> >
> > Signed-off-by: Lad Prabhakar <[email protected]>
> > ---
> > RFC v3 -> v4
> > * New patch
> > ---
>
> > diff --git a/arch/riscv/errata/Makefile b/arch/riscv/errata/Makefile
> > index a1055965fbee..81828e80f6dc 100644
> > --- a/arch/riscv/errata/Makefile
> > +++ b/arch/riscv/errata/Makefile
> > @@ -1,2 +1,3 @@
> > obj-$(CONFIG_ERRATA_SIFIVE) += sifive/
> > obj-$(CONFIG_ERRATA_THEAD) += thead/
> > +obj-$(CONFIG_ERRATA_ANDES) += andes/
>
> alphabetical sorting please
>
>
> > diff --git a/arch/riscv/errata/andes/errata.c b/arch/riscv/errata/andes/errata.c
> > new file mode 100644
> > index 000000000000..ec3e052ca8c7
> > --- /dev/null
> > +++ b/arch/riscv/errata/andes/errata.c
> > @@ -0,0 +1,68 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Erratas to be applied for Andes CPU cores
> > + *
> > + * Copyright (C) 2022 Renesas Electronics Corporation.
> > + *
> > + * Author: Lad Prabhakar <[email protected]>
> > + */
> > +
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +
> > +#include <asm/alternative.h>
> > +#include <asm/cacheflush.h>
> > +#include <asm/errata_list.h>
> > +#include <asm/patch.h>
> > +#include <asm/vendorid_list.h>
> > +
> > +static bool errata_probe_iocp(unsigned int stage, unsigned long arch_id, unsigned long impid)
> > +{
> > + if (!IS_ENABLED(CONFIG_ERRATA_ANDES_CMO))
> > + return false;
> > +
> > + if (arch_id != 0x8000000000008a45 || impid != 0x500)
> > + return false;
> > +
> > + riscv_cbom_block_size = 1;
>
> as this is mainly to make the core cbo code happy, maybe add a comment
> above that line to explain.
>
Agreed, I'll add a comment here.

>
> > + riscv_noncoherent_supported();
> > +
> > + return true;
> > +}
> > +
>
> > diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> > index 6511dd73e812..d8012af30cbd 100644
> > --- a/arch/riscv/include/asm/alternative.h
> > +++ b/arch/riscv/include/asm/alternative.h
> > @@ -46,6 +46,9 @@ void sifive_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
> > void thead_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
> > unsigned long archid, unsigned long impid,
> > unsigned int stage);
> > +void andes_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
> > + unsigned long archid, unsigned long impid,
> > + unsigned int stage);
>
> again alphabetical please (i.e. above sifive)
>
>
> > diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> > index 4180312d2a70..2ba7e6e74540 100644
> > --- a/arch/riscv/include/asm/errata_list.h
> > +++ b/arch/riscv/include/asm/errata_list.h
> > @@ -9,6 +9,11 @@
> > #include <asm/csr.h>
> > #include <asm/vendorid_list.h>
> >
> > +#ifdef CONFIG_ERRATA_ANDES
> > +#define ERRATA_ANDESTECH_NO_IOCP 0
> > +#define ERRATA_ANDESTECH_NUMBER 1
> > +#endif
> > +
> > #ifdef CONFIG_ERRATA_SIFIVE
> > #define ERRATA_SIFIVE_CIP_453 0
> > #define ERRATA_SIFIVE_CIP_1200 1
> > diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> > index a7d26a00beea..4ded3e9aa3bc 100644
> > --- a/arch/riscv/kernel/alternative.c
> > +++ b/arch/riscv/kernel/alternative.c
> > @@ -47,6 +47,11 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
> > case THEAD_VENDOR_ID:
> > cpu_mfr_info->patch_func = thead_errata_patch_func;
> > break;
> > +#endif
> > +#ifdef CONFIG_ERRATA_ANDES
> > + case ANDESTECH_VENDOR_ID:
> > + cpu_mfr_info->patch_func = andes_errata_patch_func;
> > + break;
>
> and again alphabetical please
>
Oops I missed that, I'll sort this and all the above.

Cheers,
Prabhakar

2022-11-24 19:48:09

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 0/7] AX45MP: Add support to non-coherent DMA

Hey!

On Thu, Nov 24, 2022 at 05:22:00PM +0000, Prabhakar wrote:
> From: Lad Prabhakar <[email protected]>
>
> Hi All,
>
> On the Andes AX45MP core, cache coherency is a specification option so it
> may not be supported. In this case DMA will fail. To get around with this
> issue this patch series does the below:
>
> 1] Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> block that allows dynamic adjustment of memory attributes in the runtime.
> It contains a configurable amount of PMA entries implemented as CSR
> registers to control the attributes of memory locations in interest. PMA
> regions are passed from the l2 node which are configured as
> non-cacheable + bufferable with the SBI call.
>
> l2cache: [email protected] {
> ....
> andestech,pma-regions = <0x58000000 0x08000000
> (AX45MP_PMACFG_ETYP_NAPOT |
> AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF)>;
> ....
> };
>
> 2] We provide callbacks to synchronize specific content between memory and
> cache.
>
> - arch_sync_dma_for_device()
> - arch_sync_dma_for_cpu()
>
> Below are the configs that are enabled:
>
> - DMA_GLOBAL_POOL
> - RISCV_DMA_NONCOHERENT
>
> 3] We reserve the shared DMA pool, so the DMA memory requests go through
> this pool:
>
> reserved-memory {
> #address-cells = <2>;
> #size-cells = <2>;
> ranges;
>
> reserved: linux,[email protected] {
> compatible = "shared-dma-pool";
> no-map;
> linux,dma-default;
> reg = <0x0 0x58000000 0x0 0x08000000>;
> };
> };
>
>
> Below is the L2 cache DT node:
>
> l2cache: [email protected] {
> compatible = "andestech,ax45mp-cache", "cache";
> cache-size = <0x40000>;
> cache-line-size = <64>;
> cache-sets = <1024>;
> cache-unified;
> reg = <0x0 0x13400000 0x0 0x100000>;
> andestech,pma-regions = <0x0 0x58000000 0x0 0x08000000 0x0
> (AX45MP_PMACFG_ETYP_NAPOT |
> AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF)>;
> interrupts = <SOC_PERIPHERAL_IRQ(476, IRQ_TYPE_LEVEL_HIGH)>;
> };
>
> Due to the above approach custom SBI calls have been implemented. The
> above implementation is in preparation for adding support for Renesas
> RZ/Five SoC which uses the AX45MP core. As with the above approach the
> kernel image might not be generic so that it can be used on other
> platforms.
>
> OpenSBI implementation isn't upstreamed yet, public repo for access is
> available at [0].
>
> [0] https://github.com/renesas-rz/rz_opensbi/tree/work/OpenSBI-PMA
>
> Note,
> - This series requires testing on Cores with zibcom and T-Head SoCs
> - Ive used GCC 9.4.0 for compilation

Just dumping the following, which I saw with gcc 12.1 & binutils 2.39
while building allmodconfig. Perhaps it is worth you upgrading to a
recent toolchain for testing purposes. FWIW, I applied your patches on
top of 20221122.

/stuff/linux/arch/riscv/mm/dma-noncoherent.c: Assembler messages:
/stuff/linux/arch/riscv/mm/dma-noncoherent.c:62: Error: attempt to move .org backwards
/stuff/linux/arch/riscv/mm/dma-noncoherent.c:66: Error: attempt to move .org backwards
/stuff/linux/arch/riscv/mm/dma-noncoherent.c:84: Error: attempt to move .org backwards
/stuff/linux/arch/riscv/mm/dma-noncoherent.c:96: Error: attempt to move .org backwards


In file included from /stuff/linux/arch/riscv/errata/andes/errata.c:16:
/stuff/linux/arch/riscv/errata/andes/errata.c: In function 'is_auipc_insn':
/stuff/linux/arch/riscv/errata/andes/errata.c:25:34: error: 'MASK_AUIPC' undeclared (first use in this function)
25 | DECLARE_INSN(auipc, MATCH_AUIPC, MASK_AUIPC)
| ^~~~~~~~~~
/stuff/linux/arch/riscv/include/asm/parse_asm.h:175:25: note: in definition of macro 'DECLARE_INSN'
175 | return (insn & (INSN_MASK)) == (INSN_MATCH); \
| ^~~~~~~~~
/stuff/linux/arch/riscv/errata/andes/errata.c:25:34: note: each undeclared identifier is reported only once for each function it appears in
25 | DECLARE_INSN(auipc, MATCH_AUIPC, MASK_AUIPC)
| ^~~~~~~~~~
/stuff/linux/arch/riscv/include/asm/parse_asm.h:175:25: note: in definition of macro 'DECLARE_INSN'
175 | return (insn & (INSN_MASK)) == (INSN_MATCH); \
| ^~~~~~~~~
/stuff/linux/arch/riscv/errata/andes/errata.c:25:21: error: 'MATCH_AUIPC' undeclared (first use in this function); did you mean 'OPC_AUIPC'?
25 | DECLARE_INSN(auipc, MATCH_AUIPC, MASK_AUIPC)
| ^~~~~~~~~~~
/stuff/linux/arch/riscv/include/asm/parse_asm.h:175:41: note: in definition of macro 'DECLARE_INSN'
175 | return (insn & (INSN_MASK)) == (INSN_MATCH); \
| ^~~~~~~~~~
/stuff/linux/arch/riscv/errata/andes/errata.c: In function 'riscv_alternative_fix_auipc_jalr':
/stuff/linux/arch/riscv/errata/andes/errata.c:64:23: error: implicit declaration of function 'EXTRACT_RD_REG' [-Werror=implicit-function-declaration]
64 | rd1 = EXTRACT_RD_REG(*(alt_ptr + i));
| ^~~~~~~~~~~~~~
/stuff/linux/arch/riscv/errata/andes/errata.c:69:24: error: implicit declaration of function 'EXTRACT_UTYPE_IMM'; did you mean 'EXTRACT_BTYPE_IMM'? [-Werror=implicit-function-declaration]
69 | imm1 = EXTRACT_UTYPE_IMM(*(alt_ptr + i));
| ^~~~~~~~~~~~~~~~~
| EXTRACT_BTYPE_IMM
/stuff/linux/arch/riscv/errata/andes/errata.c:78:30: error: 'U_IMM_31_12_MASK' undeclared (first use in this function); did you mean 'J_IMM_19_12_MASK'?
78 | call[0] &= ~(U_IMM_31_12_MASK);
| ^~~~~~~~~~~~~~~~
| J_IMM_19_12_MASK
/stuff/linux/arch/riscv/errata/andes/errata.c: In function 'is_auipc_insn':
/stuff/linux/arch/riscv/include/asm/parse_asm.h:176:1: error: control reaches end of non-void function [-Werror=return-type]
176 | }
| ^
/stuff/linux/arch/riscv/errata/andes/errata.c:25:1: note: in expansion of macro 'DECLARE_INSN'
25 | DECLARE_INSN(auipc, MATCH_AUIPC, MASK_AUIPC)
| ^~~~~~~~~~~~
cc1: all warnings being treated as errors


> - Tested all the IP blocks on RZ/Five which use DMA
>
> RFC v3 -> v4
> * Implemented ALTERNATIVE_3() macro
> * Now using runtime patching mechanism instead of compile time config
> * Added Andes CMO as and errata
> * Fixed comments pointed by Geert
>
> RFC v2-> RFC v3
> * Fixed review comments pointed by Conor
> * Move DT binding into cache folder
> * Fixed DT binding check issue
> * Added andestech,ax45mp-cache.h header file
> * Now passing the flags for the PMA setup as part of andestech,pma-regions
> property.
> * Added andestech,inst/data-prefetch and andestech,tag/data-ram-ctl
> properties to configure the L2 cache.
> * Registered the cache driver as platform driver
>
> RFC v1-> RFC v2
> * Moved out the code from arc/riscv to drivers/soc/renesas
> * Now handling the PMA setup as part of the L2 cache
> * Now making use of dma-noncoherent.c instead SoC specific implementation.
> * Dropped arch_dma_alloc() and arch_dma_free()
> * Switched to RISCV_DMA_NONCOHERENT
> * Included DT binding doc
>
> RFC v2: https://patchwork.kernel.org/project/linux-renesas-soc/cover/[email protected]/
> RFC v1: https://patchwork.kernel.org/project/linux-renesas-soc/cover/[email protected]/
>
> Cheers,
> Prabhakar
>
> Lad Prabhakar (7):
> riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro
> riscv: asm: vendorid_list: Add Andes Technology to the vendors list
> riscv: errata: Add Andes alternative ports
> riscv: errata: andes: Fix auipc-jalr addresses in patched alternatives
> riscv: mm: dma-noncoherent: Pass direction and operation to
> ALT_CMO_OP()
> dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation
> for L2 cache controller
> soc: renesas: Add L2 cache management for RZ/Five SoC
>
> .../cache/andestech,ax45mp-cache.yaml | 93 ++++
> arch/riscv/Kconfig.erratas | 22 +
> arch/riscv/errata/Makefile | 1 +
> arch/riscv/errata/andes/Makefile | 1 +
> arch/riscv/errata/andes/errata.c | 139 ++++++
> arch/riscv/include/asm/alternative-macros.h | 94 ++++
> arch/riscv/include/asm/alternative.h | 3 +
> arch/riscv/include/asm/cacheflush.h | 12 +
> arch/riscv/include/asm/errata_list.h | 45 +-
> arch/riscv/include/asm/vendorid_list.h | 1 +
> arch/riscv/kernel/alternative.c | 5 +
> arch/riscv/mm/dma-noncoherent.c | 15 +-
> drivers/soc/renesas/Kconfig | 7 +
> drivers/soc/renesas/Makefile | 2 +
> drivers/soc/renesas/rzfive/Kconfig | 6 +
> drivers/soc/renesas/rzfive/Makefile | 3 +
> drivers/soc/renesas/rzfive/ax45mp_cache.c | 415 ++++++++++++++++++
> drivers/soc/renesas/rzfive/ax45mp_sbi.h | 29 ++
> .../cache/andestech,ax45mp-cache.h | 38 ++
> 19 files changed, 918 insertions(+), 13 deletions(-)
> create mode 100644 Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
> create mode 100644 arch/riscv/errata/andes/Makefile
> create mode 100644 arch/riscv/errata/andes/errata.c
> create mode 100644 drivers/soc/renesas/rzfive/Kconfig
> create mode 100644 drivers/soc/renesas/rzfive/Makefile
> create mode 100644 drivers/soc/renesas/rzfive/ax45mp_cache.c
> create mode 100644 drivers/soc/renesas/rzfive/ax45mp_sbi.h
> create mode 100644 include/dt-bindings/cache/andestech,ax45mp-cache.h
>
> --
> 2.25.1
>

2022-11-24 20:05:37

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

Hi Heiko,

Thank you for the review.

On Thu, Nov 24, 2022 at 6:30 PM Heiko Stübner <[email protected]> wrote:
>
> Am Donnerstag, 24. November 2022, 18:22:07 CET schrieb Prabhakar:
> > From: Lad Prabhakar <[email protected]>
> >
> > On the AX45MP core, cache coherency is a specification option so it may
> > not be supported. In this case DMA will fail. As a workaround, firstly we
> > allocate a global dma coherent pool from which DMA allocations are taken
> > and marked as non-cacheable + bufferable using the PMA region as specified
> > in the device tree. Synchronization callbacks are implemented to
> > synchronize when doing DMA transactions.
> >
> > The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> > block that allows dynamic adjustment of memory attributes in the runtime.
> > It contains a configurable amount of PMA entries implemented as CSR
> > registers to control the attributes of memory locations in interest.
> >
> > Below are the memory attributes supported:
> > * Device, Non-bufferable
> > * Device, bufferable
> > * Memory, Non-cacheable, Non-bufferable
> > * Memory, Non-cacheable, Bufferable
> > * Memory, Write-back, No-allocate
> > * Memory, Write-back, Read-allocate
> > * Memory, Write-back, Write-allocate
> > * Memory, Write-back, Read and Write-allocate
> >
> > This patch adds support to configure the memory attributes of the memory
> > regions as passed from the l2 cache node and exposes the cache management
> > ops.
> >
> > More info about PMA (section 10.3):
> > Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
> >
> > Signed-off-by: Lad Prabhakar <[email protected]>
> > ---
> > RFC v3 -> v4
> > * Made use of runtime patching instead of compile time
> > * Now just exposing single function ax45mp_no_iocp_cmo() for CMO handling
> > * Added a check to make sure cache line size is always 64 bytes
> > * Renamed folder rzf -> rzfive
> > * Improved Kconfig description
> > * Dropped L2 cache configuration
> > * Dropped unnecessary casts
> > * Fixed comments pointed by Geert, apart from use of PTR_ALIGN_XYZ() macros.
> > ---
> > arch/riscv/include/asm/cacheflush.h | 8 +
> > arch/riscv/include/asm/errata_list.h | 32 +-
> > drivers/soc/renesas/Kconfig | 7 +
> > drivers/soc/renesas/Makefile | 2 +
> > drivers/soc/renesas/rzfive/Kconfig | 6 +
> > drivers/soc/renesas/rzfive/Makefile | 3 +
> > drivers/soc/renesas/rzfive/ax45mp_cache.c | 415 ++++++++++++++++++++++
> > drivers/soc/renesas/rzfive/ax45mp_sbi.h | 29 ++
> > 8 files changed, 496 insertions(+), 6 deletions(-)
> > create mode 100644 drivers/soc/renesas/rzfive/Kconfig
> > create mode 100644 drivers/soc/renesas/rzfive/Makefile
> > create mode 100644 drivers/soc/renesas/rzfive/ax45mp_cache.c
> > create mode 100644 drivers/soc/renesas/rzfive/ax45mp_sbi.h
> >
> > diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
> > index 4a04d1be7c67..3226f3aceafe 100644
> > --- a/arch/riscv/include/asm/cacheflush.h
> > +++ b/arch/riscv/include/asm/cacheflush.h
> > @@ -61,6 +61,14 @@ static inline void riscv_noncoherent_supported(void) {}
> > #define SYS_RISCV_FLUSH_ICACHE_LOCAL 1UL
> > #define SYS_RISCV_FLUSH_ICACHE_ALL (SYS_RISCV_FLUSH_ICACHE_LOCAL)
> >
> > +#ifdef CONFIG_AX45MP_L2_CACHE
> > +extern asmlinkage void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr,
> > + size_t size, int dir, int ops);
> > +#else
> > +inline void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr,
> > + size_t size, int dir, int ops) {}
> > +#endif
> > +
> > #include <asm-generic/cacheflush.h>
> >
> > #endif /* _ASM_RISCV_CACHEFLUSH_H */
> > diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> > index 48e899a8e7a9..300fed3bfd80 100644
> > --- a/arch/riscv/include/asm/errata_list.h
> > +++ b/arch/riscv/include/asm/errata_list.h
> > @@ -125,8 +125,8 @@ asm volatile(ALTERNATIVE( \
> > #define THEAD_SYNC_S ".long 0x0190000b"
> >
> > #define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops) \
> > -asm volatile(ALTERNATIVE_2( \
> > - __nops(6), \
> > +asm volatile(ALTERNATIVE_3( \
> > + __nops(14), \
> > "mv a0, %1\n\t" \
> > "j 2f\n\t" \
> > "3:\n\t" \
> > @@ -134,7 +134,7 @@ asm volatile(ALTERNATIVE_2( \
> > "add a0, a0, %0\n\t" \
> > "2:\n\t" \
> > "bltu a0, %2, 3b\n\t" \
> > - "nop", 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
> > + __nops(8), 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
> > "mv a0, %1\n\t" \
> > "j 2f\n\t" \
> > "3:\n\t" \
> > @@ -142,8 +142,28 @@ asm volatile(ALTERNATIVE_2( \
> > "add a0, a0, %0\n\t" \
> > "2:\n\t" \
> > "bltu a0, %2, 3b\n\t" \
> > - THEAD_SYNC_S, THEAD_VENDOR_ID, \
> > - ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO) \
> > + THEAD_SYNC_S "\n\t" \
> > + __nops(8), THEAD_VENDOR_ID, \
> > + ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO, \
> > + ".option push\n\t\n\t" \
> > + ".option norvc\n\t" \
> > + ".option norelax\n\t"> \
>
> alternatives already do the norvc + norelax options anyway for old and new instructions,
> so the .option stuff shouldn't be necessary I guess?
>
I did a quick run with .option stuff and all seems to be OK. I'll do
some rigorous testing and get rid of it in the next version.
>
> > + "addi sp,sp,-16\n\t" \
> > + "sd s0,0(sp)\n\t" \
> > + "sd ra,8(sp)\n\t" \
> > + "addi s0,sp,16\n\t" \
> > + "mv a4,%6\n\t" \
> > + "mv a3,%5\n\t" \
> > + "mv a2,%4\n\t" \
> > + "mv a1,%3\n\t" \
> > + "mv a0,%0\n\t" \
> > + "call ax45mp_no_iocp_cmo\n\t" \
> > + "ld ra,8(sp)\n\t" \
> > + "ld s0,0(sp)\n\t" \
> > + "addi sp,sp,16\n\t" \
> > + ".option pop\n\t", \
> > + ANDESTECH_VENDOR_ID, ERRATA_ANDESTECH_NO_IOCP, \
> > + CONFIG_ERRATA_ANDES_CMO) \
> > : : "r"(_cachesize), \
> > "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \
> > "r"((unsigned long)(_start) + (_size)), \
> > @@ -151,7 +171,7 @@ asm volatile(ALTERNATIVE_2( \
> > "r"((unsigned long)(_size)), \
> > "r"((unsigned long)(_dir)), \
> > "r"((unsigned long)(_ops)) \
> > - : "a0")
> > + : "a0", "a1", "a2", "a3", "a4", "memory")
> >
> > #define THEAD_C9XX_RV_IRQ_PMU 17
> > #define THEAD_C9XX_CSR_SCOUNTEROF 0x5c5
<snip>
> > +static int ax45mp_l2c_probe(struct platform_device *pdev)
> > +{
> > + struct device_node *np = pdev->dev.of_node;
> > + int ret;
> > +
> > + ax45mp_priv = devm_kzalloc(&pdev->dev, sizeof(*ax45mp_priv), GFP_KERNEL);
> > + if (!ax45mp_priv)
> > + return -ENOMEM;
> > +
> > + ax45mp_priv->l2c_base = devm_of_iomap(&pdev->dev, pdev->dev.of_node, 0, NULL);
> > + if (!ax45mp_priv->l2c_base) {
> > + ret = -ENOMEM;
> > + goto l2c_err;
> > + }
> > +
> > + ret = ax45mp_configure_l2_cache(np);
> > + if (ret)
> > + goto l2c_err;
> > +
> > + ret = ax45mp_configure_pma_regions(np);
> > + if (ret)
> > + goto l2c_err;
> > +
> > + static_branch_disable(&ax45mp_l2c_configured);
> > +
> > + return 0;
> > +
> > +l2c_err:
> > + devm_kfree(&pdev->dev, ax45mp_priv);
> > + ax45mp_priv = NULL;
> > + return ret;
> > +}
> > +
> > +static const struct of_device_id ax45mp_cache_ids[] = {
> > + { .compatible = "andestech,ax45mp-cache" },
> > + { /* sentinel */ }
> > +};
> > +
> > +static struct platform_driver ax45mp_l2c_driver = {
> > + .driver = {
> > + .name = "ax45mp-l2c",
> > + .of_match_table = ax45mp_cache_ids,
> > + },
> > + .probe = ax45mp_l2c_probe,
> > +};
> > +
> > +static int __init ax45mp_cache_init(void)
> > +{
> > + static_branch_enable(&ax45mp_l2c_configured);
> > + return platform_driver_register(&ax45mp_l2c_driver);
>
> the ordering is racy I think.
>
> I.e. in the function called from the cmo operations (ax45mp*_range)
> you need to access ax45mp_priv and its line-size element.
>
> But when you enable the static branch the driver is not yet registered
> but even more important, also not probed yet.
>
> So I guess the static-branch-enable should be living at the end of
> ax45mp_l2c_probe()
>
Hmm so my understanding is incorrect.

static_branch_unlikely() - evaluates to false when
static_branch_enable() is called
static_branch_unlikely() - evaluates to true when
static_branch_disable() is called

Is that what you meant?

Cheers,
Prabhakar

2022-11-24 20:06:07

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 1/7] riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro

On Thu, Nov 24, 2022 at 05:22:01PM +0000, Prabhakar wrote:
> From: Lad Prabhakar <[email protected]>
>
> Introduce ALTERNATIVE_3() macro.

Bit perfunctory I think! There's a lovely comment down below that would
make for a better commit message if you were to yoink it.
Content looks about what I'd expect to see though.

> Signed-off-by: Lad Prabhakar <[email protected]>
> ---
> RFC v3 -> v4
> * New patch
> ---
> arch/riscv/include/asm/alternative-macros.h | 94 +++++++++++++++++++++
> 1 file changed, 94 insertions(+)
>
> diff --git a/arch/riscv/include/asm/alternative-macros.h b/arch/riscv/include/asm/alternative-macros.h
> index ec2f3f1b836f..1caf4306b3d6 100644
> --- a/arch/riscv/include/asm/alternative-macros.h
> +++ b/arch/riscv/include/asm/alternative-macros.h
> @@ -69,6 +69,34 @@
> new_c_2, vendor_id_2, errata_id_2, \
> IS_ENABLED(CONFIG_k_2)
>
> +.macro __ALTERNATIVE_CFG_3 old_c, new_c_1, vendor_id_1, errata_id_1, enable_1, \
> + new_c_2, vendor_id_2, errata_id_2, enable_2, \
> + new_c_3, vendor_id_3, errata_id_3, enable_3
> +886 :
> + .option push
> + .option norvc
> + .option norelax
> + \old_c
> + .option pop
> +887 :
> + ALT_NEW_CONTENT \vendor_id_1, \errata_id_1, \enable_1, \new_c_1
> + ALT_NEW_CONTENT \vendor_id_2, \errata_id_2, \enable_2, \new_c_2
> + ALT_NEW_CONTENT \vendor_id_3, \errata_id_3, \enable_3, \new_c_3
> +.endm
> +
> +#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> + CONFIG_k_1, \
> + new_c_2, vendor_id_2, errata_id_2, \
> + CONFIG_k_2, \
> + new_c_3, vendor_id_3, errata_id_3, \
> + CONFIG_k_3) \
> + __ALTERNATIVE_CFG_3 old_c, new_c_1, vendor_id_1, errata_id_1, \
> + IS_ENABLED(CONFIG_k_1), \
> + new_c_2, vendor_id_2, errata_id_2, \
> + IS_ENABLED(CONFIG_k_2), \
> + new_c_3, vendor_id_3, errata_id_3, \
> + IS_ENABLED(CONFIG_k_3)
> +
> #else /* !__ASSEMBLY__ */
>
> #include <asm/asm.h>
> @@ -135,6 +163,36 @@
> new_c_2, vendor_id_2, errata_id_2, \
> IS_ENABLED(CONFIG_k_2))
>
> +#define __ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> + enable_1, \
> + new_c_2, vendor_id_2, errata_id_2, \
> + enable_2, \
> + new_c_3, vendor_id_3, errata_id_3, \
> + enable_3) \
> + "886 :\n" \
> + ".option push\n" \
> + ".option norvc\n" \
> + ".option norelax\n" \
> + old_c "\n" \
> + ".option pop\n" \
> + "887 :\n" \
> + ALT_NEW_CONTENT(vendor_id_1, errata_id_1, enable_1, new_c_1) \
> + ALT_NEW_CONTENT(vendor_id_2, errata_id_2, enable_2, new_c_2) \
> + ALT_NEW_CONTENT(vendor_id_3, errata_id_3, enable_3, new_c_3)
> +
> +#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> + CONFIG_k_1, \
> + new_c_2, vendor_id_2, errata_id_2, \
> + CONFIG_k_2, \
> + new_c_3, vendor_id_3, errata_id_3, \
> + CONFIG_k_3) \
> + __ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> + IS_ENABLED(CONFIG_k_1), \
> + new_c_2, vendor_id_2, errata_id_2, \
> + IS_ENABLED(CONFIG_k_2), \
> + new_c_3, vendor_id_3, errata_id_3, \
> + IS_ENABLED(CONFIG_k_3))
> +
> #endif /* __ASSEMBLY__ */
>
> #else /* CONFIG_RISCV_ALTERNATIVE */
> @@ -153,6 +211,14 @@
> CONFIG_k_2) \
> __ALTERNATIVE_CFG old_c
>
> +#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> + CONFIG_k_1, \
> + new_c_2, vendor_id_2, errata_id_2, \
> + CONFIG_k_2, \
> + new_c_3, vendor_id_3, errata_id_3, \
> + CONFIG_k_3) \
> + __ALTERNATIVE_CFG old_c
> +
> #else /* !__ASSEMBLY__ */
>
> #define __ALTERNATIVE_CFG(old_c) \
> @@ -167,6 +233,14 @@
> CONFIG_k_2) \
> __ALTERNATIVE_CFG(old_c)
>
> +#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> + CONFIG_k_1, \
> + new_c_2, vendor_id_2, errata_id_2, \
> + CONFIG_k_2, \
> + new_c_3, vendor_id_3, errata_id_3, \
> + CONFIG_k_3) \
> + __ALTERNATIVE_CFG(old_c)
> +
> #endif /* __ASSEMBLY__ */
> #endif /* CONFIG_RISCV_ALTERNATIVE */
>
> @@ -202,4 +276,24 @@
> new_content_2, vendor_id_2, \
> errata_id_2, CONFIG_k_2)
>
> +/*
> + * A vendor wants to replace an old_content, but another vendor has used
> + * ALTERNATIVE_2() to patch its customized content at the same location. In
> + * this case, this vendor can create a new macro ALTERNATIVE_3() based
> + * on the following sample code and then replace ALTERNATIVE_2() with
> + * ALTERNATIVE_3() to append its customized content.
> + */
> +#define ALTERNATIVE_3(old_content, new_content_1, vendor_id_1, \
> + errata_id_1, CONFIG_k_1, \
> + new_content_2, vendor_id_2, \
> + errata_id_2, CONFIG_k_2, \
> + new_content_3, vendor_id_3, \
> + errata_id_3, CONFIG_k_3) \
> + _ALTERNATIVE_CFG_3(old_content, new_content_1, vendor_id_1, \
> + errata_id_1, CONFIG_k_1, \
> + new_content_2, vendor_id_2, \
> + errata_id_2, CONFIG_k_2, \
> + new_content_3, vendor_id_3, \
> + errata_id_3, CONFIG_k_3)
> +
> #endif
> --
> 2.25.1
>

2022-11-24 20:22:25

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 1/7] riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro

On Thu, Nov 24, 2022 at 08:58:41PM +0100, Heiko St?bner wrote:
> Am Donnerstag, 24. November 2022, 20:52:33 CET schrieb Conor Dooley:
> > On Thu, Nov 24, 2022 at 05:22:01PM +0000, Prabhakar wrote:
> > > From: Lad Prabhakar <[email protected]>
> > >
> > > Introduce ALTERNATIVE_3() macro.
> >
> > Bit perfunctory I think! There's a lovely comment down below that would
> > make for a better commit message if you were to yoink it.
> > Content looks about what I'd expect to see though.
>
> Also both the comment on the original ALTERNATIVE_2 and the new ALTERNATIVE_3
> should probably be merged into a single comment explaining this once for all
> ALTERNATIVE_x variants.
>
> Especially with the dma stuff, I'm pretty sure we'll get at least an ALTERNATIVE_4
> if not even more ;-) . So we defnitly don't want to repeat this multiple times.

Oh I can promise you that there'll be a #4 ;) I do find the comment's
wording to be quite odd though..

> + * A vendor wants to replace an old_content, but another vendor has used
> + * ALTERNATIVE_2() to patch its customized content at the same location. In

In particular this bit about "at the same location" does not make all
that much sense. What "at the same location" means in this context
should be expanded on imo. Effectively it boils down to someone else is
already replacing the same things you want to replace - it's just the
word "location" that might make sense if you're an old hand but not
otherwise?

> + * this case, this vendor can create a new macro ALTERNATIVE_3() based

Also, using the word "can". Is it not a "must" rather than a "can",
since this stuff needs to be multiplatform?

> + * on the following sample code and then replace ALTERNATIVE_2() with
> + * ALTERNATIVE_3() to append its customized content.


2022-11-24 20:23:05

by Heiko Stübner

[permalink] [raw]
Subject: Re: [PATCH v4 1/7] riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro

Am Donnerstag, 24. November 2022, 20:52:33 CET schrieb Conor Dooley:
> On Thu, Nov 24, 2022 at 05:22:01PM +0000, Prabhakar wrote:
> > From: Lad Prabhakar <[email protected]>
> >
> > Introduce ALTERNATIVE_3() macro.
>
> Bit perfunctory I think! There's a lovely comment down below that would
> make for a better commit message if you were to yoink it.
> Content looks about what I'd expect to see though.

Also both the comment on the original ALTERNATIVE_2 and the new ALTERNATIVE_3
should probably be merged into a single comment explaining this once for all
ALTERNATIVE_x variants.

Especially with the dma stuff, I'm pretty sure we'll get at least an ALTERNATIVE_4
if not even more ;-) . So we defnitly don't want to repeat this multiple times.


Heiko

> > Signed-off-by: Lad Prabhakar <[email protected]>
> > ---
> > RFC v3 -> v4
> > * New patch
> > ---
> > arch/riscv/include/asm/alternative-macros.h | 94 +++++++++++++++++++++
> > 1 file changed, 94 insertions(+)
> >
> > diff --git a/arch/riscv/include/asm/alternative-macros.h b/arch/riscv/include/asm/alternative-macros.h
> > index ec2f3f1b836f..1caf4306b3d6 100644
> > --- a/arch/riscv/include/asm/alternative-macros.h
> > +++ b/arch/riscv/include/asm/alternative-macros.h
> > @@ -69,6 +69,34 @@
> > new_c_2, vendor_id_2, errata_id_2, \
> > IS_ENABLED(CONFIG_k_2)
> >
> > +.macro __ALTERNATIVE_CFG_3 old_c, new_c_1, vendor_id_1, errata_id_1, enable_1, \
> > + new_c_2, vendor_id_2, errata_id_2, enable_2, \
> > + new_c_3, vendor_id_3, errata_id_3, enable_3
> > +886 :
> > + .option push
> > + .option norvc
> > + .option norelax
> > + \old_c
> > + .option pop
> > +887 :
> > + ALT_NEW_CONTENT \vendor_id_1, \errata_id_1, \enable_1, \new_c_1
> > + ALT_NEW_CONTENT \vendor_id_2, \errata_id_2, \enable_2, \new_c_2
> > + ALT_NEW_CONTENT \vendor_id_3, \errata_id_3, \enable_3, \new_c_3
> > +.endm
> > +
> > +#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> > + CONFIG_k_1, \
> > + new_c_2, vendor_id_2, errata_id_2, \
> > + CONFIG_k_2, \
> > + new_c_3, vendor_id_3, errata_id_3, \
> > + CONFIG_k_3) \
> > + __ALTERNATIVE_CFG_3 old_c, new_c_1, vendor_id_1, errata_id_1, \
> > + IS_ENABLED(CONFIG_k_1), \
> > + new_c_2, vendor_id_2, errata_id_2, \
> > + IS_ENABLED(CONFIG_k_2), \
> > + new_c_3, vendor_id_3, errata_id_3, \
> > + IS_ENABLED(CONFIG_k_3)
> > +
> > #else /* !__ASSEMBLY__ */
> >
> > #include <asm/asm.h>
> > @@ -135,6 +163,36 @@
> > new_c_2, vendor_id_2, errata_id_2, \
> > IS_ENABLED(CONFIG_k_2))
> >
> > +#define __ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> > + enable_1, \
> > + new_c_2, vendor_id_2, errata_id_2, \
> > + enable_2, \
> > + new_c_3, vendor_id_3, errata_id_3, \
> > + enable_3) \
> > + "886 :\n" \
> > + ".option push\n" \
> > + ".option norvc\n" \
> > + ".option norelax\n" \
> > + old_c "\n" \
> > + ".option pop\n" \
> > + "887 :\n" \
> > + ALT_NEW_CONTENT(vendor_id_1, errata_id_1, enable_1, new_c_1) \
> > + ALT_NEW_CONTENT(vendor_id_2, errata_id_2, enable_2, new_c_2) \
> > + ALT_NEW_CONTENT(vendor_id_3, errata_id_3, enable_3, new_c_3)
> > +
> > +#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> > + CONFIG_k_1, \
> > + new_c_2, vendor_id_2, errata_id_2, \
> > + CONFIG_k_2, \
> > + new_c_3, vendor_id_3, errata_id_3, \
> > + CONFIG_k_3) \
> > + __ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> > + IS_ENABLED(CONFIG_k_1), \
> > + new_c_2, vendor_id_2, errata_id_2, \
> > + IS_ENABLED(CONFIG_k_2), \
> > + new_c_3, vendor_id_3, errata_id_3, \
> > + IS_ENABLED(CONFIG_k_3))
> > +
> > #endif /* __ASSEMBLY__ */
> >
> > #else /* CONFIG_RISCV_ALTERNATIVE */
> > @@ -153,6 +211,14 @@
> > CONFIG_k_2) \
> > __ALTERNATIVE_CFG old_c
> >
> > +#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> > + CONFIG_k_1, \
> > + new_c_2, vendor_id_2, errata_id_2, \
> > + CONFIG_k_2, \
> > + new_c_3, vendor_id_3, errata_id_3, \
> > + CONFIG_k_3) \
> > + __ALTERNATIVE_CFG old_c
> > +
> > #else /* !__ASSEMBLY__ */
> >
> > #define __ALTERNATIVE_CFG(old_c) \
> > @@ -167,6 +233,14 @@
> > CONFIG_k_2) \
> > __ALTERNATIVE_CFG(old_c)
> >
> > +#define _ALTERNATIVE_CFG_3(old_c, new_c_1, vendor_id_1, errata_id_1, \
> > + CONFIG_k_1, \
> > + new_c_2, vendor_id_2, errata_id_2, \
> > + CONFIG_k_2, \
> > + new_c_3, vendor_id_3, errata_id_3, \
> > + CONFIG_k_3) \
> > + __ALTERNATIVE_CFG(old_c)
> > +
> > #endif /* __ASSEMBLY__ */
> > #endif /* CONFIG_RISCV_ALTERNATIVE */
> >
> > @@ -202,4 +276,24 @@
> > new_content_2, vendor_id_2, \
> > errata_id_2, CONFIG_k_2)
> >
> > +/*
> > + * A vendor wants to replace an old_content, but another vendor has used
> > + * ALTERNATIVE_2() to patch its customized content at the same location. In
> > + * this case, this vendor can create a new macro ALTERNATIVE_3() based
> > + * on the following sample code and then replace ALTERNATIVE_2() with
> > + * ALTERNATIVE_3() to append its customized content.
> > + */
> > +#define ALTERNATIVE_3(old_content, new_content_1, vendor_id_1, \
> > + errata_id_1, CONFIG_k_1, \
> > + new_content_2, vendor_id_2, \
> > + errata_id_2, CONFIG_k_2, \
> > + new_content_3, vendor_id_3, \
> > + errata_id_3, CONFIG_k_3) \
> > + _ALTERNATIVE_CFG_3(old_content, new_content_1, vendor_id_1, \
> > + errata_id_1, CONFIG_k_1, \
> > + new_content_2, vendor_id_2, \
> > + errata_id_2, CONFIG_k_2, \
> > + new_content_3, vendor_id_3, \
> > + errata_id_3, CONFIG_k_3)
> > +
> > #endif
>




2022-11-24 20:24:38

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 0/7] AX45MP: Add support to non-coherent DMA

Hi Conor,

Thank you for the quick test.

On Thu, Nov 24, 2022 at 7:41 PM Conor Dooley <[email protected]> wrote:
>
> Hey!
>
> On Thu, Nov 24, 2022 at 05:22:00PM +0000, Prabhakar wrote:
> > From: Lad Prabhakar <[email protected]>
> >
> > Hi All,
> >
> > On the Andes AX45MP core, cache coherency is a specification option so it
> > may not be supported. In this case DMA will fail. To get around with this
> > issue this patch series does the below:
> >
> > 1] Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> > block that allows dynamic adjustment of memory attributes in the runtime.
> > It contains a configurable amount of PMA entries implemented as CSR
> > registers to control the attributes of memory locations in interest. PMA
> > regions are passed from the l2 node which are configured as
> > non-cacheable + bufferable with the SBI call.
> >
> > l2cache: [email protected] {
> > ....
> > andestech,pma-regions = <0x58000000 0x08000000
> > (AX45MP_PMACFG_ETYP_NAPOT |
> > AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF)>;
> > ....
> > };
> >
> > 2] We provide callbacks to synchronize specific content between memory and
> > cache.
> >
> > - arch_sync_dma_for_device()
> > - arch_sync_dma_for_cpu()
> >
> > Below are the configs that are enabled:
> >
> > - DMA_GLOBAL_POOL
> > - RISCV_DMA_NONCOHERENT
> >
> > 3] We reserve the shared DMA pool, so the DMA memory requests go through
> > this pool:
> >
> > reserved-memory {
> > #address-cells = <2>;
> > #size-cells = <2>;
> > ranges;
> >
> > reserved: linux,[email protected] {
> > compatible = "shared-dma-pool";
> > no-map;
> > linux,dma-default;
> > reg = <0x0 0x58000000 0x0 0x08000000>;
> > };
> > };
> >
> >
> > Below is the L2 cache DT node:
> >
> > l2cache: [email protected] {
> > compatible = "andestech,ax45mp-cache", "cache";
> > cache-size = <0x40000>;
> > cache-line-size = <64>;
> > cache-sets = <1024>;
> > cache-unified;
> > reg = <0x0 0x13400000 0x0 0x100000>;
> > andestech,pma-regions = <0x0 0x58000000 0x0 0x08000000 0x0
> > (AX45MP_PMACFG_ETYP_NAPOT |
> > AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF)>;
> > interrupts = <SOC_PERIPHERAL_IRQ(476, IRQ_TYPE_LEVEL_HIGH)>;
> > };
> >
> > Due to the above approach custom SBI calls have been implemented. The
> > above implementation is in preparation for adding support for Renesas
> > RZ/Five SoC which uses the AX45MP core. As with the above approach the
> > kernel image might not be generic so that it can be used on other
> > platforms.
> >
> > OpenSBI implementation isn't upstreamed yet, public repo for access is
> > available at [0].
> >
> > [0] https://github.com/renesas-rz/rz_opensbi/tree/work/OpenSBI-PMA
> >
> > Note,
> > - This series requires testing on Cores with zibcom and T-Head SoCs
> > - Ive used GCC 9.4.0 for compilation
>
> Just dumping the following, which I saw with gcc 12.1 & binutils 2.39
> while building allmodconfig. Perhaps it is worth you upgrading to a
> recent toolchain for testing purposes. FWIW, I applied your patches on
> top of 20221122.
>
> /stuff/linux/arch/riscv/mm/dma-noncoherent.c: Assembler messages:
> /stuff/linux/arch/riscv/mm/dma-noncoherent.c:62: Error: attempt to move .org backwards
> /stuff/linux/arch/riscv/mm/dma-noncoherent.c:66: Error: attempt to move .org backwards
> /stuff/linux/arch/riscv/mm/dma-noncoherent.c:84: Error: attempt to move .org backwards
> /stuff/linux/arch/riscv/mm/dma-noncoherent.c:96: Error: attempt to move .org backwards
>
Hmm that looks interesting! I'll give that a shot with the latest tool-chain.

BTW is there a link to get the latest toolchain?
>
> In file included from /stuff/linux/arch/riscv/errata/andes/errata.c:16:
> /stuff/linux/arch/riscv/errata/andes/errata.c: In function 'is_auipc_insn':
> /stuff/linux/arch/riscv/errata/andes/errata.c:25:34: error: 'MASK_AUIPC' undeclared (first use in this function)
> 25 | DECLARE_INSN(auipc, MATCH_AUIPC, MASK_AUIPC)
> | ^~~~~~~~~~
> /stuff/linux/arch/riscv/include/asm/parse_asm.h:175:25: note: in definition of macro 'DECLARE_INSN'
> 175 | return (insn & (INSN_MASK)) == (INSN_MATCH); \
> | ^~~~~~~~~
> /stuff/linux/arch/riscv/errata/andes/errata.c:25:34: note: each undeclared identifier is reported only once for each function it appears in
> 25 | DECLARE_INSN(auipc, MATCH_AUIPC, MASK_AUIPC)
> | ^~~~~~~~~~
> /stuff/linux/arch/riscv/include/asm/parse_asm.h:175:25: note: in definition of macro 'DECLARE_INSN'
> 175 | return (insn & (INSN_MASK)) == (INSN_MATCH); \
> | ^~~~~~~~~
> /stuff/linux/arch/riscv/errata/andes/errata.c:25:21: error: 'MATCH_AUIPC' undeclared (first use in this function); did you mean 'OPC_AUIPC'?
> 25 | DECLARE_INSN(auipc, MATCH_AUIPC, MASK_AUIPC)
> | ^~~~~~~~~~~
> /stuff/linux/arch/riscv/include/asm/parse_asm.h:175:41: note: in definition of macro 'DECLARE_INSN'
> 175 | return (insn & (INSN_MASK)) == (INSN_MATCH); \
> | ^~~~~~~~~~
> /stuff/linux/arch/riscv/errata/andes/errata.c: In function 'riscv_alternative_fix_auipc_jalr':
> /stuff/linux/arch/riscv/errata/andes/errata.c:64:23: error: implicit declaration of function 'EXTRACT_RD_REG' [-Werror=implicit-function-declaration]
> 64 | rd1 = EXTRACT_RD_REG(*(alt_ptr + i));
> | ^~~~~~~~~~~~~~
> /stuff/linux/arch/riscv/errata/andes/errata.c:69:24: error: implicit declaration of function 'EXTRACT_UTYPE_IMM'; did you mean 'EXTRACT_BTYPE_IMM'? [-Werror=implicit-function-declaration]
> 69 | imm1 = EXTRACT_UTYPE_IMM(*(alt_ptr + i));
> | ^~~~~~~~~~~~~~~~~
> | EXTRACT_BTYPE_IMM
> /stuff/linux/arch/riscv/errata/andes/errata.c:78:30: error: 'U_IMM_31_12_MASK' undeclared (first use in this function); did you mean 'J_IMM_19_12_MASK'?
> 78 | call[0] &= ~(U_IMM_31_12_MASK);
> | ^~~~~~~~~~~~~~~~
> | J_IMM_19_12_MASK
> /stuff/linux/arch/riscv/errata/andes/errata.c: In function 'is_auipc_insn':
> /stuff/linux/arch/riscv/include/asm/parse_asm.h:176:1: error: control reaches end of non-void function [-Werror=return-type]
> 176 | }
> | ^
> /stuff/linux/arch/riscv/errata/andes/errata.c:25:1: note: in expansion of macro 'DECLARE_INSN'
> 25 | DECLARE_INSN(auipc, MATCH_AUIPC, MASK_AUIPC)
> | ^~~~~~~~~~~~
> cc1: all warnings being treated as errors
>
>
Oops I missed to mention the dependency here we need patches from [0]
. Just patches 1-5 should be sufficient for this build (as including
patch 6/7 gave me a build issue).

[0] https://patchwork.kernel.org/project/linux-riscv/cover/[email protected]/

Cheers,
Prabhakar

2022-11-24 20:49:57

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 0/7] AX45MP: Add support to non-coherent DMA

On Thu, Nov 24, 2022 at 07:52:28PM +0000, Lad, Prabhakar wrote:

> Hmm that looks interesting! I'll give that a shot with the latest tool-chain.
>
> BTW is there a link to get the latest toolchain?

Uhh, I just build mine from https://github.com/riscv-collab/riscv-gnu-toolchain
Arnd puts toolchains here: https://mirrors.edge.kernel.org/pub/tools/crosstool/
I've never used one though!

> Oops I missed to mention the dependency here we need patches from [0]
> . Just patches 1-5 should be sufficient for this build (as including
> patch 6/7 gave me a build issue).
>
> [0] https://patchwork.kernel.org/project/linux-riscv/cover/[email protected]/

Ah right. I kinda figured that with the "don't review this" patch that
you'd pulled all of the deps into it.

2022-11-24 20:50:27

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 3/7] riscv: errata: Add Andes alternative ports

On Thu, Nov 24, 2022 at 05:22:03PM +0000, Prabhakar wrote:
> From: Lad Prabhakar <[email protected]>
>
> Add required ports of the Alternative scheme for Andes CPU cores.

You've got a lot of nice info in your cover letter that would be nice in
the git history. Could you add some of the commentary about why the
Andes cache needs special handling from there to this commit message
please?

> Signed-off-by: Lad Prabhakar <[email protected]>
> ---
> RFC v3 -> v4
> * New patch
> ---
> arch/riscv/Kconfig.erratas | 22 +++++++++
> arch/riscv/errata/Makefile | 1 +
> arch/riscv/errata/andes/Makefile | 1 +
> arch/riscv/errata/andes/errata.c | 68 ++++++++++++++++++++++++++++
> arch/riscv/include/asm/alternative.h | 3 ++
> arch/riscv/include/asm/errata_list.h | 5 ++
> arch/riscv/kernel/alternative.c | 5 ++
> 7 files changed, 105 insertions(+)
> create mode 100644 arch/riscv/errata/andes/Makefile
> create mode 100644 arch/riscv/errata/andes/errata.c

> diff --git a/arch/riscv/errata/andes/errata.c b/arch/riscv/errata/andes/errata.c
> new file mode 100644
> index 000000000000..ec3e052ca8c7
> --- /dev/null
> +++ b/arch/riscv/errata/andes/errata.c
> @@ -0,0 +1,68 @@
> +// SPDX-License-Identifier: GPL-2.0-only
> +/*
> + * Erratas to be applied for Andes CPU cores
> + *
> + * Copyright (C) 2022 Renesas Electronics Corporation.
> + *
> + * Author: Lad Prabhakar <[email protected]>
> + */
> +
> +#include <linux/kernel.h>
> +#include <linux/module.h>
> +
> +#include <asm/alternative.h>
> +#include <asm/cacheflush.h>
> +#include <asm/errata_list.h>
> +#include <asm/patch.h>
> +#include <asm/vendorid_list.h>
> +
> +static bool errata_probe_iocp(unsigned int stage, unsigned long arch_id, unsigned long impid)

To the lay reader, what's an "iocp" when it's at home? "I/O coherency
port"? Again, commit message would be a good place for the introduction
of that term :)

> +{
> + if (!IS_ENABLED(CONFIG_ERRATA_ANDES_CMO))
> + return false;
> +
> + if (arch_id != 0x8000000000008a45 || impid != 0x500)

Can you #define these?

> + return false;
> +
> + riscv_cbom_block_size = 1;
> + riscv_noncoherent_supported();
> +
> + return true;
> +}
> +
> +static u32 andes_errata_probe(unsigned int stage, unsigned long archid, unsigned long impid)
> +{
> + u32 cpu_req_errata = 0;
> +

I read some code and when it does the opposite of what I'd expect, I
feel inclined to add a comment. In this case, you're probing for the
presence of the port `probe_iocp()`, but the interesting case is when
you don't find it. You can leave it uncommented if you like, but even
something like the below I think fits.

/*
* In the absence of the I/O Coherency Port, access to certain peripherals
* requires vendor specific DMA handling.
*/
> + if (errata_probe_iocp(stage, archid, impid))
> + cpu_req_errata |= BIT(ERRATA_ANDESTECH_NO_IOCP);
> +
> + return cpu_req_errata;
> +}
> +
> +void __init_or_module andes_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
> + unsigned long archid, unsigned long impid,
> + unsigned int stage)
> +{
> + u32 cpu_req_errata = andes_errata_probe(stage, archid, impid);
> + struct alt_entry *alt;
> + u32 tmp;
> +
> + if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> + return;
> +
> + for (alt = begin; alt < end; alt++) {
> + if (alt->vendor_id != ANDESTECH_VENDOR_ID)
> + continue;
> + if (alt->errata_id >= ERRATA_ANDESTECH_NUMBER)
> + continue;
> +
> + tmp = (1U << alt->errata_id);

Is this not BIT(alt->errata_id)?

> + if (cpu_req_errata & tmp) {
> + patch_text_nosync(alt->old_ptr, alt->alt_ptr, alt->alt_len);
> +
> + riscv_alternative_fix_auipc_jalr(alt->old_ptr, alt->alt_len,
> + alt->old_ptr - alt->alt_ptr);
> + }
> + }
> +}
> diff --git a/arch/riscv/include/asm/alternative.h b/arch/riscv/include/asm/alternative.h
> index 6511dd73e812..d8012af30cbd 100644
> --- a/arch/riscv/include/asm/alternative.h
> +++ b/arch/riscv/include/asm/alternative.h
> @@ -46,6 +46,9 @@ void sifive_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
> void thead_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
> unsigned long archid, unsigned long impid,
> unsigned int stage);
> +void andes_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
> + unsigned long archid, unsigned long impid,
> + unsigned int stage);
>
> void riscv_cpufeature_patch_func(struct alt_entry *begin, struct alt_entry *end,
> unsigned int stage);
> diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> index 4180312d2a70..2ba7e6e74540 100644
> --- a/arch/riscv/include/asm/errata_list.h
> +++ b/arch/riscv/include/asm/errata_list.h
> @@ -9,6 +9,11 @@
> #include <asm/csr.h>
> #include <asm/vendorid_list.h>
>
> +#ifdef CONFIG_ERRATA_ANDES
> +#define ERRATA_ANDESTECH_NO_IOCP 0
> +#define ERRATA_ANDESTECH_NUMBER 1
> +#endif

Not a question for you, but I wonder why we even bother wrapping these
defines.

> +
> #ifdef CONFIG_ERRATA_SIFIVE
> #define ERRATA_SIFIVE_CIP_453 0
> #define ERRATA_SIFIVE_CIP_1200 1
> diff --git a/arch/riscv/kernel/alternative.c b/arch/riscv/kernel/alternative.c
> index a7d26a00beea..4ded3e9aa3bc 100644
> --- a/arch/riscv/kernel/alternative.c
> +++ b/arch/riscv/kernel/alternative.c
> @@ -47,6 +47,11 @@ static void __init_or_module riscv_fill_cpu_mfr_info(struct cpu_manufacturer_inf
> case THEAD_VENDOR_ID:
> cpu_mfr_info->patch_func = thead_errata_patch_func;
> break;
> +#endif
> +#ifdef CONFIG_ERRATA_ANDES
> + case ANDESTECH_VENDOR_ID:
> + cpu_mfr_info->patch_func = andes_errata_patch_func;
> + break;
> #endif
> default:
> cpu_mfr_info->patch_func = NULL;
> --
> 2.25.1
>

2022-11-24 21:01:18

by Heiko Stübner

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

Am Donnerstag, 24. November 2022, 20:56:39 CET schrieb Lad, Prabhakar:
> Hi Heiko,
>
> Thank you for the review.
>
> On Thu, Nov 24, 2022 at 6:30 PM Heiko St?bner <[email protected]> wrote:
> >
> > Am Donnerstag, 24. November 2022, 18:22:07 CET schrieb Prabhakar:
> > > From: Lad Prabhakar <[email protected]>
> > >
> > > On the AX45MP core, cache coherency is a specification option so it may
> > > not be supported. In this case DMA will fail. As a workaround, firstly we
> > > allocate a global dma coherent pool from which DMA allocations are taken
> > > and marked as non-cacheable + bufferable using the PMA region as specified
> > > in the device tree. Synchronization callbacks are implemented to
> > > synchronize when doing DMA transactions.
> > >
> > > The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> > > block that allows dynamic adjustment of memory attributes in the runtime.
> > > It contains a configurable amount of PMA entries implemented as CSR
> > > registers to control the attributes of memory locations in interest.
> > >
> > > Below are the memory attributes supported:
> > > * Device, Non-bufferable
> > > * Device, bufferable
> > > * Memory, Non-cacheable, Non-bufferable
> > > * Memory, Non-cacheable, Bufferable
> > > * Memory, Write-back, No-allocate
> > > * Memory, Write-back, Read-allocate
> > > * Memory, Write-back, Write-allocate
> > > * Memory, Write-back, Read and Write-allocate
> > >
> > > This patch adds support to configure the memory attributes of the memory
> > > regions as passed from the l2 cache node and exposes the cache management
> > > ops.
> > >
> > > More info about PMA (section 10.3):
> > > Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
> > >
> > > Signed-off-by: Lad Prabhakar <[email protected]>
> > > ---
> > > RFC v3 -> v4
> > > * Made use of runtime patching instead of compile time
> > > * Now just exposing single function ax45mp_no_iocp_cmo() for CMO handling
> > > * Added a check to make sure cache line size is always 64 bytes
> > > * Renamed folder rzf -> rzfive
> > > * Improved Kconfig description
> > > * Dropped L2 cache configuration
> > > * Dropped unnecessary casts
> > > * Fixed comments pointed by Geert, apart from use of PTR_ALIGN_XYZ() macros.
> > > ---
> > > arch/riscv/include/asm/cacheflush.h | 8 +
> > > arch/riscv/include/asm/errata_list.h | 32 +-
> > > drivers/soc/renesas/Kconfig | 7 +
> > > drivers/soc/renesas/Makefile | 2 +
> > > drivers/soc/renesas/rzfive/Kconfig | 6 +
> > > drivers/soc/renesas/rzfive/Makefile | 3 +
> > > drivers/soc/renesas/rzfive/ax45mp_cache.c | 415 ++++++++++++++++++++++
> > > drivers/soc/renesas/rzfive/ax45mp_sbi.h | 29 ++
> > > 8 files changed, 496 insertions(+), 6 deletions(-)
> > > create mode 100644 drivers/soc/renesas/rzfive/Kconfig
> > > create mode 100644 drivers/soc/renesas/rzfive/Makefile
> > > create mode 100644 drivers/soc/renesas/rzfive/ax45mp_cache.c
> > > create mode 100644 drivers/soc/renesas/rzfive/ax45mp_sbi.h
> > >
> > > diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
> > > index 4a04d1be7c67..3226f3aceafe 100644
> > > --- a/arch/riscv/include/asm/cacheflush.h
> > > +++ b/arch/riscv/include/asm/cacheflush.h
> > > @@ -61,6 +61,14 @@ static inline void riscv_noncoherent_supported(void) {}
> > > #define SYS_RISCV_FLUSH_ICACHE_LOCAL 1UL
> > > #define SYS_RISCV_FLUSH_ICACHE_ALL (SYS_RISCV_FLUSH_ICACHE_LOCAL)
> > >
> > > +#ifdef CONFIG_AX45MP_L2_CACHE
> > > +extern asmlinkage void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr,
> > > + size_t size, int dir, int ops);
> > > +#else
> > > +inline void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr,
> > > + size_t size, int dir, int ops) {}
> > > +#endif
> > > +
> > > #include <asm-generic/cacheflush.h>
> > >
> > > #endif /* _ASM_RISCV_CACHEFLUSH_H */
> > > diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> > > index 48e899a8e7a9..300fed3bfd80 100644
> > > --- a/arch/riscv/include/asm/errata_list.h
> > > +++ b/arch/riscv/include/asm/errata_list.h
> > > @@ -125,8 +125,8 @@ asm volatile(ALTERNATIVE( \
> > > #define THEAD_SYNC_S ".long 0x0190000b"
> > >
> > > #define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops) \
> > > -asm volatile(ALTERNATIVE_2( \
> > > - __nops(6), \
> > > +asm volatile(ALTERNATIVE_3( \
> > > + __nops(14), \
> > > "mv a0, %1\n\t" \
> > > "j 2f\n\t" \
> > > "3:\n\t" \
> > > @@ -134,7 +134,7 @@ asm volatile(ALTERNATIVE_2( \
> > > "add a0, a0, %0\n\t" \
> > > "2:\n\t" \
> > > "bltu a0, %2, 3b\n\t" \
> > > - "nop", 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
> > > + __nops(8), 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
> > > "mv a0, %1\n\t" \
> > > "j 2f\n\t" \
> > > "3:\n\t" \
> > > @@ -142,8 +142,28 @@ asm volatile(ALTERNATIVE_2( \
> > > "add a0, a0, %0\n\t" \
> > > "2:\n\t" \
> > > "bltu a0, %2, 3b\n\t" \
> > > - THEAD_SYNC_S, THEAD_VENDOR_ID, \
> > > - ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO) \
> > > + THEAD_SYNC_S "\n\t" \
> > > + __nops(8), THEAD_VENDOR_ID, \
> > > + ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO, \
> > > + ".option push\n\t\n\t" \
> > > + ".option norvc\n\t" \
> > > + ".option norelax\n\t"> \
> >
> > alternatives already do the norvc + norelax options anyway for old and new instructions,
> > so the .option stuff shouldn't be necessary I guess?
> >
> I did a quick run with .option stuff and all seems to be OK. I'll do
> some rigorous testing and get rid of it in the next version.
> >
> > > + "addi sp,sp,-16\n\t" \
> > > + "sd s0,0(sp)\n\t" \
> > > + "sd ra,8(sp)\n\t" \
> > > + "addi s0,sp,16\n\t" \
> > > + "mv a4,%6\n\t" \
> > > + "mv a3,%5\n\t" \
> > > + "mv a2,%4\n\t" \
> > > + "mv a1,%3\n\t" \
> > > + "mv a0,%0\n\t" \
> > > + "call ax45mp_no_iocp_cmo\n\t" \
> > > + "ld ra,8(sp)\n\t" \
> > > + "ld s0,0(sp)\n\t" \
> > > + "addi sp,sp,16\n\t" \
> > > + ".option pop\n\t", \
> > > + ANDESTECH_VENDOR_ID, ERRATA_ANDESTECH_NO_IOCP, \
> > > + CONFIG_ERRATA_ANDES_CMO) \
> > > : : "r"(_cachesize), \
> > > "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \
> > > "r"((unsigned long)(_start) + (_size)), \
> > > @@ -151,7 +171,7 @@ asm volatile(ALTERNATIVE_2( \
> > > "r"((unsigned long)(_size)), \
> > > "r"((unsigned long)(_dir)), \
> > > "r"((unsigned long)(_ops)) \
> > > - : "a0")
> > > + : "a0", "a1", "a2", "a3", "a4", "memory")
> > >
> > > #define THEAD_C9XX_RV_IRQ_PMU 17
> > > #define THEAD_C9XX_CSR_SCOUNTEROF 0x5c5
> <snip>
> > > +static int ax45mp_l2c_probe(struct platform_device *pdev)
> > > +{
> > > + struct device_node *np = pdev->dev.of_node;
> > > + int ret;
> > > +
> > > + ax45mp_priv = devm_kzalloc(&pdev->dev, sizeof(*ax45mp_priv), GFP_KERNEL);
> > > + if (!ax45mp_priv)
> > > + return -ENOMEM;
> > > +
> > > + ax45mp_priv->l2c_base = devm_of_iomap(&pdev->dev, pdev->dev.of_node, 0, NULL);
> > > + if (!ax45mp_priv->l2c_base) {
> > > + ret = -ENOMEM;
> > > + goto l2c_err;
> > > + }
> > > +
> > > + ret = ax45mp_configure_l2_cache(np);
> > > + if (ret)
> > > + goto l2c_err;
> > > +
> > > + ret = ax45mp_configure_pma_regions(np);
> > > + if (ret)
> > > + goto l2c_err;
> > > +
> > > + static_branch_disable(&ax45mp_l2c_configured);
> > > +
> > > + return 0;
> > > +
> > > +l2c_err:
> > > + devm_kfree(&pdev->dev, ax45mp_priv);
> > > + ax45mp_priv = NULL;
> > > + return ret;
> > > +}
> > > +
> > > +static const struct of_device_id ax45mp_cache_ids[] = {
> > > + { .compatible = "andestech,ax45mp-cache" },
> > > + { /* sentinel */ }
> > > +};
> > > +
> > > +static struct platform_driver ax45mp_l2c_driver = {
> > > + .driver = {
> > > + .name = "ax45mp-l2c",
> > > + .of_match_table = ax45mp_cache_ids,
> > > + },
> > > + .probe = ax45mp_l2c_probe,
> > > +};
> > > +
> > > +static int __init ax45mp_cache_init(void)
> > > +{
> > > + static_branch_enable(&ax45mp_l2c_configured);
> > > + return platform_driver_register(&ax45mp_l2c_driver);
> >
> > the ordering is racy I think.
> >
> > I.e. in the function called from the cmo operations (ax45mp*_range)
> > you need to access ax45mp_priv and its line-size element.
> >
> > But when you enable the static branch the driver is not yet registered
> > but even more important, also not probed yet.
> >
> > So I guess the static-branch-enable should be living at the end of
> > ax45mp_l2c_probe()
> >
> Hmm so my understanding is incorrect.
>
> static_branch_unlikely() - evaluates to false when
> static_branch_enable() is called
> static_branch_unlikely() - evaluates to true when
> static_branch_disable() is called
>
> Is that what you meant?

That is an issue as well :-)

I.e. static_branch_* will always return true when enabled
and false when disabled. The likely / unlikely suffix is a mechanism
for runtime performance, i.e. unlikely means that you expect
the static-key to be false in _most_ cases, where likely means
you expect it to be true in most cases.

But also

- arch_initcall(ax45mp_cache_init);
- does static-branch enable
- registers platform_driver
- platform_driver probes
- kzalloc(priv)
... [1]
- priv->line_size from dt

at [1] your condition could already be fullfilled
but you don't have a cache-line-size yet.


Heiko


2022-11-24 21:02:16

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 1/7] riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro

On 24/11/2022 20:05, Conor Dooley wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> On Thu, Nov 24, 2022 at 08:58:41PM +0100, Heiko Stübner wrote:
>> Am Donnerstag, 24. November 2022, 20:52:33 CET schrieb Conor Dooley:
>>> On Thu, Nov 24, 2022 at 05:22:01PM +0000, Prabhakar wrote:
>>>> From: Lad Prabhakar <[email protected]>
>>>>
>>>> Introduce ALTERNATIVE_3() macro.
>>>
>>> Bit perfunctory I think! There's a lovely comment down below that would
>>> make for a better commit message if you were to yoink it.
>>> Content looks about what I'd expect to see though.
>>
>> Also both the comment on the original ALTERNATIVE_2 and the new ALTERNATIVE_3
>> should probably be merged into a single comment explaining this once for all
>> ALTERNATIVE_x variants.
>>
>> Especially with the dma stuff, I'm pretty sure we'll get at least an ALTERNATIVE_4
>> if not even more ;-) . So we defnitly don't want to repeat this multiple times.
>
> Oh I can promise you that there'll be a #4 ;) I do find the comment's
> wording to be quite odd though..
>
>> + * A vendor wants to replace an old_content, but another vendor has used
>> + * ALTERNATIVE_2() to patch its customized content at the same location. In
>
> In particular this bit about "at the same location" does not make all
> that much sense. What "at the same location" means in this context
> should be expanded on imo. Effectively it boils down to someone else is
> already replacing the same things you want to replace - it's just the
> word "location" that might make sense if you're an old hand but not
> otherwise?

Or maybe I am just biased because I tried to explain this to someone
recently and the language in the comments didn't make sense to them,
and anyone meddling with this code should be able to understand it?

>> + * this case, this vendor can create a new macro ALTERNATIVE_3() based
>
> Also, using the word "can". Is it not a "must" rather than a "can",
> since this stuff needs to be multiplatform?
>
>> + * on the following sample code and then replace ALTERNATIVE_2() with
>> + * ALTERNATIVE_3() to append its customized content.
>
>

2022-11-24 21:22:00

by Heiko Stübner

[permalink] [raw]
Subject: Re: [PATCH v4 1/7] riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro

Am Donnerstag, 24. November 2022, 21:08:17 CET schrieb Conor Dooley:
> On 24/11/2022 20:05, Conor Dooley wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >
> > On Thu, Nov 24, 2022 at 08:58:41PM +0100, Heiko St?bner wrote:
> >> Am Donnerstag, 24. November 2022, 20:52:33 CET schrieb Conor Dooley:
> >>> On Thu, Nov 24, 2022 at 05:22:01PM +0000, Prabhakar wrote:
> >>>> From: Lad Prabhakar <[email protected]>
> >>>>
> >>>> Introduce ALTERNATIVE_3() macro.
> >>>
> >>> Bit perfunctory I think! There's a lovely comment down below that would
> >>> make for a better commit message if you were to yoink it.
> >>> Content looks about what I'd expect to see though.
> >>
> >> Also both the comment on the original ALTERNATIVE_2 and the new ALTERNATIVE_3
> >> should probably be merged into a single comment explaining this once for all
> >> ALTERNATIVE_x variants.
> >>
> >> Especially with the dma stuff, I'm pretty sure we'll get at least an ALTERNATIVE_4
> >> if not even more ;-) . So we defnitly don't want to repeat this multiple times.
> >
> > Oh I can promise you that there'll be a #4 ;) I do find the comment's
> > wording to be quite odd though..
> >
> >> + * A vendor wants to replace an old_content, but another vendor has used
> >> + * ALTERNATIVE_2() to patch its customized content at the same location. In
> >
> > In particular this bit about "at the same location" does not make all
> > that much sense. What "at the same location" means in this context
> > should be expanded on imo. Effectively it boils down to someone else is
> > already replacing the same things you want to replace - it's just the
> > word "location" that might make sense if you're an old hand but not
> > otherwise?
>
> Or maybe I am just biased because I tried to explain this to someone
> recently and the language in the comments didn't make sense to them,
> and anyone meddling with this code should be able to understand it?

When I first looked at the whole alternatives / patching thing, the whole thing
looked like dark magic to me ;-) .

But yeah, the comment here, but also the original one above ALTERNATIVE_2
could use improvements to explain better what it tries to do.


> >> + * this case, this vendor can create a new macro ALTERNATIVE_3() based
> >
> > Also, using the word "can". Is it not a "must" rather than a "can",
> > since this stuff needs to be multiplatform?
> >
> >> + * on the following sample code and then replace ALTERNATIVE_2() with
> >> + * ALTERNATIVE_3() to append its customized content.
> >
> >
>
>




2022-11-24 22:09:13

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

On Thu, Nov 24, 2022 at 09:31:42PM +0000, Conor Dooley wrote:
> On Thu, Nov 24, 2022 at 05:22:07PM +0000, Prabhakar wrote:
> > From: Lad Prabhakar <[email protected]>

> > + ax45mp_priv->ucctl_ok = ax45mp_cpu_cache_controlable();

> That function name is a typo, should be called ax45mp_cpu_cache_cache_controllable().


And so is my suggestion! s/cache_cache/cache/

2022-11-24 22:13:41

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

On Thu, Nov 24, 2022 at 05:22:07PM +0000, Prabhakar wrote:
> From: Lad Prabhakar <[email protected]>
>
> On the AX45MP core, cache coherency is a specification option so it may

How about:
"Cache coherency is an option feature of the AX45MP core, so it may not
be supported."

I keep finding that sentence kinda hard..

> In this case DMA will fail.
"The AX45MP predates the standard extensions for cache management, so an
alternate approach is required to support non-coherent DMA for SoCs
where this feature is not available, such as the Renesas RZ/Five."

(you've gotta explain somewhere why this is in drivers/soc/renesas lol)

> As a workaround, firstly we

How about:
" Since the cache management instructions cannot be used, we instead
allocate..."

> allocate a global dma coherent pool from which DMA allocations are taken
> and marked as non-cacheable + bufferable using the PMA region as specified
> in the device tree. Synchronization callbacks are implemented to
> synchronize when doing DMA transactions.
>
> The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> block that allows dynamic adjustment of memory attributes in the runtime.
> It contains a configurable amount of PMA entries implemented as CSR
> registers to control the attributes of memory locations in interest.
>
> Below are the memory attributes supported:
> * Device, Non-bufferable
> * Device, bufferable
> * Memory, Non-cacheable, Non-bufferable
> * Memory, Non-cacheable, Bufferable
> * Memory, Write-back, No-allocate
> * Memory, Write-back, Read-allocate
> * Memory, Write-back, Write-allocate
> * Memory, Write-back, Read and Write-allocate
>
> This patch adds support to configure the memory attributes of the memory
> regions as passed from the l2 cache node and exposes the cache management
> ops.
>
> More info about PMA (section 10.3):
> Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf

But yeah, this is basically the sort of stuff that'd be nice to have in
the previous patch!

> Signed-off-by: Lad Prabhakar <[email protected]>
> ---
> RFC v3 -> v4
> * Made use of runtime patching instead of compile time
> * Now just exposing single function ax45mp_no_iocp_cmo() for CMO handling
> * Added a check to make sure cache line size is always 64 bytes
> * Renamed folder rzf -> rzfive
> * Improved Kconfig description
> * Dropped L2 cache configuration
> * Dropped unnecessary casts
> * Fixed comments pointed by Geert, apart from use of PTR_ALIGN_XYZ() macros.
> ---
> arch/riscv/include/asm/cacheflush.h | 8 +
> arch/riscv/include/asm/errata_list.h | 32 +-
> drivers/soc/renesas/Kconfig | 7 +
> drivers/soc/renesas/Makefile | 2 +
> drivers/soc/renesas/rzfive/Kconfig | 6 +
> drivers/soc/renesas/rzfive/Makefile | 3 +
> drivers/soc/renesas/rzfive/ax45mp_cache.c | 415 ++++++++++++++++++++++
> drivers/soc/renesas/rzfive/ax45mp_sbi.h | 29 ++
> 8 files changed, 496 insertions(+), 6 deletions(-)
> create mode 100644 drivers/soc/renesas/rzfive/Kconfig
> create mode 100644 drivers/soc/renesas/rzfive/Makefile
> create mode 100644 drivers/soc/renesas/rzfive/ax45mp_cache.c
> create mode 100644 drivers/soc/renesas/rzfive/ax45mp_sbi.h

> diff --git a/drivers/soc/renesas/Kconfig b/drivers/soc/renesas/Kconfig
> index 660498252ec5..e7810256c60d 100644
> --- a/drivers/soc/renesas/Kconfig
> +++ b/drivers/soc/renesas/Kconfig
> @@ -340,9 +340,16 @@ if RISCV
> config ARCH_R9A07G043
> bool "RISC-V Platform support for RZ/Five"
> select ARCH_RZG2L
> + select AX45MP_L2_CACHE
> + select DMA_GLOBAL_POOL
> + select ERRATA_ANDES
> + select ERRATA_ANDES_CMO
> + select RISCV_DMA_NONCOHERENT

Is this not redundant due to the select by ERRATA_ANDES_CMO?

> help
> This enables support for the Renesas RZ/Five SoC.
>
> +source "drivers/soc/renesas/rzfive/Kconfig"
> +
> endif # RISCV
>
> config RST_RCAR

> diff --git a/drivers/soc/renesas/rzfive/Makefile b/drivers/soc/renesas/rzfive/Makefile
> new file mode 100644
> index 000000000000..2012e7fb978d
> --- /dev/null
> +++ b/drivers/soc/renesas/rzfive/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +obj-$(CONFIG_AX45MP_L2_CACHE) += ax45mp_cache.o
> diff --git a/drivers/soc/renesas/rzfive/ax45mp_cache.c b/drivers/soc/renesas/rzfive/ax45mp_cache.c
> new file mode 100644
> index 000000000000..4e0d0545d3af

Mainly just whizzing through the driver itself..

> +static int ax45mp_configure_pma_regions(struct device_node *np)
> +{
> + const char *propname = "andestech,pma-regions";
> + u32 start, size, flags;
> + unsigned int entry_id;
> + unsigned int i;
> + int count;
> + int ret;
> +
> + count = of_property_count_elems_of_size(np, propname, sizeof(u32) * 3);
> + if (count < 0)
> + return count;
> +
> + if (count > AX45MP_MAX_PMA_REGIONS)
> + return -EINVAL;
> +
> + for (i = 0, entry_id = 0 ; entry_id < count ; i += 3, entry_id++) {
> + of_property_read_u32_index(np, propname, i, &start);
> + of_property_read_u32_index(np, propname, i + 1, &size);
> + of_property_read_u32_index(np, propname, i + 2, &flags);
> + ret = ax45mp_sbi_set_pma(start, size, flags, entry_id);
> + if (!ret)
> + pr_err("Failed to setup PMA region 0x%x - 0x%x flags: 0x%x",
> + start, start + size, flags);

I have to ask - is it okay to just continue here if a RMA region setup
fails?

> + }
> +
> + return 0;
> +}
> +

> +static bool ax45mp_cpu_cache_controlable(void)
> +{
> + return (((ax45mp_cpu_get_micm_cfg_status() & AX45MP_MICM_CFG_ISZ_MASK) ||
> + (ax45mp_cpu_get_mdcm_cfg_status() & AX45MP_MDCM_CFG_DSZ_MASK)) &&
> + (ax45mp_cpu_get_misa_cfg_status() & AX45MP_MISA_20_MASK) &&
> + (ax45mp_cpu_get_mmsc_cfg_status() & AX45MP_MMSC_CFG_CCTLCSR_MASK) &&
> + (ax45mp_cpu_get_mcache_ctl_status() & AX45MP_MCACHE_CTL_CCTL_SUEN_MASK));

That's a bit of a mouthful lol!


> +static void ax45mp_cpu_dma_inval_range(void *vaddr, size_t size)

Not mine to look after so /shrug but this looks like the sort of thing
that could do with a comment or two explaining the invalidation process.

> +{
> + char cache_buf[2][AX45MP_MAX_CACHE_LINE_SIZE];
> + unsigned long start = (unsigned long)vaddr;
> + unsigned long end = start + size;
> + unsigned long old_start = start;
> + unsigned long old_end = end;
> + unsigned long line_size;
> + unsigned long flags;
> +
> + if (static_branch_unlikely(&ax45mp_l2c_configured) && !ax45mp_priv)
> + return;
> +
> + if (unlikely(start == end))
> + return;
> +
> + line_size = ax45mp_priv->ax45mp_cache_line_size;
> +
> + memset(&cache_buf, 0x0, sizeof(cache_buf));
> + start = start & (~(line_size - 1));
> + end = ((end + line_size - 1) & (~(line_size - 1)));
> +
> + local_irq_save(flags);
> + if (unlikely(start != old_start))
> + memcpy(&cache_buf[0][0], (void *)start, line_size);
> +
> + if (unlikely(end != old_end))
> + memcpy(&cache_buf[1][0], (void *)(old_end & (~(line_size - 1))), line_size);
> +
> + ax45mp_cpu_dcache_inval_range(vaddr, (void *)end, line_size);
> +
> + if (unlikely(start != old_start))
> + memcpy((void *)start, &cache_buf[0][0], (old_start & (line_size - 1)));
> +
> + if (unlikely(end != old_end))
> + memcpy((void *)(old_end + 1),
> + &cache_buf[1][(old_end & (line_size - 1)) + 1],
> + end - old_end - 1);
> +
> + local_irq_restore(flags);
> +}

> +static int ax45mp_configure_l2_cache(struct device_node *np)
> +{
> + int ret;
> +
> + ret = of_property_read_u32(np, "cache-line-size", &ax45mp_priv->ax45mp_cache_line_size);
> + if (ret) {
> + pr_err("Failed to get cache-line-size defaulting to 64 bytes\n");
^
Looks like you need a comma here...

> + ax45mp_priv->ax45mp_cache_line_size = SZ_64;
> + }
> +
> + if (ax45mp_priv->ax45mp_cache_line_size != SZ_64) {
> + pr_err("Expected cache-line-size to 64 bytes (found:%u). Defaulting to 64 bytes\n",
^
...and a "be" here.

Would you also benefit from a pr_fmt here since you have no device? Or
else you could save the dev to your ax45mp_priv and avail of dev_err
here?

> + ax45mp_priv->ax45mp_cache_line_size);
> + ax45mp_priv->ax45mp_cache_line_size = SZ_64;
> + }
> +
> + ax45mp_priv->ucctl_ok = ax45mp_cpu_cache_controlable();
^
That function name is a typo, should be called ax45mp_cpu_cache_cache_controllable().

> + ax45mp_priv->l2cache_enabled = ax45mp_cpu_l2c_ctl_status() & AX45MP_L2_CACHE_CTL_CEN_MASK;
> +
> + return 0;
> +}
> +
> +static int ax45mp_l2c_probe(struct platform_device *pdev)
> +{
> + struct device_node *np = pdev->dev.of_node;
> + int ret;
> +
> + ax45mp_priv = devm_kzalloc(&pdev->dev, sizeof(*ax45mp_priv), GFP_KERNEL);
> + if (!ax45mp_priv)
> + return -ENOMEM;
> +
> + ax45mp_priv->l2c_base = devm_of_iomap(&pdev->dev, pdev->dev.of_node, 0, NULL);
> + if (!ax45mp_priv->l2c_base) {
> + ret = -ENOMEM;
> + goto l2c_err;
> + }
> +
> + ret = ax45mp_configure_l2_cache(np);
> + if (ret)
> + goto l2c_err;
> +
> + ret = ax45mp_configure_pma_regions(np);
> + if (ret)
> + goto l2c_err;
> +
> + static_branch_disable(&ax45mp_l2c_configured);
> +
> + return 0;
> +
> +l2c_err:
> + devm_kfree(&pdev->dev, ax45mp_priv);
> + ax45mp_priv = NULL;
> + return ret;
> +}

> diff --git a/drivers/soc/renesas/rzfive/ax45mp_sbi.h b/drivers/soc/renesas/rzfive/ax45mp_sbi.h
> new file mode 100644
> index 000000000000..1604874954d0
> --- /dev/null
> +++ b/drivers/soc/renesas/rzfive/ax45mp_sbi.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0+ */
> +
> +#ifndef __AX45MP_SBI_H
> +#define __AX45MP_SBI_H
> +
> +#define SBI_EXT_ANDES 0x0900031E
> +
> +enum ax45mp_sbi_ext_fid {
> + AX45MP_SBI_EXT_GET_MCACHE_CTL_STATUS = 0,

Is that zero not implied?

> + AX45MP_SBI_EXT_GET_MMISC_CTL_STATUS,
> + AX45MP_SBI_EXT_SET_MCACHE_CTL,
> + AX45MP_SBI_EXT_SET_MMISC_CTL,
> + AX45MP_SBI_EXT_ICACHE_OP,
> + AX45MP_SBI_EXT_DCACHE_OP,
> + AX45MP_SBI_EXT_L1CACHE_I_PREFETCH,
> + AX45MP_SBI_EXT_L1CACHE_D_PREFETCH,
> + AX45MP_SBI_EXT_NON_BLOCKING_LOAD_STORE,
> + AX45MP_SBI_EXT_WRITE_AROUND,
> + AX45MP_SBI_EXT_SET_PMA,
> + AX45MP_SBI_EXT_FREE_PMA,
> + AX45MP_SBI_EXT_PROBE_PMA,
> + AX45MP_SBI_EXT_DCACHE_WBINVAL_ALL,
> + AX45MP_SBI_EXT_GET_MICM_CTL_STATUS,
> + AX45MP_SBI_EXT_GET_MDCM_CTL_STATUS,
> + AX45MP_SBI_EXT_GET_MMSC_CTL_STATUS,
> + AX45MP_SBI_EXT_GET_MISA_CTL_STATUS,
> +};
> +
> +#endif
> --
> 2.25.1
>

2022-11-25 08:39:45

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller

On 24/11/2022 18:22, Prabhakar wrote:
> From: Lad Prabhakar <[email protected]>
>
> Add DT binding documentation for L2 cache controller found on RZ/Five SoC.
>
> The Renesas RZ/Five microprocessor includes a RISC-V CPU Core (AX45MP
> Single) from Andes. The AX45MP core has an L2 cache controller, this patch
> describes the L2 cache block.
>
> Signed-off-by: Lad Prabhakar <[email protected]>
> ---
> RFC v3 -> v4
> * Dropped l2 cache configuration parameters
> * s/larger/large
> * Added minItems/maxItems for andestech,pma-regions
> ---
> .../cache/andestech,ax45mp-cache.yaml | 93 +++++++++++++++++++
> .../cache/andestech,ax45mp-cache.h | 38 ++++++++
> 2 files changed, 131 insertions(+)
> create mode 100644 Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
> create mode 100644 include/dt-bindings/cache/andestech,ax45mp-cache.h
>
> diff --git a/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml b/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
> new file mode 100644
> index 000000000000..bf255b177d0a
> --- /dev/null
> +++ b/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
> @@ -0,0 +1,93 @@
> +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> +# Copyright (C) 2022 Renesas Electronics Corp.
> +%YAML 1.2
> +---
> +$id: http://devicetree.org/schemas/cache/andestech,ax45mp-cache.yaml#
> +$schema: http://devicetree.org/meta-schemas/core.yaml#
> +
> +title: Andestech AX45MP L2 Cache Controller
> +
> +maintainers:
> + - Lad Prabhakar <[email protected]>
> +
> +description:
> + A level-2 cache (L2C) is used to improve the system performance by providing
> + a large amount of cache line entries and reasonable access delays. The L2C
> + is shared between cores, and a non-inclusive non-exclusive policy is used.
> +
> +select:
> + properties:
> + compatible:
> + contains:
> + enum:
> + - andestech,ax45mp-cache
> +
> + required:
> + - compatible
> +
> +properties:
> + compatible:
> + items:
> + - const: andestech,ax45mp-cache
> + - const: cache
> +
> + reg:
> + maxItems: 1
> +
> + interrupts:
> + maxItems: 1
> +
> + cache-line-size:
> + const: 64
> +
> + cache-level:
> + const: 2
> +
> + cache-sets:
> + const: 1024
> +
> + cache-size:
> + enum: [131072, 262144, 524288, 1048576, 2097152]
> +
> + cache-unified: true
> +
> + next-level-cache: true
> +
> + andestech,pma-regions:
> + $ref: /schemas/types.yaml#/definitions/uint32-matrix
> + minItems: 1
> + maxItems: 16
> + items:
> + minItems: 3
> + maxItems: 3

Instead:
items:
items:
- description: Explain
- description: what is
- description: here

> + description: Optional array of memory regions to be set in the PMA.
> +
> +additionalProperties: false
> +
> +required:
> + - compatible
> + - cache-line-size
> + - cache-level
> + - cache-sets
> + - cache-size
> + - cache-unified
> + - interrupts
> + - reg

Keep the same order as properties appear in the "properties:"

> +
> +examples:
> + - |
> + #include <dt-bindings/interrupt-controller/irq.h>
> + #include <dt-bindings/cache/andestech,ax45mp-cache.h>
> +
> + [email protected] {
> + reg = <0x13400000 0x100000>;
> + compatible = "andestech,ax45mp-cache", "cache";
> + interrupts = <508 IRQ_TYPE_LEVEL_HIGH>;
> + cache-line-size = <64>;
> + cache-level = <2>;
> + cache-sets = <1024>;
> + cache-size = <262144>;
> + cache-unified;
> + andestech,pma-regions = <0x58000000 0x08000000
> + (AX45MP_PMACFG_ETYP_NAPOT | AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF)>;
> + };
> diff --git a/include/dt-bindings/cache/andestech,ax45mp-cache.h b/include/dt-bindings/cache/andestech,ax45mp-cache.h
> new file mode 100644
> index 000000000000..aa1cad24075d
> --- /dev/null
> +++ b/include/dt-bindings/cache/andestech,ax45mp-cache.h
> @@ -0,0 +1,38 @@
> +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */
> +/*
> + * This header provides constants for Andes AX45MP PMA configuration
> + *
> + * Copyright (C) 2022 Renesas Electronics Corp.
> + */
> +
> +#ifndef __DT_BINDINGS_ANDESTECH_AX45MP_CACHE_H
> +#define __DT_BINDINGS_ANDESTECH_AX45MP_CACHE_H
> +
> +/* OFF: PMA entry is disabled */
> +#define AX45MP_PMACFG_ETYP_DISABLED 0
> +/* Naturally aligned power of 2 region */
> +#define AX45MP_PMACFG_ETYP_NAPOT 3
> +
> +/* Device, Non-bufferable */
> +#define AX45MP_PMACFG_MTYP_DEV_NON_BUF (0 << 2)
> +/* Device, bufferable */
> +#define AX45MP_PMACFG_MTYP_DEV_BUF (1 << 2)
> +/* Memory, Non-cacheable, Non-bufferable */
> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF (2 << 2)
> +/* Memory, Non-cacheable, Bufferable */
> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF (3 << 2)

What are all these? They don't look like flags, because 3 = 1 | 2...
they don't look like constants, because we do not use shifts in
constants. Are these some register values? I also do not see the header
being used in the code, so why having a bindings header if it is not
used (DTS is not usage...)?


> +/* Memory, Write-back, No-allocate */
> +#define AX45MP_PMACFG_MTYP_MEM_WB_NA (8 << 2)
> +/* Memory, Write-back, Read-allocate */
> +#define AX45MP_PMACFG_MTYP_MEM_WB_RA (9 << 2)
> +/* Memory, Write-back, Write-allocate */
> +#define AX45MP_PMACFG_MTYP_MEM_WB_WA (10 << 2)
> +/* Memory, Write-back, Read and Write-allocate */
> +#define AX45MP_PMACFG_MTYP_MEM_WB_R_WA (11 << 2)
> +
> +/* AMO instructions are supported */
> +#define AX45MP_PMACFG_NAMO_AMO_SUPPORT (0 << 6)
> +/* AMO instructions are not supported */
> +#define AX45MP_PMACFG_NAMO_AMO_NO_SUPPORT (1 << 6)
> +
> +#endif /* __DT_BINDINGS_ANDESTECH_AX45MP_CACHE_H */

Best regards,
Krzysztof

2022-11-25 09:29:03

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v4 0/7] AX45MP: Add support to non-coherent DMA

Hi Prabhakar,

On Thu, Nov 24, 2022 at 6:24 PM Prabhakar <[email protected]> wrote:
> Due to the above approach custom SBI calls have been implemented. The
> above implementation is in preparation for adding support for Renesas
> RZ/Five SoC which uses the AX45MP core. As with the above approach the
> kernel image might not be generic so that it can be used on other

might be generic?

> platforms.

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2022-11-25 10:52:11

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 1/7] riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro

Hi Heiko,

On Fri, Nov 25, 2022 at 10:20 AM Heiko Stübner <[email protected]> wrote:
>
> Am Freitag, 25. November 2022, 11:02:21 CET schrieb Lad, Prabhakar:
> > Hi Heiko,
> >
> > On Thu, Nov 24, 2022 at 7:58 PM Heiko Stübner <[email protected]> wrote:
> > >
> > > Am Donnerstag, 24. November 2022, 20:52:33 CET schrieb Conor Dooley:
> > > > On Thu, Nov 24, 2022 at 05:22:01PM +0000, Prabhakar wrote:
> > > > > From: Lad Prabhakar <[email protected]>
> > > > >
> > > > > Introduce ALTERNATIVE_3() macro.
> > > >
> > > > Bit perfunctory I think! There's a lovely comment down below that would
> > > > make for a better commit message if you were to yoink it.
> > > > Content looks about what I'd expect to see though.
> > >
> > > Also both the comment on the original ALTERNATIVE_2 and the new ALTERNATIVE_3
> > > should probably be merged into a single comment explaining this once for all
> > > ALTERNATIVE_x variants.
> > >
> > > Especially with the dma stuff, I'm pretty sure we'll get at least an ALTERNATIVE_4
> > > if not even more ;-) . So we defnitly don't want to repeat this multiple times.
> > >
> > Do agree. How about the below?
> >
> > /*
> > * Similar to what ALTERNATIVE_2() macro does but with an additional
> > * vendor content.
> > */
> >
> > So the other ALTERNATIVE_2+() macros will keep on building on it.
>
> My idea was more like having _one_ comment block of something like
>
> -----
> /*
> * ALTERNATIVE_x macros allow providing multiple replacement options
> * for an ALTERNATIVE code section. This is helpful if multiple
> * implementation variants for the same functionality exist for
> * different cpu cores.
> *
> * Usage:
> * ALTERNATIVE_x(old_content,
> * new_content1, vendor_id1, errata_id1, CONFIG_k1,
> * new_content2, vendor_id2, errata_id2, CONFIG_k2,
> * ...
> * new_contentx, vendor_idx, errata_idx, CONFIG_kx)
> */
>
> #define ALTERNATIVE_2(...)
> #define ALTERNATIVE_3(...)
> etc
> -----
>
LGTM, I'll include the above in the next version.

> So this would include dropping the old comment over ALTERNATIVE2
>
Agreed, I'll drop it in the same patch.

Cheers,
Prabhakar

2022-11-25 11:20:23

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

Hi Conor,

Thank you for the review.

On Thu, Nov 24, 2022 at 9:31 PM Conor Dooley <[email protected]> wrote:
>
> On Thu, Nov 24, 2022 at 05:22:07PM +0000, Prabhakar wrote:
> > From: Lad Prabhakar <[email protected]>
> >
> > On the AX45MP core, cache coherency is a specification option so it may
>
> How about:
> "Cache coherency is an option feature of the AX45MP core, so it may not
> be supported."
>
Sure, I'll update that.

> I keep finding that sentence kinda hard..
>
> > In this case DMA will fail.
> "The AX45MP predates the standard extensions for cache management, so an
> alternate approach is required to support non-coherent DMA for SoCs
> where this feature is not available, such as the Renesas RZ/Five."
>
> (you've gotta explain somewhere why this is in drivers/soc/renesas lol)
>
I dont want to touch that fiery topic ;)

> > As a workaround, firstly we
>
> How about:
> " Since the cache management instructions cannot be used, we instead
> allocate..."
>
Sure, I'll update that.

> > allocate a global dma coherent pool from which DMA allocations are taken
> > and marked as non-cacheable + bufferable using the PMA region as specified
> > in the device tree. Synchronization callbacks are implemented to
> > synchronize when doing DMA transactions.
> >
> > The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> > block that allows dynamic adjustment of memory attributes in the runtime.
> > It contains a configurable amount of PMA entries implemented as CSR
> > registers to control the attributes of memory locations in interest.
> >
> > Below are the memory attributes supported:
> > * Device, Non-bufferable
> > * Device, bufferable
> > * Memory, Non-cacheable, Non-bufferable
> > * Memory, Non-cacheable, Bufferable
> > * Memory, Write-back, No-allocate
> > * Memory, Write-back, Read-allocate
> > * Memory, Write-back, Write-allocate
> > * Memory, Write-back, Read and Write-allocate
> >
> > This patch adds support to configure the memory attributes of the memory
> > regions as passed from the l2 cache node and exposes the cache management
> > ops.
> >
> > More info about PMA (section 10.3):
> > Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
>
> But yeah, this is basically the sort of stuff that'd be nice to have in
> the previous patch!
>
Agreed, I'll include this in the binding patch too.

> > Signed-off-by: Lad Prabhakar <[email protected]>
> > ---
> > RFC v3 -> v4
> > * Made use of runtime patching instead of compile time
> > * Now just exposing single function ax45mp_no_iocp_cmo() for CMO handling
> > * Added a check to make sure cache line size is always 64 bytes
> > * Renamed folder rzf -> rzfive
> > * Improved Kconfig description
> > * Dropped L2 cache configuration
> > * Dropped unnecessary casts
> > * Fixed comments pointed by Geert, apart from use of PTR_ALIGN_XYZ() macros.
> > ---
> > arch/riscv/include/asm/cacheflush.h | 8 +
> > arch/riscv/include/asm/errata_list.h | 32 +-
> > drivers/soc/renesas/Kconfig | 7 +
> > drivers/soc/renesas/Makefile | 2 +
> > drivers/soc/renesas/rzfive/Kconfig | 6 +
> > drivers/soc/renesas/rzfive/Makefile | 3 +
> > drivers/soc/renesas/rzfive/ax45mp_cache.c | 415 ++++++++++++++++++++++
> > drivers/soc/renesas/rzfive/ax45mp_sbi.h | 29 ++
> > 8 files changed, 496 insertions(+), 6 deletions(-)
> > create mode 100644 drivers/soc/renesas/rzfive/Kconfig
> > create mode 100644 drivers/soc/renesas/rzfive/Makefile
> > create mode 100644 drivers/soc/renesas/rzfive/ax45mp_cache.c
> > create mode 100644 drivers/soc/renesas/rzfive/ax45mp_sbi.h
>
> > diff --git a/drivers/soc/renesas/Kconfig b/drivers/soc/renesas/Kconfig
> > index 660498252ec5..e7810256c60d 100644
> > --- a/drivers/soc/renesas/Kconfig
> > +++ b/drivers/soc/renesas/Kconfig
> > @@ -340,9 +340,16 @@ if RISCV
> > config ARCH_R9A07G043
> > bool "RISC-V Platform support for RZ/Five"
> > select ARCH_RZG2L
> > + select AX45MP_L2_CACHE
> > + select DMA_GLOBAL_POOL
> > + select ERRATA_ANDES
> > + select ERRATA_ANDES_CMO
> > + select RISCV_DMA_NONCOHERENT
>
> Is this not redundant due to the select by ERRATA_ANDES_CMO?
>
Indeed, I'll drop it.

> > help
> > This enables support for the Renesas RZ/Five SoC.
> >
> > +source "drivers/soc/renesas/rzfive/Kconfig"
> > +
> > endif # RISCV
> >
> > config RST_RCAR
>
> > diff --git a/drivers/soc/renesas/rzfive/Makefile b/drivers/soc/renesas/rzfive/Makefile
> > new file mode 100644
> > index 000000000000..2012e7fb978d
> > --- /dev/null
> > +++ b/drivers/soc/renesas/rzfive/Makefile
> > @@ -0,0 +1,3 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +
> > +obj-$(CONFIG_AX45MP_L2_CACHE) += ax45mp_cache.o
> > diff --git a/drivers/soc/renesas/rzfive/ax45mp_cache.c b/drivers/soc/renesas/rzfive/ax45mp_cache.c
> > new file mode 100644
> > index 000000000000..4e0d0545d3af
>
> Mainly just whizzing through the driver itself..
>
> > +static int ax45mp_configure_pma_regions(struct device_node *np)
> > +{
> > + const char *propname = "andestech,pma-regions";
> > + u32 start, size, flags;
> > + unsigned int entry_id;
> > + unsigned int i;
> > + int count;
> > + int ret;
> > +
> > + count = of_property_count_elems_of_size(np, propname, sizeof(u32) * 3);
> > + if (count < 0)
> > + return count;
> > +
> > + if (count > AX45MP_MAX_PMA_REGIONS)
> > + return -EINVAL;
> > +
> > + for (i = 0, entry_id = 0 ; entry_id < count ; i += 3, entry_id++) {
> > + of_property_read_u32_index(np, propname, i, &start);
> > + of_property_read_u32_index(np, propname, i + 1, &size);
> > + of_property_read_u32_index(np, propname, i + 2, &flags);
> > + ret = ax45mp_sbi_set_pma(start, size, flags, entry_id);
> > + if (!ret)
> > + pr_err("Failed to setup PMA region 0x%x - 0x%x flags: 0x%x",
> > + start, start + size, flags);
>
> I have to ask - is it okay to just continue here if a RMA region setup
> fails?
>
Ok so incase of failures "ebreak" is called, so continuing doesn't
cause any harm I guess.

> > + }
> > +
> > + return 0;
> > +}
> > +
>
> > +static bool ax45mp_cpu_cache_controlable(void)
> > +{
> > + return (((ax45mp_cpu_get_micm_cfg_status() & AX45MP_MICM_CFG_ISZ_MASK) ||
> > + (ax45mp_cpu_get_mdcm_cfg_status() & AX45MP_MDCM_CFG_DSZ_MASK)) &&
> > + (ax45mp_cpu_get_misa_cfg_status() & AX45MP_MISA_20_MASK) &&
> > + (ax45mp_cpu_get_mmsc_cfg_status() & AX45MP_MMSC_CFG_CCTLCSR_MASK) &&
> > + (ax45mp_cpu_get_mcache_ctl_status() & AX45MP_MCACHE_CTL_CCTL_SUEN_MASK));
>
> That's a bit of a mouthful lol!
>
>
> > +static void ax45mp_cpu_dma_inval_range(void *vaddr, size_t size)
>
> Not mine to look after so /shrug but this looks like the sort of thing
> that could do with a comment or two explaining the invalidation process.
>
Sure, I'll add some comments.

> > +{
> > + char cache_buf[2][AX45MP_MAX_CACHE_LINE_SIZE];
> > + unsigned long start = (unsigned long)vaddr;
> > + unsigned long end = start + size;
> > + unsigned long old_start = start;
> > + unsigned long old_end = end;
> > + unsigned long line_size;
> > + unsigned long flags;
> > +
> > + if (static_branch_unlikely(&ax45mp_l2c_configured) && !ax45mp_priv)
> > + return;
> > +
> > + if (unlikely(start == end))
> > + return;
> > +
> > + line_size = ax45mp_priv->ax45mp_cache_line_size;
> > +
> > + memset(&cache_buf, 0x0, sizeof(cache_buf));
> > + start = start & (~(line_size - 1));
> > + end = ((end + line_size - 1) & (~(line_size - 1)));
> > +
> > + local_irq_save(flags);
> > + if (unlikely(start != old_start))
> > + memcpy(&cache_buf[0][0], (void *)start, line_size);
> > +
> > + if (unlikely(end != old_end))
> > + memcpy(&cache_buf[1][0], (void *)(old_end & (~(line_size - 1))), line_size);
> > +
> > + ax45mp_cpu_dcache_inval_range(vaddr, (void *)end, line_size);
> > +
> > + if (unlikely(start != old_start))
> > + memcpy((void *)start, &cache_buf[0][0], (old_start & (line_size - 1)));
> > +
> > + if (unlikely(end != old_end))
> > + memcpy((void *)(old_end + 1),
> > + &cache_buf[1][(old_end & (line_size - 1)) + 1],
> > + end - old_end - 1);
> > +
> > + local_irq_restore(flags);
> > +}
>
> > +static int ax45mp_configure_l2_cache(struct device_node *np)
> > +{
> > + int ret;
> > +
> > + ret = of_property_read_u32(np, "cache-line-size", &ax45mp_priv->ax45mp_cache_line_size);
> > + if (ret) {
> > + pr_err("Failed to get cache-line-size defaulting to 64 bytes\n");
> ^
> Looks like you need a comma here...
>
OK.

> > + ax45mp_priv->ax45mp_cache_line_size = SZ_64;
> > + }
> > +
> > + if (ax45mp_priv->ax45mp_cache_line_size != SZ_64) {
> > + pr_err("Expected cache-line-size to 64 bytes (found:%u). Defaulting to 64 bytes\n",
> ^
> ...and a "be" here.
>
> Would you also benefit from a pr_fmt here since you have no device? Or
> else you could save the dev to your ax45mp_priv and avail of dev_err
> here?
>
Sure, I'll switch to dev_err()

> > + ax45mp_priv->ax45mp_cache_line_size);
> > + ax45mp_priv->ax45mp_cache_line_size = SZ_64;
> > + }
> > +
> > + ax45mp_priv->ucctl_ok = ax45mp_cpu_cache_controlable();
> ^
> That function name is a typo, should be called ax45mp_cpu_cache_cache_controllable().
>
Ok, I'll rename it to ax45mp_cpu_cache_controllable().

> > + ax45mp_priv->l2cache_enabled = ax45mp_cpu_l2c_ctl_status() & AX45MP_L2_CACHE_CTL_CEN_MASK;
> > +
> > + return 0;
> > +}
> > +
> > +static int ax45mp_l2c_probe(struct platform_device *pdev)
> > +{
> > + struct device_node *np = pdev->dev.of_node;
> > + int ret;
> > +
> > + ax45mp_priv = devm_kzalloc(&pdev->dev, sizeof(*ax45mp_priv), GFP_KERNEL);
> > + if (!ax45mp_priv)
> > + return -ENOMEM;
> > +
> > + ax45mp_priv->l2c_base = devm_of_iomap(&pdev->dev, pdev->dev.of_node, 0, NULL);
> > + if (!ax45mp_priv->l2c_base) {
> > + ret = -ENOMEM;
> > + goto l2c_err;
> > + }
> > +
> > + ret = ax45mp_configure_l2_cache(np);
> > + if (ret)
> > + goto l2c_err;
> > +
> > + ret = ax45mp_configure_pma_regions(np);
> > + if (ret)
> > + goto l2c_err;
> > +
> > + static_branch_disable(&ax45mp_l2c_configured);
> > +
> > + return 0;
> > +
> > +l2c_err:
> > + devm_kfree(&pdev->dev, ax45mp_priv);
> > + ax45mp_priv = NULL;
> > + return ret;
> > +}
>
> > diff --git a/drivers/soc/renesas/rzfive/ax45mp_sbi.h b/drivers/soc/renesas/rzfive/ax45mp_sbi.h
> > new file mode 100644
> > index 000000000000..1604874954d0
> > --- /dev/null
> > +++ b/drivers/soc/renesas/rzfive/ax45mp_sbi.h
> > @@ -0,0 +1,29 @@
> > +/* SPDX-License-Identifier: GPL-2.0+ */
> > +
> > +#ifndef __AX45MP_SBI_H
> > +#define __AX45MP_SBI_H
> > +
> > +#define SBI_EXT_ANDES 0x0900031E
> > +
> > +enum ax45mp_sbi_ext_fid {
> > + AX45MP_SBI_EXT_GET_MCACHE_CTL_STATUS = 0,
>
> Is that zero not implied?
>
I had a feeling we always had to explicitly specify in the Linux
coding standard.

Cheers,
Prabhakar

2022-11-25 11:20:42

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 1/7] riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro

Hi Heiko,

On Thu, Nov 24, 2022 at 7:58 PM Heiko Stübner <[email protected]> wrote:
>
> Am Donnerstag, 24. November 2022, 20:52:33 CET schrieb Conor Dooley:
> > On Thu, Nov 24, 2022 at 05:22:01PM +0000, Prabhakar wrote:
> > > From: Lad Prabhakar <[email protected]>
> > >
> > > Introduce ALTERNATIVE_3() macro.
> >
> > Bit perfunctory I think! There's a lovely comment down below that would
> > make for a better commit message if you were to yoink it.
> > Content looks about what I'd expect to see though.
>
> Also both the comment on the original ALTERNATIVE_2 and the new ALTERNATIVE_3
> should probably be merged into a single comment explaining this once for all
> ALTERNATIVE_x variants.
>
> Especially with the dma stuff, I'm pretty sure we'll get at least an ALTERNATIVE_4
> if not even more ;-) . So we defnitly don't want to repeat this multiple times.
>
Do agree. How about the below?

/*
* Similar to what ALTERNATIVE_2() macro does but with an additional
* vendor content.
*/

So the other ALTERNATIVE_2+() macros will keep on building on it.

Cheers,
Prabhakar

2022-11-25 11:22:34

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 0/7] AX45MP: Add support to non-coherent DMA

Hi Geert,

On Fri, Nov 25, 2022 at 9:04 AM Geert Uytterhoeven <[email protected]> wrote:
>
> Hi Prabhakar,
>
> On Thu, Nov 24, 2022 at 6:24 PM Prabhakar <[email protected]> wrote:
> > Due to the above approach custom SBI calls have been implemented. The
> > above implementation is in preparation for adding support for Renesas
> > RZ/Five SoC which uses the AX45MP core. As with the above approach the
> > kernel image might not be generic so that it can be used on other
>
> might be generic?
>
Oops I missed updating this part of the cover letter. Indeed it should
be generic now.

Cheers,
Prabhakar

2022-11-25 11:24:23

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 3/7] riscv: errata: Add Andes alternative ports

Hi Conor,

Thank you for the review.

On Thu, Nov 24, 2022 at 8:22 PM Conor Dooley <[email protected]> wrote:
>
> On Thu, Nov 24, 2022 at 05:22:03PM +0000, Prabhakar wrote:
> > From: Lad Prabhakar <[email protected]>
> >
> > Add required ports of the Alternative scheme for Andes CPU cores.
>
> You've got a lot of nice info in your cover letter that would be nice in
> the git history. Could you add some of the commentary about why the
> Andes cache needs special handling from there to this commit message
> please?
>
Sure, I'll update the commit message here.

> > Signed-off-by: Lad Prabhakar <[email protected]>
> > ---
> > RFC v3 -> v4
> > * New patch
> > ---
> > arch/riscv/Kconfig.erratas | 22 +++++++++
> > arch/riscv/errata/Makefile | 1 +
> > arch/riscv/errata/andes/Makefile | 1 +
> > arch/riscv/errata/andes/errata.c | 68 ++++++++++++++++++++++++++++
> > arch/riscv/include/asm/alternative.h | 3 ++
> > arch/riscv/include/asm/errata_list.h | 5 ++
> > arch/riscv/kernel/alternative.c | 5 ++
> > 7 files changed, 105 insertions(+)
> > create mode 100644 arch/riscv/errata/andes/Makefile
> > create mode 100644 arch/riscv/errata/andes/errata.c
>
> > diff --git a/arch/riscv/errata/andes/errata.c b/arch/riscv/errata/andes/errata.c
> > new file mode 100644
> > index 000000000000..ec3e052ca8c7
> > --- /dev/null
> > +++ b/arch/riscv/errata/andes/errata.c
> > @@ -0,0 +1,68 @@
> > +// SPDX-License-Identifier: GPL-2.0-only
> > +/*
> > + * Erratas to be applied for Andes CPU cores
> > + *
> > + * Copyright (C) 2022 Renesas Electronics Corporation.
> > + *
> > + * Author: Lad Prabhakar <[email protected]>
> > + */
> > +
> > +#include <linux/kernel.h>
> > +#include <linux/module.h>
> > +
> > +#include <asm/alternative.h>
> > +#include <asm/cacheflush.h>
> > +#include <asm/errata_list.h>
> > +#include <asm/patch.h>
> > +#include <asm/vendorid_list.h>
> > +
> > +static bool errata_probe_iocp(unsigned int stage, unsigned long arch_id, unsigned long impid)
>
> To the lay reader, what's an "iocp" when it's at home? "I/O coherency
> port"? Again, commit message would be a good place for the introduction
> of that term :)
>
Agree, I'll update that.

> > +{
> > + if (!IS_ENABLED(CONFIG_ERRATA_ANDES_CMO))
> > + return false;
> > +
> > + if (arch_id != 0x8000000000008a45 || impid != 0x500)
>
> Can you #define these?
>
> > + return false;
> > +
> > + riscv_cbom_block_size = 1;
> > + riscv_noncoherent_supported();
> > +
> > + return true;
> > +}
> > +
> > +static u32 andes_errata_probe(unsigned int stage, unsigned long archid, unsigned long impid)
> > +{
> > + u32 cpu_req_errata = 0;
> > +
>
> I read some code and when it does the opposite of what I'd expect, I
> feel inclined to add a comment. In this case, you're probing for the
> presence of the port `probe_iocp()`, but the interesting case is when
> you don't find it. You can leave it uncommented if you like, but even
> something like the below I think fits.
>
> /*
> * In the absence of the I/O Coherency Port, access to certain peripherals
> * requires vendor specific DMA handling.
> */
Makes sense, I'll include the above.

> > + if (errata_probe_iocp(stage, archid, impid))
> > + cpu_req_errata |= BIT(ERRATA_ANDESTECH_NO_IOCP);
> > +
> > + return cpu_req_errata;
> > +}
> > +
> > +void __init_or_module andes_errata_patch_func(struct alt_entry *begin, struct alt_entry *end,
> > + unsigned long archid, unsigned long impid,
> > + unsigned int stage)
> > +{
> > + u32 cpu_req_errata = andes_errata_probe(stage, archid, impid);
> > + struct alt_entry *alt;
> > + u32 tmp;
> > +
> > + if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
> > + return;
> > +
> > + for (alt = begin; alt < end; alt++) {
> > + if (alt->vendor_id != ANDESTECH_VENDOR_ID)
> > + continue;
> > + if (alt->errata_id >= ERRATA_ANDESTECH_NUMBER)
> > + continue;
> > +
> > + tmp = (1U << alt->errata_id);
>
> Is this not BIT(alt->errata_id)?
>
Yep, I will switch to BIT().

Cheers,
Prabhakar

2022-11-25 11:59:07

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller

Hi Geert,

On Fri, Nov 25, 2022 at 11:18 AM Geert Uytterhoeven
<[email protected]> wrote:
>
> , Hi Prabhakar,
>
> On Fri, Nov 25, 2022 at 11:34 AM Lad, Prabhakar
> <[email protected]> wrote:
> > On Fri, Nov 25, 2022 at 8:16 AM Krzysztof Kozlowski
> > <[email protected]> wrote:
> > > On 24/11/2022 18:22, Prabhakar wrote:
> > > > From: Lad Prabhakar <[email protected]>
> > > >
> > > > Add DT binding documentation for L2 cache controller found on RZ/Five SoC.
> > > >
> > > > The Renesas RZ/Five microprocessor includes a RISC-V CPU Core (AX45MP
> > > > Single) from Andes. The AX45MP core has an L2 cache controller, this patch
> > > > describes the L2 cache block.
> > > >
> > > > Signed-off-by: Lad Prabhakar <[email protected]>
> > > > ---
> > > > RFC v3 -> v4
> > > > * Dropped l2 cache configuration parameters
> > > > * s/larger/large
> > > > * Added minItems/maxItems for andestech,pma-regions
>
> > > > --- /dev/null
> > > > +++ b/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
>
> > > > +examples:
> > > > + - |
> > > > + #include <dt-bindings/interrupt-controller/irq.h>
> > > > + #include <dt-bindings/cache/andestech,ax45mp-cache.h>
> > > > +
> > > > + [email protected] {
> > > > + reg = <0x13400000 0x100000>;
> > > > + compatible = "andestech,ax45mp-cache", "cache";
> > > > + interrupts = <508 IRQ_TYPE_LEVEL_HIGH>;
> > > > + cache-line-size = <64>;
> > > > + cache-level = <2>;
> > > > + cache-sets = <1024>;
> > > > + cache-size = <262144>;
> > > > + cache-unified;
> > > > + andestech,pma-regions = <0x58000000 0x08000000
> > > > + (AX45MP_PMACFG_ETYP_NAPOT | AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF)>;
> > > > + };
> > > > diff --git a/include/dt-bindings/cache/andestech,ax45mp-cache.h b/include/dt-bindings/cache/andestech,ax45mp-cache.h
> > > > new file mode 100644
> > > > index 000000000000..aa1cad24075d
> > > > --- /dev/null
> > > > +++ b/include/dt-bindings/cache/andestech,ax45mp-cache.h
> > > > @@ -0,0 +1,38 @@
> > > > +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */
> > > > +/*
> > > > + * This header provides constants for Andes AX45MP PMA configuration
> > > > + *
> > > > + * Copyright (C) 2022 Renesas Electronics Corp.
> > > > + */
> > > > +
> > > > +#ifndef __DT_BINDINGS_ANDESTECH_AX45MP_CACHE_H
> > > > +#define __DT_BINDINGS_ANDESTECH_AX45MP_CACHE_H
> > > > +
> > > > +/* OFF: PMA entry is disabled */
> > > > +#define AX45MP_PMACFG_ETYP_DISABLED 0
> > > > +/* Naturally aligned power of 2 region */
> > > > +#define AX45MP_PMACFG_ETYP_NAPOT 3
> > > > +
> > > > +/* Device, Non-bufferable */
> > > > +#define AX45MP_PMACFG_MTYP_DEV_NON_BUF (0 << 2)
> > > > +/* Device, bufferable */
> > > > +#define AX45MP_PMACFG_MTYP_DEV_BUF (1 << 2)
> > > > +/* Memory, Non-cacheable, Non-bufferable */
> > > > +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF (2 << 2)
> > > > +/* Memory, Non-cacheable, Bufferable */
> > > > +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF (3 << 2)
> > >
> > > What are all these? They don't look like flags, because 3 = 1 | 2...
> > > they don't look like constants, because we do not use shifts in
> > > constants. Are these some register values? I also do not see the header
> > > being used in the code, so why having a bindings header if it is not
> > > used (DTS is not usage...)?
> > >
> > These are register bit values for the MTYP[5:2] field. The DTS example
> > in the binding doc (above) uses these macros. I haven't included the
> > DTS/I patches with this patchset yet do think I should?
>
> I think the main objection from Rob is that these look too much like
> raw register values to be written unchanged to registers, which is
> frowned upon in DT.
>
> Now, can we make this more generic?
>
> 1. Do you need AX45MP_PMACFG_ETYP_DISABLED, i.e. will it ever be
> specified in DTS, or is this a pure software thing?
> 2. Obviously you can let the driver decide if AX45MP_PMACFG_ETYP_NAPOT
> can be set, based on address/size?
> 3. If the two above are removed, the shifts can be handled in the
> driver instead,
Yes we can get rid of AX45MP_PMACFG_ETYP_DISABLED and
AX45MP_PMACFG_ETYP_NAPOT. If we are setting up the PMA region it
always has to be a power of 2. So AX45MP_PMACFG_ETYP_NAPOT can be
passed either from the driver or in the OpenSBI just OR it while
setting up the PMA.


> 4. Are there existing (more generic) definitions that can be used
> instead?
>
You mean for the MTYP flags? I haven't come across any in the kernel.

> BTW, what's the difference between non-bufferable and non-cacheable?
>
non-cacheable, from the Andes manual: Accessing to non-cacheable
memory and device will bypass I-Cache, D-Cache and L2-Cache no matter
the data is cached or not. The cache states are not changed by these
accesses.

TBH I dont have a clear answer for non-bufferable nor do we have in
the Andes HW manual. I'll get the details from Andes.

Cheers,
Prabhakar

2022-11-25 11:59:17

by Andrew Jones

[permalink] [raw]
Subject: Re: [PATCH v4 1/7] riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro

On Thu, Nov 24, 2022 at 08:05:40PM +0000, Conor Dooley wrote:
> On Thu, Nov 24, 2022 at 08:58:41PM +0100, Heiko St?bner wrote:
> > Am Donnerstag, 24. November 2022, 20:52:33 CET schrieb Conor Dooley:
> > > On Thu, Nov 24, 2022 at 05:22:01PM +0000, Prabhakar wrote:
> > > > From: Lad Prabhakar <[email protected]>
> > > >
> > > > Introduce ALTERNATIVE_3() macro.
> > >
> > > Bit perfunctory I think! There's a lovely comment down below that would
> > > make for a better commit message if you were to yoink it.
> > > Content looks about what I'd expect to see though.
> >
> > Also both the comment on the original ALTERNATIVE_2 and the new ALTERNATIVE_3
> > should probably be merged into a single comment explaining this once for all
> > ALTERNATIVE_x variants.
> >
> > Especially with the dma stuff, I'm pretty sure we'll get at least an ALTERNATIVE_4
> > if not even more ;-) . So we defnitly don't want to repeat this multiple times.
>
> Oh I can promise you that there'll be a #4 ;)

I took a stab[*] at cleaning up alternative-macros.h a bit in order to
prepare for _3, _4, ..., _42. The idea was to try and find a way to
reduce as much duplication as possible, both in the current code and
in the new macros.

[*] https://lore.kernel.org/all/[email protected]/

Thanks,
drew

2022-11-25 11:59:31

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller

Hi Krzysztof,

Thank you for the review.

On Fri, Nov 25, 2022 at 8:16 AM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 24/11/2022 18:22, Prabhakar wrote:
> > From: Lad Prabhakar <[email protected]>
> >
> > Add DT binding documentation for L2 cache controller found on RZ/Five SoC.
> >
> > The Renesas RZ/Five microprocessor includes a RISC-V CPU Core (AX45MP
> > Single) from Andes. The AX45MP core has an L2 cache controller, this patch
> > describes the L2 cache block.
> >
> > Signed-off-by: Lad Prabhakar <[email protected]>
> > ---
> > RFC v3 -> v4
> > * Dropped l2 cache configuration parameters
> > * s/larger/large
> > * Added minItems/maxItems for andestech,pma-regions
> > ---
> > .../cache/andestech,ax45mp-cache.yaml | 93 +++++++++++++++++++
> > .../cache/andestech,ax45mp-cache.h | 38 ++++++++
> > 2 files changed, 131 insertions(+)
> > create mode 100644 Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
> > create mode 100644 include/dt-bindings/cache/andestech,ax45mp-cache.h
> >
> > diff --git a/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml b/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
> > new file mode 100644
> > index 000000000000..bf255b177d0a
> > --- /dev/null
> > +++ b/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml
> > @@ -0,0 +1,93 @@
> > +# SPDX-License-Identifier: (GPL-2.0 OR BSD-2-Clause)
> > +# Copyright (C) 2022 Renesas Electronics Corp.
> > +%YAML 1.2
> > +---
> > +$id: http://devicetree.org/schemas/cache/andestech,ax45mp-cache.yaml#
> > +$schema: http://devicetree.org/meta-schemas/core.yaml#
> > +
> > +title: Andestech AX45MP L2 Cache Controller
> > +
> > +maintainers:
> > + - Lad Prabhakar <[email protected]>
> > +
> > +description:
> > + A level-2 cache (L2C) is used to improve the system performance by providing
> > + a large amount of cache line entries and reasonable access delays. The L2C
> > + is shared between cores, and a non-inclusive non-exclusive policy is used.
> > +
> > +select:
> > + properties:
> > + compatible:
> > + contains:
> > + enum:
> > + - andestech,ax45mp-cache
> > +
> > + required:
> > + - compatible
> > +
> > +properties:
> > + compatible:
> > + items:
> > + - const: andestech,ax45mp-cache
> > + - const: cache
> > +
> > + reg:
> > + maxItems: 1
> > +
> > + interrupts:
> > + maxItems: 1
> > +
> > + cache-line-size:
> > + const: 64
> > +
> > + cache-level:
> > + const: 2
> > +
> > + cache-sets:
> > + const: 1024
> > +
> > + cache-size:
> > + enum: [131072, 262144, 524288, 1048576, 2097152]
> > +
> > + cache-unified: true
> > +
> > + next-level-cache: true
> > +
> > + andestech,pma-regions:
> > + $ref: /schemas/types.yaml#/definitions/uint32-matrix
> > + minItems: 1
> > + maxItems: 16
> > + items:
> > + minItems: 3
> > + maxItems: 3
>
> Instead:
> items:
> items:
> - description: Explain
> - description: what is
> - description: here
>
Ok, I will do that in the next version.

- description: Memory region offset to be set up in the PMA
- description: Size of the PMA region
- description: Flags indicating how the region should be set up in the
PMA. (ETYP[1:0] | MTYP[5:2]) use the macros
defined in include/dt-bindings/cache/andestech,ax45mp-cache.h.

> > + description: Optional array of memory regions to be set in the PMA.
> > +
> > +additionalProperties: false
> > +
> > +required:
> > + - compatible
> > + - cache-line-size
> > + - cache-level
> > + - cache-sets
> > + - cache-size
> > + - cache-unified
> > + - interrupts
> > + - reg
>
> Keep the same order as properties appear in the "properties:"
>
Agreed, will do.


> > +
> > +examples:
> > + - |
> > + #include <dt-bindings/interrupt-controller/irq.h>
> > + #include <dt-bindings/cache/andestech,ax45mp-cache.h>
> > +
> > + [email protected] {
> > + reg = <0x13400000 0x100000>;
> > + compatible = "andestech,ax45mp-cache", "cache";
> > + interrupts = <508 IRQ_TYPE_LEVEL_HIGH>;
> > + cache-line-size = <64>;
> > + cache-level = <2>;
> > + cache-sets = <1024>;
> > + cache-size = <262144>;
> > + cache-unified;
> > + andestech,pma-regions = <0x58000000 0x08000000
> > + (AX45MP_PMACFG_ETYP_NAPOT | AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF)>;
> > + };
> > diff --git a/include/dt-bindings/cache/andestech,ax45mp-cache.h b/include/dt-bindings/cache/andestech,ax45mp-cache.h
> > new file mode 100644
> > index 000000000000..aa1cad24075d
> > --- /dev/null
> > +++ b/include/dt-bindings/cache/andestech,ax45mp-cache.h
> > @@ -0,0 +1,38 @@
> > +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */
> > +/*
> > + * This header provides constants for Andes AX45MP PMA configuration
> > + *
> > + * Copyright (C) 2022 Renesas Electronics Corp.
> > + */
> > +
> > +#ifndef __DT_BINDINGS_ANDESTECH_AX45MP_CACHE_H
> > +#define __DT_BINDINGS_ANDESTECH_AX45MP_CACHE_H
> > +
> > +/* OFF: PMA entry is disabled */
> > +#define AX45MP_PMACFG_ETYP_DISABLED 0
> > +/* Naturally aligned power of 2 region */
> > +#define AX45MP_PMACFG_ETYP_NAPOT 3
> > +
> > +/* Device, Non-bufferable */
> > +#define AX45MP_PMACFG_MTYP_DEV_NON_BUF (0 << 2)
> > +/* Device, bufferable */
> > +#define AX45MP_PMACFG_MTYP_DEV_BUF (1 << 2)
> > +/* Memory, Non-cacheable, Non-bufferable */
> > +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF (2 << 2)
> > +/* Memory, Non-cacheable, Bufferable */
> > +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF (3 << 2)
>
> What are all these? They don't look like flags, because 3 = 1 | 2...
> they don't look like constants, because we do not use shifts in
> constants. Are these some register values? I also do not see the header
> being used in the code, so why having a bindings header if it is not
> used (DTS is not usage...)?
>
These are register bit values for the MTYP[5:2] field. The DTS example
in the binding doc (above) uses these macros. I haven't included the
DTS/I patches with this patchset yet do think I should?

Cheers,
Prabhakar

2022-11-25 12:02:42

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller

, Hi Prabhakar,

On Fri, Nov 25, 2022 at 11:34 AM Lad, Prabhakar
<[email protected]> wrote:
> On Fri, Nov 25, 2022 at 8:16 AM Krzysztof Kozlowski
> <[email protected]> wrote:
> > On 24/11/2022 18:22, Prabhakar wrote:
> > > From: Lad Prabhakar <[email protected]>
> > >
> > > Add DT binding documentation for L2 cache controller found on RZ/Five SoC.
> > >
> > > The Renesas RZ/Five microprocessor includes a RISC-V CPU Core (AX45MP
> > > Single) from Andes. The AX45MP core has an L2 cache controller, this patch
> > > describes the L2 cache block.
> > >
> > > Signed-off-by: Lad Prabhakar <[email protected]>
> > > ---
> > > RFC v3 -> v4
> > > * Dropped l2 cache configuration parameters
> > > * s/larger/large
> > > * Added minItems/maxItems for andestech,pma-regions

> > > --- /dev/null
> > > +++ b/Documentation/devicetree/bindings/cache/andestech,ax45mp-cache.yaml

> > > +examples:
> > > + - |
> > > + #include <dt-bindings/interrupt-controller/irq.h>
> > > + #include <dt-bindings/cache/andestech,ax45mp-cache.h>
> > > +
> > > + [email protected] {
> > > + reg = <0x13400000 0x100000>;
> > > + compatible = "andestech,ax45mp-cache", "cache";
> > > + interrupts = <508 IRQ_TYPE_LEVEL_HIGH>;
> > > + cache-line-size = <64>;
> > > + cache-level = <2>;
> > > + cache-sets = <1024>;
> > > + cache-size = <262144>;
> > > + cache-unified;
> > > + andestech,pma-regions = <0x58000000 0x08000000
> > > + (AX45MP_PMACFG_ETYP_NAPOT | AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF)>;
> > > + };
> > > diff --git a/include/dt-bindings/cache/andestech,ax45mp-cache.h b/include/dt-bindings/cache/andestech,ax45mp-cache.h
> > > new file mode 100644
> > > index 000000000000..aa1cad24075d
> > > --- /dev/null
> > > +++ b/include/dt-bindings/cache/andestech,ax45mp-cache.h
> > > @@ -0,0 +1,38 @@
> > > +/* SPDX-License-Identifier: (GPL-2.0-only OR BSD-2-Clause) */
> > > +/*
> > > + * This header provides constants for Andes AX45MP PMA configuration
> > > + *
> > > + * Copyright (C) 2022 Renesas Electronics Corp.
> > > + */
> > > +
> > > +#ifndef __DT_BINDINGS_ANDESTECH_AX45MP_CACHE_H
> > > +#define __DT_BINDINGS_ANDESTECH_AX45MP_CACHE_H
> > > +
> > > +/* OFF: PMA entry is disabled */
> > > +#define AX45MP_PMACFG_ETYP_DISABLED 0
> > > +/* Naturally aligned power of 2 region */
> > > +#define AX45MP_PMACFG_ETYP_NAPOT 3
> > > +
> > > +/* Device, Non-bufferable */
> > > +#define AX45MP_PMACFG_MTYP_DEV_NON_BUF (0 << 2)
> > > +/* Device, bufferable */
> > > +#define AX45MP_PMACFG_MTYP_DEV_BUF (1 << 2)
> > > +/* Memory, Non-cacheable, Non-bufferable */
> > > +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF (2 << 2)
> > > +/* Memory, Non-cacheable, Bufferable */
> > > +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF (3 << 2)
> >
> > What are all these? They don't look like flags, because 3 = 1 | 2...
> > they don't look like constants, because we do not use shifts in
> > constants. Are these some register values? I also do not see the header
> > being used in the code, so why having a bindings header if it is not
> > used (DTS is not usage...)?
> >
> These are register bit values for the MTYP[5:2] field. The DTS example
> in the binding doc (above) uses these macros. I haven't included the
> DTS/I patches with this patchset yet do think I should?

I think the main objection from Rob is that these look too much like
raw register values to be written unchanged to registers, which is
frowned upon in DT.

Now, can we make this more generic?

1. Do you need AX45MP_PMACFG_ETYP_DISABLED, i.e. will it ever be
specified in DTS, or is this a pure software thing?
2. Obviously you can let the driver decide if AX45MP_PMACFG_ETYP_NAPOT
can be set, based on address/size?
3. If the two above are removed, the shifts can be handled in the
driver instead,
4. Are there existing (more generic) definitions that can be used
instead?

BTW, what's the difference between non-bufferable and non-cacheable?

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2022-11-25 12:25:48

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller

On 25/11/2022 11:34, Lad, Prabhakar wrote:
>>> +/* Device, Non-bufferable */
>>> +#define AX45MP_PMACFG_MTYP_DEV_NON_BUF (0 << 2)
>>> +/* Device, bufferable */
>>> +#define AX45MP_PMACFG_MTYP_DEV_BUF (1 << 2)
>>> +/* Memory, Non-cacheable, Non-bufferable */
>>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF (2 << 2)
>>> +/* Memory, Non-cacheable, Bufferable */
>>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF (3 << 2)
>>
>> What are all these? They don't look like flags, because 3 = 1 | 2...
>> they don't look like constants, because we do not use shifts in
>> constants. Are these some register values? I also do not see the header
>> being used in the code, so why having a bindings header if it is not
>> used (DTS is not usage...)?
>>
> These are register bit values for the MTYP[5:2] field. The DTS example
> in the binding doc (above) uses these macros. I haven't included the
> DTS/I patches with this patchset yet do think I should?

Then why storing it as bindings? Bindings headers describe the interface
implemented by drivers and used by DTS, but this is not implemented by
drivers.

Best regards,
Krzysztof

2022-11-25 12:29:06

by Heiko Stübner

[permalink] [raw]
Subject: Re: [PATCH v4 1/7] riscv: asm: alternative-macros: Introduce ALTERNATIVE_3() macro

Am Freitag, 25. November 2022, 11:02:21 CET schrieb Lad, Prabhakar:
> Hi Heiko,
>
> On Thu, Nov 24, 2022 at 7:58 PM Heiko St?bner <[email protected]> wrote:
> >
> > Am Donnerstag, 24. November 2022, 20:52:33 CET schrieb Conor Dooley:
> > > On Thu, Nov 24, 2022 at 05:22:01PM +0000, Prabhakar wrote:
> > > > From: Lad Prabhakar <[email protected]>
> > > >
> > > > Introduce ALTERNATIVE_3() macro.
> > >
> > > Bit perfunctory I think! There's a lovely comment down below that would
> > > make for a better commit message if you were to yoink it.
> > > Content looks about what I'd expect to see though.
> >
> > Also both the comment on the original ALTERNATIVE_2 and the new ALTERNATIVE_3
> > should probably be merged into a single comment explaining this once for all
> > ALTERNATIVE_x variants.
> >
> > Especially with the dma stuff, I'm pretty sure we'll get at least an ALTERNATIVE_4
> > if not even more ;-) . So we defnitly don't want to repeat this multiple times.
> >
> Do agree. How about the below?
>
> /*
> * Similar to what ALTERNATIVE_2() macro does but with an additional
> * vendor content.
> */
>
> So the other ALTERNATIVE_2+() macros will keep on building on it.

My idea was more like having _one_ comment block of something like

-----
/*
* ALTERNATIVE_x macros allow providing multiple replacement options
* for an ALTERNATIVE code section. This is helpful if multiple
* implementation variants for the same functionality exist for
* different cpu cores.
*
* Usage:
* ALTERNATIVE_x(old_content,
* new_content1, vendor_id1, errata_id1, CONFIG_k1,
* new_content2, vendor_id2, errata_id2, CONFIG_k2,
* ...
* new_contentx, vendor_idx, errata_idx, CONFIG_kx)
*/

#define ALTERNATIVE_2(...)
#define ALTERNATIVE_3(...)
etc
-----

So this would include dropping the old comment over ALTERNATIVE2


Heiko


2022-11-25 12:47:56

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

On Fri, Nov 25, 2022 at 10:50:01AM +0000, Lad, Prabhakar wrote:
> Hi Conor,
>
> Thank you for the review.
>
> On Thu, Nov 24, 2022 at 9:31 PM Conor Dooley <[email protected]> wrote:

> > But yeah, this is basically the sort of stuff that'd be nice to have in
> > the previous patch!
> >
> Agreed, I'll include this in the binding patch too.

I said "the previous patch" but I meant "the previous patch that I
commented on the commit message for". I can hardly expect telepathy, so
sorry for the poor wording.

2022-11-25 13:07:54

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller

On Fri, Nov 25, 2022 at 01:12:18PM +0100, Krzysztof Kozlowski wrote:
> On 25/11/2022 11:34, Lad, Prabhakar wrote:
> >>> +/* Device, Non-bufferable */
> >>> +#define AX45MP_PMACFG_MTYP_DEV_NON_BUF (0 << 2)
> >>> +/* Device, bufferable */
> >>> +#define AX45MP_PMACFG_MTYP_DEV_BUF (1 << 2)
> >>> +/* Memory, Non-cacheable, Non-bufferable */
> >>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF (2 << 2)
> >>> +/* Memory, Non-cacheable, Bufferable */
> >>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF (3 << 2)
> >>
> >> What are all these? They don't look like flags, because 3 = 1 | 2...
> >> they don't look like constants, because we do not use shifts in
> >> constants. Are these some register values? I also do not see the header
> >> being used in the code, so why having a bindings header if it is not
> >> used (DTS is not usage...)?
> >>
> > These are register bit values for the MTYP[5:2] field. The DTS example
> > in the binding doc (above) uses these macros. I haven't included the
> > DTS/I patches with this patchset yet do think I should?
>
> Then why storing it as bindings? Bindings headers describe the interface
> implemented by drivers and used by DTS, but this is not implemented by
> drivers.

IIUC, some of these properties are non-discoverable attributes of the
cache controller. I see two things that could be done here that are
"better" than #defining bits:
- add an RZ/Five specific compatible and use match data to set the
attributes which is only possible if the pma-regions are set on a
per SoC basis
- make pma-regions into a child node, in which andestech,non-cacheable
andestech,non-bufferable etc are properties of the child node

Prabhakar, does that make sense or am I off with my understanding of the
attributes?

Thanks,
Conor.

2022-11-25 13:08:30

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller

Hi Conor,

On Fri, Nov 25, 2022 at 12:25 PM Conor Dooley
<[email protected]> wrote:
>
> On Fri, Nov 25, 2022 at 01:12:18PM +0100, Krzysztof Kozlowski wrote:
> > On 25/11/2022 11:34, Lad, Prabhakar wrote:
> > >>> +/* Device, Non-bufferable */
> > >>> +#define AX45MP_PMACFG_MTYP_DEV_NON_BUF (0 << 2)
> > >>> +/* Device, bufferable */
> > >>> +#define AX45MP_PMACFG_MTYP_DEV_BUF (1 << 2)
> > >>> +/* Memory, Non-cacheable, Non-bufferable */
> > >>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF (2 << 2)
> > >>> +/* Memory, Non-cacheable, Bufferable */
> > >>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF (3 << 2)
> > >>
> > >> What are all these? They don't look like flags, because 3 = 1 | 2...
> > >> they don't look like constants, because we do not use shifts in
> > >> constants. Are these some register values? I also do not see the header
> > >> being used in the code, so why having a bindings header if it is not
> > >> used (DTS is not usage...)?
> > >>
> > > These are register bit values for the MTYP[5:2] field. The DTS example
> > > in the binding doc (above) uses these macros. I haven't included the
> > > DTS/I patches with this patchset yet do think I should?
> >
> > Then why storing it as bindings? Bindings headers describe the interface
> > implemented by drivers and used by DTS, but this is not implemented by
> > drivers.
>
> IIUC, some of these properties are non-discoverable attributes of the
> cache controller. I see two things that could be done here that are
> "better" than #defining bits:
> - add an RZ/Five specific compatible and use match data to set the
> attributes which is only possible if the pma-regions are set on a
> per SoC basis
> - make pma-regions into a child node, in which andestech,non-cacheable
> andestech,non-bufferable etc are properties of the child node
>
For now the only way to get DMA working without IOCP is to have
AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF. But for future purposes I have
introduced the other available flags.

So maybe for now we could just have this flag
andestech,mem-non-cacheable-bufferable in the binding doc.

[email protected] {
reg = <0x13400000 0x100000>;
compatible = "andestech,ax45mp-cache", "cache";
interrupts = <508 IRQ_TYPE_LEVEL_HIGH>;
cache-line-size = <64>;
cache-level = <2>;
cache-sets = <1024>;
cache-size = <262144>;
cache-unified;
andestech,[email protected] {
reg = <0x58000000 0x08000000>;
andestech,mem-non-cacheable-bufferable;
};
andestech,[email protected] {
reg = <0xdeadbeef 0x08000000>;
andestech,mem-non-cacheable-bufferable;
};
....
};

Did I chime in this time?

Cheers,
Prabhakar

2022-11-25 14:19:19

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller

On Fri, Nov 25, 2022 at 12:51:34PM +0000, Lad, Prabhakar wrote:
> Hi Conor,
>
> On Fri, Nov 25, 2022 at 12:25 PM Conor Dooley
> <[email protected]> wrote:
> >
> > On Fri, Nov 25, 2022 at 01:12:18PM +0100, Krzysztof Kozlowski wrote:
> > > On 25/11/2022 11:34, Lad, Prabhakar wrote:
> > > >>> +/* Device, Non-bufferable */
> > > >>> +#define AX45MP_PMACFG_MTYP_DEV_NON_BUF (0 << 2)
> > > >>> +/* Device, bufferable */
> > > >>> +#define AX45MP_PMACFG_MTYP_DEV_BUF (1 << 2)
> > > >>> +/* Memory, Non-cacheable, Non-bufferable */
> > > >>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF (2 << 2)
> > > >>> +/* Memory, Non-cacheable, Bufferable */
> > > >>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF (3 << 2)
> > > >>
> > > >> What are all these? They don't look like flags, because 3 = 1 | 2...
> > > >> they don't look like constants, because we do not use shifts in
> > > >> constants. Are these some register values? I also do not see the header
> > > >> being used in the code, so why having a bindings header if it is not
> > > >> used (DTS is not usage...)?
> > > >>
> > > > These are register bit values for the MTYP[5:2] field. The DTS example
> > > > in the binding doc (above) uses these macros. I haven't included the
> > > > DTS/I patches with this patchset yet do think I should?
> > >
> > > Then why storing it as bindings? Bindings headers describe the interface
> > > implemented by drivers and used by DTS, but this is not implemented by
> > > drivers.
> >
> > IIUC, some of these properties are non-discoverable attributes of the
> > cache controller. I see two things that could be done here that are
> > "better" than #defining bits:
> > - add an RZ/Five specific compatible and use match data to set the
> > attributes which is only possible if the pma-regions are set on a
> > per SoC basis
> > - make pma-regions into a child node, in which andestech,non-cacheable
> > andestech,non-bufferable etc are properties of the child node
> >
> For now the only way to get DMA working without IOCP is to have
> AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF. But for future purposes I have
> introduced the other available flags.
>
> So maybe for now we could just have this flag
> andestech,mem-non-cacheable-bufferable in the binding doc.
>
> [email protected] {
> reg = <0x13400000 0x100000>;
> compatible = "andestech,ax45mp-cache", "cache";
> interrupts = <508 IRQ_TYPE_LEVEL_HIGH>;
> cache-line-size = <64>;
> cache-level = <2>;
> cache-sets = <1024>;
> cache-size = <262144>;
> cache-unified;
> andestech,[email protected] {
> reg = <0x58000000 0x08000000>;
> andestech,mem-non-cacheable-bufferable;

Yah, that's about what I would expect - except splitting the properties
up. I think split up makes more sense from a property description point
of view, rather than needing some sort of
oneOf:
- non-cacheable-bufferable
- cacheable-non-bufferable
- non-cacheable-non-bufferable


> };
> andestech,[email protected] {
> reg = <0xdeadbeef 0x08000000>;
> andestech,mem-non-cacheable-bufferable;
> };
> ....
> };

2022-11-25 16:20:27

by Krzysztof Kozlowski

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller

On 25/11/2022 13:25, Conor Dooley wrote:
> On Fri, Nov 25, 2022 at 01:12:18PM +0100, Krzysztof Kozlowski wrote:
>> On 25/11/2022 11:34, Lad, Prabhakar wrote:
>>>>> +/* Device, Non-bufferable */
>>>>> +#define AX45MP_PMACFG_MTYP_DEV_NON_BUF (0 << 2)
>>>>> +/* Device, bufferable */
>>>>> +#define AX45MP_PMACFG_MTYP_DEV_BUF (1 << 2)
>>>>> +/* Memory, Non-cacheable, Non-bufferable */
>>>>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF (2 << 2)
>>>>> +/* Memory, Non-cacheable, Bufferable */
>>>>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF (3 << 2)
>>>>
>>>> What are all these? They don't look like flags, because 3 = 1 | 2...
>>>> they don't look like constants, because we do not use shifts in
>>>> constants. Are these some register values? I also do not see the header
>>>> being used in the code, so why having a bindings header if it is not
>>>> used (DTS is not usage...)?
>>>>
>>> These are register bit values for the MTYP[5:2] field. The DTS example
>>> in the binding doc (above) uses these macros. I haven't included the
>>> DTS/I patches with this patchset yet do think I should?
>>
>> Then why storing it as bindings? Bindings headers describe the interface
>> implemented by drivers and used by DTS, but this is not implemented by
>> drivers.
>
> IIUC, some of these properties are non-discoverable attributes of the
> cache controller. I see two things that could be done here that are
> "better" than #defining bits:

I did not comment about properties. I comment about constants. Why
register values/offsets/addresses are in this particular case suitable
for binding headers?

> - add an RZ/Five specific compatible and use match data to set the
> attributes which is only possible if the pma-regions are set on a
> per SoC basis
> - make pma-regions into a child node, in which andestech,non-cacheable
> andestech,non-bufferable etc are properties of the child node
>
> Prabhakar, does that make sense or am I off with my understanding of the
> attributes?


Best regards,
Krzysztof

2022-11-25 18:19:38

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller

On Fri, Nov 25, 2022 at 04:55:11PM +0100, Krzysztof Kozlowski wrote:
> On 25/11/2022 13:25, Conor Dooley wrote:
> > On Fri, Nov 25, 2022 at 01:12:18PM +0100, Krzysztof Kozlowski wrote:
> >> On 25/11/2022 11:34, Lad, Prabhakar wrote:
> >>>>> +/* Device, Non-bufferable */
> >>>>> +#define AX45MP_PMACFG_MTYP_DEV_NON_BUF (0 << 2)
> >>>>> +/* Device, bufferable */
> >>>>> +#define AX45MP_PMACFG_MTYP_DEV_BUF (1 << 2)
> >>>>> +/* Memory, Non-cacheable, Non-bufferable */
> >>>>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF (2 << 2)
> >>>>> +/* Memory, Non-cacheable, Bufferable */
> >>>>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF (3 << 2)
> >>>>
> >>>> What are all these? They don't look like flags, because 3 = 1 | 2...
> >>>> they don't look like constants, because we do not use shifts in
> >>>> constants. Are these some register values? I also do not see the header
> >>>> being used in the code, so why having a bindings header if it is not
> >>>> used (DTS is not usage...)?
> >>>>
> >>> These are register bit values for the MTYP[5:2] field. The DTS example
> >>> in the binding doc (above) uses these macros. I haven't included the
> >>> DTS/I patches with this patchset yet do think I should?
> >>
> >> Then why storing it as bindings? Bindings headers describe the interface
> >> implemented by drivers and used by DTS, but this is not implemented by
> >> drivers.
> >
> > IIUC, some of these properties are non-discoverable attributes of the
> > cache controller. I see two things that could be done here that are
> > "better" than #defining bits:
>
> I did not comment about properties. I comment about constants. Why
> register values/offsets/addresses are in this particular case suitable
> for binding headers?

I don't think we disagree here. I'm not in favour of the defines either
here. Perhaps I confused you by accidentally not adding Prabhakar to the
to field.

The dt needs to convey his particular cache implementation's bufferable
and/or coherent regions so I was suggesting alternatives for conveying
this information, without resorting to defines.

> > - add an RZ/Five specific compatible and use match data to set the
> > attributes which is only possible if the pma-regions are set on a
> > per SoC basis
> > - make pma-regions into a child node, in which andestech,non-cacheable
> > andestech,non-bufferable etc are properties of the child node

2022-11-25 18:38:21

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 6/7] dt-bindings: cache: r9a07g043f-l2-cache: Add DT binding documentation for L2 cache controller

Hi Krzysztof,

On Fri, Nov 25, 2022 at 12:12 PM Krzysztof Kozlowski
<[email protected]> wrote:
>
> On 25/11/2022 11:34, Lad, Prabhakar wrote:
> >>> +/* Device, Non-bufferable */
> >>> +#define AX45MP_PMACFG_MTYP_DEV_NON_BUF (0 << 2)
> >>> +/* Device, bufferable */
> >>> +#define AX45MP_PMACFG_MTYP_DEV_BUF (1 << 2)
> >>> +/* Memory, Non-cacheable, Non-bufferable */
> >>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_NON_BUF (2 << 2)
> >>> +/* Memory, Non-cacheable, Bufferable */
> >>> +#define AX45MP_PMACFG_MTYP_MEM_NON_CACHE_BUF (3 << 2)
> >>
> >> What are all these? They don't look like flags, because 3 = 1 | 2...
> >> they don't look like constants, because we do not use shifts in
> >> constants. Are these some register values? I also do not see the header
> >> being used in the code, so why having a bindings header if it is not
> >> used (DTS is not usage...)?
> >>
> > These are register bit values for the MTYP[5:2] field. The DTS example
> > in the binding doc (above) uses these macros. I haven't included the
> > DTS/I patches with this patchset yet do think I should?
>
> Then why storing it as bindings? Bindings headers describe the interface
> implemented by drivers and used by DTS, but this is not implemented by
> drivers.
>
I got your point. I'll make use of the header in the driver for the
next version and fix your previously pointed comments.

Cheers,
Prabhakar

2022-11-25 19:11:21

by Samuel Holland

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] riscv: mm: dma-noncoherent: Pass direction and operation to ALT_CMO_OP()

On 11/24/22 13:18, Lad, Prabhakar wrote:
> Hi Heiko,
>
> Thank you for the review.
>
> On Thu, Nov 24, 2022 at 6:29 PM Heiko Stübner <[email protected]> wrote:
>>
>> Am Donnerstag, 24. November 2022, 18:22:05 CET schrieb Prabhakar:
>>> From: Lad Prabhakar <[email protected]>
>>>
>>> Pass direction and operation to ALT_CMO_OP() macro.
>>>
>>> This is in preparation for adding errata for the Andes CPU core.
>>
>> can you provide more explanation why that is necessary please?
>> I guess you want to use different cache operations for some cases?
>>
> Yes basically to call different cache operations based on the dir and
> operations (and also this allows to export just one function to handle
> the errata). I'll update the commit message in the next version.

This makes things less efficient, because it requires more instructions
and registers inside the alternative section, and your function
duplicates the logic from arch_sync_dma_for_device(). The alternative is
already passed the operation (clean/flush/invalidate) as a token, so you
can construct the function name with token pasting.

Regards,
Samuel

2022-11-25 20:02:09

by Samuel Holland

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

Hi Prabhakar,

On 11/24/22 11:22, Prabhakar wrote:
> From: Lad Prabhakar <[email protected]>
>
> On the AX45MP core, cache coherency is a specification option so it may
> not be supported. In this case DMA will fail. As a workaround, firstly we
> allocate a global dma coherent pool from which DMA allocations are taken
> and marked as non-cacheable + bufferable using the PMA region as specified
> in the device tree. Synchronization callbacks are implemented to
> synchronize when doing DMA transactions.
>
> The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> block that allows dynamic adjustment of memory attributes in the runtime.
> It contains a configurable amount of PMA entries implemented as CSR
> registers to control the attributes of memory locations in interest.
>
> Below are the memory attributes supported:
> * Device, Non-bufferable
> * Device, bufferable
> * Memory, Non-cacheable, Non-bufferable
> * Memory, Non-cacheable, Bufferable
> * Memory, Write-back, No-allocate
> * Memory, Write-back, Read-allocate
> * Memory, Write-back, Write-allocate
> * Memory, Write-back, Read and Write-allocate
>
> This patch adds support to configure the memory attributes of the memory
> regions as passed from the l2 cache node and exposes the cache management
> ops.

Forgive my ignorance, but why do you need both a DMA pool and explicit
cache maintenance? Wouldn't the purpose of marking a memory region as
permanently non-cacheable be to avoid cache maintenance? And likewise,
if you are doing cache maintenance anyway, why does it matter if/how the
memory is cacheable?

> More info about PMA (section 10.3):
> Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
>
> Signed-off-by: Lad Prabhakar <[email protected]>
> ---
> RFC v3 -> v4
> * Made use of runtime patching instead of compile time
> * Now just exposing single function ax45mp_no_iocp_cmo() for CMO handling
> * Added a check to make sure cache line size is always 64 bytes
> * Renamed folder rzf -> rzfive
> * Improved Kconfig description
> * Dropped L2 cache configuration
> * Dropped unnecessary casts
> * Fixed comments pointed by Geert, apart from use of PTR_ALIGN_XYZ() macros.
> ---
> arch/riscv/include/asm/cacheflush.h | 8 +
> arch/riscv/include/asm/errata_list.h | 32 +-
> drivers/soc/renesas/Kconfig | 7 +
> drivers/soc/renesas/Makefile | 2 +
> drivers/soc/renesas/rzfive/Kconfig | 6 +
> drivers/soc/renesas/rzfive/Makefile | 3 +
> drivers/soc/renesas/rzfive/ax45mp_cache.c | 415 ++++++++++++++++++++++
> drivers/soc/renesas/rzfive/ax45mp_sbi.h | 29 ++
> 8 files changed, 496 insertions(+), 6 deletions(-)
> create mode 100644 drivers/soc/renesas/rzfive/Kconfig
> create mode 100644 drivers/soc/renesas/rzfive/Makefile
> create mode 100644 drivers/soc/renesas/rzfive/ax45mp_cache.c
> create mode 100644 drivers/soc/renesas/rzfive/ax45mp_sbi.h
>
> diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
> index 4a04d1be7c67..3226f3aceafe 100644
> --- a/arch/riscv/include/asm/cacheflush.h
> +++ b/arch/riscv/include/asm/cacheflush.h
> @@ -61,6 +61,14 @@ static inline void riscv_noncoherent_supported(void) {}
> #define SYS_RISCV_FLUSH_ICACHE_LOCAL 1UL
> #define SYS_RISCV_FLUSH_ICACHE_ALL (SYS_RISCV_FLUSH_ICACHE_LOCAL)
>
> +#ifdef CONFIG_AX45MP_L2_CACHE
> +extern asmlinkage void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr,
> + size_t size, int dir, int ops);
> +#else
> +inline void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr,
> + size_t size, int dir, int ops) {}
> +#endif
> +
> #include <asm-generic/cacheflush.h>
>
> #endif /* _ASM_RISCV_CACHEFLUSH_H */
> diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> index 48e899a8e7a9..300fed3bfd80 100644
> --- a/arch/riscv/include/asm/errata_list.h
> +++ b/arch/riscv/include/asm/errata_list.h
> @@ -125,8 +125,8 @@ asm volatile(ALTERNATIVE( \
> #define THEAD_SYNC_S ".long 0x0190000b"
>
> #define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops) \
> -asm volatile(ALTERNATIVE_2( \
> - __nops(6), \
> +asm volatile(ALTERNATIVE_3( \
> + __nops(14), \
> "mv a0, %1\n\t" \
> "j 2f\n\t" \
> "3:\n\t" \
> @@ -134,7 +134,7 @@ asm volatile(ALTERNATIVE_2( \
> "add a0, a0, %0\n\t" \
> "2:\n\t" \
> "bltu a0, %2, 3b\n\t" \
> - "nop", 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
> + __nops(8), 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
> "mv a0, %1\n\t" \
> "j 2f\n\t" \
> "3:\n\t" \
> @@ -142,8 +142,28 @@ asm volatile(ALTERNATIVE_2( \
> "add a0, a0, %0\n\t" \
> "2:\n\t" \
> "bltu a0, %2, 3b\n\t" \
> - THEAD_SYNC_S, THEAD_VENDOR_ID, \
> - ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO) \
> + THEAD_SYNC_S "\n\t" \
> + __nops(8), THEAD_VENDOR_ID, \
> + ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO, \
> + ".option push\n\t\n\t" \
> + ".option norvc\n\t" \
> + ".option norelax\n\t" \
> + "addi sp,sp,-16\n\t" \
> + "sd s0,0(sp)\n\t" \
> + "sd ra,8(sp)\n\t" \
> + "addi s0,sp,16\n\t" \
> + "mv a4,%6\n\t" \
> + "mv a3,%5\n\t" \
> + "mv a2,%4\n\t" \
> + "mv a1,%3\n\t" \
> + "mv a0,%0\n\t" \
> + "call ax45mp_no_iocp_cmo\n\t" \
> + "ld ra,8(sp)\n\t" \
> + "ld s0,0(sp)\n\t" \
> + "addi sp,sp,16\n\t" \
> + ".option pop\n\t", \
> + ANDESTECH_VENDOR_ID, ERRATA_ANDESTECH_NO_IOCP, \
> + CONFIG_ERRATA_ANDES_CMO) \
> : : "r"(_cachesize), \
> "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \
> "r"((unsigned long)(_start) + (_size)), \
> @@ -151,7 +171,7 @@ asm volatile(ALTERNATIVE_2( \
> "r"((unsigned long)(_size)), \
> "r"((unsigned long)(_dir)), \
> "r"((unsigned long)(_ops)) \
> - : "a0")
> + : "a0", "a1", "a2", "a3", "a4", "memory")
>
> #define THEAD_C9XX_RV_IRQ_PMU 17
> #define THEAD_C9XX_CSR_SCOUNTEROF 0x5c5
> diff --git a/drivers/soc/renesas/Kconfig b/drivers/soc/renesas/Kconfig
> index 660498252ec5..e7810256c60d 100644
> --- a/drivers/soc/renesas/Kconfig
> +++ b/drivers/soc/renesas/Kconfig
> @@ -340,9 +340,16 @@ if RISCV
> config ARCH_R9A07G043
> bool "RISC-V Platform support for RZ/Five"
> select ARCH_RZG2L
> + select AX45MP_L2_CACHE
> + select DMA_GLOBAL_POOL
> + select ERRATA_ANDES
> + select ERRATA_ANDES_CMO
> + select RISCV_DMA_NONCOHERENT
> help
> This enables support for the Renesas RZ/Five SoC.
>
> +source "drivers/soc/renesas/rzfive/Kconfig"
> +
> endif # RISCV
>
> config RST_RCAR
> diff --git a/drivers/soc/renesas/Makefile b/drivers/soc/renesas/Makefile
> index 535868c9c7e4..9df9f759a039 100644
> --- a/drivers/soc/renesas/Makefile
> +++ b/drivers/soc/renesas/Makefile
> @@ -31,6 +31,8 @@ ifdef CONFIG_SMP
> obj-$(CONFIG_ARCH_R9A06G032) += r9a06g032-smp.o
> endif
>
> +obj-$(CONFIG_RISCV) += rzfive/
> +
> # Family
> obj-$(CONFIG_RST_RCAR) += rcar-rst.o
> obj-$(CONFIG_SYSC_RCAR) += rcar-sysc.o
> diff --git a/drivers/soc/renesas/rzfive/Kconfig b/drivers/soc/renesas/rzfive/Kconfig
> new file mode 100644
> index 000000000000..b6bc00337d99
> --- /dev/null
> +++ b/drivers/soc/renesas/rzfive/Kconfig
> @@ -0,0 +1,6 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +config AX45MP_L2_CACHE
> + bool "Andes Technology AX45MP L2 Cache controller"
> + help
> + Support for the L2 cache controller on Andes Technology AX45MP platforms.
> diff --git a/drivers/soc/renesas/rzfive/Makefile b/drivers/soc/renesas/rzfive/Makefile
> new file mode 100644
> index 000000000000..2012e7fb978d
> --- /dev/null
> +++ b/drivers/soc/renesas/rzfive/Makefile
> @@ -0,0 +1,3 @@
> +# SPDX-License-Identifier: GPL-2.0
> +
> +obj-$(CONFIG_AX45MP_L2_CACHE) += ax45mp_cache.o
> diff --git a/drivers/soc/renesas/rzfive/ax45mp_cache.c b/drivers/soc/renesas/rzfive/ax45mp_cache.c
> new file mode 100644
> index 000000000000..4e0d0545d3af
> --- /dev/null
> +++ b/drivers/soc/renesas/rzfive/ax45mp_cache.c
> @@ -0,0 +1,415 @@
> +// SPDX-License-Identifier: GPL-2.0
> +/*
> + * PMA setup and non-coherent cache functions for Andes AX45MP
> + *
> + * Copyright (C) 2022 Renesas Electronics Corp.
> + */
> +
> +#include <linux/cacheflush.h>
> +#include <linux/cacheinfo.h>
> +#include <linux/dma-direction.h>
> +#include <linux/of_address.h>
> +#include <linux/of_platform.h>
> +
> +#include <asm/cacheflush.h>
> +#include <asm/sbi.h>
> +
> +#include "ax45mp_sbi.h"
> +
> +/* L2 cache registers */
> +#define AX45MP_L2C_REG_CTL_OFFSET 0x8
> +
> +#define AX45MP_L2C_REG_C0_CMD_OFFSET 0x40
> +#define AX45MP_L2C_REG_C0_ACC_OFFSET 0x48
> +#define AX45MP_L2C_REG_STATUS_OFFSET 0x80
> +
> +/* D-cache operation */
> +#define AX45MP_CCTL_L1D_VA_INVAL 0
> +#define AX45MP_CCTL_L1D_VA_WB 1
> +
> +/* L2 cache */
> +#define AX45MP_L2_CACHE_CTL_CEN_MASK 1
> +
> +/* L2 CCTL status */
> +#define AX45MP_CCTL_L2_STATUS_IDLE 0
> +
> +/* L2 CCTL status cores mask */
> +#define AX45MP_CCTL_L2_STATUS_C0_MASK 0xf
> +
> +/* L2 cache operation */
> +#define AX45MP_CCTL_L2_PA_INVAL 0x8
> +#define AX45MP_CCTL_L2_PA_WB 0x9
> +
> +#define AX45MP_L2C_HPM_PER_CORE_OFFSET 0x8
> +#define AX45MP_L2C_REG_PER_CORE_OFFSET 0x10
> +#define AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET 4
> +
> +#define AX45MP_L2C_REG_CN_CMD_OFFSET(n) \
> + (AX45MP_L2C_REG_C0_CMD_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET))
> +#define AX45MP_L2C_REG_CN_ACC_OFFSET(n) \
> + (AX45MP_L2C_REG_C0_ACC_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET))
> +#define AX45MP_CCTL_L2_STATUS_CN_MASK(n) \
> + (AX45MP_CCTL_L2_STATUS_C0_MASK << ((n) * AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET))
> +
> +#define AX45MP_MICM_CFG_ISZ_OFFSET 6
> +#define AX45MP_MICM_CFG_ISZ_MASK (0x7 << AX45MP_MICM_CFG_ISZ_OFFSET)
> +
> +#define AX45MP_MDCM_CFG_DSZ_OFFSET 6
> +#define AX45MP_MDCM_CFG_DSZ_MASK (0x7 << AX45MP_MDCM_CFG_DSZ_OFFSET)
> +
> +#define AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM 0x80b
> +#define AX45MP_CCTL_REG_UCCTLCOMMAND_NUM 0x80c
> +
> +#define AX45MP_MCACHE_CTL_CCTL_SUEN_OFFSET 8
> +#define AX45MP_MMSC_CFG_CCTLCSR_OFFSET 16
> +#define AX45MP_MISA_20_OFFSET 20
> +
> +#define AX45MP_MCACHE_CTL_CCTL_SUEN_MASK (0x1 << AX45MP_MCACHE_CTL_CCTL_SUEN_OFFSET)
> +#define AX45MP_MMSC_CFG_CCTLCSR_MASK (0x1 << AX45MP_MMSC_CFG_CCTLCSR_OFFSET)
> +#define AX45MP_MISA_20_MASK (0x1 << AX45MP_MISA_20_OFFSET)
> +
> +#define AX45MP_MAX_CACHE_LINE_SIZE 256
> +
> +#define AX45MP_MAX_PMA_REGIONS 16
> +
> +struct ax45mp_priv {
> + void __iomem *l2c_base;
> + u32 ax45mp_cache_line_size;
> + bool l2cache_enabled;
> + bool ucctl_ok;
> +};
> +
> +static struct ax45mp_priv *ax45mp_priv;
> +static DEFINE_STATIC_KEY_FALSE(ax45mp_l2c_configured);
> +
> +/* PMA setup */
> +static long ax45mp_sbi_set_pma(unsigned long start,
> + unsigned long size,
> + unsigned long flags,
> + unsigned int entry_id)
> +{
> + struct sbiret ret;
> +
> + ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_SET_PMA,
> + start, size, entry_id, flags, 0, 0);
> +
> + return ret.value;
> +}
> +
> +static int ax45mp_configure_pma_regions(struct device_node *np)
> +{
> + const char *propname = "andestech,pma-regions";
> + u32 start, size, flags;
> + unsigned int entry_id;
> + unsigned int i;
> + int count;
> + int ret;
> +
> + count = of_property_count_elems_of_size(np, propname, sizeof(u32) * 3);
> + if (count < 0)
> + return count;
> +
> + if (count > AX45MP_MAX_PMA_REGIONS)
> + return -EINVAL;
> +
> + for (i = 0, entry_id = 0 ; entry_id < count ; i += 3, entry_id++) {
> + of_property_read_u32_index(np, propname, i, &start);
> + of_property_read_u32_index(np, propname, i + 1, &size);
> + of_property_read_u32_index(np, propname, i + 2, &flags);
> + ret = ax45mp_sbi_set_pma(start, size, flags, entry_id);
> + if (!ret)
> + pr_err("Failed to setup PMA region 0x%x - 0x%x flags: 0x%x",
> + start, start + size, flags);
> + }
> +
> + return 0;
> +}

If firmware support is required to set up these PMA regions, why is
Linux doing this at all? The firmware has access to the devicetree as
well. It can set this up before entering S-mode, and then you don't need
to expose this capability via an SBI extension. In fact, firmware could
generate the reserved-memory node based on these regions at runtime (or
vice versa).

> +
> +/* L2 Cache operations */
> +static uint32_t ax45mp_cpu_get_mcache_ctl_status(void)
> +{
> + struct sbiret ret;
> +
> + ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MCACHE_CTL_STATUS,
> + 0, 0, 0, 0, 0, 0);
> + return ret.value;
> +}
> +
> +static uint32_t ax45mp_cpu_get_micm_cfg_status(void)
> +{
> + struct sbiret ret;
> +
> + ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MICM_CTL_STATUS,
> + 0, 0, 0, 0, 0, 0);
> + return ret.value;
> +}
> +
> +static uint32_t ax45mp_cpu_get_mdcm_cfg_status(void)
> +{
> + struct sbiret ret;
> +
> + ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MDCM_CTL_STATUS,
> + 0, 0, 0, 0, 0, 0);
> + return ret.value;
> +}
> +
> +static uint32_t ax45mp_cpu_get_mmsc_cfg_status(void)
> +{
> + struct sbiret ret;
> +
> + ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MMSC_CTL_STATUS,
> + 0, 0, 0, 0, 0, 0);
> + return ret.value;
> +}
> +
> +static uint32_t ax45mp_cpu_get_misa_cfg_status(void)
> +{
> + struct sbiret ret;
> +
> + ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MISA_CTL_STATUS,
> + 0, 0, 0, 0, 0, 0);
> + return ret.value;
> +}
> +
> +static inline uint32_t ax45mp_cpu_l2c_get_cctl_status(void)
> +{
> + return readl(ax45mp_priv->l2c_base + AX45MP_L2C_REG_STATUS_OFFSET);
> +}
> +
> +static inline uint32_t ax45mp_cpu_l2c_ctl_status(void)
> +{
> + return readl(ax45mp_priv->l2c_base + AX45MP_L2C_REG_CTL_OFFSET);
> +}
> +
> +static bool ax45mp_cpu_cache_controlable(void)
> +{
> + return (((ax45mp_cpu_get_micm_cfg_status() & AX45MP_MICM_CFG_ISZ_MASK) ||
> + (ax45mp_cpu_get_mdcm_cfg_status() & AX45MP_MDCM_CFG_DSZ_MASK)) &&
> + (ax45mp_cpu_get_misa_cfg_status() & AX45MP_MISA_20_MASK) &&
> + (ax45mp_cpu_get_mmsc_cfg_status() & AX45MP_MMSC_CFG_CCTLCSR_MASK) &&
> + (ax45mp_cpu_get_mcache_ctl_status() & AX45MP_MCACHE_CTL_CCTL_SUEN_MASK));
> +}
> +
> +static void ax45mp_cpu_dcache_wb_range(void *start, void *end, int line_size)
> +{
> + void __iomem *base = ax45mp_priv->l2c_base;
> + unsigned long pa;
> + int mhartid = 0;
> +#ifdef CONFIG_SMP
> + mhartid = smp_processor_id();
> +#endif

This doesn't need an #ifdef. smp_processor_id() already returns zero
when SMP is disabled.

> +
> + while (end > start) {
> + if (ax45mp_priv->ucctl_ok) {
> + csr_write(AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM, start);
> + csr_write(AX45MP_CCTL_REG_UCCTLCOMMAND_NUM, AX45MP_CCTL_L1D_VA_WB);
> + }
> +
> + if (ax45mp_priv->l2cache_enabled) {
> + pa = virt_to_phys(start);
> + writel(pa, base + AX45MP_L2C_REG_CN_ACC_OFFSET(mhartid));
> + writel(AX45MP_CCTL_L2_PA_WB,
> + base + AX45MP_L2C_REG_CN_CMD_OFFSET(mhartid));
> + while ((ax45mp_cpu_l2c_get_cctl_status() &
> + AX45MP_CCTL_L2_STATUS_CN_MASK(mhartid)) !=
> + AX45MP_CCTL_L2_STATUS_IDLE)
> + ;
> + }
> +
> + start += line_size;
> + }
> +}
> +
> +static void ax45mp_cpu_dcache_inval_range(void *start, void *end, int line_size)
> +{
> + void __iomem *base = ax45mp_priv->l2c_base;
> + unsigned long pa;
> + int mhartid = 0;
> +#ifdef CONFIG_SMP
> + mhartid = smp_processor_id();
> +#endif
> +
> + while (end > start) {
> + if (ax45mp_priv->ucctl_ok) {
> + csr_write(AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM, start);
> + csr_write(AX45MP_CCTL_REG_UCCTLCOMMAND_NUM, AX45MP_CCTL_L1D_VA_INVAL);
> + }
> +
> + if (ax45mp_priv->l2cache_enabled) {
> + pa = virt_to_phys(start);
> + writel(pa, base + AX45MP_L2C_REG_CN_ACC_OFFSET(mhartid));
> + writel(AX45MP_CCTL_L2_PA_INVAL,
> + base + AX45MP_L2C_REG_CN_CMD_OFFSET(mhartid));
> + while ((ax45mp_cpu_l2c_get_cctl_status() &
> + AX45MP_CCTL_L2_STATUS_CN_MASK(mhartid)) !=
> + AX45MP_CCTL_L2_STATUS_IDLE)
> + ;
> + }
> +
> + start += line_size;
> + }
> +}
> +
> +static void ax45mp_cpu_dma_inval_range(void *vaddr, size_t size)
> +{
> + char cache_buf[2][AX45MP_MAX_CACHE_LINE_SIZE];
> + unsigned long start = (unsigned long)vaddr;
> + unsigned long end = start + size;
> + unsigned long old_start = start;
> + unsigned long old_end = end;
> + unsigned long line_size;
> + unsigned long flags;
> +
> + if (static_branch_unlikely(&ax45mp_l2c_configured) && !ax45mp_priv)
> + return;
> +
> + if (unlikely(start == end))
> + return;
> +
> + line_size = ax45mp_priv->ax45mp_cache_line_size;
> +
> + memset(&cache_buf, 0x0, sizeof(cache_buf));
> + start = start & (~(line_size - 1));
> + end = ((end + line_size - 1) & (~(line_size - 1)));
> +
> + local_irq_save(flags);
> + if (unlikely(start != old_start))
> + memcpy(&cache_buf[0][0], (void *)start, line_size);
> +
> + if (unlikely(end != old_end))
> + memcpy(&cache_buf[1][0], (void *)(old_end & (~(line_size - 1))), line_size);

The memcpy dance is only required if ax45mp_cache_line_size is larger
than ARCH_DMA_MINALIGN. Is that actually the case in practice? If not,
you could verify this in the probe function, and remove this logic.

> +
> + ax45mp_cpu_dcache_inval_range(vaddr, (void *)end, line_size);
> +
> + if (unlikely(start != old_start))
> + memcpy((void *)start, &cache_buf[0][0], (old_start & (line_size - 1)));
> +
> + if (unlikely(end != old_end))
> + memcpy((void *)(old_end + 1),
> + &cache_buf[1][(old_end & (line_size - 1)) + 1],
> + end - old_end - 1);
> +
> + local_irq_restore(flags);
> +}
> +
> +static void ax45mp_cpu_dma_wb_range(void *vaddr, size_t size)
> +{
> + unsigned long start = (unsigned long)vaddr;
> + unsigned long end = start + size;
> + unsigned long line_size;
> + unsigned long flags;
> +
> + if (static_branch_unlikely(&ax45mp_l2c_configured) && !ax45mp_priv)
> + return;
> +
> + line_size = ax45mp_priv->ax45mp_cache_line_size;
> + local_irq_save(flags);
> + start = start & (~(line_size - 1));
> + ax45mp_cpu_dcache_wb_range(vaddr, (void *)end, line_size);
> + local_irq_restore(flags);
> +}
> +
> +void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr, size_t size, int dir, int ops)
> +{
> + if (ops == NON_COHERENT_DMA_PREP)
> + return;
> +
> + if (ops == NON_COHERENT_SYNC_DMA_FOR_DEVICE) {
> + switch (dir) {
> + case DMA_FROM_DEVICE:
> + ax45mp_cpu_dma_inval_range(vaddr, size);
> + break;
> + case DMA_TO_DEVICE:
> + case DMA_BIDIRECTIONAL:
> + ax45mp_cpu_dma_wb_range(vaddr, size);
> + break;
> + default:
> + break;
> + }
> + return;
> + }
> +
> + /* op == NON_COHERENT_SYNC_DMA_FOR_CPU */
> + if (dir == DMA_BIDIRECTIONAL || dir == DMA_FROM_DEVICE)
> + ax45mp_cpu_dma_inval_range(vaddr, size);
> +}
> +EXPORT_SYMBOL(ax45mp_no_iocp_cmo);
> +
> +static int ax45mp_configure_l2_cache(struct device_node *np)
> +{
> + int ret;
> +
> + ret = of_property_read_u32(np, "cache-line-size", &ax45mp_priv->ax45mp_cache_line_size);
> + if (ret) {
> + pr_err("Failed to get cache-line-size defaulting to 64 bytes\n");
> + ax45mp_priv->ax45mp_cache_line_size = SZ_64;
> + }
> +
> + if (ax45mp_priv->ax45mp_cache_line_size != SZ_64) {
> + pr_err("Expected cache-line-size to 64 bytes (found:%u). Defaulting to 64 bytes\n",
> + ax45mp_priv->ax45mp_cache_line_size);
> + ax45mp_priv->ax45mp_cache_line_size = SZ_64;
> + }

Ah, so you already do this. And SZ_64 == ARCH_DMA_MINALIGN. So you do
not need the memcpy logic.

> +
> + ax45mp_priv->ucctl_ok = ax45mp_cpu_cache_controlable();
> + ax45mp_priv->l2cache_enabled = ax45mp_cpu_l2c_ctl_status() & AX45MP_L2_CACHE_CTL_CEN_MASK;
> +
> + return 0;
> +}
> +
> +static int ax45mp_l2c_probe(struct platform_device *pdev)
> +{
> + struct device_node *np = pdev->dev.of_node;
> + int ret;
> +
> + ax45mp_priv = devm_kzalloc(&pdev->dev, sizeof(*ax45mp_priv), GFP_KERNEL);
> + if (!ax45mp_priv)
> + return -ENOMEM;
> +
> + ax45mp_priv->l2c_base = devm_of_iomap(&pdev->dev, pdev->dev.of_node, 0, NULL);

devm_platform_ioremap_resource()

> + if (!ax45mp_priv->l2c_base) {
> + ret = -ENOMEM;
> + goto l2c_err;
> + }
> +
> + ret = ax45mp_configure_l2_cache(np);
> + if (ret)
> + goto l2c_err;
> +
> + ret = ax45mp_configure_pma_regions(np);
> + if (ret)
> + goto l2c_err;
> +
> + static_branch_disable(&ax45mp_l2c_configured);

Instead of enabling this before the probe function, and disabling it
afterward, just enable it once here, in the success case. Then you can
drop the !ax45mp_priv check in the functions above.

And none of the functions would get called anyway if the alternative is
not applied. I suppose it's not possible to do some of this probe logic
in the alternative check function?

> +
> + return 0;
> +
> +l2c_err:
> + devm_kfree(&pdev->dev, ax45mp_priv);
> + ax45mp_priv = NULL;

None of this cleanup is necessary.

Regards,
Samuel

> + return ret;
> +}
> +
> +static const struct of_device_id ax45mp_cache_ids[] = {
> + { .compatible = "andestech,ax45mp-cache" },
> + { /* sentinel */ }
> +};
> +
> +static struct platform_driver ax45mp_l2c_driver = {
> + .driver = {
> + .name = "ax45mp-l2c",
> + .of_match_table = ax45mp_cache_ids,
> + },
> + .probe = ax45mp_l2c_probe,
> +};
> +
> +static int __init ax45mp_cache_init(void)
> +{
> + static_branch_enable(&ax45mp_l2c_configured);
> + return platform_driver_register(&ax45mp_l2c_driver);
> +}
> +arch_initcall(ax45mp_cache_init);
> +
> +MODULE_AUTHOR("Lad Prabhakar <[email protected]>");
> +MODULE_DESCRIPTION("Andes AX45MP L2 cache driver");
> +MODULE_LICENSE("GPL");
> diff --git a/drivers/soc/renesas/rzfive/ax45mp_sbi.h b/drivers/soc/renesas/rzfive/ax45mp_sbi.h
> new file mode 100644
> index 000000000000..1604874954d0
> --- /dev/null
> +++ b/drivers/soc/renesas/rzfive/ax45mp_sbi.h
> @@ -0,0 +1,29 @@
> +/* SPDX-License-Identifier: GPL-2.0+ */
> +
> +#ifndef __AX45MP_SBI_H
> +#define __AX45MP_SBI_H
> +
> +#define SBI_EXT_ANDES 0x0900031E
> +
> +enum ax45mp_sbi_ext_fid {
> + AX45MP_SBI_EXT_GET_MCACHE_CTL_STATUS = 0,
> + AX45MP_SBI_EXT_GET_MMISC_CTL_STATUS,
> + AX45MP_SBI_EXT_SET_MCACHE_CTL,
> + AX45MP_SBI_EXT_SET_MMISC_CTL,
> + AX45MP_SBI_EXT_ICACHE_OP,
> + AX45MP_SBI_EXT_DCACHE_OP,
> + AX45MP_SBI_EXT_L1CACHE_I_PREFETCH,
> + AX45MP_SBI_EXT_L1CACHE_D_PREFETCH,
> + AX45MP_SBI_EXT_NON_BLOCKING_LOAD_STORE,
> + AX45MP_SBI_EXT_WRITE_AROUND,
> + AX45MP_SBI_EXT_SET_PMA,
> + AX45MP_SBI_EXT_FREE_PMA,
> + AX45MP_SBI_EXT_PROBE_PMA,
> + AX45MP_SBI_EXT_DCACHE_WBINVAL_ALL,
> + AX45MP_SBI_EXT_GET_MICM_CTL_STATUS,
> + AX45MP_SBI_EXT_GET_MDCM_CTL_STATUS,
> + AX45MP_SBI_EXT_GET_MMSC_CTL_STATUS,
> + AX45MP_SBI_EXT_GET_MISA_CTL_STATUS,
> +};
> +
> +#endif

2022-11-25 21:42:42

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 5/7] riscv: mm: dma-noncoherent: Pass direction and operation to ALT_CMO_OP()

Hi Samuel,

Thank you for the review.

On Fri, Nov 25, 2022 at 6:49 PM Samuel Holland <[email protected]> wrote:
>
> On 11/24/22 13:18, Lad, Prabhakar wrote:
> > Hi Heiko,
> >
> > Thank you for the review.
> >
> > On Thu, Nov 24, 2022 at 6:29 PM Heiko Stübner <[email protected]> wrote:
> >>
> >> Am Donnerstag, 24. November 2022, 18:22:05 CET schrieb Prabhakar:
> >>> From: Lad Prabhakar <[email protected]>
> >>>
> >>> Pass direction and operation to ALT_CMO_OP() macro.
> >>>
> >>> This is in preparation for adding errata for the Andes CPU core.
> >>
> >> can you provide more explanation why that is necessary please?
> >> I guess you want to use different cache operations for some cases?
> >>
> > Yes basically to call different cache operations based on the dir and
> > operations (and also this allows to export just one function to handle
> > the errata). I'll update the commit message in the next version.
>
> This makes things less efficient, because it requires more instructions
> and registers inside the alternative section, and your function
> duplicates the logic from arch_sync_dma_for_device(). The alternative is
> already passed the operation (clean/flush/invalidate) as a token, so you
> can construct the function name with token pasting.
>
I did think about it but that didn't help for example in the
arch_dma_prep_coherent() we are calling flush token, but on RZ/Five
for arch_dma_prep_coherent() we have to do nothing.

Cheers,
Prabhakar

2022-11-26 21:46:14

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

Hi Samuel,

Thank you for the review.

On Fri, Nov 25, 2022 at 7:43 PM Samuel Holland <[email protected]> wrote:
>
> Hi Prabhakar,
>
> On 11/24/22 11:22, Prabhakar wrote:
> > From: Lad Prabhakar <[email protected]>
> >
> > On the AX45MP core, cache coherency is a specification option so it may
> > not be supported. In this case DMA will fail. As a workaround, firstly we
> > allocate a global dma coherent pool from which DMA allocations are taken
> > and marked as non-cacheable + bufferable using the PMA region as specified
> > in the device tree. Synchronization callbacks are implemented to
> > synchronize when doing DMA transactions.
> >
> > The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> > block that allows dynamic adjustment of memory attributes in the runtime.
> > It contains a configurable amount of PMA entries implemented as CSR
> > registers to control the attributes of memory locations in interest.
> >
> > Below are the memory attributes supported:
> > * Device, Non-bufferable
> > * Device, bufferable
> > * Memory, Non-cacheable, Non-bufferable
> > * Memory, Non-cacheable, Bufferable
> > * Memory, Write-back, No-allocate
> > * Memory, Write-back, Read-allocate
> > * Memory, Write-back, Write-allocate
> > * Memory, Write-back, Read and Write-allocate
> >
> > This patch adds support to configure the memory attributes of the memory
> > regions as passed from the l2 cache node and exposes the cache management
> > ops.
>
> Forgive my ignorance, but why do you need both a DMA pool and explicit
> cache maintenance? Wouldn't the purpose of marking a memory region as
> permanently non-cacheable be to avoid cache maintenance? And likewise,
> if you are doing cache maintenance anyway, why does it matter if/how the
> memory is cacheable?
>
"Memory, Non-cacheable, Bufferable" raises an AXI signal for
transactions hence needing SW implementation for cache maintenance.

> > More info about PMA (section 10.3):
> > Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
> >
> > Signed-off-by: Lad Prabhakar <[email protected]>
> > ---
> > RFC v3 -> v4
> > * Made use of runtime patching instead of compile time
> > * Now just exposing single function ax45mp_no_iocp_cmo() for CMO handling
> > * Added a check to make sure cache line size is always 64 bytes
> > * Renamed folder rzf -> rzfive
> > * Improved Kconfig description
> > * Dropped L2 cache configuration
> > * Dropped unnecessary casts
> > * Fixed comments pointed by Geert, apart from use of PTR_ALIGN_XYZ() macros.
> > ---
> > arch/riscv/include/asm/cacheflush.h | 8 +
> > arch/riscv/include/asm/errata_list.h | 32 +-
> > drivers/soc/renesas/Kconfig | 7 +
> > drivers/soc/renesas/Makefile | 2 +
> > drivers/soc/renesas/rzfive/Kconfig | 6 +
> > drivers/soc/renesas/rzfive/Makefile | 3 +
> > drivers/soc/renesas/rzfive/ax45mp_cache.c | 415 ++++++++++++++++++++++
> > drivers/soc/renesas/rzfive/ax45mp_sbi.h | 29 ++
> > 8 files changed, 496 insertions(+), 6 deletions(-)
> > create mode 100644 drivers/soc/renesas/rzfive/Kconfig
> > create mode 100644 drivers/soc/renesas/rzfive/Makefile
> > create mode 100644 drivers/soc/renesas/rzfive/ax45mp_cache.c
> > create mode 100644 drivers/soc/renesas/rzfive/ax45mp_sbi.h
> >
> > diff --git a/arch/riscv/include/asm/cacheflush.h b/arch/riscv/include/asm/cacheflush.h
> > index 4a04d1be7c67..3226f3aceafe 100644
> > --- a/arch/riscv/include/asm/cacheflush.h
> > +++ b/arch/riscv/include/asm/cacheflush.h
> > @@ -61,6 +61,14 @@ static inline void riscv_noncoherent_supported(void) {}
> > #define SYS_RISCV_FLUSH_ICACHE_LOCAL 1UL
> > #define SYS_RISCV_FLUSH_ICACHE_ALL (SYS_RISCV_FLUSH_ICACHE_LOCAL)
> >
> > +#ifdef CONFIG_AX45MP_L2_CACHE
> > +extern asmlinkage void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr,
> > + size_t size, int dir, int ops);
> > +#else
> > +inline void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr,
> > + size_t size, int dir, int ops) {}
> > +#endif
> > +
> > #include <asm-generic/cacheflush.h>
> >
> > #endif /* _ASM_RISCV_CACHEFLUSH_H */
> > diff --git a/arch/riscv/include/asm/errata_list.h b/arch/riscv/include/asm/errata_list.h
> > index 48e899a8e7a9..300fed3bfd80 100644
> > --- a/arch/riscv/include/asm/errata_list.h
> > +++ b/arch/riscv/include/asm/errata_list.h
> > @@ -125,8 +125,8 @@ asm volatile(ALTERNATIVE( \
> > #define THEAD_SYNC_S ".long 0x0190000b"
> >
> > #define ALT_CMO_OP(_op, _start, _size, _cachesize, _dir, _ops) \
> > -asm volatile(ALTERNATIVE_2( \
> > - __nops(6), \
> > +asm volatile(ALTERNATIVE_3( \
> > + __nops(14), \
> > "mv a0, %1\n\t" \
> > "j 2f\n\t" \
> > "3:\n\t" \
> > @@ -134,7 +134,7 @@ asm volatile(ALTERNATIVE_2( \
> > "add a0, a0, %0\n\t" \
> > "2:\n\t" \
> > "bltu a0, %2, 3b\n\t" \
> > - "nop", 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
> > + __nops(8), 0, CPUFEATURE_ZICBOM, CONFIG_RISCV_ISA_ZICBOM, \
> > "mv a0, %1\n\t" \
> > "j 2f\n\t" \
> > "3:\n\t" \
> > @@ -142,8 +142,28 @@ asm volatile(ALTERNATIVE_2( \
> > "add a0, a0, %0\n\t" \
> > "2:\n\t" \
> > "bltu a0, %2, 3b\n\t" \
> > - THEAD_SYNC_S, THEAD_VENDOR_ID, \
> > - ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO) \
> > + THEAD_SYNC_S "\n\t" \
> > + __nops(8), THEAD_VENDOR_ID, \
> > + ERRATA_THEAD_CMO, CONFIG_ERRATA_THEAD_CMO, \
> > + ".option push\n\t\n\t" \
> > + ".option norvc\n\t" \
> > + ".option norelax\n\t" \
> > + "addi sp,sp,-16\n\t" \
> > + "sd s0,0(sp)\n\t" \
> > + "sd ra,8(sp)\n\t" \
> > + "addi s0,sp,16\n\t" \
> > + "mv a4,%6\n\t" \
> > + "mv a3,%5\n\t" \
> > + "mv a2,%4\n\t" \
> > + "mv a1,%3\n\t" \
> > + "mv a0,%0\n\t" \
> > + "call ax45mp_no_iocp_cmo\n\t" \
> > + "ld ra,8(sp)\n\t" \
> > + "ld s0,0(sp)\n\t" \
> > + "addi sp,sp,16\n\t" \
> > + ".option pop\n\t", \
> > + ANDESTECH_VENDOR_ID, ERRATA_ANDESTECH_NO_IOCP, \
> > + CONFIG_ERRATA_ANDES_CMO) \
> > : : "r"(_cachesize), \
> > "r"((unsigned long)(_start) & ~((_cachesize) - 1UL)), \
> > "r"((unsigned long)(_start) + (_size)), \
> > @@ -151,7 +171,7 @@ asm volatile(ALTERNATIVE_2( \
> > "r"((unsigned long)(_size)), \
> > "r"((unsigned long)(_dir)), \
> > "r"((unsigned long)(_ops)) \
> > - : "a0")
> > + : "a0", "a1", "a2", "a3", "a4", "memory")
> >
> > #define THEAD_C9XX_RV_IRQ_PMU 17
> > #define THEAD_C9XX_CSR_SCOUNTEROF 0x5c5
> > diff --git a/drivers/soc/renesas/Kconfig b/drivers/soc/renesas/Kconfig
> > index 660498252ec5..e7810256c60d 100644
> > --- a/drivers/soc/renesas/Kconfig
> > +++ b/drivers/soc/renesas/Kconfig
> > @@ -340,9 +340,16 @@ if RISCV
> > config ARCH_R9A07G043
> > bool "RISC-V Platform support for RZ/Five"
> > select ARCH_RZG2L
> > + select AX45MP_L2_CACHE
> > + select DMA_GLOBAL_POOL
> > + select ERRATA_ANDES
> > + select ERRATA_ANDES_CMO
> > + select RISCV_DMA_NONCOHERENT
> > help
> > This enables support for the Renesas RZ/Five SoC.
> >
> > +source "drivers/soc/renesas/rzfive/Kconfig"
> > +
> > endif # RISCV
> >
> > config RST_RCAR
> > diff --git a/drivers/soc/renesas/Makefile b/drivers/soc/renesas/Makefile
> > index 535868c9c7e4..9df9f759a039 100644
> > --- a/drivers/soc/renesas/Makefile
> > +++ b/drivers/soc/renesas/Makefile
> > @@ -31,6 +31,8 @@ ifdef CONFIG_SMP
> > obj-$(CONFIG_ARCH_R9A06G032) += r9a06g032-smp.o
> > endif
> >
> > +obj-$(CONFIG_RISCV) += rzfive/
> > +
> > # Family
> > obj-$(CONFIG_RST_RCAR) += rcar-rst.o
> > obj-$(CONFIG_SYSC_RCAR) += rcar-sysc.o
> > diff --git a/drivers/soc/renesas/rzfive/Kconfig b/drivers/soc/renesas/rzfive/Kconfig
> > new file mode 100644
> > index 000000000000..b6bc00337d99
> > --- /dev/null
> > +++ b/drivers/soc/renesas/rzfive/Kconfig
> > @@ -0,0 +1,6 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +
> > +config AX45MP_L2_CACHE
> > + bool "Andes Technology AX45MP L2 Cache controller"
> > + help
> > + Support for the L2 cache controller on Andes Technology AX45MP platforms.
> > diff --git a/drivers/soc/renesas/rzfive/Makefile b/drivers/soc/renesas/rzfive/Makefile
> > new file mode 100644
> > index 000000000000..2012e7fb978d
> > --- /dev/null
> > +++ b/drivers/soc/renesas/rzfive/Makefile
> > @@ -0,0 +1,3 @@
> > +# SPDX-License-Identifier: GPL-2.0
> > +
> > +obj-$(CONFIG_AX45MP_L2_CACHE) += ax45mp_cache.o
> > diff --git a/drivers/soc/renesas/rzfive/ax45mp_cache.c b/drivers/soc/renesas/rzfive/ax45mp_cache.c
> > new file mode 100644
> > index 000000000000..4e0d0545d3af
> > --- /dev/null
> > +++ b/drivers/soc/renesas/rzfive/ax45mp_cache.c
> > @@ -0,0 +1,415 @@
> > +// SPDX-License-Identifier: GPL-2.0
> > +/*
> > + * PMA setup and non-coherent cache functions for Andes AX45MP
> > + *
> > + * Copyright (C) 2022 Renesas Electronics Corp.
> > + */
> > +
> > +#include <linux/cacheflush.h>
> > +#include <linux/cacheinfo.h>
> > +#include <linux/dma-direction.h>
> > +#include <linux/of_address.h>
> > +#include <linux/of_platform.h>
> > +
> > +#include <asm/cacheflush.h>
> > +#include <asm/sbi.h>
> > +
> > +#include "ax45mp_sbi.h"
> > +
> > +/* L2 cache registers */
> > +#define AX45MP_L2C_REG_CTL_OFFSET 0x8
> > +
> > +#define AX45MP_L2C_REG_C0_CMD_OFFSET 0x40
> > +#define AX45MP_L2C_REG_C0_ACC_OFFSET 0x48
> > +#define AX45MP_L2C_REG_STATUS_OFFSET 0x80
> > +
> > +/* D-cache operation */
> > +#define AX45MP_CCTL_L1D_VA_INVAL 0
> > +#define AX45MP_CCTL_L1D_VA_WB 1
> > +
> > +/* L2 cache */
> > +#define AX45MP_L2_CACHE_CTL_CEN_MASK 1
> > +
> > +/* L2 CCTL status */
> > +#define AX45MP_CCTL_L2_STATUS_IDLE 0
> > +
> > +/* L2 CCTL status cores mask */
> > +#define AX45MP_CCTL_L2_STATUS_C0_MASK 0xf
> > +
> > +/* L2 cache operation */
> > +#define AX45MP_CCTL_L2_PA_INVAL 0x8
> > +#define AX45MP_CCTL_L2_PA_WB 0x9
> > +
> > +#define AX45MP_L2C_HPM_PER_CORE_OFFSET 0x8
> > +#define AX45MP_L2C_REG_PER_CORE_OFFSET 0x10
> > +#define AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET 4
> > +
> > +#define AX45MP_L2C_REG_CN_CMD_OFFSET(n) \
> > + (AX45MP_L2C_REG_C0_CMD_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET))
> > +#define AX45MP_L2C_REG_CN_ACC_OFFSET(n) \
> > + (AX45MP_L2C_REG_C0_ACC_OFFSET + ((n) * AX45MP_L2C_REG_PER_CORE_OFFSET))
> > +#define AX45MP_CCTL_L2_STATUS_CN_MASK(n) \
> > + (AX45MP_CCTL_L2_STATUS_C0_MASK << ((n) * AX45MP_CCTL_L2_STATUS_PER_CORE_OFFSET))
> > +
> > +#define AX45MP_MICM_CFG_ISZ_OFFSET 6
> > +#define AX45MP_MICM_CFG_ISZ_MASK (0x7 << AX45MP_MICM_CFG_ISZ_OFFSET)
> > +
> > +#define AX45MP_MDCM_CFG_DSZ_OFFSET 6
> > +#define AX45MP_MDCM_CFG_DSZ_MASK (0x7 << AX45MP_MDCM_CFG_DSZ_OFFSET)
> > +
> > +#define AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM 0x80b
> > +#define AX45MP_CCTL_REG_UCCTLCOMMAND_NUM 0x80c
> > +
> > +#define AX45MP_MCACHE_CTL_CCTL_SUEN_OFFSET 8
> > +#define AX45MP_MMSC_CFG_CCTLCSR_OFFSET 16
> > +#define AX45MP_MISA_20_OFFSET 20
> > +
> > +#define AX45MP_MCACHE_CTL_CCTL_SUEN_MASK (0x1 << AX45MP_MCACHE_CTL_CCTL_SUEN_OFFSET)
> > +#define AX45MP_MMSC_CFG_CCTLCSR_MASK (0x1 << AX45MP_MMSC_CFG_CCTLCSR_OFFSET)
> > +#define AX45MP_MISA_20_MASK (0x1 << AX45MP_MISA_20_OFFSET)
> > +
> > +#define AX45MP_MAX_CACHE_LINE_SIZE 256
> > +
> > +#define AX45MP_MAX_PMA_REGIONS 16
> > +
> > +struct ax45mp_priv {
> > + void __iomem *l2c_base;
> > + u32 ax45mp_cache_line_size;
> > + bool l2cache_enabled;
> > + bool ucctl_ok;
> > +};
> > +
> > +static struct ax45mp_priv *ax45mp_priv;
> > +static DEFINE_STATIC_KEY_FALSE(ax45mp_l2c_configured);
> > +
> > +/* PMA setup */
> > +static long ax45mp_sbi_set_pma(unsigned long start,
> > + unsigned long size,
> > + unsigned long flags,
> > + unsigned int entry_id)
> > +{
> > + struct sbiret ret;
> > +
> > + ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_SET_PMA,
> > + start, size, entry_id, flags, 0, 0);
> > +
> > + return ret.value;
> > +}
> > +
> > +static int ax45mp_configure_pma_regions(struct device_node *np)
> > +{
> > + const char *propname = "andestech,pma-regions";
> > + u32 start, size, flags;
> > + unsigned int entry_id;
> > + unsigned int i;
> > + int count;
> > + int ret;
> > +
> > + count = of_property_count_elems_of_size(np, propname, sizeof(u32) * 3);
> > + if (count < 0)
> > + return count;
> > +
> > + if (count > AX45MP_MAX_PMA_REGIONS)
> > + return -EINVAL;
> > +
> > + for (i = 0, entry_id = 0 ; entry_id < count ; i += 3, entry_id++) {
> > + of_property_read_u32_index(np, propname, i, &start);
> > + of_property_read_u32_index(np, propname, i + 1, &size);
> > + of_property_read_u32_index(np, propname, i + 2, &flags);
> > + ret = ax45mp_sbi_set_pma(start, size, flags, entry_id);
> > + if (!ret)
> > + pr_err("Failed to setup PMA region 0x%x - 0x%x flags: 0x%x",
> > + start, start + size, flags);
> > + }
> > +
> > + return 0;
> > +}
>
> If firmware support is required to set up these PMA regions, why is
> Linux doing this at all? The firmware has access to the devicetree as
> well. It can set this up before entering S-mode, and then you don't need
> to expose this capability via an SBI extension. In fact, firmware could
> generate the reserved-memory node based on these regions at runtime (or
> vice versa).
>
That's a good point. I'll do some research on this and get back.

Btw are there any existing examples where the firmware adds DT nodes?

> > +
> > +/* L2 Cache operations */
> > +static uint32_t ax45mp_cpu_get_mcache_ctl_status(void)
> > +{
> > + struct sbiret ret;
> > +
> > + ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MCACHE_CTL_STATUS,
> > + 0, 0, 0, 0, 0, 0);
> > + return ret.value;
> > +}
> > +
> > +static uint32_t ax45mp_cpu_get_micm_cfg_status(void)
> > +{
> > + struct sbiret ret;
> > +
> > + ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MICM_CTL_STATUS,
> > + 0, 0, 0, 0, 0, 0);
> > + return ret.value;
> > +}
> > +
> > +static uint32_t ax45mp_cpu_get_mdcm_cfg_status(void)
> > +{
> > + struct sbiret ret;
> > +
> > + ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MDCM_CTL_STATUS,
> > + 0, 0, 0, 0, 0, 0);
> > + return ret.value;
> > +}
> > +
> > +static uint32_t ax45mp_cpu_get_mmsc_cfg_status(void)
> > +{
> > + struct sbiret ret;
> > +
> > + ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MMSC_CTL_STATUS,
> > + 0, 0, 0, 0, 0, 0);
> > + return ret.value;
> > +}
> > +
> > +static uint32_t ax45mp_cpu_get_misa_cfg_status(void)
> > +{
> > + struct sbiret ret;
> > +
> > + ret = sbi_ecall(SBI_EXT_ANDES, AX45MP_SBI_EXT_GET_MISA_CTL_STATUS,
> > + 0, 0, 0, 0, 0, 0);
> > + return ret.value;
> > +}
> > +
> > +static inline uint32_t ax45mp_cpu_l2c_get_cctl_status(void)
> > +{
> > + return readl(ax45mp_priv->l2c_base + AX45MP_L2C_REG_STATUS_OFFSET);
> > +}
> > +
> > +static inline uint32_t ax45mp_cpu_l2c_ctl_status(void)
> > +{
> > + return readl(ax45mp_priv->l2c_base + AX45MP_L2C_REG_CTL_OFFSET);
> > +}
> > +
> > +static bool ax45mp_cpu_cache_controlable(void)
> > +{
> > + return (((ax45mp_cpu_get_micm_cfg_status() & AX45MP_MICM_CFG_ISZ_MASK) ||
> > + (ax45mp_cpu_get_mdcm_cfg_status() & AX45MP_MDCM_CFG_DSZ_MASK)) &&
> > + (ax45mp_cpu_get_misa_cfg_status() & AX45MP_MISA_20_MASK) &&
> > + (ax45mp_cpu_get_mmsc_cfg_status() & AX45MP_MMSC_CFG_CCTLCSR_MASK) &&
> > + (ax45mp_cpu_get_mcache_ctl_status() & AX45MP_MCACHE_CTL_CCTL_SUEN_MASK));
> > +}
> > +
> > +static void ax45mp_cpu_dcache_wb_range(void *start, void *end, int line_size)
> > +{
> > + void __iomem *base = ax45mp_priv->l2c_base;
> > + unsigned long pa;
> > + int mhartid = 0;
> > +#ifdef CONFIG_SMP
> > + mhartid = smp_processor_id();
> > +#endif
>
> This doesn't need an #ifdef. smp_processor_id() already returns zero
> when SMP is disabled.
>
Ok, I'll get rid of this check.

> > +
> > + while (end > start) {
> > + if (ax45mp_priv->ucctl_ok) {
> > + csr_write(AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM, start);
> > + csr_write(AX45MP_CCTL_REG_UCCTLCOMMAND_NUM, AX45MP_CCTL_L1D_VA_WB);
> > + }
> > +
> > + if (ax45mp_priv->l2cache_enabled) {
> > + pa = virt_to_phys(start);
> > + writel(pa, base + AX45MP_L2C_REG_CN_ACC_OFFSET(mhartid));
> > + writel(AX45MP_CCTL_L2_PA_WB,
> > + base + AX45MP_L2C_REG_CN_CMD_OFFSET(mhartid));
> > + while ((ax45mp_cpu_l2c_get_cctl_status() &
> > + AX45MP_CCTL_L2_STATUS_CN_MASK(mhartid)) !=
> > + AX45MP_CCTL_L2_STATUS_IDLE)
> > + ;
> > + }
> > +
> > + start += line_size;
> > + }
> > +}
> > +
> > +static void ax45mp_cpu_dcache_inval_range(void *start, void *end, int line_size)
> > +{
> > + void __iomem *base = ax45mp_priv->l2c_base;
> > + unsigned long pa;
> > + int mhartid = 0;
> > +#ifdef CONFIG_SMP
> > + mhartid = smp_processor_id();
> > +#endif
> > +
> > + while (end > start) {
> > + if (ax45mp_priv->ucctl_ok) {
> > + csr_write(AX45MP_CCTL_REG_UCCTLBEGINADDR_NUM, start);
> > + csr_write(AX45MP_CCTL_REG_UCCTLCOMMAND_NUM, AX45MP_CCTL_L1D_VA_INVAL);
> > + }
> > +
> > + if (ax45mp_priv->l2cache_enabled) {
> > + pa = virt_to_phys(start);
> > + writel(pa, base + AX45MP_L2C_REG_CN_ACC_OFFSET(mhartid));
> > + writel(AX45MP_CCTL_L2_PA_INVAL,
> > + base + AX45MP_L2C_REG_CN_CMD_OFFSET(mhartid));
> > + while ((ax45mp_cpu_l2c_get_cctl_status() &
> > + AX45MP_CCTL_L2_STATUS_CN_MASK(mhartid)) !=
> > + AX45MP_CCTL_L2_STATUS_IDLE)
> > + ;
> > + }
> > +
> > + start += line_size;
> > + }
> > +}
> > +
> > +static void ax45mp_cpu_dma_inval_range(void *vaddr, size_t size)
> > +{
> > + char cache_buf[2][AX45MP_MAX_CACHE_LINE_SIZE];
> > + unsigned long start = (unsigned long)vaddr;
> > + unsigned long end = start + size;
> > + unsigned long old_start = start;
> > + unsigned long old_end = end;
> > + unsigned long line_size;
> > + unsigned long flags;
> > +
> > + if (static_branch_unlikely(&ax45mp_l2c_configured) && !ax45mp_priv)
> > + return;
> > +
> > + if (unlikely(start == end))
> > + return;
> > +
> > + line_size = ax45mp_priv->ax45mp_cache_line_size;
> > +
> > + memset(&cache_buf, 0x0, sizeof(cache_buf));
> > + start = start & (~(line_size - 1));
> > + end = ((end + line_size - 1) & (~(line_size - 1)));
> > +
> > + local_irq_save(flags);
> > + if (unlikely(start != old_start))
> > + memcpy(&cache_buf[0][0], (void *)start, line_size);
> > +
> > + if (unlikely(end != old_end))
> > + memcpy(&cache_buf[1][0], (void *)(old_end & (~(line_size - 1))), line_size);
>
> The memcpy dance is only required if ax45mp_cache_line_size is larger
> than ARCH_DMA_MINALIGN. Is that actually the case in practice? If not,
> you could verify this in the probe function, and remove this logic.
>
OK...
> > +
> > + ax45mp_cpu_dcache_inval_range(vaddr, (void *)end, line_size);
> > +
> > + if (unlikely(start != old_start))
> > + memcpy((void *)start, &cache_buf[0][0], (old_start & (line_size - 1)));
> > +
> > + if (unlikely(end != old_end))
> > + memcpy((void *)(old_end + 1),
> > + &cache_buf[1][(old_end & (line_size - 1)) + 1],
> > + end - old_end - 1);
> > +
> > + local_irq_restore(flags);
> > +}
> > +
> > +static void ax45mp_cpu_dma_wb_range(void *vaddr, size_t size)
> > +{
> > + unsigned long start = (unsigned long)vaddr;
> > + unsigned long end = start + size;
> > + unsigned long line_size;
> > + unsigned long flags;
> > +
> > + if (static_branch_unlikely(&ax45mp_l2c_configured) && !ax45mp_priv)
> > + return;
> > +
> > + line_size = ax45mp_priv->ax45mp_cache_line_size;
> > + local_irq_save(flags);
> > + start = start & (~(line_size - 1));
> > + ax45mp_cpu_dcache_wb_range(vaddr, (void *)end, line_size);
> > + local_irq_restore(flags);
> > +}
> > +
> > +void ax45mp_no_iocp_cmo(unsigned int cache_size, void *vaddr, size_t size, int dir, int ops)
> > +{
> > + if (ops == NON_COHERENT_DMA_PREP)
> > + return;
> > +
> > + if (ops == NON_COHERENT_SYNC_DMA_FOR_DEVICE) {
> > + switch (dir) {
> > + case DMA_FROM_DEVICE:
> > + ax45mp_cpu_dma_inval_range(vaddr, size);
> > + break;
> > + case DMA_TO_DEVICE:
> > + case DMA_BIDIRECTIONAL:
> > + ax45mp_cpu_dma_wb_range(vaddr, size);
> > + break;
> > + default:
> > + break;
> > + }
> > + return;
> > + }
> > +
> > + /* op == NON_COHERENT_SYNC_DMA_FOR_CPU */
> > + if (dir == DMA_BIDIRECTIONAL || dir == DMA_FROM_DEVICE)
> > + ax45mp_cpu_dma_inval_range(vaddr, size);
> > +}
> > +EXPORT_SYMBOL(ax45mp_no_iocp_cmo);
> > +
> > +static int ax45mp_configure_l2_cache(struct device_node *np)
> > +{
> > + int ret;
> > +
> > + ret = of_property_read_u32(np, "cache-line-size", &ax45mp_priv->ax45mp_cache_line_size);
> > + if (ret) {
> > + pr_err("Failed to get cache-line-size defaulting to 64 bytes\n");
> > + ax45mp_priv->ax45mp_cache_line_size = SZ_64;
> > + }
> > +
> > + if (ax45mp_priv->ax45mp_cache_line_size != SZ_64) {
> > + pr_err("Expected cache-line-size to 64 bytes (found:%u). Defaulting to 64 bytes\n",
> > + ax45mp_priv->ax45mp_cache_line_size);
> > + ax45mp_priv->ax45mp_cache_line_size = SZ_64;
> > + }
>
> Ah, so you already do this. And SZ_64 == ARCH_DMA_MINALIGN. So you do
> not need the memcpy logic.
>
... I did some initial testing and all seems to be OK. I'll get rid of
that check.

> > +
> > + ax45mp_priv->ucctl_ok = ax45mp_cpu_cache_controlable();
> > + ax45mp_priv->l2cache_enabled = ax45mp_cpu_l2c_ctl_status() & AX45MP_L2_CACHE_CTL_CEN_MASK;
> > +
> > + return 0;
> > +}
> > +
> > +static int ax45mp_l2c_probe(struct platform_device *pdev)
> > +{
> > + struct device_node *np = pdev->dev.of_node;
> > + int ret;
> > +
> > + ax45mp_priv = devm_kzalloc(&pdev->dev, sizeof(*ax45mp_priv), GFP_KERNEL);
> > + if (!ax45mp_priv)
> > + return -ENOMEM;
> > +
> > + ax45mp_priv->l2c_base = devm_of_iomap(&pdev->dev, pdev->dev.of_node, 0, NULL);
>
> devm_platform_ioremap_resource()
>
OK.

> > + if (!ax45mp_priv->l2c_base) {
> > + ret = -ENOMEM;
> > + goto l2c_err;
> > + }
> > +
> > + ret = ax45mp_configure_l2_cache(np);
> > + if (ret)
> > + goto l2c_err;
> > +
> > + ret = ax45mp_configure_pma_regions(np);
> > + if (ret)
> > + goto l2c_err;
> > +
> > + static_branch_disable(&ax45mp_l2c_configured);
>
> Instead of enabling this before the probe function, and disabling it
> afterward, just enable it once here, in the success case. Then you can
> drop the !ax45mp_priv check in the functions above.
>
I think I had tried it but static_branch_unlikely() was always returning true.

> And none of the functions would get called anyway if the alternative is
> not applied. I suppose it's not possible to do some of this probe logic
> in the alternative check function?
>
you mean to check in the vendor errata patch function to see if this
driver has probed?

> > +
> > + return 0;
> > +
> > +l2c_err:
> > + devm_kfree(&pdev->dev, ax45mp_priv);
> > + ax45mp_priv = NULL;
>
> None of this cleanup is necessary.
>
Agreed, I'll drop it.

Cheers,
Prabhakar

2022-11-27 11:08:09

by Geert Uytterhoeven

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

Hi Prabhakar,

On Sat, Nov 26, 2022 at 10:10 PM Lad, Prabhakar
<[email protected]> wrote:
> On Fri, Nov 25, 2022 at 7:43 PM Samuel Holland <[email protected]> wrote:
> > On 11/24/22 11:22, Prabhakar wrote:
> > > From: Lad Prabhakar <[email protected]>
> > >
> > > On the AX45MP core, cache coherency is a specification option so it may
> > > not be supported. In this case DMA will fail. As a workaround, firstly we
> > > allocate a global dma coherent pool from which DMA allocations are taken
> > > and marked as non-cacheable + bufferable using the PMA region as specified
> > > in the device tree. Synchronization callbacks are implemented to
> > > synchronize when doing DMA transactions.
> > >
> > > The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> > > block that allows dynamic adjustment of memory attributes in the runtime.
> > > It contains a configurable amount of PMA entries implemented as CSR
> > > registers to control the attributes of memory locations in interest.
> > >
> > > Below are the memory attributes supported:
> > > * Device, Non-bufferable
> > > * Device, bufferable
> > > * Memory, Non-cacheable, Non-bufferable
> > > * Memory, Non-cacheable, Bufferable
> > > * Memory, Write-back, No-allocate
> > > * Memory, Write-back, Read-allocate
> > > * Memory, Write-back, Write-allocate
> > > * Memory, Write-back, Read and Write-allocate
> > >
> > > This patch adds support to configure the memory attributes of the memory
> > > regions as passed from the l2 cache node and exposes the cache management
> > > ops.
> >
> > Forgive my ignorance, but why do you need both a DMA pool and explicit
> > cache maintenance? Wouldn't the purpose of marking a memory region as
> > permanently non-cacheable be to avoid cache maintenance? And likewise,
> > if you are doing cache maintenance anyway, why does it matter if/how the
> > memory is cacheable?
> >
> "Memory, Non-cacheable, Bufferable" raises an AXI signal for
> transactions hence needing SW implementation for cache maintenance.
>
> > > More info about PMA (section 10.3):
> > > Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
> > >
> > > Signed-off-by: Lad Prabhakar <[email protected]>

> > > +static int ax45mp_configure_pma_regions(struct device_node *np)
> > > +{
> > > + const char *propname = "andestech,pma-regions";
> > > + u32 start, size, flags;
> > > + unsigned int entry_id;
> > > + unsigned int i;
> > > + int count;
> > > + int ret;
> > > +
> > > + count = of_property_count_elems_of_size(np, propname, sizeof(u32) * 3);
> > > + if (count < 0)
> > > + return count;
> > > +
> > > + if (count > AX45MP_MAX_PMA_REGIONS)
> > > + return -EINVAL;
> > > +
> > > + for (i = 0, entry_id = 0 ; entry_id < count ; i += 3, entry_id++) {
> > > + of_property_read_u32_index(np, propname, i, &start);
> > > + of_property_read_u32_index(np, propname, i + 1, &size);
> > > + of_property_read_u32_index(np, propname, i + 2, &flags);
> > > + ret = ax45mp_sbi_set_pma(start, size, flags, entry_id);
> > > + if (!ret)
> > > + pr_err("Failed to setup PMA region 0x%x - 0x%x flags: 0x%x",
> > > + start, start + size, flags);
> > > + }
> > > +
> > > + return 0;
> > > +}
> >
> > If firmware support is required to set up these PMA regions, why is
> > Linux doing this at all? The firmware has access to the devicetree as
> > well. It can set this up before entering S-mode, and then you don't need
> > to expose this capability via an SBI extension. In fact, firmware could
> > generate the reserved-memory node based on these regions at runtime (or
> > vice versa).
> >
> That's a good point. I'll do some research on this and get back.
>
> Btw are there any existing examples where the firmware adds DT nodes?

/memory, reserved-memory, optee on ARM, RPC status on R-Car Gen3/4, ...

Gr{oetje,eeting}s,

Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- [email protected]

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
-- Linus Torvalds

2022-11-28 13:01:36

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

Hi Geert,

On Sun, Nov 27, 2022 at 9:55 AM Geert Uytterhoeven <[email protected]> wrote:
>
> Hi Prabhakar,
>
> On Sat, Nov 26, 2022 at 10:10 PM Lad, Prabhakar
> <[email protected]> wrote:
> > On Fri, Nov 25, 2022 at 7:43 PM Samuel Holland <[email protected]> wrote:
> > > On 11/24/22 11:22, Prabhakar wrote:
> > > > From: Lad Prabhakar <[email protected]>
> > > >
> > > > On the AX45MP core, cache coherency is a specification option so it may
> > > > not be supported. In this case DMA will fail. As a workaround, firstly we
> > > > allocate a global dma coherent pool from which DMA allocations are taken
> > > > and marked as non-cacheable + bufferable using the PMA region as specified
> > > > in the device tree. Synchronization callbacks are implemented to
> > > > synchronize when doing DMA transactions.
> > > >
> > > > The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
> > > > block that allows dynamic adjustment of memory attributes in the runtime.
> > > > It contains a configurable amount of PMA entries implemented as CSR
> > > > registers to control the attributes of memory locations in interest.
> > > >
> > > > Below are the memory attributes supported:
> > > > * Device, Non-bufferable
> > > > * Device, bufferable
> > > > * Memory, Non-cacheable, Non-bufferable
> > > > * Memory, Non-cacheable, Bufferable
> > > > * Memory, Write-back, No-allocate
> > > > * Memory, Write-back, Read-allocate
> > > > * Memory, Write-back, Write-allocate
> > > > * Memory, Write-back, Read and Write-allocate
> > > >
> > > > This patch adds support to configure the memory attributes of the memory
> > > > regions as passed from the l2 cache node and exposes the cache management
> > > > ops.
> > >
> > > Forgive my ignorance, but why do you need both a DMA pool and explicit
> > > cache maintenance? Wouldn't the purpose of marking a memory region as
> > > permanently non-cacheable be to avoid cache maintenance? And likewise,
> > > if you are doing cache maintenance anyway, why does it matter if/how the
> > > memory is cacheable?
> > >
> > "Memory, Non-cacheable, Bufferable" raises an AXI signal for
> > transactions hence needing SW implementation for cache maintenance.
> >
> > > > More info about PMA (section 10.3):
> > > > Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
> > > >
> > > > Signed-off-by: Lad Prabhakar <[email protected]>
>
> > > > +static int ax45mp_configure_pma_regions(struct device_node *np)
> > > > +{
> > > > + const char *propname = "andestech,pma-regions";
> > > > + u32 start, size, flags;
> > > > + unsigned int entry_id;
> > > > + unsigned int i;
> > > > + int count;
> > > > + int ret;
> > > > +
> > > > + count = of_property_count_elems_of_size(np, propname, sizeof(u32) * 3);
> > > > + if (count < 0)
> > > > + return count;
> > > > +
> > > > + if (count > AX45MP_MAX_PMA_REGIONS)
> > > > + return -EINVAL;
> > > > +
> > > > + for (i = 0, entry_id = 0 ; entry_id < count ; i += 3, entry_id++) {
> > > > + of_property_read_u32_index(np, propname, i, &start);
> > > > + of_property_read_u32_index(np, propname, i + 1, &size);
> > > > + of_property_read_u32_index(np, propname, i + 2, &flags);
> > > > + ret = ax45mp_sbi_set_pma(start, size, flags, entry_id);
> > > > + if (!ret)
> > > > + pr_err("Failed to setup PMA region 0x%x - 0x%x flags: 0x%x",
> > > > + start, start + size, flags);
> > > > + }
> > > > +
> > > > + return 0;
> > > > +}
> > >
> > > If firmware support is required to set up these PMA regions, why is
> > > Linux doing this at all? The firmware has access to the devicetree as
> > > well. It can set this up before entering S-mode, and then you don't need
> > > to expose this capability via an SBI extension. In fact, firmware could
> > > generate the reserved-memory node based on these regions at runtime (or
> > > vice versa).
> > >
> > That's a good point. I'll do some research on this and get back.
> >
> > Btw are there any existing examples where the firmware adds DT nodes?
>
> /memory, reserved-memory, optee on ARM, RPC status on R-Car Gen3/4, ...
>
On the TF-A we pass the FDT blob to u-boot and this does the magic.

On the RISC-V what would be the correct approach?
- We setup the PMA regions in OpenSBI
- We provide a vendor specific EXT to check if the PMA is setup
- In u-boot ft_board_setup() callback add the reserved-memory node

Does the above approach sound good or is there a better approach I'm missing?

Cheers,
Prabhakar

2022-11-29 06:17:59

by Samuel Holland

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

On 11/28/22 06:08, Lad, Prabhakar wrote:
> Hi Geert,
>
> On Sun, Nov 27, 2022 at 9:55 AM Geert Uytterhoeven <[email protected]> wrote:
>>
>> Hi Prabhakar,
>>
>> On Sat, Nov 26, 2022 at 10:10 PM Lad, Prabhakar
>> <[email protected]> wrote:
>>> On Fri, Nov 25, 2022 at 7:43 PM Samuel Holland <[email protected]> wrote:
>>>> On 11/24/22 11:22, Prabhakar wrote:
>>>>> From: Lad Prabhakar <[email protected]>
>>>>>
>>>>> On the AX45MP core, cache coherency is a specification option so it may
>>>>> not be supported. In this case DMA will fail. As a workaround, firstly we
>>>>> allocate a global dma coherent pool from which DMA allocations are taken
>>>>> and marked as non-cacheable + bufferable using the PMA region as specified
>>>>> in the device tree. Synchronization callbacks are implemented to
>>>>> synchronize when doing DMA transactions.
>>>>>
>>>>> The Andes AX45MP core has a Programmable Physical Memory Attributes (PMA)
>>>>> block that allows dynamic adjustment of memory attributes in the runtime.
>>>>> It contains a configurable amount of PMA entries implemented as CSR
>>>>> registers to control the attributes of memory locations in interest.
>>>>>
>>>>> Below are the memory attributes supported:
>>>>> * Device, Non-bufferable
>>>>> * Device, bufferable
>>>>> * Memory, Non-cacheable, Non-bufferable
>>>>> * Memory, Non-cacheable, Bufferable
>>>>> * Memory, Write-back, No-allocate
>>>>> * Memory, Write-back, Read-allocate
>>>>> * Memory, Write-back, Write-allocate
>>>>> * Memory, Write-back, Read and Write-allocate
>>>>>
>>>>> This patch adds support to configure the memory attributes of the memory
>>>>> regions as passed from the l2 cache node and exposes the cache management
>>>>> ops.
>>>>
>>>> Forgive my ignorance, but why do you need both a DMA pool and explicit
>>>> cache maintenance? Wouldn't the purpose of marking a memory region as
>>>> permanently non-cacheable be to avoid cache maintenance? And likewise,
>>>> if you are doing cache maintenance anyway, why does it matter if/how the
>>>> memory is cacheable?
>>>>
>>> "Memory, Non-cacheable, Bufferable" raises an AXI signal for
>>> transactions hence needing SW implementation for cache maintenance.
>>>
>>>>> More info about PMA (section 10.3):
>>>>> Link: http://www.andestech.com/wp-content/uploads/AX45MP-1C-Rev.-5.0.0-Datasheet.pdf
>>>>>
>>>>> Signed-off-by: Lad Prabhakar <[email protected]>
>>
>>>>> +static int ax45mp_configure_pma_regions(struct device_node *np)
>>>>> +{
>>>>> + const char *propname = "andestech,pma-regions";
>>>>> + u32 start, size, flags;
>>>>> + unsigned int entry_id;
>>>>> + unsigned int i;
>>>>> + int count;
>>>>> + int ret;
>>>>> +
>>>>> + count = of_property_count_elems_of_size(np, propname, sizeof(u32) * 3);
>>>>> + if (count < 0)
>>>>> + return count;
>>>>> +
>>>>> + if (count > AX45MP_MAX_PMA_REGIONS)
>>>>> + return -EINVAL;
>>>>> +
>>>>> + for (i = 0, entry_id = 0 ; entry_id < count ; i += 3, entry_id++) {
>>>>> + of_property_read_u32_index(np, propname, i, &start);
>>>>> + of_property_read_u32_index(np, propname, i + 1, &size);
>>>>> + of_property_read_u32_index(np, propname, i + 2, &flags);
>>>>> + ret = ax45mp_sbi_set_pma(start, size, flags, entry_id);
>>>>> + if (!ret)
>>>>> + pr_err("Failed to setup PMA region 0x%x - 0x%x flags: 0x%x",
>>>>> + start, start + size, flags);
>>>>> + }
>>>>> +
>>>>> + return 0;
>>>>> +}
>>>>
>>>> If firmware support is required to set up these PMA regions, why is
>>>> Linux doing this at all? The firmware has access to the devicetree as
>>>> well. It can set this up before entering S-mode, and then you don't need
>>>> to expose this capability via an SBI extension. In fact, firmware could
>>>> generate the reserved-memory node based on these regions at runtime (or
>>>> vice versa).
>>>>
>>> That's a good point. I'll do some research on this and get back.
>>>
>>> Btw are there any existing examples where the firmware adds DT nodes?
>>
>> /memory, reserved-memory, optee on ARM, RPC status on R-Car Gen3/4, ...
>>
> On the TF-A we pass the FDT blob to u-boot and this does the magic.
>
> On the RISC-V what would be the correct approach?
> - We setup the PMA regions in OpenSBI
> - We provide a vendor specific EXT to check if the PMA is setup
> - In u-boot ft_board_setup() callback add the reserved-memory node
>
> Does the above approach sound good or is there a better approach I'm missing?

My suggestion was to fix up the DT in OpenSBI itself. See
lib/utils/fdt/fdt_fixup.c in the OpenSBI source tree. There is also a
platform hook for this. Then OpenSBI passes the FDT to U-Boot, and
U-Boot passes it on to Linux. No SBI extension is needed in that case.

If you optionally want your U-Boot to support loading a replacement FDT
from disk, then ft_board_setup() would need to copy the reserved-memory
nodes from U-Boot's control FDT to the loaded FDT. But this logic is the
same for all reserved-memory nodes, including the one OpenSBI adds
already. U-Boot has some code for this copying which you could reuse.

Regards,
Samuel

2022-11-29 06:27:24

by Samuel Holland

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

On 11/26/22 15:09, Lad, Prabhakar wrote:
>>> + if (!ax45mp_priv->l2c_base) {
>>> + ret = -ENOMEM;
>>> + goto l2c_err;
>>> + }
>>> +
>>> + ret = ax45mp_configure_l2_cache(np);
>>> + if (ret)
>>> + goto l2c_err;
>>> +
>>> + ret = ax45mp_configure_pma_regions(np);
>>> + if (ret)
>>> + goto l2c_err;
>>> +
>>> + static_branch_disable(&ax45mp_l2c_configured);
>>
>> Instead of enabling this before the probe function, and disabling it
>> afterward, just enable it once here, in the success case. Then you can
>> drop the !ax45mp_priv check in the functions above.
>>
> I think I had tried it but static_branch_unlikely() was always returning true.

You use DEFINE_STATIC_KEY_FALSE above, so static_branch_unlikely()
should return false until you call static_branch_enable().

>> And none of the functions would get called anyway if the alternative is
>> not applied. I suppose it's not possible to do some of this probe logic
>> in the alternative check function?
>>
> you mean to check in the vendor errata patch function to see if this
> driver has probed?

I meant to do the equivalent of:

+ ax45mp_priv->ucctl_ok = ax45mp_cpu_cache_controlable();
+ ax45mp_priv->l2cache_enabled = ax45mp_cpu_l2c_ctl_status() &
AX45MP_L2_CACHE_CTL_CEN_MASK;

in the errata function, since that decides if the cache maintenance
functions actually do anything. But ax45mp_cpu_l2c_ctl_status() gets the
MMIO address from the DT, and trying to do that from the errata function
could get ugly, so maybe it is not a good suggestion.

Regards,
Samuel

2022-12-01 13:01:00

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 7/7] soc: renesas: Add L2 cache management for RZ/Five SoC

Hi Samuel,

On Tue, Nov 29, 2022 at 5:58 AM Samuel Holland <[email protected]> wrote:
>
> On 11/26/22 15:09, Lad, Prabhakar wrote:
> >>> + if (!ax45mp_priv->l2c_base) {
> >>> + ret = -ENOMEM;
> >>> + goto l2c_err;
> >>> + }
> >>> +
> >>> + ret = ax45mp_configure_l2_cache(np);
> >>> + if (ret)
> >>> + goto l2c_err;
> >>> +
> >>> + ret = ax45mp_configure_pma_regions(np);
> >>> + if (ret)
> >>> + goto l2c_err;
> >>> +
> >>> + static_branch_disable(&ax45mp_l2c_configured);
> >>
> >> Instead of enabling this before the probe function, and disabling it
> >> afterward, just enable it once here, in the success case. Then you can
> >> drop the !ax45mp_priv check in the functions above.
> >>
> > I think I had tried it but static_branch_unlikely() was always returning true.
>
> You use DEFINE_STATIC_KEY_FALSE above, so static_branch_unlikely()
> should return false until you call static_branch_enable().
>
OK, got that.

> >> And none of the functions would get called anyway if the alternative is
> >> not applied. I suppose it's not possible to do some of this probe logic
> >> in the alternative check function?
> >>
> > you mean to check in the vendor errata patch function to see if this
> > driver has probed?
>
> I meant to do the equivalent of:
>
> + ax45mp_priv->ucctl_ok = ax45mp_cpu_cache_controlable();
> + ax45mp_priv->l2cache_enabled = ax45mp_cpu_l2c_ctl_status() &
> AX45MP_L2_CACHE_CTL_CEN_MASK;
>
> in the errata function, since that decides if the cache maintenance
> functions actually do anything. But ax45mp_cpu_l2c_ctl_status() gets the
> MMIO address from the DT, and trying to do that from the errata function
> could get ugly, so maybe it is not a good suggestion.
>
Actually I did think about this and the best approach is to do it in
errata only as you suggested. So here's my approach for dropping the
above checks is to introduce vendor specific SBI EXT
(RZFIVE_SBI_EXT_IOCP_SW_WORKAROUND) which will check both the above
conditions and only apply the errata on success and hence avoid the
"if" checks every time in the sync operation.

Cheers,
Prabhakar

2022-12-02 00:01:20

by Conor Dooley

[permalink] [raw]
Subject: Re: [PATCH v4 0/7] AX45MP: Add support to non-coherent DMA

Hi Prabhakar,

I'm going to mark this series as "Changes Requested" in patchwork since
there's been quite a lot of commentary and it looks like Samuel & Heiko
have both suggested some changes.
Scream if you think that's not appropriate :)

Thanks,
Conor.

2022-12-02 10:11:41

by Lad, Prabhakar

[permalink] [raw]
Subject: Re: [PATCH v4 0/7] AX45MP: Add support to non-coherent DMA

Hi Conor,

On Thu, Dec 1, 2022 at 11:36 PM Conor Dooley <[email protected]> wrote:
>
> Hi Prabhakar,
>
> I'm going to mark this series as "Changes Requested" in patchwork since
> there's been quite a lot of commentary and it looks like Samuel & Heiko
> have both suggested some changes.
Sounds good, I am working on the changes requested.

Cheers,
Prabhakar