Currently, riscv defines ARCH_DMA_MINALIGN as L1_CACHE_BYTES, I.E
64Bytes, if CONFIG_RISCV_DMA_NONCOHERENT=y. To support unified kernel
Image, usually we have to enable CONFIG_RISCV_DMA_NONCOHERENT, thus
it brings some bad effects to for coherent platforms:
Firstly, it wastes memory, kmalloc-96, kmalloc-32, kmalloc-16 and
kmalloc-8 slab caches don't exist any more, they are replaced with
either kmalloc-128 or kmalloc-64.
Secondly, larger than necessary kmalloc aligned allocations results
in unnecessary cache/TLB pressure.
This issue also exists on arm64 platforms. From last year, Catalin
tried to solve this issue by decoupling ARCH_KMALLOC_MINALIGN from
ARCH_DMA_MINALIGN, limiting kmalloc() minimum alignment to
dma_get_cache_alignment() and replacing ARCH_KMALLOC_MINALIGN usage
in various drivers with ARCH_DMA_MINALIGN etc.
One fact we can make use of for riscv: if the CPU doesn't support
ZICBOM or T-HEAD CMO, we know the platform is coherent. Based on
Catalin's work and above fact, we can easily solve the kmalloc align
issue for riscv: we can override dma_get_cache_alignment(), then let
it return ARCH_DMA_MINALIGN at the beginning and return 1 once we know
the underlying HW neither supports ZICBOM nor supports T-HEAD CMO.
So what about if the CPU supports ZICBOM and T-HEAD CMO, but all the
devices are dma coherent? Well, we use ARCH_DMA_MINALIGN as the
kmalloc minimum alignment, nothing changed in this case. This case
can be improved in the future.
After this patch, a simple test of booting to a small buildroot rootfs
on qemu shows:
kmalloc-96 5041 5041 96 ...
kmalloc-64 9606 9606 64 ...
kmalloc-32 5128 5128 32 ...
kmalloc-16 7682 7682 16 ...
kmalloc-8 10246 10246 8 ...
So we save about 1268KB memory. The saving will be much larger in normal
OS env on real HW platforms.
patch 1,2,3,4 are either clean up or preparation patches.
patch5 allows kmalloc() caches aligned to the smallest value.
patch6 enables DMA_BOUNCE_UNALIGNED_KMALLOC.
After this series:
As for coherent platforms, kmalloc-{8,16,32,96} caches come back on
coherent both RV32 and RV64 platforms, I.E !ZICBOM and !THEAD_CMO.
As for noncoherent RV32 platforms, nothing changed.
As for noncoherent RV64 platforms, I.E either ZICBOM or THEAD_CMO, the
above kmalloc caches also come back if > 4GB memory or users pass
"swiotlb=mmnn,force" to force swiotlb creation if <= 4GB memory. How
much mmnn should be depends on the specific platform, it need to be
tried and tested all possible usage case on the specific hardware. For
example, I can use the minimal I/O TLB slabs on Sipeed M1S Dock.
[1] Link: https://lore.kernel.org/linux-arm-kernel/[email protected]/
Jisheng Zhang (6):
riscv: errata: thead: only set cbom size & noncoherent during boot
riscv: mm: mark CBO relate initialization funcs as __init
riscv: mm: mark noncoherent_supported as __ro_after_init
riscv: mm: pass noncoherent or not to riscv_noncoherent_supported()
riscv: allow kmalloc() caches aligned to the smallest value
riscv: enable DMA_BOUNCE_UNALIGNED_KMALLOC for !dma_coherent
arch/riscv/Kconfig | 1 +
arch/riscv/errata/thead/errata.c | 22 ++++++++++++++--------
arch/riscv/include/asm/cache.h | 14 ++++++++++++++
arch/riscv/include/asm/cacheflush.h | 4 ++--
arch/riscv/kernel/setup.c | 6 +++++-
arch/riscv/mm/cacheflush.c | 8 ++++----
arch/riscv/mm/dma-noncoherent.c | 16 +++++++++++-----
7 files changed, 51 insertions(+), 20 deletions(-)
--
2.40.1
The CBOM size and whether the HW is noncoherent is known and
determined during booting and won't change after that.
Signed-off-by: Jisheng Zhang <[email protected]>
---
arch/riscv/errata/thead/errata.c | 7 +++++--
1 file changed, 5 insertions(+), 2 deletions(-)
diff --git a/arch/riscv/errata/thead/errata.c b/arch/riscv/errata/thead/errata.c
index c259dc925ec1..be84b14f0118 100644
--- a/arch/riscv/errata/thead/errata.c
+++ b/arch/riscv/errata/thead/errata.c
@@ -45,8 +45,11 @@ static bool errata_probe_cmo(unsigned int stage,
if (stage == RISCV_ALTERNATIVES_EARLY_BOOT)
return false;
- riscv_cbom_block_size = L1_CACHE_BYTES;
- riscv_noncoherent_supported();
+ if (stage == RISCV_ALTERNATIVES_BOOT) {
+ riscv_cbom_block_size = L1_CACHE_BYTES;
+ riscv_noncoherent_supported();
+ }
+
return true;
}
--
2.40.1
The two functions cbo_get_block_size() and riscv_init_cbo_blocksizes()
are only called during booting, mark them as __init.
Signed-off-by: Jisheng Zhang <[email protected]>
---
arch/riscv/mm/cacheflush.c | 8 ++++----
1 file changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/riscv/mm/cacheflush.c b/arch/riscv/mm/cacheflush.c
index fca532ddf3ec..fbc59b3f69f2 100644
--- a/arch/riscv/mm/cacheflush.c
+++ b/arch/riscv/mm/cacheflush.c
@@ -104,9 +104,9 @@ EXPORT_SYMBOL_GPL(riscv_cbom_block_size);
unsigned int riscv_cboz_block_size;
EXPORT_SYMBOL_GPL(riscv_cboz_block_size);
-static void cbo_get_block_size(struct device_node *node,
- const char *name, u32 *block_size,
- unsigned long *first_hartid)
+static void __init cbo_get_block_size(struct device_node *node,
+ const char *name, u32 *block_size,
+ unsigned long *first_hartid)
{
unsigned long hartid;
u32 val;
@@ -126,7 +126,7 @@ static void cbo_get_block_size(struct device_node *node,
}
}
-void riscv_init_cbo_blocksizes(void)
+void __init riscv_init_cbo_blocksizes(void)
{
unsigned long cbom_hartid, cboz_hartid;
u32 cbom_block_size = 0, cboz_block_size = 0;
--
2.40.1
On Sat, May 27, 2023 at 12:59:53AM +0800, Jisheng Zhang wrote:
> The CBOM size and whether the HW is noncoherent is known and
> determined during booting and won't change after that.
>
> Signed-off-by: Jisheng Zhang <[email protected]>
Makes sense to me,
Reviewed-by: Conor Dooley <[email protected]>
Thanks,
Conor.
On Sat, May 27, 2023 at 12:59:54AM +0800, Jisheng Zhang wrote:
> The two functions cbo_get_block_size() and riscv_init_cbo_blocksizes()
> are only called during booting, mark them as __init.
>
> Signed-off-by: Jisheng Zhang <[email protected]>
Reviewed-by: Conor Dooley <[email protected]>
Thanks,
Conor.