2024-02-05 17:20:32

by Anand Moon

[permalink] [raw]
Subject: [PATCHv1 1/5] arm64: dts: amlogic: Add cache information to the Amlogic GXBB and GXL SoC

As per S905 and S905X datasheet add missing cache information to
the Amlogic GXBB and GXL SoC.

- Each Cortex-A53 core has 32KB of L1 instruction cache available and
32KB of L1 data cache available.
- Along with 512KB Unified L2 cache.

To improve system performance.

Signed-off-by: Anand Moon <[email protected]>
---
Datasheet
[0] https://dn.odroid.com/S905/DataSheet/S905_Public_Datasheet_V1.1.4.pdf
---
arch/arm64/boot/dts/amlogic/meson-gx.dtsi | 27 +++++++++++++++++++++++
1 file changed, 27 insertions(+)

diff --git a/arch/arm64/boot/dts/amlogic/meson-gx.dtsi b/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
index 2673f0dbafe7..e141ade5e49b 100644
--- a/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
+++ b/arch/arm64/boot/dts/amlogic/meson-gx.dtsi
@@ -95,6 +95,12 @@ cpu0: cpu@0 {
compatible = "arm,cortex-a53";
reg = <0x0 0x0>;
enable-method = "psci";
+ d-cache-line-size = <32>;
+ d-cache-size = <0x8000>;
+ d-cache-sets = <32>;
+ i-cache-line-size = <32>;
+ i-cache-size = <0x8000>;
+ i-cache-sets = <32>;
next-level-cache = <&l2>;
clocks = <&scpi_dvfs 0>;
#cooling-cells = <2>;
@@ -105,6 +111,12 @@ cpu1: cpu@1 {
compatible = "arm,cortex-a53";
reg = <0x0 0x1>;
enable-method = "psci";
+ d-cache-line-size = <32>;
+ d-cache-size = <0x8000>;
+ d-cache-sets = <32>;
+ i-cache-line-size = <32>;
+ i-cache-size = <0x8000>;
+ i-cache-sets = <32>;
next-level-cache = <&l2>;
clocks = <&scpi_dvfs 0>;
#cooling-cells = <2>;
@@ -115,6 +127,12 @@ cpu2: cpu@2 {
compatible = "arm,cortex-a53";
reg = <0x0 0x2>;
enable-method = "psci";
+ d-cache-line-size = <32>;
+ d-cache-size = <0x8000>;
+ d-cache-sets = <32>;
+ i-cache-line-size = <32>;
+ i-cache-size = <0x8000>;
+ i-cache-sets = <32>;
next-level-cache = <&l2>;
clocks = <&scpi_dvfs 0>;
#cooling-cells = <2>;
@@ -125,6 +143,12 @@ cpu3: cpu@3 {
compatible = "arm,cortex-a53";
reg = <0x0 0x3>;
enable-method = "psci";
+ d-cache-line-size = <32>;
+ d-cache-size = <0x8000>;
+ d-cache-sets = <32>;
+ i-cache-line-size = <32>;
+ i-cache-size = <0x8000>;
+ i-cache-sets = <32>;
next-level-cache = <&l2>;
clocks = <&scpi_dvfs 0>;
#cooling-cells = <2>;
@@ -134,6 +158,9 @@ l2: l2-cache0 {
compatible = "cache";
cache-level = <2>;
cache-unified;
+ cache-size = <0x7d000>; /* L2. 512 KB */
+ cache-line-size = <64>;
+ cache-sets = <512>;
};
};

--
2.43.0



2024-02-27 13:22:42

by Anand Moon

[permalink] [raw]
Subject: Re: [PATCHv1 1/5] arm64: dts: amlogic: Add cache information to the Amlogic GXBB and GXL SoC

Hi Neil,

On Mon, 5 Feb 2024 at 22:50, Anand Moon <[email protected]> wrote:
>
> As per S905 and S905X datasheet add missing cache information to
> the Amlogic GXBB and GXL SoC.
>
> - Each Cortex-A53 core has 32KB of L1 instruction cache available and
> 32KB of L1 data cache available.
> - Along with 512KB Unified L2 cache.
>
> To improve system performance.
>
> Signed-off-by: Anand Moon <[email protected]>
> ---
> Datasheet
> [0] https://dn.odroid.com/S905/DataSheet/S905_Public_Datasheet_V1.1.4.pdf
> ---

As per the Arm Cortex A53 TRM documentation
[0] https://developer.arm.com/documentation/ddi0500/j/Introduction/Implementation-options?lang=en

Since this SoC supports arm-pmu we could read cache info using perf
[1] https://www.baeldung.com/linux/analyze-cache-misses

[alarm@archl-librecm ~]$ sudo perf list

List of pre-defined events (to be used in -e or -M):

branch-instructions OR branches [Hardware event]
branch-misses [Hardware event]
bus-cycles [Hardware event]
cache-misses [Hardware event]
cache-references [Hardware event]
cpu-cycles OR cycles [Hardware event]
instructions [Hardware event]
alignment-faults [Software event]
bpf-output [Software event]
cgroup-switches [Software event]
context-switches OR cs [Software event]
cpu-clock [Software event]
cpu-migrations OR migrations [Software event]
dummy [Software event]
emulation-faults [Software event]
major-faults [Software event]
minor-faults [Software event]
page-faults OR faults [Software event]
task-clock [Software event]
duration_time [Tool event]
user_time [Tool event]
system_time [Tool event]

armv8_cortex_a53:
L1-dcache-loads OR armv8_cortex_a53/L1-dcache-loads/
L1-dcache-load-misses OR armv8_cortex_a53/L1-dcache-load-misses/
L1-dcache-prefetch-misses OR armv8_cortex_a53/L1-dcache-prefetch-misses/
L1-icache-loads OR armv8_cortex_a53/L1-icache-loads/
L1-icache-load-misses OR armv8_cortex_a53/L1-icache-load-misses/
dTLB-load-misses OR armv8_cortex_a53/dTLB-load-misses/
iTLB-load-misses OR armv8_cortex_a53/iTLB-load-misses/
branch-loads OR armv8_cortex_a53/branch-loads/
branch-load-misses OR armv8_cortex_a53/branch-load-misses/
node-loads OR armv8_cortex_a53/node-loads/
node-stores OR armv8_cortex_a53/node-stores/
br_immed_retired OR armv8_cortex_a53/br_immed_retired/[Kernel PMU event]
br_mis_pred OR armv8_cortex_a53/br_mis_pred/ [Kernel PMU event]
br_pred OR armv8_cortex_a53/br_pred/ [Kernel PMU event]
bus_access OR armv8_cortex_a53/bus_access/ [Kernel PMU event]
bus_cycles OR armv8_cortex_a53/bus_cycles/ [Kernel PMU event]
cid_write_retired OR armv8_cortex_a53/cid_write_retired/[Kernel PMU event]
cpu_cycles OR armv8_cortex_a53/cpu_cycles/ [Kernel PMU event]
exc_return OR armv8_cortex_a53/exc_return/ [Kernel PMU event]

[alarm@archl-librecm ~]$ perf stat -B -e
cache-references,cache-misses,cycles,instructions,branches,faults,migrations
sleep 5

Performance counter stats for 'sleep 5':

52794 cache-references:u
2311 cache-misses:u # 4.38% of
all cache refs
480343 cycles:u
140018 instructions:u # 0.29
insn per cycle
15012 branches:u
46 faults:u
0 migrations:u

5.008073381 seconds time elapsed

0.000000000 seconds user
0.006952000 seconds sys

Thanks



-Anand