2022-12-20 16:26:19

by Sumit Gupta

[permalink] [raw]
Subject: [Patch v1 06/10] arm64: tegra: Add cpu OPP tables and interconnects property

Add OPP table and interconnects property required to scale DDR
frequency for better performance. The OPP table has CPU frequency
to per MC channel bandwidth mapping in each operating point entry.
One table is added for each cluster even though the table data is
same because the bandwidth request is per cluster. OPP framework
is creating a single icc path if the table is marked 'opp-shared'
and shared among all clusters. For us the OPP table is same but
the MC client ID argument to interconnects property is different
for each cluster which makes different icc path for all.

Signed-off-by: Sumit Gupta <[email protected]>
---
arch/arm64/boot/dts/nvidia/tegra234.dtsi | 276 +++++++++++++++++++++++
1 file changed, 276 insertions(+)

diff --git a/arch/arm64/boot/dts/nvidia/tegra234.dtsi b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
index eaf05ee9acd1..ed7d0f7da431 100644
--- a/arch/arm64/boot/dts/nvidia/tegra234.dtsi
+++ b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
@@ -2840,6 +2840,9 @@

enable-method = "psci";

+ operating-points-v2 = <&cl0_opp_tbl>;
+ interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER0 &emc>;
+
i-cache-size = <65536>;
i-cache-line-size = <64>;
i-cache-sets = <256>;
@@ -2856,6 +2859,9 @@

enable-method = "psci";

+ operating-points-v2 = <&cl0_opp_tbl>;
+ interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER0 &emc>;
+
i-cache-size = <65536>;
i-cache-line-size = <64>;
i-cache-sets = <256>;
@@ -2872,6 +2878,9 @@

enable-method = "psci";

+ operating-points-v2 = <&cl0_opp_tbl>;
+ interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER0 &emc>;
+
i-cache-size = <65536>;
i-cache-line-size = <64>;
i-cache-sets = <256>;
@@ -2888,6 +2897,9 @@

enable-method = "psci";

+ operating-points-v2 = <&cl0_opp_tbl>;
+ interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER0 &emc>;
+
i-cache-size = <65536>;
i-cache-line-size = <64>;
i-cache-sets = <256>;
@@ -2904,6 +2916,9 @@

enable-method = "psci";

+ operating-points-v2 = <&cl1_opp_tbl>;
+ interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER1 &emc>;
+
i-cache-size = <65536>;
i-cache-line-size = <64>;
i-cache-sets = <256>;
@@ -2920,6 +2935,9 @@

enable-method = "psci";

+ operating-points-v2 = <&cl1_opp_tbl>;
+ interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER1 &emc>;
+
i-cache-size = <65536>;
i-cache-line-size = <64>;
i-cache-sets = <256>;
@@ -2936,6 +2954,9 @@

enable-method = "psci";

+ operating-points-v2 = <&cl1_opp_tbl>;
+ interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER1 &emc>;
+
i-cache-size = <65536>;
i-cache-line-size = <64>;
i-cache-sets = <256>;
@@ -2952,6 +2973,9 @@

enable-method = "psci";

+ operating-points-v2 = <&cl1_opp_tbl>;
+ interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER1 &emc>;
+
i-cache-size = <65536>;
i-cache-line-size = <64>;
i-cache-sets = <256>;
@@ -2968,6 +2992,9 @@

enable-method = "psci";

+ operating-points-v2 = <&cl2_opp_tbl>;
+ interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER2 &emc>;
+
i-cache-size = <65536>;
i-cache-line-size = <64>;
i-cache-sets = <256>;
@@ -2984,6 +3011,9 @@

enable-method = "psci";

+ operating-points-v2 = <&cl2_opp_tbl>;
+ interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER2 &emc>;
+
i-cache-size = <65536>;
i-cache-line-size = <64>;
i-cache-sets = <256>;
@@ -3000,6 +3030,9 @@

enable-method = "psci";

+ operating-points-v2 = <&cl2_opp_tbl>;
+ interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER2 &emc>;
+
i-cache-size = <65536>;
i-cache-line-size = <64>;
i-cache-sets = <256>;
@@ -3016,6 +3049,9 @@

enable-method = "psci";

+ operating-points-v2 = <&cl2_opp_tbl>;
+ interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER2 &emc>;
+
i-cache-size = <65536>;
i-cache-line-size = <64>;
i-cache-sets = <256>;
@@ -3272,4 +3308,244 @@
interrupt-parent = <&gic>;
always-on;
};
+
+ cl0_opp_tbl: opp-table-cluster0 {
+ compatible = "operating-points-v2";
+ opp-shared;
+
+ cl0_ch1_opp1: opp-115200000 {
+ opp-hz = /bits/ 64 <115200000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl0_ch1_opp2: opp-268800000 {
+ opp-hz = /bits/ 64 <268800000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl0_ch1_opp3: opp-422400000 {
+ opp-hz = /bits/ 64 <422400000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl0_ch1_opp4: opp-576000000 {
+ opp-hz = /bits/ 64 <576000000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl0_ch1_opp5: opp-729600000 {
+ opp-hz = /bits/ 64 <729600000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl0_ch1_opp6: opp-883200000 {
+ opp-hz = /bits/ 64 <883200000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl0_ch1_opp7: opp-1036800000 {
+ opp-hz = /bits/ 64 <1036800000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl0_ch1_opp8: opp-1190400000 {
+ opp-hz = /bits/ 64 <1190400000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl0_ch1_opp9: opp-1344000000 {
+ opp-hz = /bits/ 64 <1344000000>;
+ opp-peak-kBps = <1632000>;
+ };
+
+ cl0_ch1_opp10: opp-1497600000 {
+ opp-hz = /bits/ 64 <1497600000>;
+ opp-peak-kBps = <1632000>;
+ };
+
+ cl0_ch1_opp11: opp-1651200000 {
+ opp-hz = /bits/ 64 <1651200000>;
+ opp-peak-kBps = <2660000>;
+ };
+
+ cl0_ch1_opp12: opp-1804800000 {
+ opp-hz = /bits/ 64 <1804800000>;
+ opp-peak-kBps = <2660000>;
+ };
+
+ cl0_ch1_opp13: opp-1958400000 {
+ opp-hz = /bits/ 64 <1958400000>;
+ opp-peak-kBps = <3200000>;
+ };
+
+ cl0_ch1_opp14: opp-2112000000 {
+ opp-hz = /bits/ 64 <2112000000>;
+ opp-peak-kBps = <6400000>;
+ };
+
+ cl0_ch1_opp15: opp-2201600000 {
+ opp-hz = /bits/ 64 <2201600000>;
+ opp-peak-kBps = <6400000>;
+ };
+ };
+
+ cl1_opp_tbl: opp-table-cluster1 {
+ compatible = "operating-points-v2";
+ opp-shared;
+
+ cl1_ch1_opp1: opp-115200000 {
+ opp-hz = /bits/ 64 <115200000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl1_ch1_opp2: opp-268800000 {
+ opp-hz = /bits/ 64 <268800000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl1_ch1_opp3: opp-422400000 {
+ opp-hz = /bits/ 64 <422400000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl1_ch1_opp4: opp-576000000 {
+ opp-hz = /bits/ 64 <576000000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl1_ch1_opp5: opp-729600000 {
+ opp-hz = /bits/ 64 <729600000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl1_ch1_opp6: opp-883200000 {
+ opp-hz = /bits/ 64 <883200000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl1_ch1_opp7: opp-1036800000 {
+ opp-hz = /bits/ 64 <1036800000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl1_ch1_opp8: opp-1190400000 {
+ opp-hz = /bits/ 64 <1190400000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl1_ch1_opp9: opp-1344000000 {
+ opp-hz = /bits/ 64 <1344000000>;
+ opp-peak-kBps = <1632000>;
+ };
+
+ cl1_ch1_opp10: opp-1497600000 {
+ opp-hz = /bits/ 64 <1497600000>;
+ opp-peak-kBps = <1632000>;
+ };
+
+ cl1_ch1_opp11: opp-1651200000 {
+ opp-hz = /bits/ 64 <1651200000>;
+ opp-peak-kBps = <2660000>;
+ };
+
+ cl1_ch1_opp12: opp-1804800000 {
+ opp-hz = /bits/ 64 <1804800000>;
+ opp-peak-kBps = <2660000>;
+ };
+
+ cl1_ch1_opp13: opp-1958400000 {
+ opp-hz = /bits/ 64 <1958400000>;
+ opp-peak-kBps = <3200000>;
+ };
+
+ cl1_ch1_opp14: opp-2112000000 {
+ opp-hz = /bits/ 64 <2112000000>;
+ opp-peak-kBps = <6400000>;
+ };
+
+ cl1_ch1_opp15: opp-2201600000 {
+ opp-hz = /bits/ 64 <2201600000>;
+ opp-peak-kBps = <6400000>;
+ };
+ };
+
+ cl2_opp_tbl: opp-table-cluster2 {
+ compatible = "operating-points-v2";
+ opp-shared;
+
+ cl2_ch1_opp1: opp-115200000 {
+ opp-hz = /bits/ 64 <115200000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl2_ch1_opp2: opp-268800000 {
+ opp-hz = /bits/ 64 <268800000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl2_ch1_opp3: opp-422400000 {
+ opp-hz = /bits/ 64 <422400000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl2_ch1_opp4: opp-576000000 {
+ opp-hz = /bits/ 64 <576000000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl2_ch1_opp5: opp-729600000 {
+ opp-hz = /bits/ 64 <729600000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl2_ch1_opp6: opp-883200000 {
+ opp-hz = /bits/ 64 <883200000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl2_ch1_opp7: opp-1036800000 {
+ opp-hz = /bits/ 64 <1036800000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl2_ch1_opp8: opp-1190400000 {
+ opp-hz = /bits/ 64 <1190400000>;
+ opp-peak-kBps = <816000>;
+ };
+
+ cl2_ch1_opp9: opp-1344000000 {
+ opp-hz = /bits/ 64 <1344000000>;
+ opp-peak-kBps = <1632000>;
+ };
+
+ cl2_ch1_opp10: opp-1497600000 {
+ opp-hz = /bits/ 64 <1497600000>;
+ opp-peak-kBps = <1632000>;
+ };
+
+ cl2_ch1_opp11: opp-1651200000 {
+ opp-hz = /bits/ 64 <1651200000>;
+ opp-peak-kBps = <2660000>;
+ };
+
+ cl2_ch1_opp12: opp-1804800000 {
+ opp-hz = /bits/ 64 <1804800000>;
+ opp-peak-kBps = <2660000>;
+ };
+
+ cl2_ch1_opp13: opp-1958400000 {
+ opp-hz = /bits/ 64 <1958400000>;
+ opp-peak-kBps = <3200000>;
+ };
+
+ cl2_ch1_opp14: opp-2112000000 {
+ opp-hz = /bits/ 64 <2112000000>;
+ opp-peak-kBps = <6400000>;
+ };
+
+ cl2_ch1_opp15: opp-2201600000 {
+ opp-hz = /bits/ 64 <2201600000>;
+ opp-peak-kBps = <6400000>;
+ };
+ };
};
--
2.17.1


2023-01-16 18:02:16

by Thierry Reding

[permalink] [raw]
Subject: Re: [Patch v1 06/10] arm64: tegra: Add cpu OPP tables and interconnects property

On Tue, Dec 20, 2022 at 09:32:36PM +0530, Sumit Gupta wrote:
> Add OPP table and interconnects property required to scale DDR
> frequency for better performance. The OPP table has CPU frequency
> to per MC channel bandwidth mapping in each operating point entry.
> One table is added for each cluster even though the table data is
> same because the bandwidth request is per cluster. OPP framework
> is creating a single icc path if the table is marked 'opp-shared'
> and shared among all clusters. For us the OPP table is same but
> the MC client ID argument to interconnects property is different
> for each cluster which makes different icc path for all.
>
> Signed-off-by: Sumit Gupta <[email protected]>
> ---
> arch/arm64/boot/dts/nvidia/tegra234.dtsi | 276 +++++++++++++++++++++++
> 1 file changed, 276 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/nvidia/tegra234.dtsi b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
> index eaf05ee9acd1..ed7d0f7da431 100644
> --- a/arch/arm64/boot/dts/nvidia/tegra234.dtsi
> +++ b/arch/arm64/boot/dts/nvidia/tegra234.dtsi
> @@ -2840,6 +2840,9 @@
>
> enable-method = "psci";
>
> + operating-points-v2 = <&cl0_opp_tbl>;
> + interconnects = <&mc TEGRA_ICC_MC_CPU_CLUSTER0 &emc>;

I dislike how this muddies the water between hardware and software
description. We don't have a hardware client ID for the CPU clusters, so
there's no good way to describe this in a hardware-centric way. We used
to have MPCORE read and write clients for this, but as far as I know
they used to be for the entire CCPLEX rather than per-cluster. It'd be
interesting to know what the BPMP does underneath, perhaps that could
give some indication as to what would be a better hardware value to use
for this.

Failing that, I wonder if a combination of icc_node_create() and
icc_get() can be used for this type of "virtual node" special case.

Thierry


Attachments:
(No filename) (1.88 kB)
signature.asc (849.00 B)
Download all attachments