This enables thermal monitoring and CPU DVFS on RK3588(s), as well as
active cooling on Radxa Rock 5B via the provided PWM fan.
Some RK3588 boards use separate regulators to supply CPUs and their
respective memory interfaces, so this is handled by coupling those
regulators in affected boards' device trees to ensure that their
voltage is adjusted in step.
This also enables the built-in thermal sensor (TSADC) for all boards
that don't currently have it enabled, using the default CRU based
emergency thermal reset. This default configuration only uses on-SoC
devices and doesn't rely on any external wiring, thus it should work
for all devices (tested only on Rock 5B though).
The boards that have TSADC_SHUT signal wired to the PMIC reset line
can choose to override the default reset logic in favour of GPIO
driven (PMIC assisted) reset, but in my testing it didn't work on
Radxa Rock 5B - maybe I'm reading the schematic wrong and it doesn't
support PMIC assisted reset after all.
Fan control on Rock 5B has been split into two intervals: let it spin
at the minimum cooling state between 55C and 65C, and then accelerate
if the system crosses the 65C mark - thanks to Dragan for suggesting.
This lets some cooling setups with beefier heatsinks and/or larger
fan fins to stay in the quietest non-zero fan state while still
gaining potential benefits from the airflow it generates, and
possibly avoiding noisy speeds altogether for some workloads.
OPPs help actually scale CPU frequencies up and down for both cooling
and performance - tested on Rock 5B under varied loads. I've dropped
those OPPs that cause frequency reductions without accompanying decrease
in CPU voltage, as they don't seem to be adding much benefit in day to
day use, while the kernel log gets a number of "OPP is inefficient" lines.
Note that this submission doesn't touch the SRAM read margin updates or
the OPP calibration based on silicon quality which the downstream driver
does and which were mentioned in [1]. It works as it is (also confirmed by
Sebastian in his follow-up message [2]), and it is stable in my testing on
Rock 5B, so it sounds better to merge a simple version first and then
extend when/if required.
[1] https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@mail.gmail.com/
[2] https://lore.kernel.org/linux-rockchip/pkyne4g2cln27dcdu3jm7bqdqpmd2kwkbguiolmozntjuiajrb@gvq4nupzna4o/
Signed-off-by: Alexey Charkov <[email protected]>
---
Changes in v4:
- Rebased against linux-rockchip/for-next
- Reordered DT nodes alphabetically as pointed out by Diederik
- Moved the TSADC enablement to per-board .dts/.dtsi files
- Dropped extra "inefficient" OPPs (same voltage - lower frequencies)
- Dropped second passive cooling trips altogether to keep things simple
- Added a cooling map for passive GPU cooling (in a separate patch)
- Link to v3: https://lore.kernel.org/r/[email protected]
Changes in v3:
- Added regulator coupling for EVB1 and QuartzPro64
- Enabled the TSADC for all boards in .dtsi, not just Rock 5B (thanks ChenYu)
- Added comments regarding two passive cooling trips in each zone (thanks Dragan)
- Fixed active cooling map numbering for Radxa Rock 5B (thanks Dragan)
- Dropped Daniel's Acked-by tag from the Rock 5B fan patch, as there's been quite some
churn there since the version he acknowledged
- Link to v2: https://lore.kernel.org/r/[email protected]
Changes in v2:
- Dropped the rfkill patch which Heiko has already applied
- Set higher 'polling-delay-passive' (100 instead of 20)
- Name all cooling maps starting from map0 in each respective zone
- Drop 'contribution' properties from passive cooling maps
- Link to v1: https://lore.kernel.org/r/[email protected]
---
Alexey Charkov (6):
arm64: dts: rockchip: add thermal zones information on RK3588
arm64: dts: rockchip: enable thermal management on all RK3588 boards
arm64: dts: rockchip: add passive GPU cooling on RK3588
arm64: dts: rockchip: enable automatic fan control on Rock 5B
arm64: dts: rockchip: Add CPU/memory regulator coupling for RK3588
arm64: dts: rockchip: Add OPP data for CPU cores on RK3588
.../boot/dts/rockchip/rk3588-armsom-sige7.dts | 4 +
.../dts/rockchip/rk3588-edgeble-neu6a-common.dtsi | 4 +
arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dts | 16 ++
arch/arm64/boot/dts/rockchip/rk3588-ok3588-c.dts | 4 +
.../arm64/boot/dts/rockchip/rk3588-quartzpro64.dts | 12 +
arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 34 ++-
.../arm64/boot/dts/rockchip/rk3588-toybrick-x0.dts | 4 +
.../arm64/boot/dts/rockchip/rk3588-turing-rk1.dtsi | 4 +
arch/arm64/boot/dts/rockchip/rk3588s-rock-5a.dts | 4 +
arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 281 +++++++++++++++++++++
10 files changed, 366 insertions(+), 1 deletion(-)
---
base-commit: 160b088184ec81028ff67a5032be33f1baea4b67
change-id: 20240124-rk-dts-additions-a6d7b52787b9
Best regards,
--
Alexey Charkov <[email protected]>
This includes the necessary device tree data to allow thermal
monitoring on RK3588(s) using the on-chip TSADC device, along with
trip points for automatic thermal management.
Each of the CPU clusters (one for the little cores and two for
the big cores) get a passive cooling trip point at 85C, which
will trigger DVFS throttling of the respective cluster upon
reaching a high temperature condition.
All zones also have a critical trip point at 115C, which will
trigger a reset.
Signed-off-by: Alexey Charkov <[email protected]>
---
arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 147 ++++++++++++++++++++++++++++++
1 file changed, 147 insertions(+)
diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
index 6ac5ac8b48ab..ef06c1f742e8 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
@@ -10,6 +10,7 @@
#include <dt-bindings/reset/rockchip,rk3588-cru.h>
#include <dt-bindings/phy/phy.h>
#include <dt-bindings/ata/ahci.h>
+#include <dt-bindings/thermal/thermal.h>
/ {
compatible = "rockchip,rk3588";
@@ -2368,6 +2369,152 @@ pwm15: pwm@febf0030 {
status = "disabled";
};
+ thermal_zones: thermal-zones {
+ /* sensor near the center of the SoC */
+ package_thermal: package-thermal {
+ polling-delay-passive = <0>;
+ polling-delay = <0>;
+ thermal-sensors = <&tsadc 0>;
+
+ trips {
+ package_crit: package-crit {
+ temperature = <115000>;
+ hysteresis = <0>;
+ type = "critical";
+ };
+ };
+ };
+
+ /* sensor between A76 cores 0 and 1 */
+ bigcore0_thermal: bigcore0-thermal {
+ polling-delay-passive = <100>;
+ polling-delay = <0>;
+ thermal-sensors = <&tsadc 1>;
+
+ trips {
+ bigcore0_alert: bigcore0-alert {
+ temperature = <85000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
+ bigcore0_crit: bigcore0-crit {
+ temperature = <115000>;
+ hysteresis = <0>;
+ type = "critical";
+ };
+ };
+ cooling-maps {
+ map0 {
+ trip = <&bigcore0_alert>;
+ cooling-device =
+ <&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+ <&cpu_b1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
+ };
+
+ /* sensor between A76 cores 2 and 3 */
+ bigcore2_thermal: bigcore2-thermal {
+ polling-delay-passive = <100>;
+ polling-delay = <0>;
+ thermal-sensors = <&tsadc 2>;
+
+ trips {
+ bigcore2_alert: bigcore2-alert {
+ temperature = <85000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
+ bigcore2_crit: bigcore2-crit {
+ temperature = <115000>;
+ hysteresis = <0>;
+ type = "critical";
+ };
+ };
+ cooling-maps {
+ map0 {
+ trip = <&bigcore2_alert>;
+ cooling-device =
+ <&cpu_b2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+ <&cpu_b3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
+ };
+
+ /* sensor between the four A55 cores */
+ little_core_thermal: littlecore-thermal {
+ polling-delay-passive = <100>;
+ polling-delay = <0>;
+ thermal-sensors = <&tsadc 3>;
+
+ trips {
+ littlecore_alert: littlecore-alert {
+ temperature = <85000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
+ littlecore_crit: littlecore-crit {
+ temperature = <115000>;
+ hysteresis = <0>;
+ type = "critical";
+ };
+ };
+ cooling-maps {
+ map0 {
+ trip = <&littlecore_alert>;
+ cooling-device =
+ <&cpu_l0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+ <&cpu_l1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+ <&cpu_l2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
+ <&cpu_l3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
+ };
+
+ /* sensor near the PD_CENTER power domain */
+ center_thermal: center-thermal {
+ polling-delay-passive = <0>;
+ polling-delay = <0>;
+ thermal-sensors = <&tsadc 4>;
+
+ trips {
+ center_crit: center-crit {
+ temperature = <115000>;
+ hysteresis = <0>;
+ type = "critical";
+ };
+ };
+ };
+
+ gpu_thermal: gpu-thermal {
+ polling-delay-passive = <0>;
+ polling-delay = <0>;
+ thermal-sensors = <&tsadc 5>;
+
+ trips {
+ gpu_crit: gpu-crit {
+ temperature = <115000>;
+ hysteresis = <0>;
+ type = "critical";
+ };
+ };
+ };
+
+ npu_thermal: npu-thermal {
+ polling-delay-passive = <0>;
+ polling-delay = <0>;
+ thermal-sensors = <&tsadc 6>;
+
+ trips {
+ npu_crit: npu-crit {
+ temperature = <115000>;
+ hysteresis = <0>;
+ type = "critical";
+ };
+ };
+ };
+ };
+
tsadc: tsadc@fec00000 {
compatible = "rockchip,rk3588-tsadc";
reg = <0x0 0xfec00000 0x0 0x400>;
--
2.45.0
This enables the on-chip thermal monitoring sensor (TSADC) on all
RK3588(s) boards that don't have it enabled yet. It provides temperature
monitoring for the SoC and emergency thermal shutdowns, and is thus
important to have in place before CPU DVFS is enabled, as high CPU
operating performance points can overheat the chip quickly in the
absence of thermal management.
Signed-off-by: Alexey Charkov <[email protected]>
---
arch/arm64/boot/dts/rockchip/rk3588-armsom-sige7.dts | 4 ++++
arch/arm64/boot/dts/rockchip/rk3588-edgeble-neu6a-common.dtsi | 4 ++++
arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dts | 4 ++++
arch/arm64/boot/dts/rockchip/rk3588-ok3588-c.dts | 4 ++++
arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 4 ++++
arch/arm64/boot/dts/rockchip/rk3588-toybrick-x0.dts | 4 ++++
arch/arm64/boot/dts/rockchip/rk3588-turing-rk1.dtsi | 4 ++++
arch/arm64/boot/dts/rockchip/rk3588s-rock-5a.dts | 4 ++++
8 files changed, 32 insertions(+)
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-armsom-sige7.dts b/arch/arm64/boot/dts/rockchip/rk3588-armsom-sige7.dts
index 98c622b27647..c667704ba985 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-armsom-sige7.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3588-armsom-sige7.dts
@@ -673,6 +673,10 @@ regulator-state-mem {
};
};
+&tsadc {
+ status = "okay";
+};
+
&u2phy0 {
status = "okay";
};
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-edgeble-neu6a-common.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-edgeble-neu6a-common.dtsi
index 709d348cf06b..03fd193be253 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-edgeble-neu6a-common.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588-edgeble-neu6a-common.dtsi
@@ -466,3 +466,7 @@ regulator-state-mem {
};
};
};
+
+&tsadc {
+ status = "okay";
+};
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dts b/arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dts
index 7be2190244ba..7c3696a3ad3a 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dts
@@ -1131,6 +1131,10 @@ &sata0 {
status = "okay";
};
+&tsadc {
+ status = "okay";
+};
+
&u2phy0 {
status = "okay";
};
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-ok3588-c.dts b/arch/arm64/boot/dts/rockchip/rk3588-ok3588-c.dts
index 009566d881f3..230e630820b4 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-ok3588-c.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3588-ok3588-c.dts
@@ -376,6 +376,10 @@ &sdmmc {
status = "okay";
};
+&tsadc {
+ status = "okay";
+};
+
&u2phy2 {
status = "okay";
};
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
index b8e15b76a8a6..21e96c212dd8 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
@@ -742,6 +742,10 @@ regulator-state-mem {
};
};
+&tsadc {
+ status = "okay";
+};
+
&uart2 {
pinctrl-0 = <&uart2m0_xfer>;
status = "okay";
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-toybrick-x0.dts b/arch/arm64/boot/dts/rockchip/rk3588-toybrick-x0.dts
index 9090c5c99f2a..d0021524e7f9 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-toybrick-x0.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3588-toybrick-x0.dts
@@ -648,6 +648,10 @@ regulator-state-mem {
};
};
+&tsadc {
+ status = "okay";
+};
+
&u2phy2 {
status = "okay";
};
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-turing-rk1.dtsi b/arch/arm64/boot/dts/rockchip/rk3588-turing-rk1.dtsi
index 6b9206ce4a03..77bcf0f6b028 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-turing-rk1.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588-turing-rk1.dtsi
@@ -601,6 +601,10 @@ regulator-state-mem {
};
};
+&tsadc {
+ status = "okay";
+};
+
&uart2 {
pinctrl-0 = <&uart2m0_xfer>;
status = "okay";
diff --git a/arch/arm64/boot/dts/rockchip/rk3588s-rock-5a.dts b/arch/arm64/boot/dts/rockchip/rk3588s-rock-5a.dts
index 8e2a07612d17..c671a61d3aef 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588s-rock-5a.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3588s-rock-5a.dts
@@ -697,6 +697,10 @@ regulator-state-mem {
};
};
+&tsadc {
+ status = "okay";
+};
+
&u2phy0 {
status = "okay";
};
--
2.45.0
This links the PWM fan on Radxa Rock 5B as an active cooling device
managed automatically by the thermal subsystem, with a target SoC
temperature of 65C and a minimum-spin interval from 55C to 65C to
ensure airflow when the system gets warm
Helped-by: Dragan Simic <[email protected]>
Reviewed-by: Dragan Simic <[email protected]>
Signed-off-by: Alexey Charkov <[email protected]>
---
arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 30 ++++++++++++++++++++++++-
1 file changed, 29 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
index 21e96c212dd8..b70313643af8 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
@@ -52,7 +52,7 @@ led_rgb_b {
fan: pwm-fan {
compatible = "pwm-fan";
- cooling-levels = <0 95 145 195 255>;
+ cooling-levels = <0 120 150 180 210 240 255>;
fan-supply = <&vcc5v0_sys>;
pwms = <&pwm1 0 50000 0>;
#cooling-cells = <2>;
@@ -279,6 +279,34 @@ i2s0_8ch_p0_0: endpoint {
};
};
+&package_thermal {
+ polling-delay = <1000>;
+
+ trips {
+ package_fan0: package-fan0 {
+ temperature = <55000>;
+ hysteresis = <2000>;
+ type = "active";
+ };
+ package_fan1: package-fan1 {
+ temperature = <65000>;
+ hysteresis = <2000>;
+ type = "active";
+ };
+ };
+
+ cooling-maps {
+ map1 {
+ trip = <&package_fan0>;
+ cooling-device = <&fan THERMAL_NO_LIMIT 1>;
+ };
+ map2 {
+ trip = <&package_fan1>;
+ cooling-device = <&fan 2 THERMAL_NO_LIMIT>;
+ };
+ };
+};
+
&pcie2x1l0 {
pinctrl-names = "default";
pinctrl-0 = <&pcie2_0_rst>;
--
2.45.0
RK3588 chips allow for their CPU cores to be powered by a different
supply vs. their corresponding memory interfaces, and two of the
boards currently upstream do that (EVB1 and QuartzPro64).
The voltage of the memory interface though has to match that of the
CPU cores that use it, which downstream kernels achieve by the means
of a custom cpufreq driver which adjusts both at the same time.
It seems that regulator coupling is a more appropriate generic
interface for it, so this patch introduces coupling to affected
device trees to ensure that memory interface voltage is also updated
whenever cpufreq switches between CPU OPPs.
Note that other boards, such as Radxa Rock 5B, define both the CPU
and memory interface regulators as aliases to the same DT node, so
this doesn't apply there.
Signed-off-by: Alexey Charkov <[email protected]>
---
arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dts | 12 ++++++++++++
arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts | 12 ++++++++++++
2 files changed, 24 insertions(+)
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dts b/arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dts
index 7c3696a3ad3a..00f660d50127 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3588-evb1-v10.dts
@@ -878,6 +878,8 @@ regulators {
vdd_cpu_big1_s0: dcdc-reg1 {
regulator-always-on;
regulator-boot-on;
+ regulator-coupled-with = <&vdd_cpu_big1_mem_s0>;
+ regulator-coupled-max-spread = <10000>;
regulator-min-microvolt = <550000>;
regulator-max-microvolt = <1050000>;
regulator-ramp-delay = <12500>;
@@ -890,6 +892,8 @@ regulator-state-mem {
vdd_cpu_big0_s0: dcdc-reg2 {
regulator-always-on;
regulator-boot-on;
+ regulator-coupled-with = <&vdd_cpu_big0_mem_s0>;
+ regulator-coupled-max-spread = <10000>;
regulator-min-microvolt = <550000>;
regulator-max-microvolt = <1050000>;
regulator-ramp-delay = <12500>;
@@ -902,6 +906,8 @@ regulator-state-mem {
vdd_cpu_lit_s0: dcdc-reg3 {
regulator-always-on;
regulator-boot-on;
+ regulator-coupled-with = <&vdd_cpu_lit_mem_s0>;
+ regulator-coupled-max-spread = <10000>;
regulator-min-microvolt = <550000>;
regulator-max-microvolt = <950000>;
regulator-ramp-delay = <12500>;
@@ -926,6 +932,8 @@ regulator-state-mem {
vdd_cpu_big1_mem_s0: dcdc-reg5 {
regulator-always-on;
regulator-boot-on;
+ regulator-coupled-with = <&vdd_cpu_big1_s0>;
+ regulator-coupled-max-spread = <10000>;
regulator-min-microvolt = <675000>;
regulator-max-microvolt = <1050000>;
regulator-ramp-delay = <12500>;
@@ -939,6 +947,8 @@ regulator-state-mem {
vdd_cpu_big0_mem_s0: dcdc-reg6 {
regulator-always-on;
regulator-boot-on;
+ regulator-coupled-with = <&vdd_cpu_big0_s0>;
+ regulator-coupled-max-spread = <10000>;
regulator-min-microvolt = <675000>;
regulator-max-microvolt = <1050000>;
regulator-ramp-delay = <12500>;
@@ -963,6 +973,8 @@ regulator-state-mem {
vdd_cpu_lit_mem_s0: dcdc-reg8 {
regulator-always-on;
regulator-boot-on;
+ regulator-coupled-with = <&vdd_cpu_lit_s0>;
+ regulator-coupled-max-spread = <10000>;
regulator-min-microvolt = <675000>;
regulator-max-microvolt = <950000>;
regulator-ramp-delay = <12500>;
diff --git a/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts b/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts
index b4f22d95ac0e..baeb08d665c7 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts
+++ b/arch/arm64/boot/dts/rockchip/rk3588-quartzpro64.dts
@@ -832,6 +832,8 @@ vdd_cpu_big1_s0: dcdc-reg1 {
regulator-name = "vdd_cpu_big1_s0";
regulator-always-on;
regulator-boot-on;
+ regulator-coupled-with = <&vdd_cpu_big1_mem_s0>;
+ regulator-coupled-max-spread = <10000>;
regulator-min-microvolt = <550000>;
regulator-max-microvolt = <1050000>;
regulator-ramp-delay = <12500>;
@@ -845,6 +847,8 @@ vdd_cpu_big0_s0: dcdc-reg2 {
regulator-name = "vdd_cpu_big0_s0";
regulator-always-on;
regulator-boot-on;
+ regulator-coupled-with = <&vdd_cpu_big0_mem_s0>;
+ regulator-coupled-max-spread = <10000>;
regulator-min-microvolt = <550000>;
regulator-max-microvolt = <1050000>;
regulator-ramp-delay = <12500>;
@@ -858,6 +862,8 @@ vdd_cpu_lit_s0: dcdc-reg3 {
regulator-name = "vdd_cpu_lit_s0";
regulator-always-on;
regulator-boot-on;
+ regulator-coupled-with = <&vdd_cpu_lit_mem_s0>;
+ regulator-coupled-max-spread = <10000>;
regulator-min-microvolt = <550000>;
regulator-max-microvolt = <950000>;
regulator-ramp-delay = <12500>;
@@ -884,6 +890,8 @@ vdd_cpu_big1_mem_s0: dcdc-reg5 {
regulator-name = "vdd_cpu_big1_mem_s0";
regulator-always-on;
regulator-boot-on;
+ regulator-coupled-with = <&vdd_cpu_big1_s0>;
+ regulator-coupled-max-spread = <10000>;
regulator-min-microvolt = <675000>;
regulator-max-microvolt = <1050000>;
regulator-ramp-delay = <12500>;
@@ -898,6 +906,8 @@ vdd_cpu_big0_mem_s0: dcdc-reg6 {
regulator-name = "vdd_cpu_big0_mem_s0";
regulator-always-on;
regulator-boot-on;
+ regulator-coupled-with = <&vdd_cpu_big0_s0>;
+ regulator-coupled-max-spread = <10000>;
regulator-min-microvolt = <675000>;
regulator-max-microvolt = <1050000>;
regulator-ramp-delay = <12500>;
@@ -924,6 +934,8 @@ vdd_cpu_lit_mem_s0: dcdc-reg8 {
regulator-name = "vdd_cpu_lit_mem_s0";
regulator-always-on;
regulator-boot-on;
+ regulator-coupled-with = <&vdd_cpu_lit_s0>;
+ regulator-coupled-max-spread = <10000>;
regulator-min-microvolt = <675000>;
regulator-max-microvolt = <950000>;
regulator-ramp-delay = <12500>;
--
2.45.0
By default the CPUs on RK3588 start up in a conservative performance
mode. Add frequency and voltage mappings to the device tree to enable
dynamic scaling via cpufreq.
OPP values are adapted from Radxa's downstream kernel for Rock 5B [1],
stripping them down to the minimum frequency and voltage combinations
as expected by the generic upstream cpufreq-dt driver, and also dropping
those OPPs that don't differ in voltage but only in frequency (keeping
the top frequency OPP in each case).
Note that this patch ignores voltage scaling for the CPU memory
interface which the downstream kernel does through a custom cpufreq
driver, and which is why the downstream version has two sets of voltage
values for each OPP (the second one being meant for the memory
interface supply regulator). This is done instead via regulator
coupling between CPU and memory interface supplies on affected boards.
This has been tested on Rock 5B with u-boot 2023.11 compiled from
Collabora's integration tree [2] with binary bl31 and appears to be
stable both under active cooling and passive cooling (with throttling)
[1] https://github.com/radxa/kernel/blob/stable-5.10-rock5/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
[2] https://gitlab.collabora.com/hardware-enablement/rockchip-3588/u-boot
Signed-off-by: Alexey Charkov <[email protected]>
---
arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 122 ++++++++++++++++++++++++++++++
1 file changed, 122 insertions(+)
diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
index 57c2d998ae75..85c25d5efdad 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
@@ -97,6 +97,7 @@ cpu_l0: cpu@0 {
clocks = <&scmi_clk SCMI_CLK_CPUL>;
assigned-clocks = <&scmi_clk SCMI_CLK_CPUL>;
assigned-clock-rates = <816000000>;
+ operating-points-v2 = <&cluster0_opp_table>;
cpu-idle-states = <&CPU_SLEEP>;
i-cache-size = <32768>;
i-cache-line-size = <64>;
@@ -116,6 +117,7 @@ cpu_l1: cpu@100 {
enable-method = "psci";
capacity-dmips-mhz = <530>;
clocks = <&scmi_clk SCMI_CLK_CPUL>;
+ operating-points-v2 = <&cluster0_opp_table>;
cpu-idle-states = <&CPU_SLEEP>;
i-cache-size = <32768>;
i-cache-line-size = <64>;
@@ -135,6 +137,7 @@ cpu_l2: cpu@200 {
enable-method = "psci";
capacity-dmips-mhz = <530>;
clocks = <&scmi_clk SCMI_CLK_CPUL>;
+ operating-points-v2 = <&cluster0_opp_table>;
cpu-idle-states = <&CPU_SLEEP>;
i-cache-size = <32768>;
i-cache-line-size = <64>;
@@ -154,6 +157,7 @@ cpu_l3: cpu@300 {
enable-method = "psci";
capacity-dmips-mhz = <530>;
clocks = <&scmi_clk SCMI_CLK_CPUL>;
+ operating-points-v2 = <&cluster0_opp_table>;
cpu-idle-states = <&CPU_SLEEP>;
i-cache-size = <32768>;
i-cache-line-size = <64>;
@@ -175,6 +179,7 @@ cpu_b0: cpu@400 {
clocks = <&scmi_clk SCMI_CLK_CPUB01>;
assigned-clocks = <&scmi_clk SCMI_CLK_CPUB01>;
assigned-clock-rates = <816000000>;
+ operating-points-v2 = <&cluster1_opp_table>;
cpu-idle-states = <&CPU_SLEEP>;
i-cache-size = <65536>;
i-cache-line-size = <64>;
@@ -194,6 +199,7 @@ cpu_b1: cpu@500 {
enable-method = "psci";
capacity-dmips-mhz = <1024>;
clocks = <&scmi_clk SCMI_CLK_CPUB01>;
+ operating-points-v2 = <&cluster1_opp_table>;
cpu-idle-states = <&CPU_SLEEP>;
i-cache-size = <65536>;
i-cache-line-size = <64>;
@@ -215,6 +221,7 @@ cpu_b2: cpu@600 {
clocks = <&scmi_clk SCMI_CLK_CPUB23>;
assigned-clocks = <&scmi_clk SCMI_CLK_CPUB23>;
assigned-clock-rates = <816000000>;
+ operating-points-v2 = <&cluster2_opp_table>;
cpu-idle-states = <&CPU_SLEEP>;
i-cache-size = <65536>;
i-cache-line-size = <64>;
@@ -234,6 +241,7 @@ cpu_b3: cpu@700 {
enable-method = "psci";
capacity-dmips-mhz = <1024>;
clocks = <&scmi_clk SCMI_CLK_CPUB23>;
+ operating-points-v2 = <&cluster2_opp_table>;
cpu-idle-states = <&CPU_SLEEP>;
i-cache-size = <65536>;
i-cache-line-size = <64>;
@@ -348,6 +356,120 @@ l3_cache: l3-cache {
};
};
+ cluster0_opp_table: opp-table-cluster0 {
+ compatible = "operating-points-v2";
+ opp-shared;
+
+ opp-1008000000 {
+ opp-hz = /bits/ 64 <1008000000>;
+ opp-microvolt = <675000 675000 950000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-1200000000 {
+ opp-hz = /bits/ 64 <1200000000>;
+ opp-microvolt = <712500 712500 950000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-1416000000 {
+ opp-hz = /bits/ 64 <1416000000>;
+ opp-microvolt = <762500 762500 950000>;
+ clock-latency-ns = <40000>;
+ opp-suspend;
+ };
+ opp-1608000000 {
+ opp-hz = /bits/ 64 <1608000000>;
+ opp-microvolt = <850000 850000 950000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-1800000000 {
+ opp-hz = /bits/ 64 <1800000000>;
+ opp-microvolt = <950000 950000 950000>;
+ clock-latency-ns = <40000>;
+ };
+ };
+
+ cluster1_opp_table: opp-table-cluster1 {
+ compatible = "operating-points-v2";
+ opp-shared;
+
+ opp-1200000000 {
+ opp-hz = /bits/ 64 <1200000000>;
+ opp-microvolt = <675000 675000 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-1416000000 {
+ opp-hz = /bits/ 64 <1416000000>;
+ opp-microvolt = <725000 725000 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-1608000000 {
+ opp-hz = /bits/ 64 <1608000000>;
+ opp-microvolt = <762500 762500 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-1800000000 {
+ opp-hz = /bits/ 64 <1800000000>;
+ opp-microvolt = <850000 850000 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-2016000000 {
+ opp-hz = /bits/ 64 <2016000000>;
+ opp-microvolt = <925000 925000 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-2208000000 {
+ opp-hz = /bits/ 64 <2208000000>;
+ opp-microvolt = <987500 987500 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-2400000000 {
+ opp-hz = /bits/ 64 <2400000000>;
+ opp-microvolt = <1000000 1000000 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ };
+
+ cluster2_opp_table: opp-table-cluster2 {
+ compatible = "operating-points-v2";
+ opp-shared;
+
+ opp-1200000000 {
+ opp-hz = /bits/ 64 <1200000000>;
+ opp-microvolt = <675000 675000 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-1416000000 {
+ opp-hz = /bits/ 64 <1416000000>;
+ opp-microvolt = <725000 725000 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-1608000000 {
+ opp-hz = /bits/ 64 <1608000000>;
+ opp-microvolt = <762500 762500 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-1800000000 {
+ opp-hz = /bits/ 64 <1800000000>;
+ opp-microvolt = <850000 850000 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-2016000000 {
+ opp-hz = /bits/ 64 <2016000000>;
+ opp-microvolt = <925000 925000 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-2208000000 {
+ opp-hz = /bits/ 64 <2208000000>;
+ opp-microvolt = <987500 987500 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ opp-2400000000 {
+ opp-hz = /bits/ 64 <2400000000>;
+ opp-microvolt = <1000000 1000000 1000000>;
+ clock-latency-ns = <40000>;
+ };
+ };
+
display_subsystem: display-subsystem {
compatible = "rockchip,display-subsystem";
ports = <&vop_out>;
--
2.45.0
As the GPU support on RK3588 has been merged upstream, along with OPP
values, add a corresponding cooling map for passive cooling using the GPU.
Signed-off-by: Alexey Charkov <[email protected]>
---
arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 14 +++++++++++++-
1 file changed, 13 insertions(+), 1 deletion(-)
diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
index ef06c1f742e8..57c2d998ae75 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
@@ -2487,17 +2487,29 @@ center_crit: center-crit {
};
gpu_thermal: gpu-thermal {
- polling-delay-passive = <0>;
+ polling-delay-passive = <100>;
polling-delay = <0>;
thermal-sensors = <&tsadc 5>;
trips {
+ gpu_alert: gpu-alert {
+ temperature = <85000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
gpu_crit: gpu-crit {
temperature = <115000>;
hysteresis = <0>;
type = "critical";
};
};
+ cooling-maps {
+ map0 {
+ trip = <&gpu_alert>;
+ cooling-device =
+ <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
};
npu_thermal: npu-thermal {
--
2.45.0
Hello Alexey,
Thanks for submitting the v4 of this series! Please, see a couple
of my comments below.
On 2024-05-06 11:36, Alexey Charkov wrote:
> This includes the necessary device tree data to allow thermal
> monitoring on RK3588(s) using the on-chip TSADC device, along with
> trip points for automatic thermal management.
>
> Each of the CPU clusters (one for the little cores and two for
> the big cores) get a passive cooling trip point at 85C, which
> will trigger DVFS throttling of the respective cluster upon
> reaching a high temperature condition.
>
> All zones also have a critical trip point at 115C, which will
> trigger a reset.
>
> Signed-off-by: Alexey Charkov <[email protected]>
> ---
> arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 147
> ++++++++++++++++++++++++++++++
> 1 file changed, 147 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> index 6ac5ac8b48ab..ef06c1f742e8 100644
> --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> @@ -10,6 +10,7 @@
> #include <dt-bindings/reset/rockchip,rk3588-cru.h>
> #include <dt-bindings/phy/phy.h>
> #include <dt-bindings/ata/ahci.h>
> +#include <dt-bindings/thermal/thermal.h>
>
> / {
> compatible = "rockchip,rk3588";
> @@ -2368,6 +2369,152 @@ pwm15: pwm@febf0030 {
> status = "disabled";
> };
>
> + thermal_zones: thermal-zones {
> + /* sensor near the center of the SoC */
> + package_thermal: package-thermal {
> + polling-delay-passive = <0>;
> + polling-delay = <0>;
> + thermal-sensors = <&tsadc 0>;
> +
> + trips {
> + package_crit: package-crit {
> + temperature = <115000>;
> + hysteresis = <0>;
> + type = "critical";
> + };
> + };
> + };
> +
> + /* sensor between A76 cores 0 and 1 */
> + bigcore0_thermal: bigcore0-thermal {
> + polling-delay-passive = <100>;
> + polling-delay = <0>;
> + thermal-sensors = <&tsadc 1>;
> +
> + trips {
> + bigcore0_alert: bigcore0-alert {
> + temperature = <85000>;
> + hysteresis = <2000>;
> + type = "passive";
> + };
Doesn't removing the second passive trip, which was present in the v3,
result in confusing the IPA governor?
> + bigcore0_crit: bigcore0-crit {
> + temperature = <115000>;
> + hysteresis = <0>;
> + type = "critical";
> + };
> + };
> + cooling-maps {
> + map0 {
> + trip = <&bigcore0_alert>;
> + cooling-device =
> + <&cpu_b0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> + <&cpu_b1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> + };
> + };
> + };
> +
> + /* sensor between A76 cores 2 and 3 */
> + bigcore2_thermal: bigcore2-thermal {
> + polling-delay-passive = <100>;
> + polling-delay = <0>;
> + thermal-sensors = <&tsadc 2>;
> +
> + trips {
> + bigcore2_alert: bigcore2-alert {
> + temperature = <85000>;
> + hysteresis = <2000>;
> + type = "passive";
> + };
The same question about the second passive trip applies here, and to
all similar cases below.
> + bigcore2_crit: bigcore2-crit {
> + temperature = <115000>;
> + hysteresis = <0>;
> + type = "critical";
> + };
> + };
> + cooling-maps {
> + map0 {
> + trip = <&bigcore2_alert>;
> + cooling-device =
> + <&cpu_b2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> + <&cpu_b3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> + };
> + };
> + };
> +
> + /* sensor between the four A55 cores */
> + little_core_thermal: littlecore-thermal {
> + polling-delay-passive = <100>;
> + polling-delay = <0>;
> + thermal-sensors = <&tsadc 3>;
> +
> + trips {
> + littlecore_alert: littlecore-alert {
> + temperature = <85000>;
> + hysteresis = <2000>;
> + type = "passive";
> + };
> + littlecore_crit: littlecore-crit {
> + temperature = <115000>;
> + hysteresis = <0>;
> + type = "critical";
> + };
> + };
> + cooling-maps {
> + map0 {
> + trip = <&littlecore_alert>;
> + cooling-device =
> + <&cpu_l0 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> + <&cpu_l1 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> + <&cpu_l2 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>,
> + <&cpu_l3 THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
> + };
> + };
> + };
> +
> + /* sensor near the PD_CENTER power domain */
> + center_thermal: center-thermal {
> + polling-delay-passive = <0>;
> + polling-delay = <0>;
> + thermal-sensors = <&tsadc 4>;
> +
> + trips {
> + center_crit: center-crit {
> + temperature = <115000>;
> + hysteresis = <0>;
> + type = "critical";
> + };
> + };
> + };
> +
> + gpu_thermal: gpu-thermal {
> + polling-delay-passive = <0>;
> + polling-delay = <0>;
> + thermal-sensors = <&tsadc 5>;
> +
> + trips {
> + gpu_crit: gpu-crit {
> + temperature = <115000>;
> + hysteresis = <0>;
> + type = "critical";
> + };
> + };
> + };
> +
> + npu_thermal: npu-thermal {
> + polling-delay-passive = <0>;
> + polling-delay = <0>;
> + thermal-sensors = <&tsadc 6>;
> +
> + trips {
> + npu_crit: npu-crit {
> + temperature = <115000>;
> + hysteresis = <0>;
> + type = "critical";
> + };
> + };
> + };
> + };
> +
> tsadc: tsadc@fec00000 {
> compatible = "rockchip,rk3588-tsadc";
> reg = <0x0 0xfec00000 0x0 0x400>;
Hello Dragan,
On Mon, May 6, 2024 at 1:52 PM Dragan Simic <[email protected]> wrote:
>
> Hello Alexey,
>
> Thanks for submitting the v4 of this series! Please, see a couple
> of my comments below.
>
> On 2024-05-06 11:36, Alexey Charkov wrote:
> > This includes the necessary device tree data to allow thermal
> > monitoring on RK3588(s) using the on-chip TSADC device, along with
> > trip points for automatic thermal management.
> >
> > Each of the CPU clusters (one for the little cores and two for
> > the big cores) get a passive cooling trip point at 85C, which
> > will trigger DVFS throttling of the respective cluster upon
> > reaching a high temperature condition.
> >
> > All zones also have a critical trip point at 115C, which will
> > trigger a reset.
> >
> > Signed-off-by: Alexey Charkov <[email protected]>
> > ---
> > arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 147
> > ++++++++++++++++++++++++++++++
> > 1 file changed, 147 insertions(+)
> >
> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > index 6ac5ac8b48ab..ef06c1f742e8 100644
> > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > @@ -10,6 +10,7 @@
> > #include <dt-bindings/reset/rockchip,rk3588-cru.h>
> > #include <dt-bindings/phy/phy.h>
> > #include <dt-bindings/ata/ahci.h>
> > +#include <dt-bindings/thermal/thermal.h>
> >
> > / {
> > compatible = "rockchip,rk3588";
> > @@ -2368,6 +2369,152 @@ pwm15: pwm@febf0030 {
> > status = "disabled";
> > };
> >
> > + thermal_zones: thermal-zones {
> > + /* sensor near the center of the SoC */
> > + package_thermal: package-thermal {
> > + polling-delay-passive = <0>;
> > + polling-delay = <0>;
> > + thermal-sensors = <&tsadc 0>;
> > +
> > + trips {
> > + package_crit: package-crit {
> > + temperature = <115000>;
> > + hysteresis = <0>;
> > + type = "critical";
> > + };
> > + };
> > + };
> > +
> > + /* sensor between A76 cores 0 and 1 */
> > + bigcore0_thermal: bigcore0-thermal {
> > + polling-delay-passive = <100>;
> > + polling-delay = <0>;
> > + thermal-sensors = <&tsadc 1>;
> > +
> > + trips {
> > + bigcore0_alert: bigcore0-alert {
> > + temperature = <85000>;
> > + hysteresis = <2000>;
> > + type = "passive";
> > + };
>
> Doesn't removing the second passive trip, which was present in the v3,
> result in confusing the IPA governor?
Not really - it will just treat the missing trip as 0C for its initial
PID calculations [1], and will continually run the governor as opposed
to putting it to rest when the temperature is below the "switch on"
value [2].
Getting the power allocation governor to work optimally (i.e. to
provide tangible benefits over, say, stepwise) is much more involved
than defining an arbitrary switch-on trip point, as it requires an
accurate estimate of sustainable power per thermal zone (which we
don't have for RK3588 in general, and furthermore it must depend a lot
on a particular cooling setup), and ideally some userspace
power/thermal model capable of tuning the PID coefficients and
updating them via sysfs based on how a particular system accumulates
and dissipates heat under different load.
So after thinking over it for a while I decided that those extra
passive trips were rather self-deceiving, as they are only useful in
the context of a power allocation governor but we do not have any of
the other pieces in place for the power allocation governor to work.
Better not to clutter the device tree IMO.
Best regards,
Alexey
[1] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/thermal/gov_power_allocator.c#n156
[2] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/thermal/gov_power_allocator.c#n487
On 2024-05-06 12:29, Alexey Charkov wrote:
> On Mon, May 6, 2024 at 1:52 PM Dragan Simic <[email protected]> wrote:
>> Thanks for submitting the v4 of this series! Please, see a couple
>> of my comments below.
>>
>> On 2024-05-06 11:36, Alexey Charkov wrote:
>> > This includes the necessary device tree data to allow thermal
>> > monitoring on RK3588(s) using the on-chip TSADC device, along with
>> > trip points for automatic thermal management.
>> >
>> > Each of the CPU clusters (one for the little cores and two for
>> > the big cores) get a passive cooling trip point at 85C, which
>> > will trigger DVFS throttling of the respective cluster upon
>> > reaching a high temperature condition.
>> >
>> > All zones also have a critical trip point at 115C, which will
>> > trigger a reset.
>> >
>> > Signed-off-by: Alexey Charkov <[email protected]>
>> > ---
>> > arch/arm64/boot/dts/rockchip/rk3588s.dtsi | 147
>> > ++++++++++++++++++++++++++++++
>> > 1 file changed, 147 insertions(+)
>> >
>> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>> > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>> > index 6ac5ac8b48ab..ef06c1f742e8 100644
>> > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>> > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>> > @@ -10,6 +10,7 @@
>> > #include <dt-bindings/reset/rockchip,rk3588-cru.h>
>> > #include <dt-bindings/phy/phy.h>
>> > #include <dt-bindings/ata/ahci.h>
>> > +#include <dt-bindings/thermal/thermal.h>
>> >
>> > / {
>> > compatible = "rockchip,rk3588";
>> > @@ -2368,6 +2369,152 @@ pwm15: pwm@febf0030 {
>> > status = "disabled";
>> > };
>> >
>> > + thermal_zones: thermal-zones {
>> > + /* sensor near the center of the SoC */
>> > + package_thermal: package-thermal {
>> > + polling-delay-passive = <0>;
>> > + polling-delay = <0>;
>> > + thermal-sensors = <&tsadc 0>;
>> > +
>> > + trips {
>> > + package_crit: package-crit {
>> > + temperature = <115000>;
>> > + hysteresis = <0>;
>> > + type = "critical";
>> > + };
>> > + };
>> > + };
>> > +
>> > + /* sensor between A76 cores 0 and 1 */
>> > + bigcore0_thermal: bigcore0-thermal {
>> > + polling-delay-passive = <100>;
>> > + polling-delay = <0>;
>> > + thermal-sensors = <&tsadc 1>;
>> > +
>> > + trips {
>> > + bigcore0_alert: bigcore0-alert {
>> > + temperature = <85000>;
>> > + hysteresis = <2000>;
>> > + type = "passive";
>> > + };
>>
>> Doesn't removing the second passive trip, which was present in the v3,
>> result in confusing the IPA governor?
>
> Not really - it will just treat the missing trip as 0C for its initial
> PID calculations [1], and will continually run the governor as opposed
> to putting it to rest when the temperature is below the "switch on"
> value [2].
>
> Getting the power allocation governor to work optimally (i.e. to
> provide tangible benefits over, say, stepwise) is much more involved
> than defining an arbitrary switch-on trip point, as it requires an
> accurate estimate of sustainable power per thermal zone (which we
> don't have for RK3588 in general, and furthermore it must depend a lot
> on a particular cooling setup), and ideally some userspace
> power/thermal model capable of tuning the PID coefficients and
> updating them via sysfs based on how a particular system accumulates
> and dissipates heat under different load.
>
> So after thinking over it for a while I decided that those extra
> passive trips were rather self-deceiving, as they are only useful in
> the context of a power allocation governor but we do not have any of
> the other pieces in place for the power allocation governor to work.
> Better not to clutter the device tree IMO.
I see, thanks for the clarification. Please, give me some time
to thoroughly test your patches, which I'll hopefully be able to
do in the next few days.
> [1]
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/thermal/gov_power_allocator.c#n156
> [2]
> https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/drivers/thermal/gov_power_allocator.c#n487
Hi,
On Monday, 6 May 2024 11:36:33 CEST Alexey Charkov wrote:
> This enables the on-chip thermal monitoring sensor (TSADC) on all
> RK3588(s) boards that don't have it enabled yet. It provides temperature
> monitoring for the SoC and emergency thermal shutdowns, and is thus
> important to have in place before CPU DVFS is enabled, as high CPU
> operating performance points can overheat the chip quickly in the
> absence of thermal management.
>
> Signed-off-by: Alexey Charkov <[email protected]>
> ---
> arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 4 ++++
> 8 files changed, 32 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index
> b8e15b76a8a6..21e96c212dd8 100644
> --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> @@ -742,6 +742,10 @@ regulator-state-mem {
> };
> };
>
> +&tsadc {
> + status = "okay";
> +};
> +
> &uart2 {
> pinctrl-0 = <&uart2m0_xfer>;
> status = "okay";
I built a kernel with v3 of your patch set and someone tested it on a ROCK 5B
'for me' and it had the following line in dmesg:
rockchip-thermal fec00000.tsadc: Missing rockchip,grf property
I'm guessing that turned up due to enabling tsadc, but (also) in v4 I didn't
see a change wrt "rockchip,grf".
Should that be done? (asking; I don't know)
Cheers,
Diederik
Hey Diederik and Alexey,
On 2024-05-06 14:28, Diederik de Haas wrote:
> On Monday, 6 May 2024 11:36:33 CEST Alexey Charkov wrote:
>> This enables the on-chip thermal monitoring sensor (TSADC) on all
>> RK3588(s) boards that don't have it enabled yet. It provides
>> temperature
>> monitoring for the SoC and emergency thermal shutdowns, and is thus
>> important to have in place before CPU DVFS is enabled, as high CPU
>> operating performance points can overheat the chip quickly in the
>> absence of thermal management.
>>
>> Signed-off-by: Alexey Charkov <[email protected]>
>> ---
>> arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 4
>> ++++
>> 8 files changed, 32 insertions(+)
>>
>> diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>> b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index
>> b8e15b76a8a6..21e96c212dd8 100644
>> --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>> +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>> @@ -742,6 +742,10 @@ regulator-state-mem {
>> };
>> };
>>
>> +&tsadc {
>> + status = "okay";
>> +};
>> +
>> &uart2 {
>> pinctrl-0 = <&uart2m0_xfer>;
>> status = "okay";
>
> I built a kernel with v3 of your patch set and someone tested it on a
> ROCK 5B
> 'for me' and it had the following line in dmesg:
>
> rockchip-thermal fec00000.tsadc: Missing rockchip,grf property
>
> I'm guessing that turned up due to enabling tsadc, but (also) in v4 I
> didn't
> see a change wrt "rockchip,grf".
> Should that be done? (asking; I don't know)
Nice catch! As it turns out, having "rockchip,grf" defined isn't
needed for the RK3588, so this warning is of somewhat false nature.
In more detail, having "rockchip,grf" defined is actually required
only for some Rockchip SoCs, e.g. RK356x.
I can get this covered in my soon-to-be-submitted device-tree cleanup
patch series, if Alexey is fine with that.
Hello Diederik,
On Mon, May 6, 2024 at 4:29 PM Diederik de Haas <[email protected]> wrote:
>
> Hi,
>
> On Monday, 6 May 2024 11:36:33 CEST Alexey Charkov wrote:
> > This enables the on-chip thermal monitoring sensor (TSADC) on all
> > RK3588(s) boards that don't have it enabled yet. It provides temperature
> > monitoring for the SoC and emergency thermal shutdowns, and is thus
> > important to have in place before CPU DVFS is enabled, as high CPU
> > operating performance points can overheat the chip quickly in the
> > absence of thermal management.
> >
> > Signed-off-by: Alexey Charkov <[email protected]>
> > ---
> > arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 4 ++++
> > 8 files changed, 32 insertions(+)
> >
> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> > b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index
> > b8e15b76a8a6..21e96c212dd8 100644
> > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> > @@ -742,6 +742,10 @@ regulator-state-mem {
> > };
> > };
> >
> > +&tsadc {
> > + status = "okay";
> > +};
> > +
> > &uart2 {
> > pinctrl-0 = <&uart2m0_xfer>;
> > status = "okay";
>
> I built a kernel with v3 of your patch set and someone tested it on a ROCK 5B
> 'for me' and it had the following line in dmesg:
>
> rockchip-thermal fec00000.tsadc: Missing rockchip,grf property
>
> I'm guessing that turned up due to enabling tsadc, but (also) in v4 I didn't
> see a change wrt "rockchip,grf".
> Should that be done? (asking; I don't know)
I'm getting the same. Neither the mainline TSADC driver [1], nor the
downstream one [2] seems to use the grf pointer on RK3588 at all. It
still works in spite of that warning, although I can't see how (or if)
it configures the reset mechanism without those GRF registers.
Best regards,
Alexey
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/thermal/rockchip_thermal.c#n818
[2] https://github.com/radxa/kernel/blob/stable-5.10-rock5/drivers/thermal/rockchip_thermal.c#L961
Hi Alexey,
On 5/6/24 11:36 AM, Alexey Charkov wrote:
> By default the CPUs on RK3588 start up in a conservative performance
> mode. Add frequency and voltage mappings to the device tree to enable
> dynamic scaling via cpufreq.
>
> OPP values are adapted from Radxa's downstream kernel for Rock 5B [1],
> stripping them down to the minimum frequency and voltage combinations
> as expected by the generic upstream cpufreq-dt driver, and also dropping
> those OPPs that don't differ in voltage but only in frequency (keeping
> the top frequency OPP in each case).
>
> Note that this patch ignores voltage scaling for the CPU memory
> interface which the downstream kernel does through a custom cpufreq
> driver, and which is why the downstream version has two sets of voltage
> values for each OPP (the second one being meant for the memory
> interface supply regulator). This is done instead via regulator
> coupling between CPU and memory interface supplies on affected boards.
>
I'm not sure this is everything we need though.
For the LITTLE cores cluster, all OPPs up to 1.416GHz are using the same
opp-supported-hw, however the ones above, aren't.
1.608GHz, 1.704GHz and 1.8GHz are all using different opp-supported-hw.
Similarly, for the big cores clusters, all OPPs up to 1.608GHz are using
the same opp-supported-hw, but not the ones above.
1.8GHz and 2.016GHz, 2.208GHz, 2.256GHz, 2.304GHz, 2.352GHz and 2.4GHz
all have a different opp-supported-hw.
The values in that array are coming from cpu leakage (different for
LITTLE, big0 and big1 clusters) and "specification serial number"
(whatever that means), those are coming from the SoC OTP. In the
downstream kernel from Rockchip, the former value is called "SoC
Version" and the latter "Speed Grade".
I think this may have something to do with "binning" and I would see the
ones above the "common" OPPs as "overclocking". Not all CPUs would
support them and some may not run stable at some lower frequency than
their stable max. Adding Kever from Rockchip in Cc to have some input on
the need to support those.
Cheers,
Quentin
Hello Quentin,
On 2024-05-08 11:12, Quentin Schulz wrote:
> On 5/6/24 11:36 AM, Alexey Charkov wrote:
>> By default the CPUs on RK3588 start up in a conservative performance
>> mode. Add frequency and voltage mappings to the device tree to enable
>> dynamic scaling via cpufreq.
>>
>> OPP values are adapted from Radxa's downstream kernel for Rock 5B [1],
>> stripping them down to the minimum frequency and voltage combinations
>> as expected by the generic upstream cpufreq-dt driver, and also
>> dropping
>> those OPPs that don't differ in voltage but only in frequency (keeping
>> the top frequency OPP in each case).
>>
>> Note that this patch ignores voltage scaling for the CPU memory
>> interface which the downstream kernel does through a custom cpufreq
>> driver, and which is why the downstream version has two sets of
>> voltage
>> values for each OPP (the second one being meant for the memory
>> interface supply regulator). This is done instead via regulator
>> coupling between CPU and memory interface supplies on affected boards.
>
> I'm not sure this is everything we need though.
>
> For the LITTLE cores cluster, all OPPs up to 1.416GHz are using the
> same opp-supported-hw, however the ones above, aren't.
>
> 1.608GHz, 1.704GHz and 1.8GHz are all using different opp-supported-hw.
>
> Similarly, for the big cores clusters, all OPPs up to 1.608GHz are
> using the same opp-supported-hw, but not the ones above.
>
> 1.8GHz and 2.016GHz, 2.208GHz, 2.256GHz, 2.304GHz, 2.352GHz and 2.4GHz
> all have a different opp-supported-hw.
>
> The values in that array are coming from cpu leakage (different for
> LITTLE, big0 and big1 clusters) and "specification serial number"
> (whatever that means), those are coming from the SoC OTP. In the
> downstream kernel from Rockchip, the former value is called "SoC
> Version" and the latter "Speed Grade".
>
> I think this may have something to do with "binning" and I would see
> the ones above the "common" OPPs as "overclocking". Not all CPUs would
> support them and some may not run stable at some lower frequency than
> their stable max. Adding Kever from Rockchip in Cc to have some input
> on the need to support those.
Good point. We should remove the OPPs for both clusters that aren't
supported by all RK3588(s) binnings, to be on the safe side.
I'll hopefully dive into supporting different Rockchip binnings rather
soon. There's even more about that, and not just with the RK3588(s),
which I think I'll get all covered.
Hi Quentin,
On Wed, May 8, 2024 at 1:12 PM Quentin Schulz <[email protected]> wrote:
>
> Hi Alexey,
>
> On 5/6/24 11:36 AM, Alexey Charkov wrote:
> > By default the CPUs on RK3588 start up in a conservative performance
> > mode. Add frequency and voltage mappings to the device tree to enable
> > dynamic scaling via cpufreq.
> >
> > OPP values are adapted from Radxa's downstream kernel for Rock 5B [1],
> > stripping them down to the minimum frequency and voltage combinations
> > as expected by the generic upstream cpufreq-dt driver, and also dropping
> > those OPPs that don't differ in voltage but only in frequency (keeping
> > the top frequency OPP in each case).
> >
> > Note that this patch ignores voltage scaling for the CPU memory
> > interface which the downstream kernel does through a custom cpufreq
> > driver, and which is why the downstream version has two sets of voltage
> > values for each OPP (the second one being meant for the memory
> > interface supply regulator). This is done instead via regulator
> > coupling between CPU and memory interface supplies on affected boards.
> >
>
> I'm not sure this is everything we need though.
>
> For the LITTLE cores cluster, all OPPs up to 1.416GHz are using the same
> opp-supported-hw, however the ones above, aren't.
Thanks a lot for pointing this out - could you please elaborate which
downstream kernel you referred to?
> 1.608GHz, 1.704GHz and 1.8GHz are all using different opp-supported-hw.
In Radxa's downstream kernel source that I looked at [1] the LITTLE
core cluster has all OPPs listed with opp-supported-hw = <0xff
0xffff>;
> Similarly, for the big cores clusters, all OPPs up to 1.608GHz are using
> the same opp-supported-hw, but not the ones above.
>
> 1.8GHz and 2.016GHz, 2.208GHz, 2.256GHz, 2.304GHz, 2.352GHz and 2.4GHz
> all have a different opp-supported-hw.
Hmm, only 2.256GHz, 2.304GHz and 2.352GHz in the sources I'm looking
at have a different opp-supported-hw = <0xff 0x0>; (but note that I
dropped them all from my patch here)
> The values in that array are coming from cpu leakage (different for
> LITTLE, big0 and big1 clusters) and "specification serial number"
> (whatever that means), those are coming from the SoC OTP. In the
> downstream kernel from Rockchip, the former value is called "SoC
> Version" and the latter "Speed Grade".
From what I understood by studying Radxa's downstream kernel sources
and TF-A sources [2], the "leakage" in NVMEM cells drives the
selection of power-optimized voltage levels (opp-microvolt-L1 through
opp-microvolt-L7) for each OPP depending on a OTP-programmed silicon
quality metric, whereas in my patch I only kept the most conservative
voltage values for each OPP (i.e. highest-voltage default ones) and
not the power-optimized ones.
So the proposed patch should (supposedly?) work on any silicon, only
the heat death of the universe becomes marginally closer :)
> I think this may have something to do with "binning" and I would see the
> ones above the "common" OPPs as "overclocking". Not all CPUs would
> support them and some may not run stable at some lower frequency than
> their stable max. Adding Kever from Rockchip in Cc to have some input on
> the need to support those.
Would be great to understand those in more detail, indeed!
Thanks a lot,
Alexey
[1] https://github.com/radxa/kernel/blob/c428536281d69aeb2b3480f65b2b227210b61535/arch/arm64/boot/dts/rockchip/rk3588s.dtsi#L588
[2] https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@mail.gmail.com/
On 2024-05-08 12:50, Quentin Schulz wrote:
> I'm surprised that we removed the lowest frequencies at the same
> voltage, are they not even allowing us to save a teeny tiny bit of
> power consumption? (I'm asking because I'm pretty sure we'll
> eventually get customers complaining the CPU freq doesn't go in super
> low frequency "so this must be a way to consume less power in idle!").
Same-voltage, different-frequency OPPs are seen as inefficient, although
I don't share the same opinion. While the jury is still out, perhaps we
can omit them, and possibly add them later.
Hi Alexey,
On 5/8/24 11:43 AM, Alexey Charkov wrote:
> Hi Quentin,
>
> On Wed, May 8, 2024 at 1:12 PM Quentin Schulz <[email protected]> wrote:
>>
>> Hi Alexey,
>>
>> On 5/6/24 11:36 AM, Alexey Charkov wrote:
>>> By default the CPUs on RK3588 start up in a conservative performance
>>> mode. Add frequency and voltage mappings to the device tree to enable
>>> dynamic scaling via cpufreq.
>>>
>>> OPP values are adapted from Radxa's downstream kernel for Rock 5B [1],
>>> stripping them down to the minimum frequency and voltage combinations
>>> as expected by the generic upstream cpufreq-dt driver, and also dropping
>>> those OPPs that don't differ in voltage but only in frequency (keeping
>>> the top frequency OPP in each case).
>>>
>>> Note that this patch ignores voltage scaling for the CPU memory
>>> interface which the downstream kernel does through a custom cpufreq
>>> driver, and which is why the downstream version has two sets of voltage
>>> values for each OPP (the second one being meant for the memory
>>> interface supply regulator). This is done instead via regulator
>>> coupling between CPU and memory interface supplies on affected boards.
>>>
>>
>> I'm not sure this is everything we need though.
>>
>> For the LITTLE cores cluster, all OPPs up to 1.416GHz are using the same
>> opp-supported-hw, however the ones above, aren't.
>
> Thanks a lot for pointing this out - could you please elaborate which
> downstream kernel you referred to?
>
The one provided by Rockchip directly :) No intermediates.
I can give you the one we use on our products at the moment:
https://git.embedded.cherry.de/tiger-linux.git/ (or jaguar-linux,
doesn't matter).
The one that is (publicly) "maintained" by Rockchip is:
https://github.com/rockchip-linux/kernel/tree/develop-5.10
From Cherry's git repo:
"""
$ rg -B1 --color never -N opp-supported-hw
arch/arm64/boot/dts/rockchip/rk3588s.dtsi
opp-408000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-600000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-816000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1008000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1200000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1416000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1608000000 {
opp-supported-hw = <0xfb 0xffff>;
--
opp-1704000000 {
opp-supported-hw = <0x02 0xffff>;
--
opp-1800000000 {
opp-supported-hw = <0xf9 0xffff>;
--
opp-408000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-600000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-816000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1008000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1200000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1416000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1608000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1800000000 {
opp-supported-hw = <0xfb 0xffff>;
--
opp-2016000000 {
opp-supported-hw = <0xfb 0xffff>;
--
opp-2208000000 {
opp-supported-hw = <0xf9 0xffff>;
--
opp-2256000000 {
opp-supported-hw = <0xf9 0x13>;
--
opp-2304000000 {
opp-supported-hw = <0xf9 0x24>;
--
opp-2352000000 {
opp-supported-hw = <0xf9 0x48>;
--
opp-2400000000 {
opp-supported-hw = <0xf9 0x80>;
--
opp-408000000 {
opp-supported-hw = <0xff 0x0ffff>;
--
opp-600000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-816000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1008000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1200000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1416000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1608000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-1800000000 {
opp-supported-hw = <0xfb 0xffff>;
--
opp-2016000000 {
opp-supported-hw = <0xfb 0xffff>;
--
opp-2208000000 {
opp-supported-hw = <0xf9 0xffff>;
--
opp-2256000000 {
opp-supported-hw = <0xf9 0x13>;
--
opp-2304000000 {
opp-supported-hw = <0xf9 0x24>;
--
opp-2352000000 {
opp-supported-hw = <0xf9 0x48>;
--
opp-2400000000 {
opp-supported-hw = <0xf9 0x80>;
--
opp-300000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-400000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-500000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-600000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-700000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-800000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-900000000 {
opp-supported-hw = <0xfb 0xffff>;
--
opp-1000000000 {
opp-supported-hw = <0xfb 0xffff>;
--
opp-300000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-400000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-500000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-600000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-700000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-800000000 {
opp-supported-hw = <0xff 0xffff>;
--
opp-900000000 {
opp-supported-hw = <0xfb 0xffff>;
--
opp-1000000000 {
opp-supported-hw = <0xfb 0xffff>;
"""
In order: LITTLE, big0, big1, DMC (memory), GPU and then NPU OPP table.
Looking at the 6.1 development branch from Rockchip
(https://github.com/JeffyCN/mirrors/blob/kernel-6.1). The LITTLE cluster
OPPs seem to all be using the same opp-supported-hw entry now (but
different from the one in 5.10). But, the big cluster OPPs in 6.1 are
matching the one in 5.10 (that is, not the ones from Radxa).
>> 1.608GHz, 1.704GHz and 1.8GHz are all using different opp-supported-hw.
>
> In Radxa's downstream kernel source that I looked at [1] the LITTLE
> core cluster has all OPPs listed with opp-supported-hw = <0xff
> 0xffff>;
>
>> Similarly, for the big cores clusters, all OPPs up to 1.608GHz are using
>> the same opp-supported-hw, but not the ones above.
>>
>> 1.8GHz and 2.016GHz, 2.208GHz, 2.256GHz, 2.304GHz, 2.352GHz and 2.4GHz
>> all have a different opp-supported-hw.
>
> Hmm, only 2.256GHz, 2.304GHz and 2.352GHz in the sources I'm looking
> at have a different opp-supported-hw = <0xff 0x0>; (but note that I
> dropped them all from my patch here)
>
Seems to be a change made by Radxa folks:
https://github.com/radxa/kernel/commit/cf277d5eb46ef55517afffa10d48dd71bdd00c61
(yay to no commit log \o/)
>> The values in that array are coming from cpu leakage (different for
>> LITTLE, big0 and big1 clusters) and "specification serial number"
>> (whatever that means), those are coming from the SoC OTP. In the
>> downstream kernel from Rockchip, the former value is called "SoC
>> Version" and the latter "Speed Grade".
>
> From what I understood by studying Radxa's downstream kernel sources
> and TF-A sources [2], the "leakage" in NVMEM cells drives the
> selection of power-optimized voltage levels (opp-microvolt-L1 through
> opp-microvolt-L7) for each OPP depending on a OTP-programmed silicon
> quality metric, whereas in my patch I only kept the most conservative
> voltage values for each OPP (i.e. highest-voltage default ones) and
> not the power-optimized ones.
>
> So the proposed patch should (supposedly?) work on any silicon, only
> the heat death of the universe becomes marginally closer :)
>
An OPP from the DT is selected if _opp_is_supported returns true. This
is based on supported_hw member of the opp_table, which we set through
dev_pm_opp_set_supported_hw. This is called by
drivers/cpufreq/rockchip-cpufreq.c with two values: SoC Version and
Speed Grade. The SoC version is a bitmap set by rk3588_get_soc_info by
reading specification_serial_number region in the OTP and reading the
first byte. If it is anything but 0xd (RK3588M) or 0xa (RK3588J), it is
BIT(0).
To know if the opp is supported, you extract the first value of the
array and mask it with the value gotten from rk3588_get_soc_info (the
bitfield). This means that for RK3588 (and not the M or J variant), the
first value of the OPP opp-supported-hw is a match if it is an odd
number, so only opp-1704000000 in LITTLE cluster is excluded (on that
sole match).
The second value in opp-supported-hw seems to be derived somehow from
the cpu_leakage OTP. This is likely the same rabbit hole you dug two
months ago, so I'll trust your findings there to avoid getting my hands
dirty :)
In summary, false alarm (but still surprising changes made by Radxa
here, not that they matter if they only run their kernel on "pure"
RK3588). Sorry for the noise, and thanks for the explanations :)
I'm surprised that we removed the lowest frequencies at the same
voltage, are they not even allowing us to save a teeny tiny bit of power
consumption? (I'm asking because I'm pretty sure we'll eventually get
customers complaining the CPU freq doesn't go in super low frequency "so
this must be a way to consume less power in idle!").
Cheers,
Quentin
Hi Alexey,
On Mon, 6 May 2024 at 18:24, Alexey Charkov <[email protected]> wrote:
>
> Hello Diederik,
>
> On Mon, May 6, 2024 at 4:29 PM Diederik de Haas <[email protected]> wrote:
> >
> > Hi,
> >
> > On Monday, 6 May 2024 11:36:33 CEST Alexey Charkov wrote:
> > > This enables the on-chip thermal monitoring sensor (TSADC) on all
> > > RK3588(s) boards that don't have it enabled yet. It provides temperature
> > > monitoring for the SoC and emergency thermal shutdowns, and is thus
> > > important to have in place before CPU DVFS is enabled, as high CPU
> > > operating performance points can overheat the chip quickly in the
> > > absence of thermal management.
> > >
> > > Signed-off-by: Alexey Charkov <[email protected]>
> > > ---
> > > arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 4 ++++
> > > 8 files changed, 32 insertions(+)
> > >
> > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> > > b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index
> > > b8e15b76a8a6..21e96c212dd8 100644
> > > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> > > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> > > @@ -742,6 +742,10 @@ regulator-state-mem {
> > > };
> > > };
> > >
> > > +&tsadc {
> > > + status = "okay";
> > > +};
> > > +
> > > &uart2 {
> > > pinctrl-0 = <&uart2m0_xfer>;
> > > status = "okay";
> >
> > I built a kernel with v3 of your patch set and someone tested it on a ROCK 5B
> > 'for me' and it had the following line in dmesg:
> >
> > rockchip-thermal fec00000.tsadc: Missing rockchip,grf property
> >
> > I'm guessing that turned up due to enabling tsadc, but (also) in v4 I didn't
> > see a change wrt "rockchip,grf".
> > Should that be done? (asking; I don't know)
>
> I'm getting the same. Neither the mainline TSADC driver [1], nor the
> downstream one [2] seems to use the grf pointer on RK3588 at all. It
> still works in spite of that warning, although I can't see how (or if)
> it configures the reset mechanism without those GRF registers.
>
> Best regards,
> Alexey
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/thermal/rockchip_thermal.c#n818
> [2] https://github.com/radxa/kernel/blob/stable-5.10-rock5/drivers/thermal/rockchip_thermal.c#L961
>
If the following changes fix the warning.
Checking the Rockchip RK3588 TRM V1.0-Part1-20220309.pdf
PMU1GRF_SOC_CON3 which has tsadc_shut_reset_trigger_en bit
to control the Enable TSADC shut reset trigger for DDR fail safe.
diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
index 85c25d5efdad..5490a44e093e 100644
--- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
+++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
@@ -2662,6 +2662,7 @@ tsadc: tsadc@fec00000 {
rockchip,hw-tshut-temp = <120000>;
rockchip,hw-tshut-mode = <0>; /* tshut mode 0:CRU 1:GPIO */
rockchip,hw-tshut-polarity = <0>; /* tshut polarity
0:LOW 1:HIGH */
+ rockchip,pmu = <&pmu1grf>;
pinctrl-0 = <&tsadc_gpio_func>;
pinctrl-1 = <&tsadc_shut>;
pinctrl-names = "gpio", "otpout";
Thanks
-Anand
> _______________________________________________
> Linux-rockchip mailing list
> [email protected]
> http://lists.infradead.org/mailman/listinfo/linux-rockchip
Hello Anand,
On 2024-05-08 13:40, Anand Moon wrote:
> On Mon, 6 May 2024 at 18:24, Alexey Charkov <[email protected]> wrote:
>> On Mon, May 6, 2024 at 4:29 PM Diederik de Haas
>> <[email protected]> wrote:
>> > On Monday, 6 May 2024 11:36:33 CEST Alexey Charkov wrote:
>> > > This enables the on-chip thermal monitoring sensor (TSADC) on all
>> > > RK3588(s) boards that don't have it enabled yet. It provides temperature
>> > > monitoring for the SoC and emergency thermal shutdowns, and is thus
>> > > important to have in place before CPU DVFS is enabled, as high CPU
>> > > operating performance points can overheat the chip quickly in the
>> > > absence of thermal management.
>> > >
>> > > Signed-off-by: Alexey Charkov <[email protected]>
>> > > ---
>> > > arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 4 ++++
>> > > 8 files changed, 32 insertions(+)
>> > >
>> > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>> > > b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index
>> > > b8e15b76a8a6..21e96c212dd8 100644
>> > > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>> > > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>> > > @@ -742,6 +742,10 @@ regulator-state-mem {
>> > > };
>> > > };
>> > >
>> > > +&tsadc {
>> > > + status = "okay";
>> > > +};
>> > > +
>> > > &uart2 {
>> > > pinctrl-0 = <&uart2m0_xfer>;
>> > > status = "okay";
>> >
>> > I built a kernel with v3 of your patch set and someone tested it on a ROCK 5B
>> > 'for me' and it had the following line in dmesg:
>> >
>> > rockchip-thermal fec00000.tsadc: Missing rockchip,grf property
>> >
>> > I'm guessing that turned up due to enabling tsadc, but (also) in v4 I didn't
>> > see a change wrt "rockchip,grf".
>> > Should that be done? (asking; I don't know)
>>
>> I'm getting the same. Neither the mainline TSADC driver [1], nor the
>> downstream one [2] seems to use the grf pointer on RK3588 at all. It
>> still works in spite of that warning, although I can't see how (or if)
>> it configures the reset mechanism without those GRF registers.
>>
>> Best regards,
>> Alexey
>>
>> [1]
>> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/thermal/rockchip_thermal.c#n818
>> [2]
>> https://github.com/radxa/kernel/blob/stable-5.10-rock5/drivers/thermal/rockchip_thermal.c#L961
>>
>
> If the following changes fix the warning.
>
> Checking the Rockchip RK3588 TRM V1.0-Part1-20220309.pdf
> PMU1GRF_SOC_CON3 which has tsadc_shut_reset_trigger_en bit
> to control the Enable TSADC shut reset trigger for DDR fail safe.
>
> diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> index 85c25d5efdad..5490a44e093e 100644
> --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> @@ -2662,6 +2662,7 @@ tsadc: tsadc@fec00000 {
> rockchip,hw-tshut-temp = <120000>;
> rockchip,hw-tshut-mode = <0>; /* tshut mode 0:CRU
> 1:GPIO */
> rockchip,hw-tshut-polarity = <0>; /* tshut polarity
> 0:LOW 1:HIGH */
> + rockchip,pmu = <&pmu1grf>;
> pinctrl-0 = <&tsadc_gpio_func>;
> pinctrl-1 = <&tsadc_shut>;
> pinctrl-names = "gpio", "otpout";
Basically, the rockchip_thermal driver doesn't use GRF at all on
the RK3588(s), so virtually any value specified as "rockchip,pmu"
can eliminate the warning.
I'm already working on a rather large device-tree cleanup series,
and this is already fixed in it. Are you fine with dropping your
patch as a separate one, and I'll tag you with Co-developed-by in
the relevant patch from my series?
Hello Dragan and Anand,
On Wed, May 8, 2024 at 3:46 PM Dragan Simic <[email protected]> wrote:
>
> Hello Anand,
>
> On 2024-05-08 13:40, Anand Moon wrote:
> > On Mon, 6 May 2024 at 18:24, Alexey Charkov <[email protected]> wrote:
> >> On Mon, May 6, 2024 at 4:29 PM Diederik de Haas
> >> <[email protected]> wrote:
> >> > On Monday, 6 May 2024 11:36:33 CEST Alexey Charkov wrote:
> >> > > This enables the on-chip thermal monitoring sensor (TSADC) on all
> >> > > RK3588(s) boards that don't have it enabled yet. It provides temperature
> >> > > monitoring for the SoC and emergency thermal shutdowns, and is thus
> >> > > important to have in place before CPU DVFS is enabled, as high CPU
> >> > > operating performance points can overheat the chip quickly in the
> >> > > absence of thermal management.
> >> > >
> >> > > Signed-off-by: Alexey Charkov <[email protected]>
> >> > > ---
> >> > > arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 4 ++++
> >> > > 8 files changed, 32 insertions(+)
> >> > >
> >> > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> >> > > b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index
> >> > > b8e15b76a8a6..21e96c212dd8 100644
> >> > > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> >> > > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> >> > > @@ -742,6 +742,10 @@ regulator-state-mem {
> >> > > };
> >> > > };
> >> > >
> >> > > +&tsadc {
> >> > > + status = "okay";
> >> > > +};
> >> > > +
> >> > > &uart2 {
> >> > > pinctrl-0 = <&uart2m0_xfer>;
> >> > > status = "okay";
> >> >
> >> > I built a kernel with v3 of your patch set and someone tested it on a ROCK 5B
> >> > 'for me' and it had the following line in dmesg:
> >> >
> >> > rockchip-thermal fec00000.tsadc: Missing rockchip,grf property
> >> >
> >> > I'm guessing that turned up due to enabling tsadc, but (also) in v4 I didn't
> >> > see a change wrt "rockchip,grf".
> >> > Should that be done? (asking; I don't know)
> >>
> >> I'm getting the same. Neither the mainline TSADC driver [1], nor the
> >> downstream one [2] seems to use the grf pointer on RK3588 at all. It
> >> still works in spite of that warning, although I can't see how (or if)
> >> it configures the reset mechanism without those GRF registers.
> >>
> >> Best regards,
> >> Alexey
> >>
> >> [1]
> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/thermal/rockchip_thermal.c#n818
> >> [2]
> >> https://github.com/radxa/kernel/blob/stable-5.10-rock5/drivers/thermal/rockchip_thermal.c#L961
> >>
> >
> > If the following changes fix the warning.
> >
> > Checking the Rockchip RK3588 TRM V1.0-Part1-20220309.pdf
> > PMU1GRF_SOC_CON3 which has tsadc_shut_reset_trigger_en bit
> > to control the Enable TSADC shut reset trigger for DDR fail safe.
> >
> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > index 85c25d5efdad..5490a44e093e 100644
> > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > @@ -2662,6 +2662,7 @@ tsadc: tsadc@fec00000 {
> > rockchip,hw-tshut-temp = <120000>;
> > rockchip,hw-tshut-mode = <0>; /* tshut mode 0:CRU
> > 1:GPIO */
> > rockchip,hw-tshut-polarity = <0>; /* tshut polarity
> > 0:LOW 1:HIGH */
> > + rockchip,pmu = <&pmu1grf>;
> > pinctrl-0 = <&tsadc_gpio_func>;
> > pinctrl-1 = <&tsadc_shut>;
> > pinctrl-names = "gpio", "otpout";
>
> Basically, the rockchip_thermal driver doesn't use GRF at all on
> the RK3588(s), so virtually any value specified as "rockchip,pmu"
> can eliminate the warning.
To me, specifying an arbitrary GRF in the device tree just to silence
a warning that the current driver emits sounds bad. If the GRF is not
needed for TSADC initialization on RK3588, then it should not be in
the device tree (and it looks like the case here) - otherwise the
device tree ceases to be a truthful description of the hardware.
I'm not sure if we need that "DDR fail safe" logic mentioned in the
TRM that Anand quoted, given that neither Rockchip downstream nor
current mainline driver implement it, and furthermore none of the
other SoC revisions supported by the driver mention it. If we do in
fact need it, we should probably first test it along with respective
driver code before committing to an upstream DT and thus making it
part of the ABI.
IMO this is more of a driver issue than a device tree issue: maybe a
small patch to demote this warning to an info message would be better?
It's harmless anyway.
Best regards,
Alexey
Hello Alexey,
On 2024-05-08 14:30, Alexey Charkov wrote:
> On Wed, May 8, 2024 at 3:46 PM Dragan Simic <[email protected]> wrote:
>> On 2024-05-08 13:40, Anand Moon wrote:
>> > On Mon, 6 May 2024 at 18:24, Alexey Charkov <[email protected]> wrote:
>> >> On Mon, May 6, 2024 at 4:29 PM Diederik de Haas
>> >> <[email protected]> wrote:
>> >> > On Monday, 6 May 2024 11:36:33 CEST Alexey Charkov wrote:
>> >> > > This enables the on-chip thermal monitoring sensor (TSADC) on all
>> >> > > RK3588(s) boards that don't have it enabled yet. It provides temperature
>> >> > > monitoring for the SoC and emergency thermal shutdowns, and is thus
>> >> > > important to have in place before CPU DVFS is enabled, as high CPU
>> >> > > operating performance points can overheat the chip quickly in the
>> >> > > absence of thermal management.
>> >> > >
>> >> > > Signed-off-by: Alexey Charkov <[email protected]>
>> >> > > ---
>> >> > > arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 4 ++++
>> >> > > 8 files changed, 32 insertions(+)
>> >> > >
>> >> > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>> >> > > b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index
>> >> > > b8e15b76a8a6..21e96c212dd8 100644
>> >> > > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>> >> > > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
>> >> > > @@ -742,6 +742,10 @@ regulator-state-mem {
>> >> > > };
>> >> > > };
>> >> > >
>> >> > > +&tsadc {
>> >> > > + status = "okay";
>> >> > > +};
>> >> > > +
>> >> > > &uart2 {
>> >> > > pinctrl-0 = <&uart2m0_xfer>;
>> >> > > status = "okay";
>> >> >
>> >> > I built a kernel with v3 of your patch set and someone tested it on a ROCK 5B
>> >> > 'for me' and it had the following line in dmesg:
>> >> >
>> >> > rockchip-thermal fec00000.tsadc: Missing rockchip,grf property
>> >> >
>> >> > I'm guessing that turned up due to enabling tsadc, but (also) in v4 I didn't
>> >> > see a change wrt "rockchip,grf".
>> >> > Should that be done? (asking; I don't know)
>> >>
>> >> I'm getting the same. Neither the mainline TSADC driver [1], nor the
>> >> downstream one [2] seems to use the grf pointer on RK3588 at all. It
>> >> still works in spite of that warning, although I can't see how (or if)
>> >> it configures the reset mechanism without those GRF registers.
>> >>
>> >> Best regards,
>> >> Alexey
>> >>
>> >> [1]
>> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/thermal/rockchip_thermal.c#n818
>> >> [2]
>> >> https://github.com/radxa/kernel/blob/stable-5.10-rock5/drivers/thermal/rockchip_thermal.c#L961
>> >>
>> >
>> > If the following changes fix the warning.
>> >
>> > Checking the Rockchip RK3588 TRM V1.0-Part1-20220309.pdf
>> > PMU1GRF_SOC_CON3 which has tsadc_shut_reset_trigger_en bit
>> > to control the Enable TSADC shut reset trigger for DDR fail safe.
>> >
>> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>> > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>> > index 85c25d5efdad..5490a44e093e 100644
>> > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>> > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
>> > @@ -2662,6 +2662,7 @@ tsadc: tsadc@fec00000 {
>> > rockchip,hw-tshut-temp = <120000>;
>> > rockchip,hw-tshut-mode = <0>; /* tshut mode 0:CRU
>> > 1:GPIO */
>> > rockchip,hw-tshut-polarity = <0>; /* tshut polarity
>> > 0:LOW 1:HIGH */
>> > + rockchip,pmu = <&pmu1grf>;
>> > pinctrl-0 = <&tsadc_gpio_func>;
>> > pinctrl-1 = <&tsadc_shut>;
>> > pinctrl-names = "gpio", "otpout";
>>
>> Basically, the rockchip_thermal driver doesn't use GRF at all on
>> the RK3588(s), so virtually any value specified as "rockchip,pmu"
>> can eliminate the warning.
>
> To me, specifying an arbitrary GRF in the device tree just to silence
> a warning that the current driver emits sounds bad. If the GRF is not
> needed for TSADC initialization on RK3588, then it should not be in
> the device tree (and it looks like the case here) - otherwise the
> device tree ceases to be a truthful description of the hardware.
After thinking a bit more about it, I think you're right. If the
rockchip_thermal driver ever gets changed to require use of GRF on
the RK3588(s), only then adding that property to the DT would be
the right thing to do.
> I'm not sure if we need that "DDR fail safe" logic mentioned in the
> TRM that Anand quoted, given that neither Rockchip downstream nor
> current mainline driver implement it, and furthermore none of the
> other SoC revisions supported by the driver mention it. If we do in
> fact need it, we should probably first test it along with respective
> driver code before committing to an upstream DT and thus making it
> part of the ABI.
>
> IMO this is more of a driver issue than a device tree issue: maybe a
> small patch to demote this warning to an info message would be better?
> It's harmless anyway.
After having second thoughts, I'll see to improve the rockchip_thermal
driver to emit that warning only when having "rockchip,grf" specified
is actually needed for the particular Rockchip SoC. That's how it
should
behave, yelling about the wrong hardware description only when that's
actually the case.
Hi Dragan, Alexey,
On Wed, 8 May 2024 at 18:08, Dragan Simic <[email protected]> wrote:
>
> Hello Alexey,
>
> On 2024-05-08 14:30, Alexey Charkov wrote:
> > On Wed, May 8, 2024 at 3:46 PM Dragan Simic <[email protected]> wrote:
> >> On 2024-05-08 13:40, Anand Moon wrote:
> >> > On Mon, 6 May 2024 at 18:24, Alexey Charkov <[email protected]> wrote:
> >> >> On Mon, May 6, 2024 at 4:29 PM Diederik de Haas
> >> >> <[email protected]> wrote:
> >> >> > On Monday, 6 May 2024 11:36:33 CEST Alexey Charkov wrote:
> >> >> > > This enables the on-chip thermal monitoring sensor (TSADC) on all
> >> >> > > RK3588(s) boards that don't have it enabled yet. It provides temperature
> >> >> > > monitoring for the SoC and emergency thermal shutdowns, and is thus
> >> >> > > important to have in place before CPU DVFS is enabled, as high CPU
> >> >> > > operating performance points can overheat the chip quickly in the
> >> >> > > absence of thermal management.
> >> >> > >
> >> >> > > Signed-off-by: Alexey Charkov <[email protected]>
> >> >> > > ---
> >> >> > > arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 4 ++++
> >> >> > > 8 files changed, 32 insertions(+)
> >> >> > >
> >> >> > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> >> >> > > b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index
> >> >> > > b8e15b76a8a6..21e96c212dd8 100644
> >> >> > > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> >> >> > > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> >> >> > > @@ -742,6 +742,10 @@ regulator-state-mem {
> >> >> > > };
> >> >> > > };
> >> >> > >
> >> >> > > +&tsadc {
> >> >> > > + status = "okay";
> >> >> > > +};
> >> >> > > +
> >> >> > > &uart2 {
> >> >> > > pinctrl-0 = <&uart2m0_xfer>;
> >> >> > > status = "okay";
> >> >> >
> >> >> > I built a kernel with v3 of your patch set and someone tested it on a ROCK 5B
> >> >> > 'for me' and it had the following line in dmesg:
> >> >> >
> >> >> > rockchip-thermal fec00000.tsadc: Missing rockchip,grf property
> >> >> >
> >> >> > I'm guessing that turned up due to enabling tsadc, but (also) in v4 I didn't
> >> >> > see a change wrt "rockchip,grf".
> >> >> > Should that be done? (asking; I don't know)
> >> >>
> >> >> I'm getting the same. Neither the mainline TSADC driver [1], nor the
> >> >> downstream one [2] seems to use the grf pointer on RK3588 at all. It
> >> >> still works in spite of that warning, although I can't see how (or if)
> >> >> it configures the reset mechanism without those GRF registers.
> >> >>
> >> >> Best regards,
> >> >> Alexey
> >> >>
> >> >> [1]
> >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/thermal/rockchip_thermal.c#n818
> >> >> [2]
> >> >> https://github.com/radxa/kernel/blob/stable-5.10-rock5/drivers/thermal/rockchip_thermal.c#L961
> >> >>
> >> >
> >> > If the following changes fix the warning.
> >> >
> >> > Checking the Rockchip RK3588 TRM V1.0-Part1-20220309.pdf
> >> > PMU1GRF_SOC_CON3 which has tsadc_shut_reset_trigger_en bit
> >> > to control the Enable TSADC shut reset trigger for DDR fail safe.
> >> >
> >> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> >> > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> >> > index 85c25d5efdad..5490a44e093e 100644
> >> > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> >> > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> >> > @@ -2662,6 +2662,7 @@ tsadc: tsadc@fec00000 {
> >> > rockchip,hw-tshut-temp = <120000>;
> >> > rockchip,hw-tshut-mode = <0>; /* tshut mode 0:CRU
> >> > 1:GPIO */
> >> > rockchip,hw-tshut-polarity = <0>; /* tshut polarity
> >> > 0:LOW 1:HIGH */
> >> > + rockchip,pmu = <&pmu1grf>;
> >> > pinctrl-0 = <&tsadc_gpio_func>;
> >> > pinctrl-1 = <&tsadc_shut>;
> >> > pinctrl-names = "gpio", "otpout";
> >>
> >> Basically, the rockchip_thermal driver doesn't use GRF at all on
> >> the RK3588(s), so virtually any value specified as "rockchip,pmu"
> >> can eliminate the warning.
> >
> > To me, specifying an arbitrary GRF in the device tree just to silence
> > a warning that the current driver emits sounds bad. If the GRF is not
> > needed for TSADC initialization on RK3588, then it should not be in
> > the device tree (and it looks like the case here) - otherwise the
> > device tree ceases to be a truthful description of the hardware.
>
> After thinking a bit more about it, I think you're right. If the
> rockchip_thermal driver ever gets changed to require use of GRF on
> the RK3588(s), only then adding that property to the DT would be
> the right thing to do.
>
> > I'm not sure if we need that "DDR fail safe" logic mentioned in the
> > TRM that Anand quoted, given that neither Rockchip downstream nor
> > current mainline driver implement it, and furthermore none of the
> > other SoC revisions supported by the driver mention it. If we do in
> > fact need it, we should probably first test it along with respective
> > driver code before committing to an upstream DT and thus making it
> > part of the ABI.
> >
> > IMO this is more of a driver issue than a device tree issue: maybe a
> > small patch to demote this warning to an info message would be better?
> > It's harmless anyway.
>
> After having second thoughts, I'll see to improve the rockchip_thermal
> driver to emit that warning only when having "rockchip,grf" specified
> is actually needed for the particular Rockchip SoC. That's how it
> should
> behave, yelling about the wrong hardware description only when that's
> actually the case.
We need to handle the GRF TSADC for rk3588 tmu driver.
For rk3568 we control the GRF TSADC which seems to be missing in rk3588
[[0] https://elixir.bootlin.com/linux/v6.8.9/source/drivers/thermal/rockchip_thermal.c#L798
Thanks
-Anand
On Wed, May 8, 2024 at 4:51 PM Anand Moon <[email protected]> wrote:
>
> Hi Dragan, Alexey,
>
> On Wed, 8 May 2024 at 18:08, Dragan Simic <[email protected]> wrote:
> >
> > Hello Alexey,
> >
> > On 2024-05-08 14:30, Alexey Charkov wrote:
> > > On Wed, May 8, 2024 at 3:46 PM Dragan Simic <[email protected]> wrote:
> > >> On 2024-05-08 13:40, Anand Moon wrote:
> > >> > On Mon, 6 May 2024 at 18:24, Alexey Charkov <[email protected]> wrote:
> > >> >> On Mon, May 6, 2024 at 4:29 PM Diederik de Haas
> > >> >> <[email protected]> wrote:
> > >> >> > On Monday, 6 May 2024 11:36:33 CEST Alexey Charkov wrote:
> > >> >> > > This enables the on-chip thermal monitoring sensor (TSADC) on all
> > >> >> > > RK3588(s) boards that don't have it enabled yet. It provides temperature
> > >> >> > > monitoring for the SoC and emergency thermal shutdowns, and is thus
> > >> >> > > important to have in place before CPU DVFS is enabled, as high CPU
> > >> >> > > operating performance points can overheat the chip quickly in the
> > >> >> > > absence of thermal management.
> > >> >> > >
> > >> >> > > Signed-off-by: Alexey Charkov <[email protected]>
> > >> >> > > ---
> > >> >> > > arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 4 ++++
> > >> >> > > 8 files changed, 32 insertions(+)
> > >> >> > >
> > >> >> > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> > >> >> > > b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index
> > >> >> > > b8e15b76a8a6..21e96c212dd8 100644
> > >> >> > > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> > >> >> > > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> > >> >> > > @@ -742,6 +742,10 @@ regulator-state-mem {
> > >> >> > > };
> > >> >> > > };
> > >> >> > >
> > >> >> > > +&tsadc {
> > >> >> > > + status = "okay";
> > >> >> > > +};
> > >> >> > > +
> > >> >> > > &uart2 {
> > >> >> > > pinctrl-0 = <&uart2m0_xfer>;
> > >> >> > > status = "okay";
> > >> >> >
> > >> >> > I built a kernel with v3 of your patch set and someone tested it on a ROCK 5B
> > >> >> > 'for me' and it had the following line in dmesg:
> > >> >> >
> > >> >> > rockchip-thermal fec00000.tsadc: Missing rockchip,grf property
> > >> >> >
> > >> >> > I'm guessing that turned up due to enabling tsadc, but (also) in v4 I didn't
> > >> >> > see a change wrt "rockchip,grf".
> > >> >> > Should that be done? (asking; I don't know)
> > >> >>
> > >> >> I'm getting the same. Neither the mainline TSADC driver [1], nor the
> > >> >> downstream one [2] seems to use the grf pointer on RK3588 at all. It
> > >> >> still works in spite of that warning, although I can't see how (or if)
> > >> >> it configures the reset mechanism without those GRF registers.
> > >> >>
> > >> >> Best regards,
> > >> >> Alexey
> > >> >>
> > >> >> [1]
> > >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/thermal/rockchip_thermal.c#n818
> > >> >> [2]
> > >> >> https://github.com/radxa/kernel/blob/stable-5.10-rock5/drivers/thermal/rockchip_thermal.c#L961
> > >> >>
> > >> >
> > >> > If the following changes fix the warning.
> > >> >
> > >> > Checking the Rockchip RK3588 TRM V1.0-Part1-20220309.pdf
> > >> > PMU1GRF_SOC_CON3 which has tsadc_shut_reset_trigger_en bit
> > >> > to control the Enable TSADC shut reset trigger for DDR fail safe.
> > >> >
> > >> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > >> > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > >> > index 85c25d5efdad..5490a44e093e 100644
> > >> > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > >> > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > >> > @@ -2662,6 +2662,7 @@ tsadc: tsadc@fec00000 {
> > >> > rockchip,hw-tshut-temp = <120000>;
> > >> > rockchip,hw-tshut-mode = <0>; /* tshut mode 0:CRU
> > >> > 1:GPIO */
> > >> > rockchip,hw-tshut-polarity = <0>; /* tshut polarity
> > >> > 0:LOW 1:HIGH */
> > >> > + rockchip,pmu = <&pmu1grf>;
> > >> > pinctrl-0 = <&tsadc_gpio_func>;
> > >> > pinctrl-1 = <&tsadc_shut>;
> > >> > pinctrl-names = "gpio", "otpout";
> > >>
> > >> Basically, the rockchip_thermal driver doesn't use GRF at all on
> > >> the RK3588(s), so virtually any value specified as "rockchip,pmu"
> > >> can eliminate the warning.
> > >
> > > To me, specifying an arbitrary GRF in the device tree just to silence
> > > a warning that the current driver emits sounds bad. If the GRF is not
> > > needed for TSADC initialization on RK3588, then it should not be in
> > > the device tree (and it looks like the case here) - otherwise the
> > > device tree ceases to be a truthful description of the hardware.
> >
> > After thinking a bit more about it, I think you're right. If the
> > rockchip_thermal driver ever gets changed to require use of GRF on
> > the RK3588(s), only then adding that property to the DT would be
> > the right thing to do.
> >
> > > I'm not sure if we need that "DDR fail safe" logic mentioned in the
> > > TRM that Anand quoted, given that neither Rockchip downstream nor
> > > current mainline driver implement it, and furthermore none of the
> > > other SoC revisions supported by the driver mention it. If we do in
> > > fact need it, we should probably first test it along with respective
> > > driver code before committing to an upstream DT and thus making it
> > > part of the ABI.
> > >
> > > IMO this is more of a driver issue than a device tree issue: maybe a
> > > small patch to demote this warning to an info message would be better?
> > > It's harmless anyway.
> >
> > After having second thoughts, I'll see to improve the rockchip_thermal
> > driver to emit that warning only when having "rockchip,grf" specified
> > is actually needed for the particular Rockchip SoC. That's how it
> > should
> > behave, yelling about the wrong hardware description only when that's
> > actually the case.
>
> We need to handle the GRF TSADC for rk3588 tmu driver.
>
> For rk3568 we control the GRF TSADC which seems to be missing in rk3588
Please note that for RK3568 the GRF registers are used to actually
enable the thermal sensing (as can be seen by the link you posted),
which doesn't seem to be necessary on RK3588 as it works just fine
without it. Furthermore, RK3568 has a nearly identical GRF register in
PMU_GRF_SOC_CON3 (see RK3568 TRM part1 page 206) to the one you
proposed referencing in the DT for RK3588, but that register is not
used in the TSADC driver either. It only uses the TSADC_CON register
within SYS_GRF (offset 0x600) on RK3568.
The TRM for RK3588 doesn't document any TSADC_CON register, even
though there is a #define in the current upstream driver that mentions
an offset of 0x100 to that register within the GRF. I'm not sure it
means 0x100 within PMU1_GRF as you mentioned - do you have any
pointers to support that?
Best regards,
Alexey
Hi Alexey,
On Wed, 8 May 2024 at 18:52, Alexey Charkov <[email protected]> wrote:
>
> On Wed, May 8, 2024 at 4:51 PM Anand Moon <[email protected]> wrote:
> >
> > Hi Dragan, Alexey,
> >
> > On Wed, 8 May 2024 at 18:08, Dragan Simic <[email protected]> wrote:
> > >
> > > Hello Alexey,
> > >
> > > On 2024-05-08 14:30, Alexey Charkov wrote:
> > > > On Wed, May 8, 2024 at 3:46 PM Dragan Simic <dsimic@manjaroorg> wrote:
> > > >> On 2024-05-08 13:40, Anand Moon wrote:
> > > >> > On Mon, 6 May 2024 at 18:24, Alexey Charkov <[email protected]> wrote:
> > > >> >> On Mon, May 6, 2024 at 4:29 PM Diederik de Haas
> > > >> >> <[email protected]> wrote:
> > > >> >> > On Monday, 6 May 2024 11:36:33 CEST Alexey Charkov wrote:
> > > >> >> > > This enables the on-chip thermal monitoring sensor (TSADC) on all
> > > >> >> > > RK3588(s) boards that don't have it enabled yet. It provides temperature
> > > >> >> > > monitoring for the SoC and emergency thermal shutdowns, and is thus
> > > >> >> > > important to have in place before CPU DVFS is enabled, as high CPU
> > > >> >> > > operating performance points can overheat the chip quickly in the
> > > >> >> > > absence of thermal management.
> > > >> >> > >
> > > >> >> > > Signed-off-by: Alexey Charkov <[email protected]>
> > > >> >> > > ---
> > > >> >> > > arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts | 4 ++++
> > > >> >> > > 8 files changed, 32 insertions(+)
> > > >> >> > >
> > > >> >> > > diff --git a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> > > >> >> > > b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts index
> > > >> >> > > b8e15b76a8a6..21e96c212dd8 100644
> > > >> >> > > --- a/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> > > >> >> > > +++ b/arch/arm64/boot/dts/rockchip/rk3588-rock-5b.dts
> > > >> >> > > @@ -742,6 +742,10 @@ regulator-state-mem {
> > > >> >> > > };
> > > >> >> > > };
> > > >> >> > >
> > > >> >> > > +&tsadc {
> > > >> >> > > + status = "okay";
> > > >> >> > > +};
> > > >> >> > > +
> > > >> >> > > &uart2 {
> > > >> >> > > pinctrl-0 = <&uart2m0_xfer>;
> > > >> >> > > status = "okay";
> > > >> >> >
> > > >> >> > I built a kernel with v3 of your patch set and someone tested it on a ROCK 5B
> > > >> >> > 'for me' and it had the following line in dmesg:
> > > >> >> >
> > > >> >> > rockchip-thermal fec00000.tsadc: Missing rockchip,grf property
> > > >> >> >
> > > >> >> > I'm guessing that turned up due to enabling tsadc, but (also) in v4 I didn't
> > > >> >> > see a change wrt "rockchip,grf".
> > > >> >> > Should that be done? (asking; I don't know)
> > > >> >>
> > > >> >> I'm getting the same. Neither the mainline TSADC driver [1], nor the
> > > >> >> downstream one [2] seems to use the grf pointer on RK3588 at all. It
> > > >> >> still works in spite of that warning, although I can't see how (or if)
> > > >> >> it configures the reset mechanism without those GRF registers.
> > > >> >>
> > > >> >> Best regards,
> > > >> >> Alexey
> > > >> >>
> > > >> >> [1]
> > > >> >> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/thermal/rockchip_thermal.c#n818
> > > >> >> [2]
> > > >> >> https://github.com/radxa/kernel/blob/stable-5.10-rock5/drivers/thermal/rockchip_thermal.c#L961
> > > >> >>
> > > >> >
> > > >> > If the following changes fix the warning.
> > > >> >
> > > >> > Checking the Rockchip RK3588 TRM V1.0-Part1-20220309.pdf
> > > >> > PMU1GRF_SOC_CON3 which has tsadc_shut_reset_trigger_en bit
> > > >> > to control the Enable TSADC shut reset trigger for DDR fail safe.
> > > >> >
> > > >> > diff --git a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > > >> > b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > > >> > index 85c25d5efdad..5490a44e093e 100644
> > > >> > --- a/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > > >> > +++ b/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
> > > >> > @@ -2662,6 +2662,7 @@ tsadc: tsadc@fec00000 {
> > > >> > rockchip,hw-tshut-temp = <120000>;
> > > >> > rockchip,hw-tshut-mode = <0>; /* tshut mode 0:CRU
> > > >> > 1:GPIO */
> > > >> > rockchip,hw-tshut-polarity = <0>; /* tshut polarity
> > > >> > 0:LOW 1:HIGH */
> > > >> > + rockchip,pmu = <&pmu1grf>;
> > > >> > pinctrl-0 = <&tsadc_gpio_func>;
> > > >> > pinctrl-1 = <&tsadc_shut>;
> > > >> > pinctrl-names = "gpio", "otpout";
> > > >>
> > > >> Basically, the rockchip_thermal driver doesn't use GRF at all on
> > > >> the RK3588(s), so virtually any value specified as "rockchip,pmu"
> > > >> can eliminate the warning.
> > > >
> > > > To me, specifying an arbitrary GRF in the device tree just to silence
> > > > a warning that the current driver emits sounds bad. If the GRF is not
> > > > needed for TSADC initialization on RK3588, then it should not be in
> > > > the device tree (and it looks like the case here) - otherwise the
> > > > device tree ceases to be a truthful description of the hardware.
> > >
> > > After thinking a bit more about it, I think you're right. If the
> > > rockchip_thermal driver ever gets changed to require use of GRF on
> > > the RK3588(s), only then adding that property to the DT would be
> > > the right thing to do.
> > >
> > > > I'm not sure if we need that "DDR fail safe" logic mentioned in the
> > > > TRM that Anand quoted, given that neither Rockchip downstream nor
> > > > current mainline driver implement it, and furthermore none of the
> > > > other SoC revisions supported by the driver mention it. If we do in
> > > > fact need it, we should probably first test it along with respective
> > > > driver code before committing to an upstream DT and thus making it
> > > > part of the ABI.
> > > >
> > > > IMO this is more of a driver issue than a device tree issue: maybe a
> > > > small patch to demote this warning to an info message would be better?
> > > > It's harmless anyway.
> > >
> > > After having second thoughts, I'll see to improve the rockchip_thermal
> > > driver to emit that warning only when having "rockchip,grf" specified
> > > is actually needed for the particular Rockchip SoC. That's how it
> > > should
> > > behave, yelling about the wrong hardware description only when that's
> > > actually the case.
> >
> > We need to handle the GRF TSADC for rk3588 tmu driver.
> >
> > For rk3568 we control the GRF TSADC which seems to be missing in rk3588
>
> Please note that for RK3568 the GRF registers are used to actually
> enable the thermal sensing (as can be seen by the link you posted),
> which doesn't seem to be necessary on RK3588 as it works just fine
> without it. Furthermore, RK3568 has a nearly identical GRF register in
> PMU_GRF_SOC_CON3 (see RK3568 TRM part1 page 206) to the one you
> proposed referencing in the DT for RK3588, but that register is not
> used in the TSADC driver either. It only uses the TSADC_CON register
> within SYS_GRF (offset 0x600) on RK3568.
>
I feel there is no PMUGRF configuration for TS-ADC on RK3588 SoC as per the TRM.
So we can ignore the warning.
> The TRM for RK3588 doesn't document any TSADC_CON register, even
> though there is a #define in the current upstream driver that mentions
> an offset of 0x100 to that register within the GRF. I'm not sure it
> means 0x100 within PMU1_GRF as you mentioned - do you have any
> pointers to support that?
>
Yes, you can drop them as there is no RK3588_GRF0_TSADC_CON register
to support such settings.
Thanks
-Anand
On Mon, May 6, 2024 at 1:37 PM Alexey Charkov <[email protected]> wrote:
>
> This enables thermal monitoring and CPU DVFS on RK3588(s), as well as
> active cooling on Radxa Rock 5B via the provided PWM fan.
>
> Some RK3588 boards use separate regulators to supply CPUs and their
> respective memory interfaces, so this is handled by coupling those
> regulators in affected boards' device trees to ensure that their
> voltage is adjusted in step.
>
> This also enables the built-in thermal sensor (TSADC) for all boards
> that don't currently have it enabled, using the default CRU based
> emergency thermal reset. This default configuration only uses on-SoC
> devices and doesn't rely on any external wiring, thus it should work
> for all devices (tested only on Rock 5B though).
>
> The boards that have TSADC_SHUT signal wired to the PMIC reset line
> can choose to override the default reset logic in favour of GPIO
> driven (PMIC assisted) reset, but in my testing it didn't work on
> Radxa Rock 5B - maybe I'm reading the schematic wrong and it doesn't
> support PMIC assisted reset after all.
>
> Fan control on Rock 5B has been split into two intervals: let it spin
> at the minimum cooling state between 55C and 65C, and then accelerate
> if the system crosses the 65C mark - thanks to Dragan for suggesting.
> This lets some cooling setups with beefier heatsinks and/or larger
> fan fins to stay in the quietest non-zero fan state while still
> gaining potential benefits from the airflow it generates, and
> possibly avoiding noisy speeds altogether for some workloads.
>
> OPPs help actually scale CPU frequencies up and down for both cooling
> and performance - tested on Rock 5B under varied loads. I've dropped
> those OPPs that cause frequency reductions without accompanying decrease
> in CPU voltage, as they don't seem to be adding much benefit in day to
> day use, while the kernel log gets a number of "OPP is inefficient" lines.
>
> Note that this submission doesn't touch the SRAM read margin updates or
> the OPP calibration based on silicon quality which the downstream driver
> does and which were mentioned in [1]. It works as it is (also confirmed by
> Sebastian in his follow-up message [2]), and it is stable in my testing on
> Rock 5B, so it sounds better to merge a simple version first and then
> extend when/if required.
>
> [1] https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@mail.gmail.com/
> [2] https://lore.kernel.org/linux-rockchip/pkyne4g2cln27dcdu3jm7bqdqpmd2kwkbguiolmozntjuiajrb@gvq4nupzna4o/
>
> Signed-off-by: Alexey Charkov <[email protected]>
> ---
Hi Heiko,
Do you think this can be merged for 6.11? Looks like there hasn't been
any new feedback in a while, and it would be good to have frequency
scaling in place for RK3588.
Please let me know if you have any reservations or if we need any
broader discussion.
Thanks a lot,
Alexey
Hello Alexey and Heiko,
On 2024-05-28 11:49, Alexey Charkov wrote:
> On Mon, May 6, 2024 at 1:37 PM Alexey Charkov <[email protected]>
> wrote:
>>
>> This enables thermal monitoring and CPU DVFS on RK3588(s), as well as
>> active cooling on Radxa Rock 5B via the provided PWM fan.
>>
>> Some RK3588 boards use separate regulators to supply CPUs and their
>> respective memory interfaces, so this is handled by coupling those
>> regulators in affected boards' device trees to ensure that their
>> voltage is adjusted in step.
>>
>> This also enables the built-in thermal sensor (TSADC) for all boards
>> that don't currently have it enabled, using the default CRU based
>> emergency thermal reset. This default configuration only uses on-SoC
>> devices and doesn't rely on any external wiring, thus it should work
>> for all devices (tested only on Rock 5B though).
>>
>> The boards that have TSADC_SHUT signal wired to the PMIC reset line
>> can choose to override the default reset logic in favour of GPIO
>> driven (PMIC assisted) reset, but in my testing it didn't work on
>> Radxa Rock 5B - maybe I'm reading the schematic wrong and it doesn't
>> support PMIC assisted reset after all.
>>
>> Fan control on Rock 5B has been split into two intervals: let it spin
>> at the minimum cooling state between 55C and 65C, and then accelerate
>> if the system crosses the 65C mark - thanks to Dragan for suggesting.
>> This lets some cooling setups with beefier heatsinks and/or larger
>> fan fins to stay in the quietest non-zero fan state while still
>> gaining potential benefits from the airflow it generates, and
>> possibly avoiding noisy speeds altogether for some workloads.
>>
>> OPPs help actually scale CPU frequencies up and down for both cooling
>> and performance - tested on Rock 5B under varied loads. I've dropped
>> those OPPs that cause frequency reductions without accompanying
>> decrease
>> in CPU voltage, as they don't seem to be adding much benefit in day to
>> day use, while the kernel log gets a number of "OPP is inefficient"
>> lines.
>>
>> Note that this submission doesn't touch the SRAM read margin updates
>> or
>> the OPP calibration based on silicon quality which the downstream
>> driver
>> does and which were mentioned in [1]. It works as it is (also
>> confirmed by
>> Sebastian in his follow-up message [2]), and it is stable in my
>> testing on
>> Rock 5B, so it sounds better to merge a simple version first and then
>> extend when/if required.
>>
>> [1]
>> https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@mail.gmail.com/
>> [2]
>> https://lore.kernel.org/linux-rockchip/pkyne4g2cln27dcdu3jm7bqdqpmd2kwkbguiolmozntjuiajrb@gvq4nupzna4o/
>>
>> Signed-off-by: Alexey Charkov <[email protected]>
>> ---
>
> Hi Heiko,
>
> Do you think this can be merged for 6.11? Looks like there hasn't been
> any new feedback in a while, and it would be good to have frequency
> scaling in place for RK3588.
>
> Please let me know if you have any reservations or if we need any
> broader discussion.
As I promised earlier, I was going to test this patch series in detail.
Alas, I haven't managed to do that yet, :/ due to many reasons, but
I still remain firmly committed to doing that.
Is -rc4 the cutoff for 6.11? If so, there's still time and I'll do my
best to test and review these patches as soon as possible.
Hi Dragan,
Am Dienstag, 28. Mai 2024, 16:05:04 CEST schrieb Dragan Simic:
> On 2024-05-28 11:49, Alexey Charkov wrote:
> > On Mon, May 6, 2024 at 1:37 PM Alexey Charkov <[email protected]>
> > wrote:
> >>
> >> This enables thermal monitoring and CPU DVFS on RK3588(s), as well as
> >> active cooling on Radxa Rock 5B via the provided PWM fan.
> >>
> >> Some RK3588 boards use separate regulators to supply CPUs and their
> >> respective memory interfaces, so this is handled by coupling those
> >> regulators in affected boards' device trees to ensure that their
> >> voltage is adjusted in step.
> >>
> >> This also enables the built-in thermal sensor (TSADC) for all boards
> >> that don't currently have it enabled, using the default CRU based
> >> emergency thermal reset. This default configuration only uses on-SoC
> >> devices and doesn't rely on any external wiring, thus it should work
> >> for all devices (tested only on Rock 5B though).
> >>
> >> The boards that have TSADC_SHUT signal wired to the PMIC reset line
> >> can choose to override the default reset logic in favour of GPIO
> >> driven (PMIC assisted) reset, but in my testing it didn't work on
> >> Radxa Rock 5B - maybe I'm reading the schematic wrong and it doesn't
> >> support PMIC assisted reset after all.
> >>
> >> Fan control on Rock 5B has been split into two intervals: let it spin
> >> at the minimum cooling state between 55C and 65C, and then accelerate
> >> if the system crosses the 65C mark - thanks to Dragan for suggesting.
> >> This lets some cooling setups with beefier heatsinks and/or larger
> >> fan fins to stay in the quietest non-zero fan state while still
> >> gaining potential benefits from the airflow it generates, and
> >> possibly avoiding noisy speeds altogether for some workloads.
> >>
> >> OPPs help actually scale CPU frequencies up and down for both cooling
> >> and performance - tested on Rock 5B under varied loads. I've dropped
> >> those OPPs that cause frequency reductions without accompanying
> >> decrease
> >> in CPU voltage, as they don't seem to be adding much benefit in day to
> >> day use, while the kernel log gets a number of "OPP is inefficient"
> >> lines.
> >>
> >> Note that this submission doesn't touch the SRAM read margin updates
> >> or
> >> the OPP calibration based on silicon quality which the downstream
> >> driver
> >> does and which were mentioned in [1]. It works as it is (also
> >> confirmed by
> >> Sebastian in his follow-up message [2]), and it is stable in my
> >> testing on
> >> Rock 5B, so it sounds better to merge a simple version first and then
> >> extend when/if required.
> >>
> >> [1]
> >> https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@mail.gmail.com/
> >> [2]
> >> https://lore.kernel.org/linux-rockchip/pkyne4g2cln27dcdu3jm7bqdqpmd2kwkbguiolmozntjuiajrb@gvq4nupzna4o/
> >>
> >> Signed-off-by: Alexey Charkov <[email protected]>
> >> ---
> >
> > Hi Heiko,
> >
> > Do you think this can be merged for 6.11? Looks like there hasn't been
> > any new feedback in a while, and it would be good to have frequency
> > scaling in place for RK3588.
> >
> > Please let me know if you have any reservations or if we need any
> > broader discussion.
not really reservations, more like there was still discussion going on
around the OPPs. Meanwhile we had more discussions regarding the whole
speed binning Rockchip seems to do for rk3588 variants.
And waiting for the testing Dragan wanted to do ;-) .
So this should definitly make it into 6.11 though, as there is still
a lot of time.
> As I promised earlier, I was going to test this patch series in detail.
> Alas, I haven't managed to do that yet, :/ due to many reasons, but
> I still remain firmly committed to doing that.
>
> Is -rc4 the cutoff for 6.11? If so, there's still time and I'll do my
> best to test and review these patches as soon as possible.
As early as possible, the hard cutoff would be -rc6 though.
I guess I'll just start picking the easy patches from the series.
Heiko
On 2024-05-28 16:34, Heiko Stuebner wrote:
> Am Dienstag, 28. Mai 2024, 16:05:04 CEST schrieb Dragan Simic:
>> On 2024-05-28 11:49, Alexey Charkov wrote:
>> > On Mon, May 6, 2024 at 1:37 PM Alexey Charkov <[email protected]>
>> > wrote:
>> >>
>> >> This enables thermal monitoring and CPU DVFS on RK3588(s), as well as
>> >> active cooling on Radxa Rock 5B via the provided PWM fan.
>> >>
>> >> Some RK3588 boards use separate regulators to supply CPUs and their
>> >> respective memory interfaces, so this is handled by coupling those
>> >> regulators in affected boards' device trees to ensure that their
>> >> voltage is adjusted in step.
>> >>
>> >> This also enables the built-in thermal sensor (TSADC) for all boards
>> >> that don't currently have it enabled, using the default CRU based
>> >> emergency thermal reset. This default configuration only uses on-SoC
>> >> devices and doesn't rely on any external wiring, thus it should work
>> >> for all devices (tested only on Rock 5B though).
>> >>
>> >> The boards that have TSADC_SHUT signal wired to the PMIC reset line
>> >> can choose to override the default reset logic in favour of GPIO
>> >> driven (PMIC assisted) reset, but in my testing it didn't work on
>> >> Radxa Rock 5B - maybe I'm reading the schematic wrong and it doesn't
>> >> support PMIC assisted reset after all.
>> >>
>> >> Fan control on Rock 5B has been split into two intervals: let it spin
>> >> at the minimum cooling state between 55C and 65C, and then accelerate
>> >> if the system crosses the 65C mark - thanks to Dragan for suggesting.
>> >> This lets some cooling setups with beefier heatsinks and/or larger
>> >> fan fins to stay in the quietest non-zero fan state while still
>> >> gaining potential benefits from the airflow it generates, and
>> >> possibly avoiding noisy speeds altogether for some workloads.
>> >>
>> >> OPPs help actually scale CPU frequencies up and down for both cooling
>> >> and performance - tested on Rock 5B under varied loads. I've dropped
>> >> those OPPs that cause frequency reductions without accompanying
>> >> decrease
>> >> in CPU voltage, as they don't seem to be adding much benefit in day to
>> >> day use, while the kernel log gets a number of "OPP is inefficient"
>> >> lines.
>> >>
>> >> Note that this submission doesn't touch the SRAM read margin updates
>> >> or
>> >> the OPP calibration based on silicon quality which the downstream
>> >> driver
>> >> does and which were mentioned in [1]. It works as it is (also
>> >> confirmed by
>> >> Sebastian in his follow-up message [2]), and it is stable in my
>> >> testing on
>> >> Rock 5B, so it sounds better to merge a simple version first and then
>> >> extend when/if required.
>> >>
>> >> [1]
>> >> https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@mail.gmail.com/
>> >> [2]
>> >> https://lore.kernel.org/linux-rockchip/pkyne4g2cln27dcdu3jm7bqdqpmd2kwkbguiolmozntjuiajrb@gvq4nupzna4o/
>> >>
>> >> Signed-off-by: Alexey Charkov <[email protected]>
>> >> ---
>> >
>> > Hi Heiko,
>> >
>> > Do you think this can be merged for 6.11? Looks like there hasn't been
>> > any new feedback in a while, and it would be good to have frequency
>> > scaling in place for RK3588.
>> >
>> > Please let me know if you have any reservations or if we need any
>> > broader discussion.
>
> not really reservations, more like there was still discussion going on
> around the OPPs. Meanwhile we had more discussions regarding the whole
> speed binning Rockchip seems to do for rk3588 variants.
>
> And waiting for the testing Dragan wanted to do ;-) .
I'm sorry for the delays.
> So this should definitly make it into 6.11 though, as there is still
> a lot of time.
>
>> As I promised earlier, I was going to test this patch series in
>> detail.
>> Alas, I haven't managed to do that yet, :/ due to many reasons, but
>> I still remain firmly committed to doing that.
>>
>> Is -rc4 the cutoff for 6.11? If so, there's still time and I'll do my
>> best to test and review these patches as soon as possible.
>
> As early as possible, the hard cutoff would be -rc6 though.
> I guess I'll just start picking the easy patches from the series.
I'll do my best to have the patches tested and reviewed in detail ASAP.
As a suggestion, perhaps it would be better to take the series as a
whole,
so we don't bring partial merging into the mix.
Am Dienstag, 28. Mai 2024, 17:01:48 CEST schrieb Dragan Simic:
> On 2024-05-28 16:34, Heiko Stuebner wrote:
> > Am Dienstag, 28. Mai 2024, 16:05:04 CEST schrieb Dragan Simic:
> >> On 2024-05-28 11:49, Alexey Charkov wrote:
> >> > On Mon, May 6, 2024 at 1:37 PM Alexey Charkov <[email protected]>
> >> > wrote:
> >> >>
> >> >> This enables thermal monitoring and CPU DVFS on RK3588(s), as well as
> >> >> active cooling on Radxa Rock 5B via the provided PWM fan.
> >> >>
> >> >> Some RK3588 boards use separate regulators to supply CPUs and their
> >> >> respective memory interfaces, so this is handled by coupling those
> >> >> regulators in affected boards' device trees to ensure that their
> >> >> voltage is adjusted in step.
> >> >>
> >> >> This also enables the built-in thermal sensor (TSADC) for all boards
> >> >> that don't currently have it enabled, using the default CRU based
> >> >> emergency thermal reset. This default configuration only uses on-SoC
> >> >> devices and doesn't rely on any external wiring, thus it should work
> >> >> for all devices (tested only on Rock 5B though).
> >> >>
> >> >> The boards that have TSADC_SHUT signal wired to the PMIC reset line
> >> >> can choose to override the default reset logic in favour of GPIO
> >> >> driven (PMIC assisted) reset, but in my testing it didn't work on
> >> >> Radxa Rock 5B - maybe I'm reading the schematic wrong and it doesn't
> >> >> support PMIC assisted reset after all.
> >> >>
> >> >> Fan control on Rock 5B has been split into two intervals: let it spin
> >> >> at the minimum cooling state between 55C and 65C, and then accelerate
> >> >> if the system crosses the 65C mark - thanks to Dragan for suggesting.
> >> >> This lets some cooling setups with beefier heatsinks and/or larger
> >> >> fan fins to stay in the quietest non-zero fan state while still
> >> >> gaining potential benefits from the airflow it generates, and
> >> >> possibly avoiding noisy speeds altogether for some workloads.
> >> >>
> >> >> OPPs help actually scale CPU frequencies up and down for both cooling
> >> >> and performance - tested on Rock 5B under varied loads. I've dropped
> >> >> those OPPs that cause frequency reductions without accompanying
> >> >> decrease
> >> >> in CPU voltage, as they don't seem to be adding much benefit in day to
> >> >> day use, while the kernel log gets a number of "OPP is inefficient"
> >> >> lines.
> >> >>
> >> >> Note that this submission doesn't touch the SRAM read margin updates
> >> >> or
> >> >> the OPP calibration based on silicon quality which the downstream
> >> >> driver
> >> >> does and which were mentioned in [1]. It works as it is (also
> >> >> confirmed by
> >> >> Sebastian in his follow-up message [2]), and it is stable in my
> >> >> testing on
> >> >> Rock 5B, so it sounds better to merge a simple version first and then
> >> >> extend when/if required.
> >> >>
> >> >> [1]
> >> >> https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@mail.gmail.com/
> >> >> [2]
> >> >> https://lore.kernel.org/linux-rockchip/pkyne4g2cln27dcdu3jm7bqdqpmd2kwkbguiolmozntjuiajrb@gvq4nupzna4o/
> >> >>
> >> >> Signed-off-by: Alexey Charkov <[email protected]>
> >> >> ---
> >> >
> >> > Hi Heiko,
> >> >
> >> > Do you think this can be merged for 6.11? Looks like there hasn't been
> >> > any new feedback in a while, and it would be good to have frequency
> >> > scaling in place for RK3588.
> >> >
> >> > Please let me know if you have any reservations or if we need any
> >> > broader discussion.
> >
> > not really reservations, more like there was still discussion going on
> > around the OPPs. Meanwhile we had more discussions regarding the whole
> > speed binning Rockchip seems to do for rk3588 variants.
> >
> > And waiting for the testing Dragan wanted to do ;-) .
>
> I'm sorry for the delays.
Was definitly _not_ meant as blame ;-) .
The series has just too many discussions threads to unravel on half
an afternoon.
> > So this should definitly make it into 6.11 though, as there is still
> > a lot of time.
> >
> >> As I promised earlier, I was going to test this patch series in
> >> detail.
> >> Alas, I haven't managed to do that yet, :/ due to many reasons, but
> >> I still remain firmly committed to doing that.
> >>
> >> Is -rc4 the cutoff for 6.11? If so, there's still time and I'll do my
> >> best to test and review these patches as soon as possible.
> >
> > As early as possible, the hard cutoff would be -rc6 though.
> > I guess I'll just start picking the easy patches from the series.
>
> I'll do my best to have the patches tested and reviewed in detail ASAP.
> As a suggestion, perhaps it would be better to take the series as a
> whole,
> so we don't bring partial merging into the mix.
Patches need to work individually anyway (in correct order of course),
so like starting at the top with the general thermal stuff should not
create issues ;-)
Heiko
On 2024-05-28 17:16, Heiko Stuebner wrote:
> Am Dienstag, 28. Mai 2024, 17:01:48 CEST schrieb Dragan Simic:
>> On 2024-05-28 16:34, Heiko Stuebner wrote:
>> > Am Dienstag, 28. Mai 2024, 16:05:04 CEST schrieb Dragan Simic:
>> >> On 2024-05-28 11:49, Alexey Charkov wrote:
>> >> > On Mon, May 6, 2024 at 1:37 PM Alexey Charkov <[email protected]>
>> >> > wrote:
>> >> >>
>> >> >> This enables thermal monitoring and CPU DVFS on RK3588(s), as well as
>> >> >> active cooling on Radxa Rock 5B via the provided PWM fan.
>> >> >>
>> >> >> Some RK3588 boards use separate regulators to supply CPUs and their
>> >> >> respective memory interfaces, so this is handled by coupling those
>> >> >> regulators in affected boards' device trees to ensure that their
>> >> >> voltage is adjusted in step.
>> >> >>
>> >> >> This also enables the built-in thermal sensor (TSADC) for all boards
>> >> >> that don't currently have it enabled, using the default CRU based
>> >> >> emergency thermal reset. This default configuration only uses on-SoC
>> >> >> devices and doesn't rely on any external wiring, thus it should work
>> >> >> for all devices (tested only on Rock 5B though).
>> >> >>
>> >> >> The boards that have TSADC_SHUT signal wired to the PMIC reset line
>> >> >> can choose to override the default reset logic in favour of GPIO
>> >> >> driven (PMIC assisted) reset, but in my testing it didn't work on
>> >> >> Radxa Rock 5B - maybe I'm reading the schematic wrong and it doesn't
>> >> >> support PMIC assisted reset after all.
>> >> >>
>> >> >> Fan control on Rock 5B has been split into two intervals: let it spin
>> >> >> at the minimum cooling state between 55C and 65C, and then accelerate
>> >> >> if the system crosses the 65C mark - thanks to Dragan for suggesting.
>> >> >> This lets some cooling setups with beefier heatsinks and/or larger
>> >> >> fan fins to stay in the quietest non-zero fan state while still
>> >> >> gaining potential benefits from the airflow it generates, and
>> >> >> possibly avoiding noisy speeds altogether for some workloads.
>> >> >>
>> >> >> OPPs help actually scale CPU frequencies up and down for both cooling
>> >> >> and performance - tested on Rock 5B under varied loads. I've dropped
>> >> >> those OPPs that cause frequency reductions without accompanying
>> >> >> decrease
>> >> >> in CPU voltage, as they don't seem to be adding much benefit in day to
>> >> >> day use, while the kernel log gets a number of "OPP is inefficient"
>> >> >> lines.
>> >> >>
>> >> >> Note that this submission doesn't touch the SRAM read margin updates
>> >> >> or
>> >> >> the OPP calibration based on silicon quality which the downstream
>> >> >> driver
>> >> >> does and which were mentioned in [1]. It works as it is (also
>> >> >> confirmed by
>> >> >> Sebastian in his follow-up message [2]), and it is stable in my
>> >> >> testing on
>> >> >> Rock 5B, so it sounds better to merge a simple version first and then
>> >> >> extend when/if required.
>> >> >>
>> >> >> [1]
>> >> >> https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@mail.gmail.com/
>> >> >> [2]
>> >> >> https://lore.kernel.org/linux-rockchip/pkyne4g2cln27dcdu3jm7bqdqpmd2kwkbguiolmozntjuiajrb@gvq4nupzna4o/
>> >> >>
>> >> >> Signed-off-by: Alexey Charkov <[email protected]>
>> >> >> ---
>> >> >
>> >> > Hi Heiko,
>> >> >
>> >> > Do you think this can be merged for 6.11? Looks like there hasn't been
>> >> > any new feedback in a while, and it would be good to have frequency
>> >> > scaling in place for RK3588.
>> >> >
>> >> > Please let me know if you have any reservations or if we need any
>> >> > broader discussion.
>> >
>> > not really reservations, more like there was still discussion going on
>> > around the OPPs. Meanwhile we had more discussions regarding the whole
>> > speed binning Rockchip seems to do for rk3588 variants.
>> >
>> > And waiting for the testing Dragan wanted to do ;-) .
>>
>> I'm sorry for the delays.
>
> Was definitly _not_ meant as blame ;-) .
Thanks, but I do blame myself a bit for not doing it earlier. :)
> The series has just too many discussions threads to unravel on half
> an afternoon.
Yes, we've touched quite a few areas, for some of which I've already
started working on the associated patches. That was one of the reasons
for the delays.
>> > So this should definitly make it into 6.11 though, as there is still
>> > a lot of time.
>> >
>> >> As I promised earlier, I was going to test this patch series in
>> >> detail.
>> >> Alas, I haven't managed to do that yet, :/ due to many reasons, but
>> >> I still remain firmly committed to doing that.
>> >>
>> >> Is -rc4 the cutoff for 6.11? If so, there's still time and I'll do my
>> >> best to test and review these patches as soon as possible.
>> >
>> > As early as possible, the hard cutoff would be -rc6 though.
>> > I guess I'll just start picking the easy patches from the series.
>>
>> I'll do my best to have the patches tested and reviewed in detail
>> ASAP.
>> As a suggestion, perhaps it would be better to take the series as a
>> whole,
>> so we don't bring partial merging into the mix.
>
> Patches need to work individually anyway (in correct order of course),
> so like starting at the top with the general thermal stuff should not
> create issues ;-)
Of course, but I might actually have some comments and suggestions
for some of the patches, and addressing those suggestions, if we end up
agreeing on them, would be a bit messy if some of the patches in the
series become merged first.
[sorry if you get this twice: I messed up and sent the previous one in
HTML instead of plain text - resending now]
On Tue, May 28, 2024 at 7:16 PM Heiko Stuebner <[email protected]> wrote:
>
> Am Dienstag, 28. Mai 2024, 17:01:48 CEST schrieb Dragan Simic:
> > On 2024-05-28 16:34, Heiko Stuebner wrote:
> > > Am Dienstag, 28. Mai 2024, 16:05:04 CEST schrieb Dragan Simic:
> > >> On 2024-05-28 11:49, Alexey Charkov wrote:
> > >> > On Mon, May 6, 2024 at 1:37 PM Alexey Charkov <[email protected]>
> > >> > wrote:
> > >> >>
> > >> >> This enables thermal monitoring and CPU DVFS on RK3588(s), as well as
> > >> >> active cooling on Radxa Rock 5B via the provided PWM fan.
> > >> >>
> > >> >> Some RK3588 boards use separate regulators to supply CPUs and their
> > >> >> respective memory interfaces, so this is handled by coupling those
> > >> >> regulators in affected boards' device trees to ensure that their
> > >> >> voltage is adjusted in step.
> > >> >>
> > >> >> This also enables the built-in thermal sensor (TSADC) for all boards
> > >> >> that don't currently have it enabled, using the default CRU based
> > >> >> emergency thermal reset. This default configuration only uses on-SoC
> > >> >> devices and doesn't rely on any external wiring, thus it should work
> > >> >> for all devices (tested only on Rock 5B though).
> > >> >>
> > >> >> The boards that have TSADC_SHUT signal wired to the PMIC reset line
> > >> >> can choose to override the default reset logic in favour of GPIO
> > >> >> driven (PMIC assisted) reset, but in my testing it didn't work on
> > >> >> Radxa Rock 5B - maybe I'm reading the schematic wrong and it doesn't
> > >> >> support PMIC assisted reset after all.
> > >> >>
> > >> >> Fan control on Rock 5B has been split into two intervals: let it spin
> > >> >> at the minimum cooling state between 55C and 65C, and then accelerate
> > >> >> if the system crosses the 65C mark - thanks to Dragan for suggesting.
> > >> >> This lets some cooling setups with beefier heatsinks and/or larger
> > >> >> fan fins to stay in the quietest non-zero fan state while still
> > >> >> gaining potential benefits from the airflow it generates, and
> > >> >> possibly avoiding noisy speeds altogether for some workloads.
> > >> >>
> > >> >> OPPs help actually scale CPU frequencies up and down for both cooling
> > >> >> and performance - tested on Rock 5B under varied loads. I've dropped
> > >> >> those OPPs that cause frequency reductions without accompanying
> > >> >> decrease
> > >> >> in CPU voltage, as they don't seem to be adding much benefit in day to
> > >> >> day use, while the kernel log gets a number of "OPP is inefficient"
> > >> >> lines.
> > >> >>
> > >> >> Note that this submission doesn't touch the SRAM read margin updates
> > >> >> or
> > >> >> the OPP calibration based on silicon quality which the downstream
> > >> >> driver
> > >> >> does and which were mentioned in [1]. It works as it is (also
> > >> >> confirmed by
> > >> >> Sebastian in his follow-up message [2]), and it is stable in my
> > >> >> testing on
> > >> >> Rock 5B, so it sounds better to merge a simple version first and then
> > >> >> extend when/if required.
> > >> >>
> > >> >> [1]
> > >> >> https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@mail.gmail.com/
> > >> >> [2]
> > >> >> https://lore.kernel.org/linux-rockchip/pkyne4g2cln27dcdu3jm7bqdqpmd2kwkbguiolmozntjuiajrb@gvq4nupzna4o/
> > >> >>
> > >> >> Signed-off-by: Alexey Charkov <[email protected]>
> > >> >> ---
> > >> >
> > >> > Hi Heiko,
> > >> >
> > >> > Do you think this can be merged for 6.11? Looks like there hasn't been
> > >> > any new feedback in a while, and it would be good to have frequency
> > >> > scaling in place for RK3588.
> > >> >
> > >> > Please let me know if you have any reservations or if we need any
> > >> > broader discussion.
> > >
> > > not really reservations, more like there was still discussion going on
> > > around the OPPs. Meanwhile we had more discussions regarding the whole
> > > speed binning Rockchip seems to do for rk3588 variants.
> > >
> > > And waiting for the testing Dragan wanted to do ;-) .
> >
> > I'm sorry for the delays.
>
> Was definitly _not_ meant as blame ;-) .
>
> The series has just too many discussions threads to unravel on half
> an afternoon.
FWIW, I think the latest exchange we had with Quentin regarding the
OPPs concluded in “false alarm”, given that this version of the series
only introduces a subset of them which should apply to all RK3588(s)
Performance binning here is more geared towards how low the voltages
can go for a given frequency, and right now we’re only introducing the
highest-voltage setting for each OPP. Thus the binning and/or
intermediate frequencies should be possible to introduce at a later
stage in a backwards compatible way (if deemed relevant).
> > > So this should definitly make it into 6.11 though, as there is still
> > > a lot of time.
> > >
> > >> As I promised earlier, I was going to test this patch series in
> > >> detail.
> > >> Alas, I haven't managed to do that yet, :/ due to many reasons, but
> > >> I still remain firmly committed to doing that.
> > >>
> > >> Is -rc4 the cutoff for 6.11? If so, there's still time and I'll do my
> > >> best to test and review these patches as soon as possible.
> > >
> > > As early as possible, the hard cutoff would be -rc6 though.
> > > I guess I'll just start picking the easy patches from the series.
> >
> > I'll do my best to have the patches tested and reviewed in detail ASAP.
> > As a suggestion, perhaps it would be better to take the series as a
> > whole,
> > so we don't bring partial merging into the mix.
>
> Patches need to work individually anyway (in correct order of course),
> so like starting at the top with the general thermal stuff should not
> create issues ;-)
Indeed, those are self-contained and can be merged independently of
the OPPs. Having OPPs without thermal is more risky though, but that
doesn’t sound like what we’re after here :)
Best regards,
Alexey
Hi all,
On 5/28/24 5:42 PM, Alexey Charkov wrote:
> On Tue, 28 May 2024 at 19:16, Heiko Stuebner <[email protected]> wrote:
>
>> Am Dienstag, 28. Mai 2024, 17:01:48 CEST schrieb Dragan Simic:
>>> On 2024-05-28 16:34, Heiko Stuebner wrote:
>>>> Am Dienstag, 28. Mai 2024, 16:05:04 CEST schrieb Dragan Simic:
>>>>> On 2024-05-28 11:49, Alexey Charkov wrote:
>>>>>> On Mon, May 6, 2024 at 1:37 PM Alexey Charkov <[email protected]>
>>>>>> wrote:
>>>>>>>
>>>>>>> This enables thermal monitoring and CPU DVFS on RK3588(s), as well
>> as
>>>>>>> active cooling on Radxa Rock 5B via the provided PWM fan.
>>>>>>>
>>>>>>> Some RK3588 boards use separate regulators to supply CPUs and their
>>>>>>> respective memory interfaces, so this is handled by coupling those
>>>>>>> regulators in affected boards' device trees to ensure that their
>>>>>>> voltage is adjusted in step.
>>>>>>>
>>>>>>> This also enables the built-in thermal sensor (TSADC) for all
>> boards
>>>>>>> that don't currently have it enabled, using the default CRU based
>>>>>>> emergency thermal reset. This default configuration only uses
>> on-SoC
>>>>>>> devices and doesn't rely on any external wiring, thus it should
>> work
>>>>>>> for all devices (tested only on Rock 5B though).
>>>>>>>
>>>>>>> The boards that have TSADC_SHUT signal wired to the PMIC reset line
>>>>>>> can choose to override the default reset logic in favour of GPIO
>>>>>>> driven (PMIC assisted) reset, but in my testing it didn't work on
>>>>>>> Radxa Rock 5B - maybe I'm reading the schematic wrong and it
>> doesn't
>>>>>>> support PMIC assisted reset after all.
>>>>>>>
>>>>>>> Fan control on Rock 5B has been split into two intervals: let it
>> spin
>>>>>>> at the minimum cooling state between 55C and 65C, and then
>> accelerate
>>>>>>> if the system crosses the 65C mark - thanks to Dragan for
>> suggesting.
>>>>>>> This lets some cooling setups with beefier heatsinks and/or larger
>>>>>>> fan fins to stay in the quietest non-zero fan state while still
>>>>>>> gaining potential benefits from the airflow it generates, and
>>>>>>> possibly avoiding noisy speeds altogether for some workloads.
>>>>>>>
>>>>>>> OPPs help actually scale CPU frequencies up and down for both
>> cooling
>>>>>>> and performance - tested on Rock 5B under varied loads. I've
>> dropped
>>>>>>> those OPPs that cause frequency reductions without accompanying
>>>>>>> decrease
>>>>>>> in CPU voltage, as they don't seem to be adding much benefit in
>> day to
>>>>>>> day use, while the kernel log gets a number of "OPP is inefficient"
>>>>>>> lines.
>>>>>>>
>>>>>>> Note that this submission doesn't touch the SRAM read margin
>> updates
>>>>>>> or
>>>>>>> the OPP calibration based on silicon quality which the downstream
>>>>>>> driver
>>>>>>> does and which were mentioned in [1]. It works as it is (also
>>>>>>> confirmed by
>>>>>>> Sebastian in his follow-up message [2]), and it is stable in my
>>>>>>> testing on
>>>>>>> Rock 5B, so it sounds better to merge a simple version first and
>> then
>>>>>>> extend when/if required.
>>>>>>>
>>>>>>> [1]
>>>>>>>
>> https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@mail.gmail.com/
>>>>>>> [2]
>>>>>>>
>> https://lore.kernel.org/linux-rockchip/pkyne4g2cln27dcdu3jm7bqdqpmd2kwkbguiolmozntjuiajrb@gvq4nupzna4o/
>>>>>>>
>>>>>>> Signed-off-by: Alexey Charkov <[email protected]>
>>>>>>> ---
>>>>>>
>>>>>> Hi Heiko,
>>>>>>
>>>>>> Do you think this can be merged for 6.11? Looks like there hasn't
>> been
>>>>>> any new feedback in a while, and it would be good to have frequency
>>>>>> scaling in place for RK3588.
>>>>>>
>>>>>> Please let me know if you have any reservations or if we need any
>>>>>> broader discussion.
>>>>
>>>> not really reservations, more like there was still discussion going on
>>>> around the OPPs. Meanwhile we had more discussions regarding the whole
>>>> speed binning Rockchip seems to do for rk3588 variants.
>>>>
>>>> And waiting for the testing Dragan wanted to do ;-) .
>>>
>>> I'm sorry for the delays.
>>
>> Was definitly _not_ meant as blame ;-) .
>>
>> The series has just too many discussions threads to unravel on half
>> an afternoon.
>
>
> FWIW, I think the latest exchange we had with Quentin regarding the OPPs
> concluded in “false alarm”, given that this version of the series only
> introduces a subset of them which should apply to all RK3588(s)
>
Correct.
However... I'm wondering if we shouldn't somehow follow the same pattern
we have used for the rk3399 OPPs? We have a file for the "true" RK3399
OPPs, then the OP1 variant and the RK3399T.
We already know there are a few variants of RK3588 with different OPPs:
RK3588(S/S2?), RK3588J and RK3588M. I wouldn't be surprised if the
RK3582 (though this one has already one big cluster (or two big cores)
fewer than RK3588) has different OPPs as well?
So. We have already discussed that the OPPs in that patch are valid for
RK3588(S) but they aren't for the other variants.
In the downstream kernel, any OPP whose opp-supported-hw has a first
value masked by BIT(1) return non-0 is supported by RK3588M. In the
downstream kernel, any OPP whose opp-supported-hw has a first value
masked by BIT(2) return non-0 is supported by RK3588J.
This means that, for LITTLE clusters:
- opp-1608000000 not supported on RK3588J
- opp-1704000000 only supported on RK3588M (but already absent in this
patch series)
- opp-1800000000 only supported on RK3588(S), not RK3588J nor RK3588M
For big clusters:
- opp-1800000000 not supported on RK3588J
- opp-2016000000 not supported on RK3588J
- opp-2208000000 only supported on RK3588(S), not RK3588J nor RK3588M
- opp-2256000000 only supported on RK3588(S), not RK3588J nor RK3588M
- opp-2304000000 only supported on RK3588(S), not RK3588J nor RK3588M
- opp-2352000000 only supported on RK3588(S), not RK3588J nor RK3588M
- opp-2400000000 only supported on RK3588(S), not RK3588J nor RK3588M
This is somehow also enforced in downstream kernel by removing the OPP
nodes directly (hence, not even requiring the check of opp-supported-hw
value), c.f.:
https://git.theobroma-systems.com/tiger-linux.git/tree/arch/arm64/boot/dts/rockchip/rk3588j.dtsi
https://git.theobroma-systems.com/tiger-linux.git/tree/arch/arm64/boot/dts/rockchip/rk3588m.dtsi
You'll not that the RK3588J also has less OPPs for the GPU and NPU (but
those should also be masked by the opp-supported-hw value).
Cheers,
Quentin
On Tue, May 28, 2024 at 8:08 PM Quentin Schulz <[email protected]> wrote:
>
> Hi all,
>
> On 5/28/24 5:42 PM, Alexey Charkov wrote:
> > On Tue, 28 May 2024 at 19:16, Heiko Stuebner <[email protected]> wrote:
> >
> >> Am Dienstag, 28. Mai 2024, 17:01:48 CEST schrieb Dragan Simic:
> >>> On 2024-05-28 16:34, Heiko Stuebner wrote:
> >>>> Am Dienstag, 28. Mai 2024, 16:05:04 CEST schrieb Dragan Simic:
> >>>>> On 2024-05-28 11:49, Alexey Charkov wrote:
> >>>>>> On Mon, May 6, 2024 at 1:37 PM Alexey Charkov <[email protected]>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> This enables thermal monitoring and CPU DVFS on RK3588(s), as well
> >> as
> >>>>>>> active cooling on Radxa Rock 5B via the provided PWM fan.
> >>>>>>>
> >>>>>>> Some RK3588 boards use separate regulators to supply CPUs and their
> >>>>>>> respective memory interfaces, so this is handled by coupling those
> >>>>>>> regulators in affected boards' device trees to ensure that their
> >>>>>>> voltage is adjusted in step.
> >>>>>>>
> >>>>>>> This also enables the built-in thermal sensor (TSADC) for all
> >> boards
> >>>>>>> that don't currently have it enabled, using the default CRU based
> >>>>>>> emergency thermal reset. This default configuration only uses
> >> on-SoC
> >>>>>>> devices and doesn't rely on any external wiring, thus it should
> >> work
> >>>>>>> for all devices (tested only on Rock 5B though).
> >>>>>>>
> >>>>>>> The boards that have TSADC_SHUT signal wired to the PMIC reset line
> >>>>>>> can choose to override the default reset logic in favour of GPIO
> >>>>>>> driven (PMIC assisted) reset, but in my testing it didn't work on
> >>>>>>> Radxa Rock 5B - maybe I'm reading the schematic wrong and it
> >> doesn't
> >>>>>>> support PMIC assisted reset after all.
> >>>>>>>
> >>>>>>> Fan control on Rock 5B has been split into two intervals: let it
> >> spin
> >>>>>>> at the minimum cooling state between 55C and 65C, and then
> >> accelerate
> >>>>>>> if the system crosses the 65C mark - thanks to Dragan for
> >> suggesting.
> >>>>>>> This lets some cooling setups with beefier heatsinks and/or larger
> >>>>>>> fan fins to stay in the quietest non-zero fan state while still
> >>>>>>> gaining potential benefits from the airflow it generates, and
> >>>>>>> possibly avoiding noisy speeds altogether for some workloads.
> >>>>>>>
> >>>>>>> OPPs help actually scale CPU frequencies up and down for both
> >> cooling
> >>>>>>> and performance - tested on Rock 5B under varied loads. I've
> >> dropped
> >>>>>>> those OPPs that cause frequency reductions without accompanying
> >>>>>>> decrease
> >>>>>>> in CPU voltage, as they don't seem to be adding much benefit in
> >> day to
> >>>>>>> day use, while the kernel log gets a number of "OPP is inefficient"
> >>>>>>> lines.
> >>>>>>>
> >>>>>>> Note that this submission doesn't touch the SRAM read margin
> >> updates
> >>>>>>> or
> >>>>>>> the OPP calibration based on silicon quality which the downstream
> >>>>>>> driver
> >>>>>>> does and which were mentioned in [1]. It works as it is (also
> >>>>>>> confirmed by
> >>>>>>> Sebastian in his follow-up message [2]), and it is stable in my
> >>>>>>> testing on
> >>>>>>> Rock 5B, so it sounds better to merge a simple version first and
> >> then
> >>>>>>> extend when/if required.
> >>>>>>>
> >>>>>>> [1]
> >>>>>>>
> >> https://lore.kernel.org/linux-rockchip/CABjd4YzTL=5S7cS8ACNAYVa730WA3iGd5L_wP1Vn9=f83RCORA@mail.gmail.com/
> >>>>>>> [2]
> >>>>>>>
> >> https://lore.kernel.org/linux-rockchip/pkyne4g2cln27dcdu3jm7bqdqpmd2kwkbguiolmozntjuiajrb@gvq4nupzna4o/
> >>>>>>>
> >>>>>>> Signed-off-by: Alexey Charkov <[email protected]>
> >>>>>>> ---
> >>>>>>
> >>>>>> Hi Heiko,
> >>>>>>
> >>>>>> Do you think this can be merged for 6.11? Looks like there hasn't
> >> been
> >>>>>> any new feedback in a while, and it would be good to have frequency
> >>>>>> scaling in place for RK3588.
> >>>>>>
> >>>>>> Please let me know if you have any reservations or if we need any
> >>>>>> broader discussion.
> >>>>
> >>>> not really reservations, more like there was still discussion going on
> >>>> around the OPPs. Meanwhile we had more discussions regarding the whole
> >>>> speed binning Rockchip seems to do for rk3588 variants.
> >>>>
> >>>> And waiting for the testing Dragan wanted to do ;-) .
> >>>
> >>> I'm sorry for the delays.
> >>
> >> Was definitly _not_ meant as blame ;-) .
> >>
> >> The series has just too many discussions threads to unravel on half
> >> an afternoon.
> >
> >
> > FWIW, I think the latest exchange we had with Quentin regarding the OPPs
> > concluded in “false alarm”, given that this version of the series only
> > introduces a subset of them which should apply to all RK3588(s)
> >
>
> Correct.
>
> However... I'm wondering if we shouldn't somehow follow the same pattern
> we have used for the rk3399 OPPs? We have a file for the "true" RK3399
> OPPs, then the OP1 variant and the RK3399T.
>
> We already know there are a few variants of RK3588 with different OPPs:
> RK3588(S/S2?), RK3588J and RK3588M. I wouldn't be surprised if the
> RK3582 (though this one has already one big cluster (or two big cores)
> fewer than RK3588) has different OPPs as well?
>
> So. We have already discussed that the OPPs in that patch are valid for
> RK3588(S) but they aren't for the other variants.
Hmm. Looking at Rockchip sources [1] more closely as opposed to the
Radxa tree I've been using as the basis, it does indeed show that
RK3588J and RK3588M use a different OPP table altogether (frequencies
are similar, but voltages differ).
We currently have an RK3588J based board in the mainline tree
(rk3588-edgeble-neu6b-io.dts), so it can't be ignored. However, given
that Rockchip sources only differentiate those OPPs by SoC revision,
and that is (currently?) known for each board at dtb compile time, it
seems easier to just include two different OPP tables for RK3588(S)
vs. RK3588J - thus avoiding all the headache with opp-supported-hw.
Similar to RK3399, yes.
Will split those out and send a separate version.
> In the downstream kernel, any OPP whose opp-supported-hw has a first
> value masked by BIT(1) return non-0 is supported by RK3588M. In the
> downstream kernel, any OPP whose opp-supported-hw has a first value
> masked by BIT(2) return non-0 is supported by RK3588J.
>
> This means that, for LITTLE clusters:
> - opp-1608000000 not supported on RK3588J
> - opp-1704000000 only supported on RK3588M (but already absent in this
> patch series)
> - opp-1800000000 only supported on RK3588(S), not RK3588J nor RK3588M
>
> For big clusters:
> - opp-1800000000 not supported on RK3588J
> - opp-2016000000 not supported on RK3588J
> - opp-2208000000 only supported on RK3588(S), not RK3588J nor RK3588M
> - opp-2256000000 only supported on RK3588(S), not RK3588J nor RK3588M
> - opp-2304000000 only supported on RK3588(S), not RK3588J nor RK3588M
> - opp-2352000000 only supported on RK3588(S), not RK3588J nor RK3588M
> - opp-2400000000 only supported on RK3588(S), not RK3588J nor RK3588M
>
> This is somehow also enforced in downstream kernel by removing the OPP
> nodes directly (hence, not even requiring the check of opp-supported-hw
> value), c.f.:
> https://git.theobroma-systems.com/tiger-linux.git/tree/arch/arm64/boot/dts/rockchip/rk3588j.dtsi
> https://git.theobroma-systems.com/tiger-linux.git/tree/arch/arm64/boot/dts/rockchip/rk3588m.dtsi
>
> You'll not that the RK3588J also has less OPPs for the GPU and NPU (but
> those should also be masked by the opp-supported-hw value).
Same with DMC, but we don't currently have either DMC or NPU in the
mainline tree, so it sounds like something to be dealt with later :)
Best regards,
Alexey
[1] https://github.com/rockchip-linux/kernel/blob/develop-5.10/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
Hello Alexey and Quentin,
On 2024-05-28 21:26, Alexey Charkov wrote:
> On Tue, May 28, 2024 at 8:08 PM Quentin Schulz
> <[email protected]> wrote:
>> On 5/28/24 5:42 PM, Alexey Charkov wrote:
>> > On Tue, 28 May 2024 at 19:16, Heiko Stuebner <[email protected]> wrote:
>> >> Am Dienstag, 28. Mai 2024, 17:01:48 CEST schrieb Dragan Simic:
>> >>> On 2024-05-28 16:34, Heiko Stuebner wrote:
>> >>>> Am Dienstag, 28. Mai 2024, 16:05:04 CEST schrieb Dragan Simic:
>> >>>>> On 2024-05-28 11:49, Alexey Charkov wrote:
>> >>>>>> Hi Heiko,
>> >>>>>>
>> >>>>>> Do you think this can be merged for 6.11? Looks like there hasn't
>> >> been
>> >>>>>> any new feedback in a while, and it would be good to have frequency
>> >>>>>> scaling in place for RK3588.
>> >>>>>>
>> >>>>>> Please let me know if you have any reservations or if we need any
>> >>>>>> broader discussion.
>> >>>>
>> >>>> not really reservations, more like there was still discussion going on
>> >>>> around the OPPs. Meanwhile we had more discussions regarding the whole
>> >>>> speed binning Rockchip seems to do for rk3588 variants.
>> >>>>
>> >>>> And waiting for the testing Dragan wanted to do ;-) .
>> >>>
>> >>> I'm sorry for the delays.
>> >>
>> >> Was definitly _not_ meant as blame ;-) .
>> >>
>> >> The series has just too many discussions threads to unravel on half
>> >> an afternoon.
>> >
>> >
>> > FWIW, I think the latest exchange we had with Quentin regarding the OPPs
>> > concluded in “false alarm”, given that this version of the series only
>> > introduces a subset of them which should apply to all RK3588(s)
>>
>> Correct.
>>
>> However... I'm wondering if we shouldn't somehow follow the same
>> pattern
>> we have used for the rk3399 OPPs? We have a file for the "true" RK3399
>> OPPs, then the OP1 variant and the RK3399T.
>>
>> We already know there are a few variants of RK3588 with different
>> OPPs:
>> RK3588(S/S2?), RK3588J and RK3588M. I wouldn't be surprised if the
>> RK3582 (though this one has already one big cluster (or two big cores)
>> fewer than RK3588) has different OPPs as well?
>>
>> So. We have already discussed that the OPPs in that patch are valid
>> for
>> RK3588(S) but they aren't for the other variants.
>
> Hmm. Looking at Rockchip sources [1] more closely as opposed to the
> Radxa tree I've been using as the basis, it does indeed show that
> RK3588J and RK3588M use a different OPP table altogether (frequencies
> are similar, but voltages differ).
>
> We currently have an RK3588J based board in the mainline tree
> (rk3588-edgeble-neu6b-io.dts), so it can't be ignored. However, given
> that Rockchip sources only differentiate those OPPs by SoC revision,
> and that is (currently?) known for each board at dtb compile time, it
> seems easier to just include two different OPP tables for RK3588(S)
> vs. RK3588J - thus avoiding all the headache with opp-supported-hw.
> Similar to RK3399, yes.
>
> Will split those out and send a separate version.
Ah, you already answered my question about the existence of supported
boards with the RK3588J and RK3588M variants. :) [2] I totally agree
about splitting the OPPs into the separate .dtsi files, i.e. following
the approach already established by the RK3399.
Perhaps the new files should be named rk3588-opp.dtsi and
rk3588j-opp.dtsi,
to (almost fully) follow the already established naming scheme. Maybe
the OPP data could instead be added to the already existing rk3588.dtsi,
rk3588s.dtsi and rk3588j.dtsi files, which would actually make more
sense
because those .dtsi files already address the specific SoC variants, but
the approach with the separate new rk3588*-opp.dtsi files seems cleaner
from the implementation point of view, and is a bit safer.
If you see a clean way for stuffing the OPP data into the already
existing
rk3588.dtsi, rk3588s.dtsi and rk3588j.dtsi files, instead of introducing
new rk3588*-opp.dtsi files, I'd be happy to support it.
[2]
https://lore.kernel.org/linux-rockchip/[email protected]/
>> In the downstream kernel, any OPP whose opp-supported-hw has a first
>> value masked by BIT(1) return non-0 is supported by RK3588M. In the
>> downstream kernel, any OPP whose opp-supported-hw has a first value
>> masked by BIT(2) return non-0 is supported by RK3588J.
>>
>> This means that, for LITTLE clusters:
>> - opp-1608000000 not supported on RK3588J
>> - opp-1704000000 only supported on RK3588M (but already absent in this
>> patch series)
>> - opp-1800000000 only supported on RK3588(S), not RK3588J nor RK3588M
>>
>> For big clusters:
>> - opp-1800000000 not supported on RK3588J
>> - opp-2016000000 not supported on RK3588J
>> - opp-2208000000 only supported on RK3588(S), not RK3588J nor RK3588M
>> - opp-2256000000 only supported on RK3588(S), not RK3588J nor RK3588M
>> - opp-2304000000 only supported on RK3588(S), not RK3588J nor RK3588M
>> - opp-2352000000 only supported on RK3588(S), not RK3588J nor RK3588M
>> - opp-2400000000 only supported on RK3588(S), not RK3588J nor RK3588M
>>
>> This is somehow also enforced in downstream kernel by removing the OPP
>> nodes directly (hence, not even requiring the check of
>> opp-supported-hw
>> value), c.f.:
>> https://git.theobroma-systems.com/tiger-linux.git/tree/arch/arm64/boot/dts/rockchip/rk3588j.dtsi
>> https://git.theobroma-systems.com/tiger-linux.git/tree/arch/arm64/boot/dts/rockchip/rk3588m.dtsi
>>
>> You'll not that the RK3588J also has less OPPs for the GPU and NPU
>> (but
>> those should also be masked by the opp-supported-hw value).
>
> Same with DMC, but we don't currently have either DMC or NPU in the
> mainline tree, so it sounds like something to be dealt with later :)
>
> [1]
> https://github.com/rockchip-linux/kernel/blob/develop-5.10/arch/arm64/boot/dts/rockchip/rk3588s.dtsi
On 2024-05-29 02:35, Dragan Simic wrote:
> On 2024-05-28 21:26, Alexey Charkov wrote:
>> On Tue, May 28, 2024 at 8:08 PM Quentin Schulz
>> <[email protected]> wrote:
>>> On 5/28/24 5:42 PM, Alexey Charkov wrote:
>>> > On Tue, 28 May 2024 at 19:16, Heiko Stuebner <[email protected]> wrote:
>>> >> Am Dienstag, 28. Mai 2024, 17:01:48 CEST schrieb Dragan Simic:
>>> >>> On 2024-05-28 16:34, Heiko Stuebner wrote:
>>> >>>> Am Dienstag, 28. Mai 2024, 16:05:04 CEST schrieb Dragan Simic:
>>> >>>>> On 2024-05-28 11:49, Alexey Charkov wrote:
>>> >>>>>> Hi Heiko,
>>> >>>>>>
>>> >>>>>> Do you think this can be merged for 6.11? Looks like there hasn't
>>> >> been
>>> >>>>>> any new feedback in a while, and it would be good to have frequency
>>> >>>>>> scaling in place for RK3588.
>>> >>>>>>
>>> >>>>>> Please let me know if you have any reservations or if we need any
>>> >>>>>> broader discussion.
>>> >>>>
>>> >>>> not really reservations, more like there was still discussion going on
>>> >>>> around the OPPs. Meanwhile we had more discussions regarding the whole
>>> >>>> speed binning Rockchip seems to do for rk3588 variants.
>>> >>>>
>>> >>>> And waiting for the testing Dragan wanted to do ;-) .
>>> >>>
>>> >>> I'm sorry for the delays.
>>> >>
>>> >> Was definitly _not_ meant as blame ;-) .
>>> >>
>>> >> The series has just too many discussions threads to unravel on half
>>> >> an afternoon.
>>> >
>>> >
>>> > FWIW, I think the latest exchange we had with Quentin regarding the OPPs
>>> > concluded in “false alarm”, given that this version of the series only
>>> > introduces a subset of them which should apply to all RK3588(s)
>>>
>>> Correct.
>>>
>>> However... I'm wondering if we shouldn't somehow follow the same
>>> pattern
>>> we have used for the rk3399 OPPs? We have a file for the "true"
>>> RK3399
>>> OPPs, then the OP1 variant and the RK3399T.
>>>
>>> We already know there are a few variants of RK3588 with different
>>> OPPs:
>>> RK3588(S/S2?), RK3588J and RK3588M. I wouldn't be surprised if the
>>> RK3582 (though this one has already one big cluster (or two big
>>> cores)
>>> fewer than RK3588) has different OPPs as well?
>>>
>>> So. We have already discussed that the OPPs in that patch are valid
>>> for
>>> RK3588(S) but they aren't for the other variants.
>>
>> Hmm. Looking at Rockchip sources [1] more closely as opposed to the
>> Radxa tree I've been using as the basis, it does indeed show that
>> RK3588J and RK3588M use a different OPP table altogether (frequencies
>> are similar, but voltages differ).
>>
>> We currently have an RK3588J based board in the mainline tree
>> (rk3588-edgeble-neu6b-io.dts), so it can't be ignored. However, given
>> that Rockchip sources only differentiate those OPPs by SoC revision,
>> and that is (currently?) known for each board at dtb compile time, it
>> seems easier to just include two different OPP tables for RK3588(S)
>> vs. RK3588J - thus avoiding all the headache with opp-supported-hw.
>> Similar to RK3399, yes.
>>
>> Will split those out and send a separate version.
>
> Ah, you already answered my question about the existence of supported
> boards with the RK3588J and RK3588M variants. :) [2] I totally agree
> about splitting the OPPs into the separate .dtsi files, i.e. following
> the approach already established by the RK3399.
>
> Perhaps the new files should be named rk3588-opp.dtsi and
> rk3588j-opp.dtsi,
> to (almost fully) follow the already established naming scheme. Maybe
> the OPP data could instead be added to the already existing
> rk3588.dtsi,
> rk3588s.dtsi and rk3588j.dtsi files, which would actually make more
> sense
> because those .dtsi files already address the specific SoC variants,
> but
> the approach with the separate new rk3588*-opp.dtsi files seems cleaner
> from the implementation point of view, and is a bit safer.
>
> If you see a clean way for stuffing the OPP data into the already
> existing
> rk3588.dtsi, rk3588s.dtsi and rk3588j.dtsi files, instead of
> introducing
> new rk3588*-opp.dtsi files, I'd be happy to support it.
>
> [2]
> https://lore.kernel.org/linux-rockchip/[email protected]/
On second thought, it makes more sense to rename and shuffle the RK3588
dtsi files a bit, to make it possible to have the OPPs already specified
when a board dts file includes the dtsi file for a RK3588 variant.
Thus,
I went ahead and prepared an RFC patch that does exactly that. [3]
Please, have a look at that RFC patch. To quote its description:
---------------------------------------------------------------------------
Rename and modify the RK3588 dtsi files a bit, to make preparations for
the ability to specify different CPU and GPU OPPs for each of the
supported
RK3588 SoC variants, including the RK3588J.
As already discussed, some of the different RK3588 SoC variants
require different OPPs, and it makes more sense to have the OPPs already
defined when a board dts includes one of the SoC dtsi files
(rk3588.dtsi,
rk3588j.dtsi or rk3588s.dtsi), rather than requiring the board dts file
to
also include a separate rk3588*-opp.dtsi file. The choice of the SoC
variant
is already made by the inclusion of the SoC dtsi file, and it doesn't
make
much sense to, effectively, allow the board dts file to include and use
an
incompatible set of OPPs for the already selected SoC variant.
No intended functional changes are introduced. (Still to be
additionally
verified if the patch moves forward from the RFC state.)
---------------------------------------------------------------------------
[3]
https://lore.kernel.org/linux-rockchip/673dcf47596e7bc8ba065034e339bb1bbf9cdcb0.1716948159.git.dsimic@manjaro.org/
>>> In the downstream kernel, any OPP whose opp-supported-hw has a first
>>> value masked by BIT(1) return non-0 is supported by RK3588M. In the
>>> downstream kernel, any OPP whose opp-supported-hw has a first value
>>> masked by BIT(2) return non-0 is supported by RK3588J.
>>>
>>> This means that, for LITTLE clusters:
>>> - opp-1608000000 not supported on RK3588J
>>> - opp-1704000000 only supported on RK3588M (but already absent in
>>> this
>>> patch series)
>>> - opp-1800000000 only supported on RK3588(S), not RK3588J nor RK3588M
>>>
>>> For big clusters:
>>> - opp-1800000000 not supported on RK3588J
>>> - opp-2016000000 not supported on RK3588J
>>> - opp-2208000000 only supported on RK3588(S), not RK3588J nor RK3588M
>>> - opp-2256000000 only supported on RK3588(S), not RK3588J nor RK3588M
>>> - opp-2304000000 only supported on RK3588(S), not RK3588J nor RK3588M
>>> - opp-2352000000 only supported on RK3588(S), not RK3588J nor RK3588M
>>> - opp-2400000000 only supported on RK3588(S), not RK3588J nor RK3588M
>>>
>>> This is somehow also enforced in downstream kernel by removing the
>>> OPP
>>> nodes directly (hence, not even requiring the check of
>>> opp-supported-hw
>>> value), c.f.:
>>> https://git.theobroma-systems.com/tiger-linux.git/tree/arch/arm64/boot/dts/rockchip/rk3588j.dtsi
>>> https://git.theobroma-systems.com/tiger-linux.git/tree/arch/arm64/boot/dts/rockchip/rk3588m.dtsi
>>>
>>> You'll not that the RK3588J also has less OPPs for the GPU and NPU
>>> (but
>>> those should also be masked by the opp-supported-hw value).
>>
>> Same with DMC, but we don't currently have either DMC or NPU in the
>> mainline tree, so it sounds like something to be dealt with later :)
>>
>> [1]
>> https://github.com/rockchip-linux/kernel/blob/develop-5.10/arch/arm64/boot/dts/rockchip/rk3588s.dtsi