2024-05-10 12:58:47

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 00/12] Adreno cooling, take 2

For the thermal framework to cool devfreq-managed devices properly,
it seems like the following conditions must be met:

1. the devfreq device has a cooling device associated with it
2. there exists some thermal zone provider
3. the cooling device is referenced in a cooling map
4. the cooling map is associated with a thermal trip point
5. the thermal trip point is of the "passive" kind
6. the "passive" trip point is being updated (via polling or otherwise)
7. the trip point is being hit (i.e. the thing gets hot enough)

Various QC DTs have various issues, mostly around 4, 5, 6 and 7.
This series tries to amend the platforms that currently can't have
Adreno throttled, without making much unnecessary/debatable mess,
although sneaking in some configuration unification/standardization.

Further updates can be made in the future.

This was originally brought into attention by Daniel in [1], this
series resolves the issues on a treewide scale.

Developed atop (and thereby depends on) [2].

[1] https://lore.kernel.org/linux-arm-msm/[email protected]/
[2] https://lore.kernel.org/linux-arm-msm/[email protected]/T/#t

Signed-off-by: Konrad Dybcio <[email protected]>
---
Konrad Dybcio (12):
arm64: dts: qcom: sc8180x: Throttle the GPU when overheating
arm64: dts: qcom: sc8280xp: Throttle the GPU when overheating
arm64: dts: qcom: sdm630: Throttle the GPU when overheating
arm64: dts: qcom: sdm845: Throttle the GPU when overheating
arm64: dts: qcom: sm6115: Update GPU thermal zone settings
arm64: dts: qcom: sm6350: Update GPU thermal zone settings
arm64: dts: qcom: sm8150: Throttle the GPU when overheating
arm64: dts: qcom: sm8250: Throttle the GPU when overheating
arm64: dts: qcom: sm8350: Throttle the GPU when overheating
arm64: dts: qcom: sm8450: Throttle the GPU when overheating
arm64: dts: qcom: sm8550: Throttle the GPU when overheating
arm64: dts: qcom: sm8650: Throttle the GPU when overheating

arch/arm64/boot/dts/qcom/sc8180x.dtsi | 28 ++++-
arch/arm64/boot/dts/qcom/sc8280xp.dtsi | 17 ++-
arch/arm64/boot/dts/qcom/sdm630.dtsi | 12 ++
arch/arm64/boot/dts/qcom/sdm845.dtsi | 28 ++++-
arch/arm64/boot/dts/qcom/sm6115.dtsi | 8 +-
arch/arm64/boot/dts/qcom/sm6350.dtsi | 16 ++-
arch/arm64/boot/dts/qcom/sm8150.dtsi | 28 ++++-
arch/arm64/boot/dts/qcom/sm8250.dtsi | 28 ++++-
arch/arm64/boot/dts/qcom/sm8350.dtsi | 24 ++++
arch/arm64/boot/dts/qcom/sm8450.dtsi | 48 +++-----
arch/arm64/boot/dts/qcom/sm8550.dtsi | 208 +++++++++++++--------------------
arch/arm64/boot/dts/qcom/sm8650.dtsi | 169 ++++++++++++++++++++++-----
12 files changed, 406 insertions(+), 208 deletions(-)
---
base-commit: 2adffd063e54f8790132eedfaf3019bfb6f62268
change-id: 20240510-topic-gpus_are_cool_now-ed8d8c4f5f7f

Best regards,
--
Konrad Dybcio <[email protected]>



2024-05-10 12:59:09

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 02/12] arm64: dts: qcom: sc8280xp: Throttle the GPU when overheating

Add an 85C passive trip point with 1C of hysteresis to ensure the
thermal framework takes sufficient action to prevent reaching junction
temperature. Also, add passive polling to ensure more than one
temperature change event is recorded.

Fixes: 014bbc990e27 ("arm64: dts: qcom: sc8280xp: Introduce additional tsens instances")
Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/sc8280xp.dtsi | 17 ++++++++++++++++-
1 file changed, 16 insertions(+), 1 deletion(-)

diff --git a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
index f63951186a5b..65c444cce00c 100644
--- a/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc8280xp.dtsi
@@ -5956,10 +5956,25 @@ cpu-crit {
};

gpu-thermal {
+ polling-delay-passive = <250>;
+
thermal-sensors = <&tsens2 2>;

+ cooling-maps {
+ map0 {
+ trip = <&gpu_alert0>;
+ cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
+
trips {
- gpu-crit {
+ gpu_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <110000>;
hysteresis = <1000>;
type = "critical";

--
2.40.1


2024-05-10 12:59:26

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 03/12] arm64: dts: qcom: sdm630: Throttle the GPU when overheating

Add an 85C passive trip point to ensure the thermal framework takes
sufficient action to prevent reaching junction temperature and a
110C critical point to help avoid hw damage.

Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/sdm630.dtsi | 12 ++++++++++++
1 file changed, 12 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sdm630.dtsi b/arch/arm64/boot/dts/qcom/sdm630.dtsi
index 7702d42e82c1..a46dbe725e54 100644
--- a/arch/arm64/boot/dts/qcom/sdm630.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm630.dtsi
@@ -2582,10 +2582,22 @@ map0 {

trips {
gpu_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
hysteresis = <1000>;
type = "hot";
};
+
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
+ };
};
};
};

--
2.40.1


2024-05-10 12:59:34

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 01/12] arm64: dts: qcom: sc8180x: Throttle the GPU when overheating

Add an 85C passive trip point to ensure the thermal framework takes
sufficient action to prevent reaching junction temperature and a
110C critical point to help avoid hw damage.

Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/sc8180x.dtsi | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sc8180x.dtsi b/arch/arm64/boot/dts/qcom/sc8180x.dtsi
index aedf2e7db038..699f377e94d3 100644
--- a/arch/arm64/boot/dts/qcom/sc8180x.dtsi
+++ b/arch/arm64/boot/dts/qcom/sc8180x.dtsi
@@ -3993,10 +3993,22 @@ map0 {

trips {
gpu_top_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};
+
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
+ };
};
};

@@ -4140,10 +4152,22 @@ map0 {

trips {
gpu_bottom_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};
+
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
+ };
};
};
};

--
2.40.1


2024-05-10 12:59:53

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 04/12] arm64: dts: qcom: sdm845: Throttle the GPU when overheating

Add an 85C passive trip point to ensure the thermal framework takes
sufficient action to prevent reaching junction temperature and a
110C critical point to help avoid hw damage.

Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/sdm845.dtsi | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sdm845.dtsi b/arch/arm64/boot/dts/qcom/sdm845.dtsi
index 57507d6ec918..52eda2bb6b09 100644
--- a/arch/arm64/boot/dts/qcom/sdm845.dtsi
+++ b/arch/arm64/boot/dts/qcom/sdm845.dtsi
@@ -5630,10 +5630,22 @@ map0 {

trips {
gpu_top_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};
+
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
+ };
};
};

@@ -5651,10 +5663,22 @@ map0 {

trips {
gpu_bottom_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};
+
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
+ };
};
};


--
2.40.1


2024-05-10 12:59:57

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 05/12] arm64: dts: qcom: sm6115: Update GPU thermal zone settings

Lower the thresholds to something more reasonable and introduce a
passive polling delay to make sure more than one "passive" thermal point
is taken into account when throttling.

Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/sm6115.dtsi | 8 +++++---
1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sm6115.dtsi b/arch/arm64/boot/dts/qcom/sm6115.dtsi
index 0a0bb5310849..afa08dd0dd69 100644
--- a/arch/arm64/boot/dts/qcom/sm6115.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm6115.dtsi
@@ -3323,6 +3323,8 @@ trip-point1 {
};

gpu-thermal {
+ polling-delay-passive = <250>;
+
thermal-sensors = <&tsens0 15>;

cooling-maps {
@@ -3334,13 +3336,13 @@ map0 {

trips {
gpu_alert0: trip-point0 {
- temperature = <115000>;
- hysteresis = <5000>;
+ temperature = <85000>;
+ hysteresis = <1000>;
type = "passive";
};

trip-point1 {
- temperature = <125000>;
+ temperature = <110000>;
hysteresis = <1000>;
type = "critical";
};

--
2.40.1


2024-05-10 13:00:13

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 06/12] arm64: dts: qcom: sm6350: Update GPU thermal zone settings

Lower the thresholds to something more reasonable and introduce a
passive polling delay to make sure more than one "passive" thermal point
is taken into account when throttling.

Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/sm6350.dtsi | 16 ++++++++++------
1 file changed, 10 insertions(+), 6 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sm6350.dtsi b/arch/arm64/boot/dts/qcom/sm6350.dtsi
index abfaa1178a39..99813f380df0 100644
--- a/arch/arm64/boot/dts/qcom/sm6350.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm6350.dtsi
@@ -3177,18 +3177,20 @@ ddr-crit {
};

gpuss0-thermal {
+ polling-delay-passive = <250>;
+
thermal-sensors = <&tsens0 13>;

trips {
gpuss0_alert0: trip-point0 {
- temperature = <95000>;
+ temperature = <85000>;
hysteresis = <2000>;
type = "passive";
};

gpuss0-crit {
- temperature = <115000>;
- hysteresis = <0>;
+ temperature = <110000>;
+ hysteresis = <1000>;
type = "critical";
};
};
@@ -3202,18 +3204,20 @@ map0 {
};

gpuss1-thermal {
+ polling-delay-passive = <250>;
+
thermal-sensors = <&tsens0 14>;

trips {
gpuss1_alert0: trip-point0 {
- temperature = <95000>;
+ temperature = <85000>;
hysteresis = <2000>;
type = "passive";
};

gpuss1-crit {
- temperature = <115000>;
- hysteresis = <0>;
+ temperature = <110000>;
+ hysteresis = <1000>;
type = "critical";
};
};

--
2.40.1


2024-05-10 13:00:23

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 07/12] arm64: dts: qcom: sm8150: Throttle the GPU when overheating

Add an 85C passive trip point to ensure the thermal framework takes
sufficient action to prevent reaching junction temperature and a
110C critical point to help avoid hw damage.

Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/sm8150.dtsi | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8150.dtsi b/arch/arm64/boot/dts/qcom/sm8150.dtsi
index 1f597f03107b..8e9194051283 100644
--- a/arch/arm64/boot/dts/qcom/sm8150.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8150.dtsi
@@ -5157,10 +5157,22 @@ map0 {

trips {
gpu_top_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};
+
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
+ };
};
};

@@ -5332,10 +5344,22 @@ map0 {

trips {
gpu_bottom_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};
+
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
+ };
};
};
};

--
2.40.1


2024-05-10 13:00:38

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 08/12] arm64: dts: qcom: sm8250: Throttle the GPU when overheating

Add an 85C passive trip point to ensure the thermal framework takes
sufficient action to prevent reaching junction temperature and a
110C critical point to help avoid hw damage.

Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/sm8250.dtsi | 28 ++++++++++++++++++++++++++--
1 file changed, 26 insertions(+), 2 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8250.dtsi b/arch/arm64/boot/dts/qcom/sm8250.dtsi
index 1a1202bdd915..b734aa13fd2e 100644
--- a/arch/arm64/boot/dts/qcom/sm8250.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8250.dtsi
@@ -6855,10 +6855,22 @@ map0 {

trips {
gpu_top_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};
+
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
+ };
};
};

@@ -6988,10 +7000,22 @@ map0 {

trips {
gpu_bottom_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};
+
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
+ };
};
};
};

--
2.40.1


2024-05-10 13:02:16

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 09/12] arm64: dts: qcom: sm8350: Throttle the GPU when overheating

Add an 85C passive trip point to ensure the thermal framework takes
sufficient action to prevent reaching junction temperature and a
110C critical point to help avoid hw damage.

Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/sm8350.dtsi | 24 ++++++++++++++++++++++++
1 file changed, 24 insertions(+)

diff --git a/arch/arm64/boot/dts/qcom/sm8350.dtsi b/arch/arm64/boot/dts/qcom/sm8350.dtsi
index 526d3c92eae8..94782180bce7 100644
--- a/arch/arm64/boot/dts/qcom/sm8350.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8350.dtsi
@@ -4259,10 +4259,22 @@ map0 {

trips {
gpu_top_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
hysteresis = <1000>;
type = "hot";
};
+
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
+ };
};
};

@@ -4280,10 +4292,22 @@ map0 {

trips {
gpu_bottom_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
hysteresis = <1000>;
type = "hot";
};
+
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
+ };
};
};


--
2.40.1


2024-05-10 13:02:48

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 10/12] arm64: dts: qcom: sm8450: Throttle the GPU when overheating

Add an 85C passive trip point to ensure the thermal framework takes
sufficient action to prevent reaching junction temperature and a
110C critical point to help avoid hw damage.

Remove the copypasta-from-downstream userspace governor entries while
at it.

Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/sm8450.dtsi | 48 ++++++++++++++----------------------
1 file changed, 18 insertions(+), 30 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8450.dtsi b/arch/arm64/boot/dts/qcom/sm8450.dtsi
index ee60fd257efe..38e8d3e9dd43 100644
--- a/arch/arm64/boot/dts/qcom/sm8450.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8450.dtsi
@@ -4928,28 +4928,22 @@ map0 {
};

trips {
- thermal-engine-config {
- temperature = <125000>;
+ gpu_top_alert0: trip-point0 {
+ temperature = <85000>;
hysteresis = <1000>;
type = "passive";
};

- thermal-hal-config {
- temperature = <125000>;
+ trip-point1 {
+ temperature = <90000>;
hysteresis = <1000>;
- type = "passive";
- };
-
- reset-mon-cfg {
- temperature = <115000>;
- hysteresis = <5000>;
- type = "passive";
+ type = "hot";
};

- gpu_top_alert0: trip-point0 {
- temperature = <95000>;
- hysteresis = <5000>;
- type = "passive";
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
};
};
};
@@ -4967,28 +4961,22 @@ map0 {
};

trips {
- thermal-engine-config {
- temperature = <125000>;
+ gpu_bottom_alert0: trip-point0 {
+ temperature = <85000>;
hysteresis = <1000>;
type = "passive";
};

- thermal-hal-config {
- temperature = <125000>;
+ trip-point1 {
+ temperature = <90000>;
hysteresis = <1000>;
- type = "passive";
- };
-
- reset-mon-cfg {
- temperature = <115000>;
- hysteresis = <5000>;
- type = "passive";
+ type = "hot";
};

- gpu_bottom_alert0: trip-point0 {
- temperature = <95000>;
- hysteresis = <5000>;
- type = "passive";
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
};
};
};

--
2.40.1


2024-05-10 13:02:49

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 11/12] arm64: dts: qcom: sm8550: Throttle the GPU when overheating

Add an 85C passive trip point to ensure the thermal framework takes
sufficient action to prevent reaching junction temperature and a
110C critical point to help avoid hw damage.

Remove the copypasta-from-downstream userspace governor entries while
at it.

Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/sm8550.dtsi | 208 ++++++++++++++---------------------
1 file changed, 80 insertions(+), 128 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8550.dtsi b/arch/arm64/boot/dts/qcom/sm8550.dtsi
index 51c547872438..23f769a5b1d4 100644
--- a/arch/arm64/boot/dts/qcom/sm8550.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8550.dtsi
@@ -5367,34 +5367,28 @@ gpuss-0-thermal {

cooling-maps {
map0 {
- trip = <&gpu0_junction_config>;
+ trip = <&gpu0_alert0>;
cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
};
};

trips {
- thermal-engine-config {
- temperature = <125000>;
+ gpu0_alert0: trip-point0 {
+ temperature = <85000>;
hysteresis = <1000>;
type = "passive";
};

- thermal-hal-config {
- temperature = <125000>;
+ trip-point1 {
+ temperature = <90000>;
hysteresis = <1000>;
- type = "passive";
- };
-
- reset-mon-config {
- temperature = <115000>;
- hysteresis = <5000>;
- type = "passive";
+ type = "hot";
};

- gpu0_junction_config: junction-config {
- temperature = <95000>;
- hysteresis = <5000>;
- type = "passive";
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
};
};
};
@@ -5406,34 +5400,28 @@ gpuss-1-thermal {

cooling-maps {
map0 {
- trip = <&gpu1_junction_config>;
+ trip = <&gpu1_alert0>;
cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
};
};

trips {
- thermal-engine-config {
- temperature = <125000>;
+ gpu1_alert0: trip-point0 {
+ temperature = <85000>;
hysteresis = <1000>;
type = "passive";
};

- thermal-hal-config {
- temperature = <125000>;
+ trip-point1 {
+ temperature = <90000>;
hysteresis = <1000>;
- type = "passive";
- };
-
- reset-mon-config {
- temperature = <115000>;
- hysteresis = <5000>;
- type = "passive";
+ type = "hot";
};

- gpu1_junction_config: junction-config {
- temperature = <95000>;
- hysteresis = <5000>;
- type = "passive";
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
};
};
};
@@ -5445,34 +5433,28 @@ gpuss-2-thermal {

cooling-maps {
map0 {
- trip = <&gpu2_junction_config>;
+ trip = <&gpu2_alert0>;
cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
};
};

trips {
- thermal-engine-config {
- temperature = <125000>;
+ gpu2_alert0: trip-point0 {
+ temperature = <85000>;
hysteresis = <1000>;
type = "passive";
};

- thermal-hal-config {
- temperature = <125000>;
+ trip-point1 {
+ temperature = <90000>;
hysteresis = <1000>;
- type = "passive";
- };
-
- reset-mon-config {
- temperature = <115000>;
- hysteresis = <5000>;
- type = "passive";
+ type = "hot";
};

- gpu2_junction_config: junction-config {
- temperature = <95000>;
- hysteresis = <5000>;
- type = "passive";
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
};
};
};
@@ -5484,34 +5466,28 @@ gpuss-3-thermal {

cooling-maps {
map0 {
- trip = <&gpu3_junction_config>;
+ trip = <&gpu3_alert0>;
cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
};
};

trips {
- thermal-engine-config {
- temperature = <125000>;
+ gpu3_alert0: trip-point0 {
+ temperature = <85000>;
hysteresis = <1000>;
type = "passive";
};

- thermal-hal-config {
- temperature = <125000>;
+ trip-point1 {
+ temperature = <90000>;
hysteresis = <1000>;
- type = "passive";
- };
-
- reset-mon-config {
- temperature = <115000>;
- hysteresis = <5000>;
- type = "passive";
+ type = "hot";
};

- gpu3_junction_config: junction-config {
- temperature = <95000>;
- hysteresis = <5000>;
- type = "passive";
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
};
};
};
@@ -5523,34 +5499,28 @@ gpuss-4-thermal {

cooling-maps {
map0 {
- trip = <&gpu4_junction_config>;
+ trip = <&gpu4_alert0>;
cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
};
};

trips {
- thermal-engine-config {
- temperature = <125000>;
+ gpu4_alert0: trip-point0 {
+ temperature = <85000>;
hysteresis = <1000>;
type = "passive";
};

- thermal-hal-config {
- temperature = <125000>;
+ trip-point1 {
+ temperature = <90000>;
hysteresis = <1000>;
- type = "passive";
- };
-
- reset-mon-config {
- temperature = <115000>;
- hysteresis = <5000>;
- type = "passive";
+ type = "hot";
};

- gpu4_junction_config: junction-config {
- temperature = <95000>;
- hysteresis = <5000>;
- type = "passive";
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
};
};
};
@@ -5562,34 +5532,28 @@ gpuss-5-thermal {

cooling-maps {
map0 {
- trip = <&gpu5_junction_config>;
+ trip = <&gpu5_alert0>;
cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
};
};

trips {
- thermal-engine-config {
- temperature = <125000>;
+ gpu5_alert0: trip-point0 {
+ temperature = <85000>;
hysteresis = <1000>;
type = "passive";
};

- thermal-hal-config {
- temperature = <125000>;
+ trip-point1 {
+ temperature = <90000>;
hysteresis = <1000>;
- type = "passive";
- };
-
- reset-mon-config {
- temperature = <115000>;
- hysteresis = <5000>;
- type = "passive";
+ type = "hot";
};

- gpu5_junction_config: junction-config {
- temperature = <95000>;
- hysteresis = <5000>;
- type = "passive";
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
};
};
};
@@ -5601,34 +5565,28 @@ gpuss-6-thermal {

cooling-maps {
map0 {
- trip = <&gpu6_junction_config>;
+ trip = <&gpu6_alert0>;
cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
};
};

trips {
- thermal-engine-config {
- temperature = <125000>;
+ gpu6_alert0: trip-point0 {
+ temperature = <85000>;
hysteresis = <1000>;
type = "passive";
};

- thermal-hal-config {
- temperature = <125000>;
+ trip-point1 {
+ temperature = <90000>;
hysteresis = <1000>;
- type = "passive";
- };
-
- reset-mon-config {
- temperature = <115000>;
- hysteresis = <5000>;
- type = "passive";
+ type = "hot";
};

- gpu6_junction_config: junction-config {
- temperature = <95000>;
- hysteresis = <5000>;
- type = "passive";
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
};
};
};
@@ -5640,34 +5598,28 @@ gpuss-7-thermal {

cooling-maps {
map0 {
- trip = <&gpu7_junction_config>;
+ trip = <&gpu7_alert0>;
cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
};
};

trips {
- thermal-engine-config {
- temperature = <125000>;
+ gpu7_alert0: trip-point0 {
+ temperature = <85000>;
hysteresis = <1000>;
type = "passive";
};

- thermal-hal-config {
- temperature = <125000>;
+ trip-point1 {
+ temperature = <90000>;
hysteresis = <1000>;
- type = "passive";
- };
-
- reset-mon-config {
- temperature = <115000>;
- hysteresis = <5000>;
- type = "passive";
+ type = "hot";
};

- gpu7_junction_config: junction-config {
- temperature = <95000>;
- hysteresis = <5000>;
- type = "passive";
+ trip-point2 {
+ temperature = <110000>;
+ hysteresis = <1000>;
+ type = "critical";
};
};
};

--
2.40.1


2024-05-10 13:03:01

by Konrad Dybcio

[permalink] [raw]
Subject: [PATCH 12/12] arm64: dts: qcom: sm8650: Throttle the GPU when overheating

Add an 85C passive trip point to ensure the thermal framework takes
sufficient action to prevent reaching junction temperature and a
110C critical point to help avoid hw damage.

Also, register the GPU as a cooling device and hook it up to the
right thermal zones.

Signed-off-by: Konrad Dybcio <[email protected]>
---
arch/arm64/boot/dts/qcom/sm8650.dtsi | 169 ++++++++++++++++++++++++++++-------
1 file changed, 137 insertions(+), 32 deletions(-)

diff --git a/arch/arm64/boot/dts/qcom/sm8650.dtsi b/arch/arm64/boot/dts/qcom/sm8650.dtsi
index 39e789b21acc..1b20d0fcd3ef 100644
--- a/arch/arm64/boot/dts/qcom/sm8650.dtsi
+++ b/arch/arm64/boot/dts/qcom/sm8650.dtsi
@@ -2626,6 +2626,7 @@ gpu: gpu@3d00000 {
operating-points-v2 = <&gpu_opp_table>;

qcom,gmu = <&gmu>;
+ #cooling-cells = <2>;

status = "disabled";

@@ -6014,16 +6015,29 @@ gpuss0-thermal {

thermal-sensors = <&tsens2 1>;

+ cooling-maps {
+ map0 {
+ trip = <&gpu0_alert0>;
+ cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
+
trips {
- trip-point0 {
+ gpu0_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};

- gpuss0-critical {
+ trip-point2 {
temperature = <110000>;
- hysteresis = <0>;
+ hysteresis = <1000>;
type = "critical";
};
};
@@ -6034,16 +6048,29 @@ gpuss1-thermal {

thermal-sensors = <&tsens2 2>;

+ cooling-maps {
+ map0 {
+ trip = <&gpu1_alert0>;
+ cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
+
trips {
- trip-point0 {
+ gpu1_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};

- gpuss1-critical {
+ trip-point2 {
temperature = <110000>;
- hysteresis = <0>;
+ hysteresis = <1000>;
type = "critical";
};
};
@@ -6054,16 +6081,29 @@ gpuss2-thermal {

thermal-sensors = <&tsens2 3>;

+ cooling-maps {
+ map0 {
+ trip = <&gpu2_alert0>;
+ cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
+
trips {
- trip-point0 {
+ gpu2_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};

- gpuss2-critical {
+ trip-point2 {
temperature = <110000>;
- hysteresis = <0>;
+ hysteresis = <1000>;
type = "critical";
};
};
@@ -6074,16 +6114,29 @@ gpuss3-thermal {

thermal-sensors = <&tsens2 4>;

+ cooling-maps {
+ map0 {
+ trip = <&gpu3_alert0>;
+ cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
+
trips {
- trip-point0 {
+ gpu3_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};

- gpuss3-critical {
+ trip-point2 {
temperature = <110000>;
- hysteresis = <0>;
+ hysteresis = <1000>;
type = "critical";
};
};
@@ -6094,16 +6147,29 @@ gpuss4-thermal {

thermal-sensors = <&tsens2 5>;

+ cooling-maps {
+ map0 {
+ trip = <&gpu4_alert0>;
+ cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
+
trips {
- trip-point0 {
+ gpu4_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};

- gpuss4-critical {
+ trip-point2 {
temperature = <110000>;
- hysteresis = <0>;
+ hysteresis = <1000>;
type = "critical";
};
};
@@ -6114,16 +6180,29 @@ gpuss5-thermal {

thermal-sensors = <&tsens2 6>;

+ cooling-maps {
+ map0 {
+ trip = <&gpu5_alert0>;
+ cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
+
trips {
- trip-point0 {
+ gpu5_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};

- gpuss5-critical {
+ trip-point2 {
temperature = <110000>;
- hysteresis = <0>;
+ hysteresis = <1000>;
type = "critical";
};
};
@@ -6134,16 +6213,29 @@ gpuss6-thermal {

thermal-sensors = <&tsens2 7>;

+ cooling-maps {
+ map0 {
+ trip = <&gpu6_alert0>;
+ cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
+
trips {
- trip-point0 {
+ gpu6_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};

- gpuss6-critical {
+ trip-point2 {
temperature = <110000>;
- hysteresis = <0>;
+ hysteresis = <1000>;
type = "critical";
};
};
@@ -6154,16 +6246,29 @@ gpuss7-thermal {

thermal-sensors = <&tsens2 8>;

+ cooling-maps {
+ map0 {
+ trip = <&gpu7_alert0>;
+ cooling-device = <&gpu THERMAL_NO_LIMIT THERMAL_NO_LIMIT>;
+ };
+ };
+
trips {
- trip-point0 {
+ gpu7_alert0: trip-point0 {
+ temperature = <85000>;
+ hysteresis = <1000>;
+ type = "passive";
+ };
+
+ trip-point1 {
temperature = <90000>;
- hysteresis = <2000>;
+ hysteresis = <1000>;
type = "hot";
};

- gpuss7-critical {
+ trip-point2 {
temperature = <110000>;
- hysteresis = <0>;
+ hysteresis = <1000>;
type = "critical";
};
};

--
2.40.1


2024-06-08 16:11:29

by Bjorn Andersson

[permalink] [raw]
Subject: Re: [PATCH 00/12] Adreno cooling, take 2


On Fri, 10 May 2024 14:58:29 +0200, Konrad Dybcio wrote:
> For the thermal framework to cool devfreq-managed devices properly,
> it seems like the following conditions must be met:
>
> 1. the devfreq device has a cooling device associated with it
> 2. there exists some thermal zone provider
> 3. the cooling device is referenced in a cooling map
> 4. the cooling map is associated with a thermal trip point
> 5. the thermal trip point is of the "passive" kind
> 6. the "passive" trip point is being updated (via polling or otherwise)
> 7. the trip point is being hit (i.e. the thing gets hot enough)
>
> [...]

Applied, thanks!

[01/12] arm64: dts: qcom: sc8180x: Throttle the GPU when overheating
commit: 7c05517e5e68205c9d5085c029df2ca4e6ad9237
[02/12] arm64: dts: qcom: sc8280xp: Throttle the GPU when overheating
commit: f7fd6d04c1046107a87a0fc883ed044cf8b877a1
[03/12] arm64: dts: qcom: sdm630: Throttle the GPU when overheating
commit: 545fef1e5e43fb73083d16507a13820179726ebe
[04/12] arm64: dts: qcom: sdm845: Throttle the GPU when overheating
commit: b79dd56ed5fcc863f167eb53771b09e8b3d8e317
[05/12] arm64: dts: qcom: sm6115: Update GPU thermal zone settings
commit: c518b5f6def159222d73f3241fb1802bc846a477
[06/12] arm64: dts: qcom: sm6350: Update GPU thermal zone settings
commit: 1a558bbffc2ee9b99226b146fd7928e41db79d41
[07/12] arm64: dts: qcom: sm8150: Throttle the GPU when overheating
commit: c61300433b7b89d5782fddf95bd96a6e819c0377
[08/12] arm64: dts: qcom: sm8250: Throttle the GPU when overheating
commit: c862b78b7203b72dd6806a77c0feff60fe96dee5
[09/12] arm64: dts: qcom: sm8350: Throttle the GPU when overheating
commit: 10a5555220ad20b2f8043060d76b0e7f83ae91fa
[10/12] arm64: dts: qcom: sm8450: Throttle the GPU when overheating
commit: 4be0dd44c39b083148ae9d4c4a7ef6d64e6c0062
[11/12] arm64: dts: qcom: sm8550: Throttle the GPU when overheating
commit: ed979c039ad1c9b02dd7e9fa6a0dd69209bac6ed
[12/12] arm64: dts: qcom: sm8650: Throttle the GPU when overheating
commit: 497624ed550604b3f713f53bc506e49ce5046e5f

Best regards,
--
Bjorn Andersson <[email protected]>