2024-02-11 10:53:54

by Andrey Skvortsov

[permalink] [raw]
Subject: [PATCH] arm64: dts: allwinner: a64: Add thermal trip points for GPU

From: Alexey Klimov <[email protected]>

Without trip points for GPU, the following errors are printed in the
dmesg log and the sun8i-thermal driver fails to load:

thermal_sys: Failed to find 'trips' node
thermal_sys: Failed to find trip points for thermal-sensor id=1
sun8i-thermal: probe of 1c25000.thermal-sensor failed with error -22

When thermal zones are defined, trip points definitions are mandatory.
Trip values for the GPU are assumed to be the same values as the CPU
ones. The available specs do not provide any hints about thermal regimes
for the GPU and it seems GPU is implemented on the same die as the CPU.

'make dtbs_check' complains about problem in dts for 18 A64-based boards
supported by the kernel:

sun50i-a64-pine64.dtb: thermal-zones: gpu0-thermal: 'trips' is a required property
from schema $id: http://devicetree.org/schemas/thermal/thermal-zones.yaml#
sun50i-a64-pine64.dtb: thermal-zones: gpu1-thermal: 'trips' is a required property
from schema $id: http://devicetree.org/schemas/thermal/thermal-zones.yaml#

Tested on Pine a64+ and PinePhone 1.2.

Cc: Samuel Holland <[email protected]>
Cc: Jernej Skrabec <[email protected]>
Cc: Chen-Yu Tsai <[email protected]>
Cc: Daniel Lezcano <[email protected]>
Cc: [email protected]
Signed-off-by: Alexey Klimov <[email protected]>
Tested-by: Andrey Skvortsov <[email protected]>

---
arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 46 +++++++++++++++++++
1 file changed, 46 insertions(+)

diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
index 57ac18738c99..c25da7229e42 100644
--- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
+++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
@@ -244,6 +244,29 @@ gpu0_thermal: gpu0-thermal {
polling-delay-passive = <0>;
polling-delay = <0>;
thermal-sensors = <&ths 1>;
+
+ trips {
+ gpu0_alert0: gpu0_alert0 {
+ /* milliCelsius */
+ temperature = <75000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
+
+ gpu0_alert1: gpu0_alert1 {
+ /* milliCelsius */
+ temperature = <90000>;
+ hysteresis = <2000>;
+ type = "hot";
+ };
+
+ gpu0_crit: gpu0_crit {
+ /* milliCelsius */
+ temperature = <110000>;
+ hysteresis = <2000>;
+ type = "critical";
+ };
+ };
};

gpu1_thermal: gpu1-thermal {
@@ -251,6 +274,29 @@ gpu1_thermal: gpu1-thermal {
polling-delay-passive = <0>;
polling-delay = <0>;
thermal-sensors = <&ths 2>;
+
+ trips {
+ gpu1_alert0: gpu1_alert0 {
+ /* milliCelsius */
+ temperature = <75000>;
+ hysteresis = <2000>;
+ type = "passive";
+ };
+
+ gpu1_alert1: gpu1_alert1 {
+ /* milliCelsius */
+ temperature = <90000>;
+ hysteresis = <2000>;
+ type = "hot";
+ };
+
+ gpu1_crit: gpu1_crit {
+ /* milliCelsius */
+ temperature = <110000>;
+ hysteresis = <2000>;
+ type = "critical";
+ };
+ };
};
};

--
2.43.0



2024-02-11 12:54:08

by Andre Przywara

[permalink] [raw]
Subject: Re: [PATCH] arm64: dts: allwinner: a64: Add thermal trip points for GPU

On Sun, 11 Feb 2024 13:53:26 +0300
Andrey Skvortsov <[email protected]> wrote:

Hi Andrey,

> From: Alexey Klimov <[email protected]>
>
> Without trip points for GPU, the following errors are printed in the
> dmesg log and the sun8i-thermal driver fails to load:

So how does that post differ from Alexey one's a few weeks back:
https://lore.kernel.org/linux-arm-kernel/[email protected]/
It seems like the same patch?

And Jernej and I had some comments (no mentioning of "Linux" in commit
message, add cooling maps, source of trip temperature values), can you
please try to address them?


> thermal_sys: Failed to find 'trips' node
> thermal_sys: Failed to find trip points for thermal-sensor id=1
> sun8i-thermal: probe of 1c25000.thermal-sensor failed with error -22

I think it's pretty obvious that the trip points are missing when they
shouldn't, so this does not need too much explanation or rationale in
the commit message, so you can cut this short.

> When thermal zones are defined, trip points definitions are mandatory.
> Trip values for the GPU are assumed to be the same values as the CPU
> ones. The available specs do not provide any hints about thermal regimes
> for the GPU and it seems GPU is implemented on the same die as the CPU.
>
> 'make dtbs_check' complains about problem in dts for 18 A64-based boards
> supported by the kernel:
>
> sun50i-a64-pine64.dtb: thermal-zones: gpu0-thermal: 'trips' is a required property
> from schema $id: http://devicetree.org/schemas/thermal/thermal-zones.yaml#
> sun50i-a64-pine64.dtb: thermal-zones: gpu1-thermal: 'trips' is a required property
> from schema $id: http://devicetree.org/schemas/thermal/thermal-zones.yaml#
>
> Tested on Pine a64+ and PinePhone 1.2.
>
> Cc: Samuel Holland <[email protected]>
> Cc: Jernej Skrabec <[email protected]>
> Cc: Chen-Yu Tsai <[email protected]>
> Cc: Daniel Lezcano <[email protected]>
> Cc: [email protected]
> Signed-off-by: Alexey Klimov <[email protected]>
> Tested-by: Andrey Skvortsov <[email protected]>

You would need your Signed-off-by: here, since you send this, even when
on Alexey's behalf.

Cheers,
Andre

>
> ---
> arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi | 46 +++++++++++++++++++
> 1 file changed, 46 insertions(+)
>
> diff --git a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
> index 57ac18738c99..c25da7229e42 100644
> --- a/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
> +++ b/arch/arm64/boot/dts/allwinner/sun50i-a64.dtsi
> @@ -244,6 +244,29 @@ gpu0_thermal: gpu0-thermal {
> polling-delay-passive = <0>;
> polling-delay = <0>;
> thermal-sensors = <&ths 1>;
> +
> + trips {
> + gpu0_alert0: gpu0_alert0 {
> + /* milliCelsius */
> + temperature = <75000>;
> + hysteresis = <2000>;
> + type = "passive";
> + };
> +
> + gpu0_alert1: gpu0_alert1 {
> + /* milliCelsius */
> + temperature = <90000>;
> + hysteresis = <2000>;
> + type = "hot";
> + };
> +
> + gpu0_crit: gpu0_crit {
> + /* milliCelsius */
> + temperature = <110000>;
> + hysteresis = <2000>;
> + type = "critical";
> + };
> + };
> };
>
> gpu1_thermal: gpu1-thermal {
> @@ -251,6 +274,29 @@ gpu1_thermal: gpu1-thermal {
> polling-delay-passive = <0>;
> polling-delay = <0>;
> thermal-sensors = <&ths 2>;
> +
> + trips {
> + gpu1_alert0: gpu1_alert0 {
> + /* milliCelsius */
> + temperature = <75000>;
> + hysteresis = <2000>;
> + type = "passive";
> + };
> +
> + gpu1_alert1: gpu1_alert1 {
> + /* milliCelsius */
> + temperature = <90000>;
> + hysteresis = <2000>;
> + type = "hot";
> + };
> +
> + gpu1_crit: gpu1_crit {
> + /* milliCelsius */
> + temperature = <110000>;
> + hysteresis = <2000>;
> + type = "critical";
> + };
> + };
> };
> };
>


2024-02-11 14:24:34

by Andrey Skvortsov

[permalink] [raw]
Subject: Re: [PATCH] arm64: dts: allwinner: a64: Add thermal trip points for GPU

Hi Andre,

On 24-02-11 12:52, Andre Przywara wrote:
> On Sun, 11 Feb 2024 13:53:26 +0300
> Andrey Skvortsov <[email protected]> wrote:
>
> Hi Andrey,
>
> > From: Alexey Klimov <[email protected]>
> >
> > Without trip points for GPU, the following errors are printed in the
> > dmesg log and the sun8i-thermal driver fails to load:
>
> So how does that post differ from Alexey one's a few weeks back:
> https://lore.kernel.org/linux-arm-kernel/[email protected]/
> It seems like the same patch?

Yes, it's the same patch. I've added only information about failed
dtbs_check errors in already supported boards to commit message.
I've found this patch from June 2023 without any feedback. [1] Since I've
worked on dts changes for PinePhone A64, I've decided to resend
it. Sorry, I wasn't aware that Alexey resent it in the mean time. It's
better to continue discussion in original Alexey's patch.

> And Jernej and I had some comments (no mentioning of "Linux" in commit
> message, add cooling maps, source of trip temperature values), can you
> please try to address them?
>
>
> > thermal_sys: Failed to find 'trips' node
> > thermal_sys: Failed to find trip points for thermal-sensor id=1
> > sun8i-thermal: probe of 1c25000.thermal-sensor failed with error -22
>
> I think it's pretty obvious that the trip points are missing when they
> shouldn't, so this does not need too much explanation or rationale in
> the commit message, so you can cut this short.
>
> > When thermal zones are defined, trip points definitions are mandatory.
> > Trip values for the GPU are assumed to be the same values as the CPU
> > ones. The available specs do not provide any hints about thermal regimes
> > for the GPU and it seems GPU is implemented on the same die as the CPU.
> >
> > 'make dtbs_check' complains about problem in dts for 18 A64-based boards
> > supported by the kernel:
> >
> > sun50i-a64-pine64.dtb: thermal-zones: gpu0-thermal: 'trips' is a required property
> > from schema $id: http://devicetree.org/schemas/thermal/thermal-zones.yaml#
> > sun50i-a64-pine64.dtb: thermal-zones: gpu1-thermal: 'trips' is a required property
> > from schema $id: http://devicetree.org/schemas/thermal/thermal-zones.yaml#
> >
> > Tested on Pine a64+ and PinePhone 1.2.
> >
> > Cc: Samuel Holland <[email protected]>
> > Cc: Jernej Skrabec <[email protected]>
> > Cc: Chen-Yu Tsai <[email protected]>
> > Cc: Daniel Lezcano <[email protected]>
> > Cc: [email protected]
> > Signed-off-by: Alexey Klimov <[email protected]>
> > Tested-by: Andrey Skvortsov <[email protected]>
>
> You would need your Signed-off-by: here, since you send this, even when
> on Alexey's behalf.
>
> Cheers,
> Andre
>

1. https://lkml.org/lkml/2023/6/4/416

--
Best regards,
Andrey Skvortsov

2024-02-12 00:09:03

by Andre Przywara

[permalink] [raw]
Subject: Re: [PATCH] arm64: dts: allwinner: a64: Add thermal trip points for GPU

On Sun, 11 Feb 2024 17:24:19 +0300
Andrey Skvortsov <[email protected]> wrote:

Hi Andrey,

> Hi Andre,
>
> On 24-02-11 12:52, Andre Przywara wrote:
> > On Sun, 11 Feb 2024 13:53:26 +0300
> > Andrey Skvortsov <[email protected]> wrote:
> >
> > Hi Andrey,
> >
> > > From: Alexey Klimov <[email protected]>
> > >
> > > Without trip points for GPU, the following errors are printed in the
> > > dmesg log and the sun8i-thermal driver fails to load:
> >
> > So how does that post differ from Alexey one's a few weeks back:
> > https://lore.kernel.org/linux-arm-kernel/[email protected]/
> > It seems like the same patch?
>
> Yes, it's the same patch. I've added only information about failed
> dtbs_check errors in already supported boards to commit message.
> I've found this patch from June 2023 without any feedback. [1] Since I've
> worked on dts changes for PinePhone A64, I've decided to resend
> it. Sorry, I wasn't aware that Alexey resent it in the mean time.

No worries, that's fine, thanks for the explanation.

> It's better to continue discussion in original Alexey's patch.

Will Alexey have time to reply and resend? If not, or if you are not
sure (it's been a while), it's fine to take over this series, and send a
v2, by yourself.

If you can just explicitly state that the GPU trip point values are
copied from the CPU ones (because they share a die), I am happy as far
as my comment is concerned. This is arguably somewhat mentioned in the
commit message already, but I missed it on the first read, so would like
to see this more prominently stated.

As mentioned before, and also stated by Jernej, consider this patch
purely device-specific, not related to any Linux behaviour, and give
rationale only based on the binding, which requires trip points.
Something simple as "The DT binding requires trip points, and
dt-validate complains about them missing for any A64 boards." should
suffice.

Thanks,
Andre

> > And Jernej and I had some comments (no mentioning of "Linux" in commit
> > message, add cooling maps, source of trip temperature values), can you
> > please try to address them?
> >
> >
> > > thermal_sys: Failed to find 'trips' node
> > > thermal_sys: Failed to find trip points for thermal-sensor id=1
> > > sun8i-thermal: probe of 1c25000.thermal-sensor failed with error -22
> >
> > I think it's pretty obvious that the trip points are missing when they
> > shouldn't, so this does not need too much explanation or rationale in
> > the commit message, so you can cut this short.
> >
> > > When thermal zones are defined, trip points definitions are mandatory.
> > > Trip values for the GPU are assumed to be the same values as the CPU
> > > ones. The available specs do not provide any hints about thermal regimes
> > > for the GPU and it seems GPU is implemented on the same die as the CPU.
> > >
> > > 'make dtbs_check' complains about problem in dts for 18 A64-based boards
> > > supported by the kernel:
> > >
> > > sun50i-a64-pine64.dtb: thermal-zones: gpu0-thermal: 'trips' is a required property
> > > from schema $id: http://devicetree.org/schemas/thermal/thermal-zones.yaml#
> > > sun50i-a64-pine64.dtb: thermal-zones: gpu1-thermal: 'trips' is a required property
> > > from schema $id: http://devicetree.org/schemas/thermal/thermal-zones.yaml#
> > >
> > > Tested on Pine a64+ and PinePhone 1.2.
> > >
> > > Cc: Samuel Holland <[email protected]>
> > > Cc: Jernej Skrabec <[email protected]>
> > > Cc: Chen-Yu Tsai <[email protected]>
> > > Cc: Daniel Lezcano <[email protected]>
> > > Cc: [email protected]
> > > Signed-off-by: Alexey Klimov <[email protected]>
> > > Tested-by: Andrey Skvortsov <[email protected]>
> >
> > You would need your Signed-off-by: here, since you send this, even when
> > on Alexey's behalf.
> >
> > Cheers,
> > Andre
> >
>
> 1. https://lkml.org/lkml/2023/6/4/416
>