2023-07-12 21:13:10

by Frank Li

[permalink] [raw]
Subject: [PATCH 1/1] thermal/drivers/imx_sc_thermal: return -EAGAIN when SCFW turn off resource

Avoid endless print following message when SCFW turns off resource.
[ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1)

Signed-off-by: Frank Li <[email protected]>
---
drivers/thermal/imx_sc_thermal.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c
index 8d6b4ef23746..0533d58f199f 100644
--- a/drivers/thermal/imx_sc_thermal.c
+++ b/drivers/thermal/imx_sc_thermal.c
@@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp)
hdr->size = 2;

ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true);
- if (ret)
+ if (ret == -EPERM) /* NO POWER */
+ return -EAGAIN;
+ else if (ret)
return ret;

*temp = msg.data.resp.celsius * 1000 + msg.data.resp.tenths * 100;
--
2.34.1



2023-07-13 13:00:56

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH 1/1] thermal/drivers/imx_sc_thermal: return -EAGAIN when SCFW turn off resource

On 12/07/2023 23:05, Frank Li wrote:
> Avoid endless print following message when SCFW turns off resource.
> [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1)
>
> Signed-off-by: Frank Li <[email protected]>
> ---
> drivers/thermal/imx_sc_thermal.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c
> index 8d6b4ef23746..0533d58f199f 100644
> --- a/drivers/thermal/imx_sc_thermal.c
> +++ b/drivers/thermal/imx_sc_thermal.c
> @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp)
> hdr->size = 2;
>
> ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true);
> - if (ret)
> + if (ret == -EPERM) /* NO POWER */
> + return -EAGAIN;

Isn't there a chain call somewhere when the resource is turned off, so
the thermal zone can be disabled?

> + else if (ret)
> return ret;
>
> *temp = msg.data.resp.celsius * 1000 + msg.data.resp.tenths * 100;

--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog


2023-07-14 17:41:50

by Frank Li

[permalink] [raw]
Subject: Re: [PATCH 1/1] thermal/drivers/imx_sc_thermal: return -EAGAIN when SCFW turn off resource

On Thu, Jul 13, 2023 at 02:49:54PM +0200, Daniel Lezcano wrote:
> On 12/07/2023 23:05, Frank Li wrote:
> > Avoid endless print following message when SCFW turns off resource.
> > [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1)
> >
> > Signed-off-by: Frank Li <[email protected]>
> > ---
> > drivers/thermal/imx_sc_thermal.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c
> > index 8d6b4ef23746..0533d58f199f 100644
> > --- a/drivers/thermal/imx_sc_thermal.c
> > +++ b/drivers/thermal/imx_sc_thermal.c
> > @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp)
> > hdr->size = 2;
> > ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true);
> > - if (ret)
> > + if (ret == -EPERM) /* NO POWER */
> > + return -EAGAIN;
>
> Isn't there a chain call somewhere when the resource is turned off, so the
> thermal zone can be disabled?

A possible place in drivers/firmware/imx/scu-pd.c. but I am not sure how to
get thermal devices. I just found a API thermal_zone_get_zone_by_name(). I
am not sure if it is good to depend on "name", which add coupling between
two drivers and if there are external thermal devices(such as) has the
same name, it will wrong turn off.

If add power domain notification in thermal driver, I am not how to get
other devices's pd in thermal driver.

Any example I can refer?

Or this is simple enough solution.

Frank

>
> > + else if (ret)
> > return ret;
> > *temp = msg.data.resp.celsius * 1000 + msg.data.resp.tenths * 100;
>
> --
> <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
>
> Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
> <http://twitter.com/#!/linaroorg> Twitter |
> <http://www.linaro.org/linaro-blog/> Blog
>

2023-08-19 17:13:28

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH 1/1] thermal/drivers/imx_sc_thermal: return -EAGAIN when SCFW turn off resource


Hi Frank,

sorry for the delay

On 14/07/2023 19:19, Frank Li wrote:
> On Thu, Jul 13, 2023 at 02:49:54PM +0200, Daniel Lezcano wrote:
>> On 12/07/2023 23:05, Frank Li wrote:
>>> Avoid endless print following message when SCFW turns off resource.
>>> [ 1818.342337] thermal thermal_zone0: failed to read out thermal zone (-1)
>>>
>>> Signed-off-by: Frank Li <[email protected]>
>>> ---
>>> drivers/thermal/imx_sc_thermal.c | 4 +++-
>>> 1 file changed, 3 insertions(+), 1 deletion(-)
>>>
>>> diff --git a/drivers/thermal/imx_sc_thermal.c b/drivers/thermal/imx_sc_thermal.c
>>> index 8d6b4ef23746..0533d58f199f 100644
>>> --- a/drivers/thermal/imx_sc_thermal.c
>>> +++ b/drivers/thermal/imx_sc_thermal.c
>>> @@ -58,7 +58,9 @@ static int imx_sc_thermal_get_temp(struct thermal_zone_device *tz, int *temp)
>>> hdr->size = 2;
>>> ret = imx_scu_call_rpc(thermal_ipc_handle, &msg, true);
>>> - if (ret)
>>> + if (ret == -EPERM) /* NO POWER */
>>> + return -EAGAIN;
>>
>> Isn't there a chain call somewhere when the resource is turned off, so the
>> thermal zone can be disabled?
>
> A possible place in drivers/firmware/imx/scu-pd.c. but I am not sure how to
> get thermal devices. I just found a API thermal_zone_get_zone_by_name(). I
> am not sure if it is good to depend on "name", which add coupling between
> two drivers and if there are external thermal devices(such as) has the
> same name, it will wrong turn off.

Correct

> If add power domain notification in thermal driver, I am not how to get
> other devices's pd in thermal driver.
>
> Any example I can refer?
>
> Or this is simple enough solution.

The solution works for removing the error message but it does not solve
the root cause of the issue. The thermal zone keeps monitoring while the
sensor is down.

So the question is why the sensor is shut down if it is in use?



>>
>>> + else if (ret)
>>> return ret;
>>> *temp = msg.data.resp.celsius * 1000 + msg.data.resp.tenths * 100;
>>
>> --
>> <http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs
>>
>> Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
>> <http://twitter.com/#!/linaroorg> Twitter |
>> <http://www.linaro.org/linaro-blog/> Blog
>>

--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog