2019-04-24 05:14:03

by Jiada Wang

[permalink] [raw]
Subject: [PATCH v4 0/2] thermal: rcar_gen3_thermal: fix IRQ issues

There are issues with interrupt handling in rcar_gen3_thermal driver.

Currently IRQ is remain enabled after .remove, later if device is probed,
IRQ is requested before .thermal_init, this may cause IRQ function be
triggered but not able to clear IRQ status, thus cause system to hang.

Since the irq line isn't shared between different devices,
so the proper interrupt type flag should be IRQF_ONESHOT.

This patch-set fix these interrupt handling retated issues.

---
v4: remove 'spinlock_t lock'
add Fixes tag in ("thermal: rcar_gen3_thermal: fix interrupt type")
fix typos in ("thermal: rcar_gen3_thermal: disable interrupt in .remove")

v3: fix to use correct code base
remove unused "flag" variable in rcar_gen3_thermal_irq

v2: use irq type IRQF_ONESHOT instead of IRQF_SHARED
disable interrupt in .remove

v1: initial version

Jiada Wang (2):
thermal: rcar_gen3_thermal: fix interrupt type
thermal: rcar_gen3_thermal: disable interrupt in .remove

drivers/thermal/rcar_gen3_thermal.c | 41 +++++++----------------------
1 file changed, 9 insertions(+), 32 deletions(-)

--
2.19.2


2019-04-24 05:14:06

by Jiada Wang

[permalink] [raw]
Subject: [PATCH v4 1/2] thermal: rcar_gen3_thermal: fix interrupt type

Currently IRQF_SHARED type interrupt line is allocated, but it
is not appropriate, as the interrupt line isn't shared between
different devices, instead IRQF_ONESHOT is the proper type.

By changing interrupt type to IRQF_ONESHOT, now irq handler is
no longer needed, as clear of interrupt status can be done in
threaded interrupt context.

Because IRQF_ONESHOT type interrupt line is kept disabled until
the threaded handler has been run, so there is no need to protect
read/write of REG_GEN3_IRQSTR with lock.

Fixes: 7d4b269776ec6 ("enable hardware interrupts for trip points")
Signed-off-by: Jiada Wang <[email protected]>
---
drivers/thermal/rcar_gen3_thermal.c | 38 +++++------------------------
1 file changed, 6 insertions(+), 32 deletions(-)

diff --git a/drivers/thermal/rcar_gen3_thermal.c b/drivers/thermal/rcar_gen3_thermal.c
index 88fa41cf16e8..065e16f53285 100644
--- a/drivers/thermal/rcar_gen3_thermal.c
+++ b/drivers/thermal/rcar_gen3_thermal.c
@@ -14,7 +14,6 @@
#include <linux/of_device.h>
#include <linux/platform_device.h>
#include <linux/pm_runtime.h>
-#include <linux/spinlock.h>
#include <linux/sys_soc.h>
#include <linux/thermal.h>

@@ -82,7 +81,6 @@ struct rcar_gen3_thermal_tsc {
struct rcar_gen3_thermal_priv {
struct rcar_gen3_thermal_tsc *tscs[TSC_MAX_NUM];
unsigned int num_tscs;
- spinlock_t lock; /* Protect interrupts on and off */
void (*thermal_init)(struct rcar_gen3_thermal_tsc *tsc);
};

@@ -232,38 +230,16 @@ static irqreturn_t rcar_gen3_thermal_irq(int irq, void *data)
{
struct rcar_gen3_thermal_priv *priv = data;
u32 status;
- int i, ret = IRQ_HANDLED;
+ int i;

- spin_lock(&priv->lock);
for (i = 0; i < priv->num_tscs; i++) {
status = rcar_gen3_thermal_read(priv->tscs[i], REG_GEN3_IRQSTR);
rcar_gen3_thermal_write(priv->tscs[i], REG_GEN3_IRQSTR, 0);
if (status)
- ret = IRQ_WAKE_THREAD;
+ thermal_zone_device_update(priv->tscs[i]->zone,
+ THERMAL_EVENT_UNSPECIFIED);
}

- if (ret == IRQ_WAKE_THREAD)
- rcar_thermal_irq_set(priv, false);
-
- spin_unlock(&priv->lock);
-
- return ret;
-}
-
-static irqreturn_t rcar_gen3_thermal_irq_thread(int irq, void *data)
-{
- struct rcar_gen3_thermal_priv *priv = data;
- unsigned long flags;
- int i;
-
- for (i = 0; i < priv->num_tscs; i++)
- thermal_zone_device_update(priv->tscs[i]->zone,
- THERMAL_EVENT_UNSPECIFIED);
-
- spin_lock_irqsave(&priv->lock, flags);
- rcar_thermal_irq_set(priv, true);
- spin_unlock_irqrestore(&priv->lock, flags);
-
return IRQ_HANDLED;
}

@@ -371,8 +347,6 @@ static int rcar_gen3_thermal_probe(struct platform_device *pdev)
if (soc_device_match(r8a7795es1))
priv->thermal_init = rcar_gen3_thermal_init_r8a7795es1;

- spin_lock_init(&priv->lock);
-
platform_set_drvdata(pdev, priv);

/*
@@ -390,9 +364,9 @@ static int rcar_gen3_thermal_probe(struct platform_device *pdev)
if (!irqname)
return -ENOMEM;

- ret = devm_request_threaded_irq(dev, irq, rcar_gen3_thermal_irq,
- rcar_gen3_thermal_irq_thread,
- IRQF_SHARED, irqname, priv);
+ ret = devm_request_threaded_irq(dev, irq, NULL,
+ rcar_gen3_thermal_irq,
+ IRQF_ONESHOT, irqname, priv);
if (ret)
return ret;
}
--
2.19.2

2019-04-24 05:14:10

by Jiada Wang

[permalink] [raw]
Subject: [PATCH v4 2/2] thermal: rcar_gen3_thermal: disable interrupt in .remove

Currently IRQ remains enabled after .remove, later if device is probed,
IRQ is requested before .thermal_init, this may cause IRQ function be
called before device is initialized.

this patch disables interrupt in .remove, to ensure irq function
only be called after device is fully initialized.

Signed-off-by: Jiada Wang <[email protected]>
---
drivers/thermal/rcar_gen3_thermal.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/drivers/thermal/rcar_gen3_thermal.c b/drivers/thermal/rcar_gen3_thermal.c
index 065e16f53285..280230951dfe 100644
--- a/drivers/thermal/rcar_gen3_thermal.c
+++ b/drivers/thermal/rcar_gen3_thermal.c
@@ -307,6 +307,9 @@ MODULE_DEVICE_TABLE(of, rcar_gen3_thermal_dt_ids);
static int rcar_gen3_thermal_remove(struct platform_device *pdev)
{
struct device *dev = &pdev->dev;
+ struct rcar_gen3_thermal_priv *priv = dev_get_drvdata(dev);
+
+ rcar_thermal_irq_set(priv, false);

pm_runtime_put(dev);
pm_runtime_disable(dev);
--
2.19.2

2019-04-24 06:52:55

by Niklas Söderlund

[permalink] [raw]
Subject: Re: [PATCH v4 0/2] thermal: rcar_gen3_thermal: fix IRQ issues

Hi Jiada,

I think this series looks good. Unfortunately I'm out of office for the
next week and are thus unable to test it. Please don't let this block
this series but if it's still on the ML when I'm back I will do a proper
review and test of it.

On 2019-04-24 14:11:43 +0900, Jiada Wang wrote:
> There are issues with interrupt handling in rcar_gen3_thermal driver.
>
> Currently IRQ is remain enabled after .remove, later if device is probed,
> IRQ is requested before .thermal_init, this may cause IRQ function be
> triggered but not able to clear IRQ status, thus cause system to hang.
>
> Since the irq line isn't shared between different devices,
> so the proper interrupt type flag should be IRQF_ONESHOT.
>
> This patch-set fix these interrupt handling retated issues.
>
> ---
> v4: remove 'spinlock_t lock'
> add Fixes tag in ("thermal: rcar_gen3_thermal: fix interrupt type")
> fix typos in ("thermal: rcar_gen3_thermal: disable interrupt in .remove")
>
> v3: fix to use correct code base
> remove unused "flag" variable in rcar_gen3_thermal_irq
>
> v2: use irq type IRQF_ONESHOT instead of IRQF_SHARED
> disable interrupt in .remove
>
> v1: initial version
>
> Jiada Wang (2):
> thermal: rcar_gen3_thermal: fix interrupt type
> thermal: rcar_gen3_thermal: disable interrupt in .remove
>
> drivers/thermal/rcar_gen3_thermal.c | 41 +++++++----------------------
> 1 file changed, 9 insertions(+), 32 deletions(-)
>
> --
> 2.19.2
>

--
Regards,
Niklas S?derlund

2019-04-24 09:24:43

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] thermal: rcar_gen3_thermal: fix interrupt type

On Wed, Apr 24, 2019 at 02:11:44PM +0900, Jiada Wang wrote:
> Currently IRQF_SHARED type interrupt line is allocated, but it
> is not appropriate, as the interrupt line isn't shared between
> different devices, instead IRQF_ONESHOT is the proper type.
>
> By changing interrupt type to IRQF_ONESHOT, now irq handler is
> no longer needed, as clear of interrupt status can be done in
> threaded interrupt context.
>
> Because IRQF_ONESHOT type interrupt line is kept disabled until
> the threaded handler has been run, so there is no need to protect
> read/write of REG_GEN3_IRQSTR with lock.
>
> Fixes: 7d4b269776ec6 ("enable hardware interrupts for trip points")
> Signed-off-by: Jiada Wang <[email protected]>


Reviewed-by: Simon Horman <[email protected]>
Tested-by: Simon Horman <[email protected]>

2019-04-24 09:26:00

by Simon Horman

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] thermal: rcar_gen3_thermal: disable interrupt in .remove

On Wed, Apr 24, 2019 at 02:11:45PM +0900, Jiada Wang wrote:
> Currently IRQ remains enabled after .remove, later if device is probed,
> IRQ is requested before .thermal_init, this may cause IRQ function be
> called before device is initialized.
>
> this patch disables interrupt in .remove, to ensure irq function
> only be called after device is fully initialized.
>
> Signed-off-by: Jiada Wang <[email protected]>

Reviewed-by: Simon Horman <[email protected]>

2019-04-24 09:39:53

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] thermal: rcar_gen3_thermal: disable interrupt in .remove

On 24/04/2019 07:11, Jiada Wang wrote:
> Currently IRQ remains enabled after .remove, later if device is probed,
> IRQ is requested before .thermal_init, this may cause IRQ function be
> called before device is initialized.
>
> this patch disables interrupt in .remove, to ensure irq function
> only be called after device is fully initialized.
>
> Signed-off-by: Jiada Wang <[email protected]>

Reviewed-by: Daniel Lezcano <[email protected]>

> ---
> drivers/thermal/rcar_gen3_thermal.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/drivers/thermal/rcar_gen3_thermal.c b/drivers/thermal/rcar_gen3_thermal.c
> index 065e16f53285..280230951dfe 100644
> --- a/drivers/thermal/rcar_gen3_thermal.c
> +++ b/drivers/thermal/rcar_gen3_thermal.c
> @@ -307,6 +307,9 @@ MODULE_DEVICE_TABLE(of, rcar_gen3_thermal_dt_ids);
> static int rcar_gen3_thermal_remove(struct platform_device *pdev)
> {
> struct device *dev = &pdev->dev;
> + struct rcar_gen3_thermal_priv *priv = dev_get_drvdata(dev);
> +
> + rcar_thermal_irq_set(priv, false);
>
> pm_runtime_put(dev);
> pm_runtime_disable(dev);
>


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2019-04-24 12:33:27

by Eugeniu Rosca

[permalink] [raw]
Subject: Re: [PATCH v4 2/2] thermal: rcar_gen3_thermal: disable interrupt in .remove

On Wed, Apr 24, 2019 at 02:11:45PM +0900, Jiada Wang wrote:
> Currently IRQ remains enabled after .remove, later if device is probed,
> IRQ is requested before .thermal_init, this may cause IRQ function be
> called before device is initialized.
>
> this patch disables interrupt in .remove, to ensure irq function
> only be called after device is fully initialized.
>
> Signed-off-by: Jiada Wang <[email protected]>

Based on https://patchwork.kernel.org/cover/10914079/#22603533 :

Reviewed-and-Tested-by: Eugeniu Rosca <[email protected]>

Thanks!

--
Best regards,
Eugeniu.

2019-04-24 13:25:09

by Daniel Lezcano

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] thermal: rcar_gen3_thermal: fix interrupt type

On 24/04/2019 07:11, Jiada Wang wrote:
> Currently IRQF_SHARED type interrupt line is allocated, but it
> is not appropriate, as the interrupt line isn't shared between
> different devices, instead IRQF_ONESHOT is the proper type.
>
> By changing interrupt type to IRQF_ONESHOT, now irq handler is
> no longer needed, as clear of interrupt status can be done in
> threaded interrupt context.
>
> Because IRQF_ONESHOT type interrupt line is kept disabled until
> the threaded handler has been run, so there is no need to protect
> read/write of REG_GEN3_IRQSTR with lock.
>
> Fixes: 7d4b269776ec6 ("enable hardware interrupts for trip points")
> Signed-off-by: Jiada Wang <[email protected]>

Suggested-by: Daniel Lezcano <[email protected]>
Reviewed-by: Daniel Lezcano <[email protected]>

> ---
> drivers/thermal/rcar_gen3_thermal.c | 38 +++++------------------------
> 1 file changed, 6 insertions(+), 32 deletions(-)
>
> diff --git a/drivers/thermal/rcar_gen3_thermal.c b/drivers/thermal/rcar_gen3_thermal.c
> index 88fa41cf16e8..065e16f53285 100644
> --- a/drivers/thermal/rcar_gen3_thermal.c
> +++ b/drivers/thermal/rcar_gen3_thermal.c
> @@ -14,7 +14,6 @@
> #include <linux/of_device.h>
> #include <linux/platform_device.h>
> #include <linux/pm_runtime.h>
> -#include <linux/spinlock.h>
> #include <linux/sys_soc.h>
> #include <linux/thermal.h>
>
> @@ -82,7 +81,6 @@ struct rcar_gen3_thermal_tsc {
> struct rcar_gen3_thermal_priv {
> struct rcar_gen3_thermal_tsc *tscs[TSC_MAX_NUM];
> unsigned int num_tscs;
> - spinlock_t lock; /* Protect interrupts on and off */
> void (*thermal_init)(struct rcar_gen3_thermal_tsc *tsc);
> };
>
> @@ -232,38 +230,16 @@ static irqreturn_t rcar_gen3_thermal_irq(int irq, void *data)
> {
> struct rcar_gen3_thermal_priv *priv = data;
> u32 status;
> - int i, ret = IRQ_HANDLED;
> + int i;
>
> - spin_lock(&priv->lock);
> for (i = 0; i < priv->num_tscs; i++) {
> status = rcar_gen3_thermal_read(priv->tscs[i], REG_GEN3_IRQSTR);
> rcar_gen3_thermal_write(priv->tscs[i], REG_GEN3_IRQSTR, 0);
> if (status)
> - ret = IRQ_WAKE_THREAD;
> + thermal_zone_device_update(priv->tscs[i]->zone,
> + THERMAL_EVENT_UNSPECIFIED);
> }
>
> - if (ret == IRQ_WAKE_THREAD)
> - rcar_thermal_irq_set(priv, false);
> -
> - spin_unlock(&priv->lock);
> -
> - return ret;
> -}
> -
> -static irqreturn_t rcar_gen3_thermal_irq_thread(int irq, void *data)
> -{
> - struct rcar_gen3_thermal_priv *priv = data;
> - unsigned long flags;
> - int i;
> -
> - for (i = 0; i < priv->num_tscs; i++)
> - thermal_zone_device_update(priv->tscs[i]->zone,
> - THERMAL_EVENT_UNSPECIFIED);
> -
> - spin_lock_irqsave(&priv->lock, flags);
> - rcar_thermal_irq_set(priv, true);
> - spin_unlock_irqrestore(&priv->lock, flags);
> -
> return IRQ_HANDLED;
> }
>
> @@ -371,8 +347,6 @@ static int rcar_gen3_thermal_probe(struct platform_device *pdev)
> if (soc_device_match(r8a7795es1))
> priv->thermal_init = rcar_gen3_thermal_init_r8a7795es1;
>
> - spin_lock_init(&priv->lock);
> -
> platform_set_drvdata(pdev, priv);
>
> /*
> @@ -390,9 +364,9 @@ static int rcar_gen3_thermal_probe(struct platform_device *pdev)
> if (!irqname)
> return -ENOMEM;
>
> - ret = devm_request_threaded_irq(dev, irq, rcar_gen3_thermal_irq,
> - rcar_gen3_thermal_irq_thread,
> - IRQF_SHARED, irqname, priv);
> + ret = devm_request_threaded_irq(dev, irq, NULL,
> + rcar_gen3_thermal_irq,
> + IRQF_ONESHOT, irqname, priv);
> if (ret)
> return ret;
> }
>


--
<http://www.linaro.org/> Linaro.org │ Open source software for ARM SoCs

Follow Linaro: <http://www.facebook.com/pages/Linaro> Facebook |
<http://twitter.com/#!/linaroorg> Twitter |
<http://www.linaro.org/linaro-blog/> Blog

2019-04-24 13:49:38

by Eugeniu Rosca

[permalink] [raw]
Subject: Re: [PATCH v4 0/2] thermal: rcar_gen3_thermal: fix IRQ issues

On Wed, Apr 24, 2019 at 02:11:43PM +0900, Jiada Wang wrote:
> There are issues with interrupt handling in rcar_gen3_thermal driver.
>
> Currently IRQ is remain enabled after .remove, later if device is probed,
> IRQ is requested before .thermal_init, this may cause IRQ function be
> triggered but not able to clear IRQ status, thus cause system to hang.
>
> Since the irq line isn't shared between different devices,
> so the proper interrupt type flag should be IRQF_ONESHOT.
>
> This patch-set fix these interrupt handling retated issues.
>
> ---
> v4: remove 'spinlock_t lock'
> add Fixes tag in ("thermal: rcar_gen3_thermal: fix interrupt type")
> fix typos in ("thermal: rcar_gen3_thermal: disable interrupt in .remove")
>
> v3: fix to use correct code base
> remove unused "flag" variable in rcar_gen3_thermal_irq
>
> v2: use irq type IRQF_ONESHOT instead of IRQF_SHARED
> disable interrupt in .remove
>
> v1: initial version
>
> Jiada Wang (2):
> thermal: rcar_gen3_thermal: fix interrupt type
> thermal: rcar_gen3_thermal: disable interrupt in .remove
>
> drivers/thermal/rcar_gen3_thermal.c | 41 +++++++----------------------
> 1 file changed, 9 insertions(+), 32 deletions(-)
>
> --
> 2.19.2

For the record, below is the diff between v3 and v4 [1].
It accurately reflects my review comments in
https://patchwork.kernel.org/patch/10913165/#22601305 .

In addition, I made sure there are either false positives or
no new issues reported by:
- sparse v0.5.2-1-ga3c4716a703a
- smatch v0.5.0-4785-g4968bcad1c08
- cppcheck 1.88 dev
- make W=123
- make coccicheck

I repeated the same test steps as described in
https://patchwork.kernel.org/cover/10913163/#22602335
and the results were the same:

Reviewed-and-Tested-by: Eugeniu Rosca <[email protected]>

[1] diff --git a/drivers/thermal/rcar_gen3_thermal.c b/drivers/thermal/rcar_gen3_thermal.c
index c63a86d3dac6..280230951dfe 100644
--- a/drivers/thermal/rcar_gen3_thermal.c
+++ b/drivers/thermal/rcar_gen3_thermal.c
@@ -14,7 +14,6 @@
#include <linux/of_device.h>
#include <linux/platform_device.h>
#include <linux/pm_runtime.h>
-#include <linux/spinlock.h>
#include <linux/sys_soc.h>
#include <linux/thermal.h>

@@ -82,7 +81,6 @@ struct rcar_gen3_thermal_tsc {
struct rcar_gen3_thermal_priv {
struct rcar_gen3_thermal_tsc *tscs[TSC_MAX_NUM];
unsigned int num_tscs;
- spinlock_t lock; /* Protect interrupts on and off */
void (*thermal_init)(struct rcar_gen3_thermal_tsc *tsc);
};

@@ -231,8 +229,8 @@ static void rcar_thermal_irq_set(struct rcar_gen3_thermal_priv *priv, bool on)
static irqreturn_t rcar_gen3_thermal_irq(int irq, void *data)
{
struct rcar_gen3_thermal_priv *priv = data;
- int i;
u32 status;
+ int i;

for (i = 0; i < priv->num_tscs; i++) {
status = rcar_gen3_thermal_read(priv->tscs[i], REG_GEN3_IRQSTR);
@@ -352,8 +350,6 @@ static int rcar_gen3_thermal_probe(struct platform_device *pdev)
if (soc_device_match(r8a7795es1))
priv->thermal_init = rcar_gen3_thermal_init_r8a7795es1;

- spin_lock_init(&priv->lock);
-
platform_set_drvdata(pdev, priv);

/*

--
Best regards,
Eugeniu.

2019-04-24 19:41:29

by Eugeniu Rosca

[permalink] [raw]
Subject: Re: [PATCH v4 1/2] thermal: rcar_gen3_thermal: fix interrupt type

On Wed, Apr 24, 2019 at 02:11:44PM +0900, Jiada Wang wrote:
> Currently IRQF_SHARED type interrupt line is allocated, but it
> is not appropriate, as the interrupt line isn't shared between
> different devices, instead IRQF_ONESHOT is the proper type.
>
> By changing interrupt type to IRQF_ONESHOT, now irq handler is
> no longer needed, as clear of interrupt status can be done in
> threaded interrupt context.
>
> Because IRQF_ONESHOT type interrupt line is kept disabled until
> the threaded handler has been run, so there is no need to protect
> read/write of REG_GEN3_IRQSTR with lock.
>
> Fixes: 7d4b269776ec6 ("enable hardware interrupts for trip points")
> Signed-off-by: Jiada Wang <[email protected]>

Based on https://patchwork.kernel.org/cover/10914079/#22603533 :

Reviewed-and-Tested-by: Eugeniu Rosca <[email protected]>

Thanks!

--
Best regards,
Eugeniu.

2019-05-07 23:56:44

by Niklas Söderlund

[permalink] [raw]
Subject: Re: [PATCH v4 0/2] thermal: rcar_gen3_thermal: fix IRQ issues

Hi Jiada,

Thanks for your patches.

On 2019-04-24 14:11:43 +0900, Jiada Wang wrote:
> There are issues with interrupt handling in rcar_gen3_thermal driver.
>
> Currently IRQ is remain enabled after .remove, later if device is probed,
> IRQ is requested before .thermal_init, this may cause IRQ function be
> triggered but not able to clear IRQ status, thus cause system to hang.
>
> Since the irq line isn't shared between different devices,
> so the proper interrupt type flag should be IRQF_ONESHOT.
>
> This patch-set fix these interrupt handling retated issues.

I really like this series, nice work.

Tested-by: Niklas S?derlund <[email protected]>
Reviewed-by: Niklas S?derlund <[email protected]>

>
> ---
> v4: remove 'spinlock_t lock'
> add Fixes tag in ("thermal: rcar_gen3_thermal: fix interrupt type")
> fix typos in ("thermal: rcar_gen3_thermal: disable interrupt in .remove")
>
> v3: fix to use correct code base
> remove unused "flag" variable in rcar_gen3_thermal_irq
>
> v2: use irq type IRQF_ONESHOT instead of IRQF_SHARED
> disable interrupt in .remove
>
> v1: initial version
>
> Jiada Wang (2):
> thermal: rcar_gen3_thermal: fix interrupt type
> thermal: rcar_gen3_thermal: disable interrupt in .remove
>
> drivers/thermal/rcar_gen3_thermal.c | 41 +++++++----------------------
> 1 file changed, 9 insertions(+), 32 deletions(-)
>
> --
> 2.19.2
>

--
Regards,
Niklas S?derlund

2019-05-10 10:43:51

by Eugeniu Rosca

[permalink] [raw]
Subject: Re: [PATCH v4 0/2] thermal: rcar_gen3_thermal: fix IRQ issues

Hi Niklas,

On Wed, May 08, 2019 at 01:54:03AM +0200, Niklas Söderlund wrote:
> Hi Jiada,
[..]
> I really like this series, nice work.
>
> Tested-by: Niklas Söderlund <[email protected]>
> Reviewed-by: Niklas Söderlund <[email protected]>

Is there anything off-the-shelf available for testing the rcar3
thermal driver, to avoid reinventing the wheel via
https://patchwork.kernel.org/cover/10913163/#22602335

Thank you.

--
Best Regards,
Eugeniu.

2019-05-10 11:37:23

by Niklas Söderlund

[permalink] [raw]
Subject: Re: [PATCH v4 0/2] thermal: rcar_gen3_thermal: fix IRQ issues

Hi Eugeniu,

On 2019-05-10 12:42:31 +0200, Eugeniu Rosca wrote:
> Hi Niklas,
>
> On Wed, May 08, 2019 at 01:54:03AM +0200, Niklas S?derlund wrote:
> > Hi Jiada,
> [..]
> > I really like this series, nice work.
> >
> > Tested-by: Niklas S?derlund <[email protected]>
> > Reviewed-by: Niklas S?derlund <[email protected]>
>
> Is there anything off-the-shelf available for testing the rcar3
> thermal driver, to avoid reinventing the wheel via
> https://patchwork.kernel.org/cover/10913163/#22602335

Not that I know of, unfortunately :-(

I have a private home hacked testing framework (don't we all?) based on
tcl+expect where I have two basic tests for rcar_gen3_thermal. I'm
willing to share the tests if you by chance want them, but be warned
that they are highly specialised for my needs and I'm reluctant to
publish my whole hack tool as it just a ugly hack ;-)

On a high level the tests I have are

1. thermal-load
Generates load on target and observes the temperature is increased
using the /sys/class/thermal/thermal_zone*/temp" interface. This
seems similar to the test case your reference using stress-ng.

2. thermal-cooling
Emulate the passive trip point temperatures using the
/sys/class/thermal/*/emul_temp interface and observe that the
specified cooling state is achieved.

I should add a third test to make sure IRQ fires but this is just a pet
project for me so maybe I will get around to it sometime...

If you know of anything around to test thermal drivers or if you create
something please let me know so I can add it to my tests. And let me
know if you want my hacks for inspiration for your own testing.

--
Regards,
Niklas S?derlund

2019-05-10 17:12:16

by Eugeniu Rosca

[permalink] [raw]
Subject: Re: [PATCH v4 0/2] thermal: rcar_gen3_thermal: fix IRQ issues

Hi Niklas,

On Fri, May 10, 2019 at 01:36:08PM +0200, Niklas Söderlund wrote:
> Hi Eugeniu,
>
> On 2019-05-10 12:42:31 +0200, Eugeniu Rosca wrote:
> > Hi Niklas,
> >
> > On Wed, May 08, 2019 at 01:54:03AM +0200, Niklas Söderlund wrote:
> > > Hi Jiada,
> > [..]
> > > I really like this series, nice work.
> > >
> > > Tested-by: Niklas Söderlund <[email protected]>
> > > Reviewed-by: Niklas Söderlund <[email protected]>
> >
> > Is there anything off-the-shelf available for testing the rcar3
> > thermal driver, to avoid reinventing the wheel via
> > https://patchwork.kernel.org/cover/10913163/#22602335
>
> Not that I know of, unfortunately :-(
>
> I have a private home hacked testing framework (don't we all?) based on
> tcl+expect where I have two basic tests for rcar_gen3_thermal. I'm
> willing to share the tests if you by chance want them, but be warned
> that they are highly specialised for my needs and I'm reluctant to
> publish my whole hack tool as it just a ugly hack ;-)
>
> On a high level the tests I have are
>
> 1. thermal-load
> Generates load on target and observes the temperature is increased
> using the /sys/class/thermal/thermal_zone*/temp" interface. This
> seems similar to the test case your reference using stress-ng.
>
> 2. thermal-cooling
> Emulate the passive trip point temperatures using the
> /sys/class/thermal/*/emul_temp interface and observe that the
> specified cooling state is achieved.
>
> I should add a third test to make sure IRQ fires but this is just a pet
> project for me so maybe I will get around to it sometime...
>
> If you know of anything around to test thermal drivers or if you create
> something please let me know so I can add it to my tests. And let me
> know if you want my hacks for inspiration for your own testing.

Thanks for this summary. It would be definitely convenient to have
a set of tests covering the most important features of the driver.

I was particularly thinking of the test procedure in light of below:
- I still can reproduce a few UBSAN (signed integer overflow) and
KASAN (use-after-free) reports with the most recent vanilla driver.
- There are a couple of thermal commits in rcar-3.9.x pending for
mainline submission:

https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=fe7d0d1c77f9 ("thermal: rcar_gen3_thermal: Use FUSE values if they are available")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=2776ccd63649 ("thermal: rcar_gen3_thermal: Fix interrupt count issue")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=9146af785f41 ("thermal: rcar_gen3_thermal: Enable selection between polling/interrupt mode")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=55b262766ec2 ("thermal: rcar_gen3_thermal: PIO-INT can be selected for each TSC separately")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=d323d9de0683 ("thermal: rcar_gen3_thermal: Add support for r8a77990")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=fb8efb8bac29 ("thermal: rcar_gen3_thermal: Fix interrupts are not raised issue on E3")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=5627c42a1bd5 ("thermal: rcar_gen3_thermal: Use DIV_ROUND_CLOSEST correctly as its description")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=d4e41702e53b ("thermal: rcar_gen3_thermal: [H3/M3N] Update calculation formula due to HW evaluation")
https://git.kernel.org/pub/scm/linux/kernel/git/horms/renesas-bsp.git/commit/?id=958bd36e03b7 ("thermal: rcar_gen3_thermal: [E3] Update calculation formula due to HW evaluation")

Long story short, I think we will review more thermal commits in
hopefully not so distant future and it would be helpful to reach some
common understanding what kind of testing the new patches should pass.

Your summary already gives some insight in that direction. Thanks.

--
Best Regards,
Eugeniu.