2021-06-19 02:21:26

by Grzegorz Jaszczyk

[permalink] [raw]
Subject: [PATCH v2 0/2] introduce watchdog_dev_suspend/resume

Hi All,

The following is a v2 version of the series [1] that fixes system hang which
occurs when the ping worker fires after wdog suspend and before wdog resume.
This happens because the ping worker can issue low-level ping while the wdog clk
was disabled by the suspend routine (accessing hw wdog registers while they are
not fed by the clk).

To overcome this issue two patches were introduced. Patch #1 introduces pm
notifier in the watchdog core which will call watchdog_dev_suspend/resume and
actually cancel ping worker during suspend and restore it back, if needed,
during resume.

Patch #2 introduces relevant changes to imx2_wdt driver and notifies wdog core
to stop ping worker on suspend.

[1] https://lkml.org/lkml/2021/6/15/542

Best regards,
Grzegorz

Grzegorz Jaszczyk (2):
watchdog: introduce watchdog_dev_suspend/resume
watchdog: imx2_wdg: notify wdog core to stop ping worker on suspend

drivers/watchdog/imx2_wdt.c | 1 +
drivers/watchdog/watchdog_core.c | 37 +++++++++++++++++++++++++
drivers/watchdog/watchdog_dev.c | 47 ++++++++++++++++++++++++++++++++
include/linux/watchdog.h | 10 +++++++
4 files changed, 95 insertions(+)

--
2.29.0


2021-06-19 02:21:52

by Grzegorz Jaszczyk

[permalink] [raw]
Subject: [PATCH v2 1/2] watchdog: introduce watchdog_dev_suspend/resume

The watchdog drivers often disable wdog clock during suspend and then
enable it again during resume. Nevertheless the ping worker is still
running and can issue low-level ping while the wdog clock is disabled
causing the system hang. To prevent such condition register pm notifier
in the watchdog core which will call watchdog_dev_suspend/resume and
actually cancel ping worker during suspend and restore it back, if
needed, during resume.

Signed-off-by: Grzegorz Jaszczyk <[email protected]>
---
v1->v2:
- Instead of using watchdog_dev_suspend/resume directly in wdog drivers
suspend/resume callbacks, register pm notifier in the watchdog core when
new WDOG_NO_PING_ON_SUSPEND status flag is set by the driver. Suggested
by Guenter Roeck <[email protected]>.
- Initialize ret variable in watchdog_dev_suspend/resume.
- Drop EXPORT_SYMBOL_GPL for watchdog_dev_suspend/resume since from now
one they are used only by the watchdog core and not by the drivers.
- Commit log was updated accordingly.
---
drivers/watchdog/watchdog_core.c | 37 +++++++++++++++++++++++++
drivers/watchdog/watchdog_dev.c | 47 ++++++++++++++++++++++++++++++++
include/linux/watchdog.h | 10 +++++++
3 files changed, 94 insertions(+)

diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
index 5df0a22e2cb4..3fe8a7edc252 100644
--- a/drivers/watchdog/watchdog_core.c
+++ b/drivers/watchdog/watchdog_core.c
@@ -34,6 +34,7 @@
#include <linux/idr.h> /* For ida_* macros */
#include <linux/err.h> /* For IS_ERR macros */
#include <linux/of.h> /* For of_get_timeout_sec */
+#include <linux/suspend.h>

#include "watchdog_core.h" /* For watchdog_dev_register/... */

@@ -185,6 +186,33 @@ static int watchdog_restart_notifier(struct notifier_block *nb,
return NOTIFY_DONE;
}

+static int watchdog_pm_notifier(struct notifier_block *nb, unsigned long mode,
+ void *data)
+{
+ struct watchdog_device *wdd;
+ int ret = 0;
+
+ wdd = container_of(nb, struct watchdog_device, pm_nb);
+
+ switch (mode) {
+ case PM_HIBERNATION_PREPARE:
+ case PM_RESTORE_PREPARE:
+ case PM_SUSPEND_PREPARE:
+ ret = watchdog_dev_suspend(wdd);
+ break;
+ case PM_POST_HIBERNATION:
+ case PM_POST_RESTORE:
+ case PM_POST_SUSPEND:
+ ret = watchdog_dev_resume(wdd);
+ break;
+ }
+
+ if (ret)
+ return NOTIFY_BAD;
+
+ return NOTIFY_DONE;
+}
+
/**
* watchdog_set_restart_priority - Change priority of restart handler
* @wdd: watchdog device
@@ -292,6 +320,15 @@ static int __watchdog_register_device(struct watchdog_device *wdd)
wdd->id, ret);
}

+ if (test_bit(WDOG_NO_PING_ON_SUSPEND, &wdd->status)) {
+ wdd->pm_nb.notifier_call = watchdog_pm_notifier;
+
+ ret = register_pm_notifier(&wdd->pm_nb);
+ if (ret)
+ pr_warn("watchdog%d: Cannot register pm handler (%d)\n",
+ wdd->id, ret);
+ }
+
return 0;
}

diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
index 2946f3a63110..9d1c340a3024 100644
--- a/drivers/watchdog/watchdog_dev.c
+++ b/drivers/watchdog/watchdog_dev.c
@@ -1219,6 +1219,53 @@ void __exit watchdog_dev_exit(void)
kthread_destroy_worker(watchdog_kworker);
}

+int watchdog_dev_suspend(struct watchdog_device *wdd)
+{
+ struct watchdog_core_data *wd_data = wdd->wd_data;
+ int ret = 0;
+
+ if (!wdd->wd_data)
+ return -ENODEV;
+
+ /* ping for the last time before suspend */
+ mutex_lock(&wd_data->lock);
+ if (watchdog_worker_should_ping(wd_data))
+ ret = __watchdog_ping(wd_data->wdd);
+ mutex_unlock(&wd_data->lock);
+
+ if (ret)
+ return ret;
+
+ /*
+ * make sure that watchdog worker will not kick in when the wdog is
+ * suspended
+ */
+ hrtimer_cancel(&wd_data->timer);
+ kthread_cancel_work_sync(&wd_data->work);
+
+ return 0;
+}
+
+int watchdog_dev_resume(struct watchdog_device *wdd)
+{
+ struct watchdog_core_data *wd_data = wdd->wd_data;
+ int ret = 0;
+
+ if (!wdd->wd_data)
+ return -ENODEV;
+
+ /*
+ * __watchdog_ping will also retrigger hrtimer and therefore restore the
+ * ping worker if needed.
+ */
+ mutex_lock(&wd_data->lock);
+ if (watchdog_worker_should_ping(wd_data))
+ ret = __watchdog_ping(wd_data->wdd);
+ mutex_unlock(&wd_data->lock);
+
+ return ret;
+}
+
module_param(handle_boot_enabled, bool, 0444);
MODULE_PARM_DESC(handle_boot_enabled,
"Watchdog core auto-updates boot enabled watchdogs before userspace takes over (default="
diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
index 9b19e6bb68b5..99660197a36c 100644
--- a/include/linux/watchdog.h
+++ b/include/linux/watchdog.h
@@ -107,6 +107,7 @@ struct watchdog_device {
unsigned int max_hw_heartbeat_ms;
struct notifier_block reboot_nb;
struct notifier_block restart_nb;
+ struct notifier_block pm_nb;
void *driver_data;
struct watchdog_core_data *wd_data;
unsigned long status;
@@ -116,6 +117,7 @@ struct watchdog_device {
#define WDOG_STOP_ON_REBOOT 2 /* Should be stopped on reboot */
#define WDOG_HW_RUNNING 3 /* True if HW watchdog running */
#define WDOG_STOP_ON_UNREGISTER 4 /* Should be stopped on unregister */
+#define WDOG_NO_PING_ON_SUSPEND 5 /* Ping worker should be stopped on suspend */
struct list_head deferred;
};

@@ -156,6 +158,12 @@ static inline void watchdog_stop_on_unregister(struct watchdog_device *wdd)
set_bit(WDOG_STOP_ON_UNREGISTER, &wdd->status);
}

+/* Use the following function to stop the wdog ping worker when suspending */
+static inline void watchdog_stop_ping_on_suspend(struct watchdog_device *wdd)
+{
+ set_bit(WDOG_NO_PING_ON_SUSPEND, &wdd->status);
+}
+
/* Use the following function to check if a timeout value is invalid */
static inline bool watchdog_timeout_invalid(struct watchdog_device *wdd, unsigned int t)
{
@@ -209,6 +217,8 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd,
unsigned int timeout_parm, struct device *dev);
extern int watchdog_register_device(struct watchdog_device *);
extern void watchdog_unregister_device(struct watchdog_device *);
+int watchdog_dev_suspend(struct watchdog_device *wdd);
+int watchdog_dev_resume(struct watchdog_device *wdd);

int watchdog_set_last_hw_keepalive(struct watchdog_device *, unsigned int);

--
2.29.0

2021-06-28 17:10:52

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH v2 1/2] watchdog: introduce watchdog_dev_suspend/resume

On Fri, Jun 18, 2021 at 09:50:32PM +0200, Grzegorz Jaszczyk wrote:
> The watchdog drivers often disable wdog clock during suspend and then
> enable it again during resume. Nevertheless the ping worker is still
> running and can issue low-level ping while the wdog clock is disabled
> causing the system hang. To prevent such condition register pm notifier
> in the watchdog core which will call watchdog_dev_suspend/resume and
> actually cancel ping worker during suspend and restore it back, if
> needed, during resume.
>
> Signed-off-by: Grzegorz Jaszczyk <[email protected]>

Reviewed-by: Guenter Roeck <[email protected]>

> ---
> v1->v2:
> - Instead of using watchdog_dev_suspend/resume directly in wdog drivers
> suspend/resume callbacks, register pm notifier in the watchdog core when
> new WDOG_NO_PING_ON_SUSPEND status flag is set by the driver. Suggested
> by Guenter Roeck <[email protected]>.
> - Initialize ret variable in watchdog_dev_suspend/resume.
> - Drop EXPORT_SYMBOL_GPL for watchdog_dev_suspend/resume since from now
> one they are used only by the watchdog core and not by the drivers.
> - Commit log was updated accordingly.
> ---
> drivers/watchdog/watchdog_core.c | 37 +++++++++++++++++++++++++
> drivers/watchdog/watchdog_dev.c | 47 ++++++++++++++++++++++++++++++++
> include/linux/watchdog.h | 10 +++++++
> 3 files changed, 94 insertions(+)
>
> diff --git a/drivers/watchdog/watchdog_core.c b/drivers/watchdog/watchdog_core.c
> index 5df0a22e2cb4..3fe8a7edc252 100644
> --- a/drivers/watchdog/watchdog_core.c
> +++ b/drivers/watchdog/watchdog_core.c
> @@ -34,6 +34,7 @@
> #include <linux/idr.h> /* For ida_* macros */
> #include <linux/err.h> /* For IS_ERR macros */
> #include <linux/of.h> /* For of_get_timeout_sec */
> +#include <linux/suspend.h>
>
> #include "watchdog_core.h" /* For watchdog_dev_register/... */
>
> @@ -185,6 +186,33 @@ static int watchdog_restart_notifier(struct notifier_block *nb,
> return NOTIFY_DONE;
> }
>
> +static int watchdog_pm_notifier(struct notifier_block *nb, unsigned long mode,
> + void *data)
> +{
> + struct watchdog_device *wdd;
> + int ret = 0;
> +
> + wdd = container_of(nb, struct watchdog_device, pm_nb);
> +
> + switch (mode) {
> + case PM_HIBERNATION_PREPARE:
> + case PM_RESTORE_PREPARE:
> + case PM_SUSPEND_PREPARE:
> + ret = watchdog_dev_suspend(wdd);
> + break;
> + case PM_POST_HIBERNATION:
> + case PM_POST_RESTORE:
> + case PM_POST_SUSPEND:
> + ret = watchdog_dev_resume(wdd);
> + break;
> + }
> +
> + if (ret)
> + return NOTIFY_BAD;
> +
> + return NOTIFY_DONE;
> +}
> +
> /**
> * watchdog_set_restart_priority - Change priority of restart handler
> * @wdd: watchdog device
> @@ -292,6 +320,15 @@ static int __watchdog_register_device(struct watchdog_device *wdd)
> wdd->id, ret);
> }
>
> + if (test_bit(WDOG_NO_PING_ON_SUSPEND, &wdd->status)) {
> + wdd->pm_nb.notifier_call = watchdog_pm_notifier;
> +
> + ret = register_pm_notifier(&wdd->pm_nb);
> + if (ret)
> + pr_warn("watchdog%d: Cannot register pm handler (%d)\n",
> + wdd->id, ret);
> + }
> +
> return 0;
> }
>
> diff --git a/drivers/watchdog/watchdog_dev.c b/drivers/watchdog/watchdog_dev.c
> index 2946f3a63110..9d1c340a3024 100644
> --- a/drivers/watchdog/watchdog_dev.c
> +++ b/drivers/watchdog/watchdog_dev.c
> @@ -1219,6 +1219,53 @@ void __exit watchdog_dev_exit(void)
> kthread_destroy_worker(watchdog_kworker);
> }
>
> +int watchdog_dev_suspend(struct watchdog_device *wdd)
> +{
> + struct watchdog_core_data *wd_data = wdd->wd_data;
> + int ret = 0;
> +
> + if (!wdd->wd_data)
> + return -ENODEV;
> +
> + /* ping for the last time before suspend */
> + mutex_lock(&wd_data->lock);
> + if (watchdog_worker_should_ping(wd_data))
> + ret = __watchdog_ping(wd_data->wdd);
> + mutex_unlock(&wd_data->lock);
> +
> + if (ret)
> + return ret;
> +
> + /*
> + * make sure that watchdog worker will not kick in when the wdog is
> + * suspended
> + */
> + hrtimer_cancel(&wd_data->timer);
> + kthread_cancel_work_sync(&wd_data->work);
> +
> + return 0;
> +}
> +
> +int watchdog_dev_resume(struct watchdog_device *wdd)
> +{
> + struct watchdog_core_data *wd_data = wdd->wd_data;
> + int ret = 0;
> +
> + if (!wdd->wd_data)
> + return -ENODEV;
> +
> + /*
> + * __watchdog_ping will also retrigger hrtimer and therefore restore the
> + * ping worker if needed.
> + */
> + mutex_lock(&wd_data->lock);
> + if (watchdog_worker_should_ping(wd_data))
> + ret = __watchdog_ping(wd_data->wdd);
> + mutex_unlock(&wd_data->lock);
> +
> + return ret;
> +}
> +
> module_param(handle_boot_enabled, bool, 0444);
> MODULE_PARM_DESC(handle_boot_enabled,
> "Watchdog core auto-updates boot enabled watchdogs before userspace takes over (default="
> diff --git a/include/linux/watchdog.h b/include/linux/watchdog.h
> index 9b19e6bb68b5..99660197a36c 100644
> --- a/include/linux/watchdog.h
> +++ b/include/linux/watchdog.h
> @@ -107,6 +107,7 @@ struct watchdog_device {
> unsigned int max_hw_heartbeat_ms;
> struct notifier_block reboot_nb;
> struct notifier_block restart_nb;
> + struct notifier_block pm_nb;
> void *driver_data;
> struct watchdog_core_data *wd_data;
> unsigned long status;
> @@ -116,6 +117,7 @@ struct watchdog_device {
> #define WDOG_STOP_ON_REBOOT 2 /* Should be stopped on reboot */
> #define WDOG_HW_RUNNING 3 /* True if HW watchdog running */
> #define WDOG_STOP_ON_UNREGISTER 4 /* Should be stopped on unregister */
> +#define WDOG_NO_PING_ON_SUSPEND 5 /* Ping worker should be stopped on suspend */
> struct list_head deferred;
> };
>
> @@ -156,6 +158,12 @@ static inline void watchdog_stop_on_unregister(struct watchdog_device *wdd)
> set_bit(WDOG_STOP_ON_UNREGISTER, &wdd->status);
> }
>
> +/* Use the following function to stop the wdog ping worker when suspending */
> +static inline void watchdog_stop_ping_on_suspend(struct watchdog_device *wdd)
> +{
> + set_bit(WDOG_NO_PING_ON_SUSPEND, &wdd->status);
> +}
> +
> /* Use the following function to check if a timeout value is invalid */
> static inline bool watchdog_timeout_invalid(struct watchdog_device *wdd, unsigned int t)
> {
> @@ -209,6 +217,8 @@ extern int watchdog_init_timeout(struct watchdog_device *wdd,
> unsigned int timeout_parm, struct device *dev);
> extern int watchdog_register_device(struct watchdog_device *);
> extern void watchdog_unregister_device(struct watchdog_device *);
> +int watchdog_dev_suspend(struct watchdog_device *wdd);
> +int watchdog_dev_resume(struct watchdog_device *wdd);
>
> int watchdog_set_last_hw_keepalive(struct watchdog_device *, unsigned int);
>

2021-07-21 07:54:19

by Grzegorz Jaszczyk

[permalink] [raw]
Subject: Re: [PATCH v2 0/2] introduce watchdog_dev_suspend/resume

On Fri, 18 Jun 2021 at 21:50, Grzegorz Jaszczyk
<[email protected]> wrote:
>
> Hi All,
>
> The following is a v2 version of the series [1] that fixes system hang which
> occurs when the ping worker fires after wdog suspend and before wdog resume.
> This happens because the ping worker can issue low-level ping while the wdog clk
> was disabled by the suspend routine (accessing hw wdog registers while they are
> not fed by the clk).
>
> To overcome this issue two patches were introduced. Patch #1 introduces pm
> notifier in the watchdog core which will call watchdog_dev_suspend/resume and
> actually cancel ping worker during suspend and restore it back, if needed,
> during resume.
>
> Patch #2 introduces relevant changes to imx2_wdt driver and notifies wdog core
> to stop ping worker on suspend.
>
> [1] https://lkml.org/lkml/2021/6/15/542
>
> Best regards,
> Grzegorz
>
> Grzegorz Jaszczyk (2):
> watchdog: introduce watchdog_dev_suspend/resume
> watchdog: imx2_wdg: notify wdog core to stop ping worker on suspend
>
> drivers/watchdog/imx2_wdt.c | 1 +
> drivers/watchdog/watchdog_core.c | 37 +++++++++++++++++++++++++
> drivers/watchdog/watchdog_dev.c | 47 ++++++++++++++++++++++++++++++++
> include/linux/watchdog.h | 10 +++++++
> 4 files changed, 95 insertions(+)
>
> --
> 2.29.0
>

Hi,

Gentle reminder about this patch-set. Both patches were reviewed by
Guenter Roeck and are on the list for some time, I would be happy if
we could get them applied.

Thank you in advance,
Grzegorz