2019-06-10 16:36:56

by Ken Sloat

[permalink] [raw]
Subject: [RFE]: watchdog: atmel: atmel-sama5d4-wdt

Hello Nicolas,

I wanted to open a discussion proposing new functionality to allow disabling of the watchdog timer upon entering
suspend in the SAMA5D2/4.

Typical use case of a hardware watchdog timer in the kernel is a userspace application opens the watchdog timer and
periodically "kicks" it. If the application hits a deadlock somewhere and is no longer able to kick it, then the watchdog
intervenes and often resets the processor. Such is the case for the Atmel driver (which also allows a watchdog interrupt
to be asserted in lieu of a system reset). In most use cases, upon entering a low power/suspend state, the application
will no longer be able to "kick" the watchdog. If the watchdog is not disabled or kicked via another method, then it will
reset the system. This is the current behavior of the Atmel driver as of today.

The watchdog peripheral itself does have a "WDIDLEHLT" bit however, and this is enabled via the "atmel,idle-halt" dt
property. However, this is not very useful, as it literally only makes the watchdog count when the CPU is active. This
results in non-deterministic triggering of the WDT and means that if a critical application were to crash, it may be
quite a long time before the WDT would ever trigger. Below is a similar statement made in the device-tree doc for this
peripheral:

- atmel,idle-halt: present if you want to stop the watchdog when the CPU is
in idle state.
CAUTION: This property should be used with care, it actually makes the
watchdog not counting when the CPU is in idle state, therefore the
watchdog reset time depends on mean CPU usage and will not reset at all
if the CPU stop working while it is in idle state, which is probably
not what you want.

It seems to me, that it would be logical and useful to introduce a new property that would cause the Atmel WDT
to disable on suspend and re-enable on resume. It also appears that the WDT is re-initialized anyways upon
resume, so the only piece missing here would really be a dt flag and a call to disable.

I would be happy to submit a patch implementing this change, but wanted to open up a discussion here as to the
merits of this idea as perhaps it has been considered in the past.

Thanks,
Ken Sloat


2019-06-10 16:40:51

by Guenter Roeck

[permalink] [raw]
Subject: Re: [RFE]: watchdog: atmel: atmel-sama5d4-wdt

On Mon, Jun 10, 2019 at 03:51:52PM +0000, Ken Sloat wrote:
> Hello Nicolas,
>
> I wanted to open a discussion proposing new functionality to allow disabling of the watchdog timer upon entering
> suspend in the SAMA5D2/4.
>
> Typical use case of a hardware watchdog timer in the kernel is a userspace application opens the watchdog timer and
> periodically "kicks" it. If the application hits a deadlock somewhere and is no longer able to kick it, then the watchdog
> intervenes and often resets the processor. Such is the case for the Atmel driver (which also allows a watchdog interrupt
> to be asserted in lieu of a system reset). In most use cases, upon entering a low power/suspend state, the application
> will no longer be able to "kick" the watchdog. If the watchdog is not disabled or kicked via another method, then it will
> reset the system. This is the current behavior of the Atmel driver as of today.
>
> The watchdog peripheral itself does have a "WDIDLEHLT" bit however, and this is enabled via the "atmel,idle-halt" dt
> property. However, this is not very useful, as it literally only makes the watchdog count when the CPU is active. This
> results in non-deterministic triggering of the WDT and means that if a critical application were to crash, it may be
> quite a long time before the WDT would ever trigger. Below is a similar statement made in the device-tree doc for this
> peripheral:
>
> - atmel,idle-halt: present if you want to stop the watchdog when the CPU is
> in idle state.
> CAUTION: This property should be used with care, it actually makes the
> watchdog not counting when the CPU is in idle state, therefore the
> watchdog reset time depends on mean CPU usage and will not reset at all
> if the CPU stop working while it is in idle state, which is probably
> not what you want.
>
> It seems to me, that it would be logical and useful to introduce a new property that would cause the Atmel WDT
> to disable on suspend and re-enable on resume. It also appears that the WDT is re-initialized anyways upon
> resume, so the only piece missing here would really be a dt flag and a call to disable.
>
Wondering - why would this need a dt property ? That would be quite unusual. Is
there a condition where one would _not_ want the watchdog to stop on suspend ?

If anything I would suggest to drop atmel,idle-halt completely; it really looks
like it would make the watchdog unreliable.

Thanks,
Guenter

2019-06-10 17:42:24

by Ken Sloat

[permalink] [raw]
Subject: RE: [RFE]: watchdog: atmel: atmel-sama5d4-wdt

> -----Original Message-----
> From: Guenter Roeck <[email protected]> On Behalf Of Guenter Roeck
> Sent: Monday, June 10, 2019 12:28 PM
> To: Ken Sloat <[email protected]>
> Cc: [email protected]; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]
> Subject: Re: [RFE]: watchdog: atmel: atmel-sama5d4-wdt
>
> [This is an EXTERNAL EMAIL]
> ________________________________
>
> On Mon, Jun 10, 2019 at 03:51:52PM +0000, Ken Sloat wrote:
> > Hello Nicolas,
> >
> > I wanted to open a discussion proposing new functionality to allow
> > disabling of the watchdog timer upon entering suspend in the SAMA5D2/4.
> >
> > Typical use case of a hardware watchdog timer in the kernel is a
> > userspace application opens the watchdog timer and periodically
> > "kicks" it. If the application hits a deadlock somewhere and is no
> > longer able to kick it, then the watchdog intervenes and often resets
> > the processor. Such is the case for the Atmel driver (which also
> > allows a watchdog interrupt to be asserted in lieu of a system reset). In
> most use cases, upon entering a low power/suspend state, the application
> will no longer be able to "kick" the watchdog. If the watchdog is not disabled
> or kicked via another method, then it will reset the system. This is the current
> behavior of the Atmel driver as of today.
> >
> > The watchdog peripheral itself does have a "WDIDLEHLT" bit however,
> > and this is enabled via the "atmel,idle-halt" dt property. However,
> > this is not very useful, as it literally only makes the watchdog count
> > when the CPU is active. This results in non-deterministic triggering
> > of the WDT and means that if a critical application were to crash, it
> > may be quite a long time before the WDT would ever trigger. Below is a
> > similar statement made in the device-tree doc for this
> > peripheral:
> >
> > - atmel,idle-halt: present if you want to stop the watchdog when the CPU is
> > in idle state.
> > CAUTION: This property should be used with care, it actually makes the
> > watchdog not counting when the CPU is in idle state, therefore the
> > watchdog reset time depends on mean CPU usage and will not reset at
> all
> > if the CPU stop working while it is in idle state, which is probably
> > not what you want.
> >
> > It seems to me, that it would be logical and useful to introduce a new
> > property that would cause the Atmel WDT to disable on suspend and
> > re-enable on resume. It also appears that the WDT is re-initialized anyways
> upon resume, so the only piece missing here would really be a dt flag and a
> call to disable.
> >
Hello Guenter,

> Wondering - why would this need a dt property ? That would be quite
> unusual. Is there a condition where one would _not_ want the watchdog to
> stop on suspend ?

Good point, not sure there would be.

> If anything I would suggest to drop atmel,idle-halt completely; it really looks
> like it would make the watchdog unreliable.
>
I agree, while it is a function of the SAMA5, it seems to me that it is not very
useful in Linux, unless I am missing something. I will wait for Nicolas to chime
in before I submit anything. I can certainly submit separate patches for each,
I already have something working for this.

> Thanks,
> Guenter

Thanks,
Ken Sloat

2019-06-10 20:13:35

by Alexandre Belloni

[permalink] [raw]
Subject: Re: [RFE]: watchdog: atmel: atmel-sama5d4-wdt

Hello,

On 10/06/2019 09:28:11-0700, Guenter Roeck wrote:
> On Mon, Jun 10, 2019 at 03:51:52PM +0000, Ken Sloat wrote:
> > Hello Nicolas,
> >
> > I wanted to open a discussion proposing new functionality to allow disabling of the watchdog timer upon entering
> > suspend in the SAMA5D2/4.
> >
> > Typical use case of a hardware watchdog timer in the kernel is a userspace application opens the watchdog timer and
> > periodically "kicks" it. If the application hits a deadlock somewhere and is no longer able to kick it, then the watchdog
> > intervenes and often resets the processor. Such is the case for the Atmel driver (which also allows a watchdog interrupt
> > to be asserted in lieu of a system reset). In most use cases, upon entering a low power/suspend state, the application
> > will no longer be able to "kick" the watchdog. If the watchdog is not disabled or kicked via another method, then it will
> > reset the system. This is the current behavior of the Atmel driver as of today.
> >
> > The watchdog peripheral itself does have a "WDIDLEHLT" bit however, and this is enabled via the "atmel,idle-halt" dt
> > property. However, this is not very useful, as it literally only makes the watchdog count when the CPU is active. This
> > results in non-deterministic triggering of the WDT and means that if a critical application were to crash, it may be
> > quite a long time before the WDT would ever trigger. Below is a similar statement made in the device-tree doc for this
> > peripheral:
> >
> > - atmel,idle-halt: present if you want to stop the watchdog when the CPU is
> > in idle state.
> > CAUTION: This property should be used with care, it actually makes the
> > watchdog not counting when the CPU is in idle state, therefore the
> > watchdog reset time depends on mean CPU usage and will not reset at all
> > if the CPU stop working while it is in idle state, which is probably
> > not what you want.
> >
> > It seems to me, that it would be logical and useful to introduce a new property that would cause the Atmel WDT
> > to disable on suspend and re-enable on resume. It also appears that the WDT is re-initialized anyways upon
> > resume, so the only piece missing here would really be a dt flag and a call to disable.
> >
> Wondering - why would this need a dt property ? That would be quite unusual. Is
> there a condition where one would _not_ want the watchdog to stop on suspend ?
>

There are customers that protects suspend/resume using the watchdog.
They wake up their platform every 15s to ping the watchdog.

Also, I don't see why the application deciding to go to suspend wouldn't
be able to disable the watchdog before do so if this is the wanted policy.

> If anything I would suggest to drop atmel,idle-halt completely; it really looks
> like it would make the watchdog unreliable.
>
> Thanks,
> Guenter

--
Alexandre Belloni, Bootlin
Embedded Linux and Kernel engineering
https://bootlin.com

2019-06-10 20:29:55

by Guenter Roeck

[permalink] [raw]
Subject: Re: [RFE]: watchdog: atmel: atmel-sama5d4-wdt

On Mon, Jun 10, 2019 at 10:13:01PM +0200, Alexandre Belloni wrote:
> Hello,
>
> On 10/06/2019 09:28:11-0700, Guenter Roeck wrote:
> > On Mon, Jun 10, 2019 at 03:51:52PM +0000, Ken Sloat wrote:
> > > Hello Nicolas,
> > >
> > > I wanted to open a discussion proposing new functionality to allow disabling of the watchdog timer upon entering
> > > suspend in the SAMA5D2/4.
> > >
> > > Typical use case of a hardware watchdog timer in the kernel is a userspace application opens the watchdog timer and
> > > periodically "kicks" it. If the application hits a deadlock somewhere and is no longer able to kick it, then the watchdog
> > > intervenes and often resets the processor. Such is the case for the Atmel driver (which also allows a watchdog interrupt
> > > to be asserted in lieu of a system reset). In most use cases, upon entering a low power/suspend state, the application
> > > will no longer be able to "kick" the watchdog. If the watchdog is not disabled or kicked via another method, then it will
> > > reset the system. This is the current behavior of the Atmel driver as of today.
> > >
> > > The watchdog peripheral itself does have a "WDIDLEHLT" bit however, and this is enabled via the "atmel,idle-halt" dt
> > > property. However, this is not very useful, as it literally only makes the watchdog count when the CPU is active. This
> > > results in non-deterministic triggering of the WDT and means that if a critical application were to crash, it may be
> > > quite a long time before the WDT would ever trigger. Below is a similar statement made in the device-tree doc for this
> > > peripheral:
> > >
> > > - atmel,idle-halt: present if you want to stop the watchdog when the CPU is
> > > in idle state.
> > > CAUTION: This property should be used with care, it actually makes the
> > > watchdog not counting when the CPU is in idle state, therefore the
> > > watchdog reset time depends on mean CPU usage and will not reset at all
> > > if the CPU stop working while it is in idle state, which is probably
> > > not what you want.
> > >
> > > It seems to me, that it would be logical and useful to introduce a new property that would cause the Atmel WDT
> > > to disable on suspend and re-enable on resume. It also appears that the WDT is re-initialized anyways upon
> > > resume, so the only piece missing here would really be a dt flag and a call to disable.
> > >
> > Wondering - why would this need a dt property ? That would be quite unusual. Is
> > there a condition where one would _not_ want the watchdog to stop on suspend ?
> >
>
> There are customers that protects suspend/resume using the watchdog.
> They wake up their platform every 15s to ping the watchdog.
>

Interesting use case.

> Also, I don't see why the application deciding to go to suspend wouldn't
> be able to disable the watchdog before do so if this is the wanted policy.
>

Many watchdog drivers already implement suspend/resume support. Such a
platform specific functionality seems to be quite undesirable to me.

Besides (and pretty much all watchdog drivers implementing suspend/resume
do that wrong), you'd likely want to disable the watchog late during
suspend and early during resume to reduce the risk of a hang. I don't
think you can do that from userspace.

Thanks,
Guenter

2019-06-10 20:54:45

by Ken Sloat

[permalink] [raw]
Subject: RE: [RFE]: watchdog: atmel: atmel-sama5d4-wdt

> -----Original Message-----
> From: Guenter Roeck <[email protected]> On Behalf Of Guenter Roeck
> Sent: Monday, June 10, 2019 4:29 PM
> To: Alexandre Belloni <[email protected]>
> Cc: Ken Sloat <[email protected]>; [email protected];
> [email protected]; [email protected]; linux-
> [email protected]; [email protected]
> Subject: Re: [RFE]: watchdog: atmel: atmel-sama5d4-wdt
>
> [This is an EXTERNAL EMAIL]
> ________________________________
>
> On Mon, Jun 10, 2019 at 10:13:01PM +0200, Alexandre Belloni wrote:
> > Hello,
> >
> > On 10/06/2019 09:28:11-0700, Guenter Roeck wrote:
> > > On Mon, Jun 10, 2019 at 03:51:52PM +0000, Ken Sloat wrote:
> > > > Hello Nicolas,
> > > >
> > > > I wanted to open a discussion proposing new functionality to allow
> > > > disabling of the watchdog timer upon entering suspend in the
> SAMA5D2/4.
> > > >
> > > > Typical use case of a hardware watchdog timer in the kernel is a
> > > > userspace application opens the watchdog timer and periodically
> > > > "kicks" it. If the application hits a deadlock somewhere and is no
> > > > longer able to kick it, then the watchdog intervenes and often
> > > > resets the processor. Such is the case for the Atmel driver (which
> > > > also allows a watchdog interrupt to be asserted in lieu of a system
> reset). In most use cases, upon entering a low power/suspend state, the
> application will no longer be able to "kick" the watchdog. If the watchdog is
> not disabled or kicked via another method, then it will reset the system. This
> is the current behavior of the Atmel driver as of today.
> > > >
> > > > The watchdog peripheral itself does have a "WDIDLEHLT" bit
> > > > however, and this is enabled via the "atmel,idle-halt" dt
> > > > property. However, this is not very useful, as it literally only
> > > > makes the watchdog count when the CPU is active. This results in
> > > > non-deterministic triggering of the WDT and means that if a
> > > > critical application were to crash, it may be quite a long time
> > > > before the WDT would ever trigger. Below is a similar statement
> > > > made in the device-tree doc for this
> > > > peripheral:
> > > >
> > > > - atmel,idle-halt: present if you want to stop the watchdog when the
> CPU is
> > > > in idle state.
> > > > CAUTION: This property should be used with care, it actually makes
> the
> > > > watchdog not counting when the CPU is in idle state, therefore the
> > > > watchdog reset time depends on mean CPU usage and will not reset at
> all
> > > > if the CPU stop working while it is in idle state, which is probably
> > > > not what you want.
> > > >
> > > > It seems to me, that it would be logical and useful to introduce a
> > > > new property that would cause the Atmel WDT to disable on suspend
> > > > and re-enable on resume. It also appears that the WDT is re-initialized
> anyways upon resume, so the only piece missing here would really be a dt
> flag and a call to disable.
> > > >
> > > Wondering - why would this need a dt property ? That would be quite
> > > unusual. Is there a condition where one would _not_ want the watchdog
> to stop on suspend ?
> > >
> >
> > There are customers that protects suspend/resume using the watchdog.
> > They wake up their platform every 15s to ping the watchdog.
> >
>
> Interesting use case.
>
> > Also, I don't see why the application deciding to go to suspend
> > wouldn't be able to disable the watchdog before do so if this is the wanted
> policy.
> >
>
> Many watchdog drivers already implement suspend/resume support. Such a
> platform specific functionality seems to be quite undesirable to me.
>
> Besides (and pretty much all watchdog drivers implementing
> suspend/resume do that wrong), you'd likely want to disable the watchog
> late during suspend and early during resume to reduce the risk of a hang. I
> don't think you can do that from userspace.
>

Agreed, there's also the risk if using something like wake_count where the
suspend process can be interrupted by the kernel. This just makes for more
cases that the application has to try and handle, and what happens if the
application hangs/crashes some time between disabling the wdt and suspend?

Right now, what probably is a very common case of devices that enter low power
mode for longer than 16 seconds are precluded from using this WDT.

> Thanks,
> Guenter

Thanks,
Ken