2020-06-24 11:49:10

by Tero Kristo

[permalink] [raw]
Subject: [PATCH 2/2] watchdog: rti: tweak min_hw_heartbeat_ms to match initial allowed window

If the RTI watchdog has been started by someone (like bootloader) when
the driver probes, we must adjust the initial ping timeout to match the
currently running watchdog window to avoid generating watchdog reset.

Signed-off-by: Tero Kristo <[email protected]>
---
drivers/watchdog/rti_wdt.c | 25 +++++++++++++++++++++++++
1 file changed, 25 insertions(+)

diff --git a/drivers/watchdog/rti_wdt.c b/drivers/watchdog/rti_wdt.c
index d456dd72d99a..02ea2b2435f5 100644
--- a/drivers/watchdog/rti_wdt.c
+++ b/drivers/watchdog/rti_wdt.c
@@ -55,11 +55,13 @@ static int heartbeat;
* @base - base io address of WD device
* @freq - source clock frequency of WDT
* @wdd - hold watchdog device as is in WDT core
+ * @min_hw_heartbeat_save - save of the min hw heartbeat value
*/
struct rti_wdt_device {
void __iomem *base;
unsigned long freq;
struct watchdog_device wdd;
+ unsigned int min_hw_heartbeat_save;
};

static int rti_wdt_start(struct watchdog_device *wdd)
@@ -107,6 +109,11 @@ static int rti_wdt_ping(struct watchdog_device *wdd)
/* put watchdog in active state */
writel_relaxed(WDKEY_SEQ1, wdt->base + RTIWDKEY);

+ if (wdt->min_hw_heartbeat_save) {
+ wdd->min_hw_heartbeat_ms = wdt->min_hw_heartbeat_save;
+ wdt->min_hw_heartbeat_save = 0;
+ }
+
return 0;
}

@@ -201,6 +208,24 @@ static int rti_wdt_probe(struct platform_device *pdev)
goto err_iomap;
}

+ if (readl(wdt->base + RTIDWDCTRL) == WDENABLE_KEY) {
+ u32 time_left;
+ u32 heartbeat;
+
+ set_bit(WDOG_HW_RUNNING, &wdd->status);
+ time_left = rti_wdt_get_timeleft(wdd);
+ heartbeat = readl(wdt->base + RTIDWDPRLD);
+ heartbeat <<= WDT_PRELOAD_SHIFT;
+ heartbeat /= wdt->freq;
+ if (time_left < heartbeat / 2)
+ wdd->min_hw_heartbeat_ms = 0;
+ else
+ wdd->min_hw_heartbeat_ms =
+ (time_left - heartbeat / 2 + 1) * 1000;
+
+ wdt->min_hw_heartbeat_save = 11 * heartbeat * 1000 / 20;
+ }
+
ret = watchdog_register_device(wdd);
if (ret) {
dev_err(dev, "cannot register watchdog device\n");
--
2.17.1

--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki


2020-06-24 15:26:43

by Jan Kiszka

[permalink] [raw]
Subject: Re: [PATCH 2/2] watchdog: rti: tweak min_hw_heartbeat_ms to match initial allowed window

On 24.06.20 13:45, Tero Kristo wrote:
> If the RTI watchdog has been started by someone (like bootloader) when
> the driver probes, we must adjust the initial ping timeout to match the
> currently running watchdog window to avoid generating watchdog reset.
>
> Signed-off-by: Tero Kristo <[email protected]>
> ---
> drivers/watchdog/rti_wdt.c | 25 +++++++++++++++++++++++++
> 1 file changed, 25 insertions(+)
>
> diff --git a/drivers/watchdog/rti_wdt.c b/drivers/watchdog/rti_wdt.c
> index d456dd72d99a..02ea2b2435f5 100644
> --- a/drivers/watchdog/rti_wdt.c
> +++ b/drivers/watchdog/rti_wdt.c
> @@ -55,11 +55,13 @@ static int heartbeat;
> * @base - base io address of WD device
> * @freq - source clock frequency of WDT
> * @wdd - hold watchdog device as is in WDT core
> + * @min_hw_heartbeat_save - save of the min hw heartbeat value
> */
> struct rti_wdt_device {
> void __iomem *base;
> unsigned long freq;
> struct watchdog_device wdd;
> + unsigned int min_hw_heartbeat_save;
> };
>
> static int rti_wdt_start(struct watchdog_device *wdd)
> @@ -107,6 +109,11 @@ static int rti_wdt_ping(struct watchdog_device *wdd)
> /* put watchdog in active state */
> writel_relaxed(WDKEY_SEQ1, wdt->base + RTIWDKEY);
>
> + if (wdt->min_hw_heartbeat_save) {
> + wdd->min_hw_heartbeat_ms = wdt->min_hw_heartbeat_save;
> + wdt->min_hw_heartbeat_save = 0;
> + }
> +
> return 0;
> }
>
> @@ -201,6 +208,24 @@ static int rti_wdt_probe(struct platform_device *pdev)
> goto err_iomap;
> }
>
> + if (readl(wdt->base + RTIDWDCTRL) == WDENABLE_KEY) {
> + u32 time_left;
> + u32 heartbeat;
> +
> + set_bit(WDOG_HW_RUNNING, &wdd->status);
> + time_left = rti_wdt_get_timeleft(wdd);
> + heartbeat = readl(wdt->base + RTIDWDPRLD);
> + heartbeat <<= WDT_PRELOAD_SHIFT;
> + heartbeat /= wdt->freq;
> + if (time_left < heartbeat / 2)
> + wdd->min_hw_heartbeat_ms = 0;
> + else
> + wdd->min_hw_heartbeat_ms =
> + (time_left - heartbeat / 2 + 1) * 1000;
> +
> + wdt->min_hw_heartbeat_save = 11 * heartbeat * 1000 / 20;
> + }
> +
> ret = watchdog_register_device(wdd);
> if (ret) {
> dev_err(dev, "cannot register watchdog device\n");
>

This assumes that the bootloader also programmed a 50% window, right?
The pending U-Boot patch will do that, but what if that may chance or
someone uses a different setup?

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

2020-06-25 09:43:55

by Tero Kristo

[permalink] [raw]
Subject: Re: [PATCH 2/2] watchdog: rti: tweak min_hw_heartbeat_ms to match initial allowed window

On 24/06/2020 18:24, Jan Kiszka wrote:
> On 24.06.20 13:45, Tero Kristo wrote:
>> If the RTI watchdog has been started by someone (like bootloader) when
>> the driver probes, we must adjust the initial ping timeout to match the
>> currently running watchdog window to avoid generating watchdog reset.
>>
>> Signed-off-by: Tero Kristo <[email protected]>
>> ---
>>   drivers/watchdog/rti_wdt.c | 25 +++++++++++++++++++++++++
>>   1 file changed, 25 insertions(+)
>>
>> diff --git a/drivers/watchdog/rti_wdt.c b/drivers/watchdog/rti_wdt.c
>> index d456dd72d99a..02ea2b2435f5 100644
>> --- a/drivers/watchdog/rti_wdt.c
>> +++ b/drivers/watchdog/rti_wdt.c
>> @@ -55,11 +55,13 @@ static int heartbeat;
>>    * @base - base io address of WD device
>>    * @freq - source clock frequency of WDT
>>    * @wdd  - hold watchdog device as is in WDT core
>> + * @min_hw_heartbeat_save - save of the min hw heartbeat value
>>    */
>>   struct rti_wdt_device {
>>       void __iomem        *base;
>>       unsigned long        freq;
>>       struct watchdog_device    wdd;
>> +    unsigned int        min_hw_heartbeat_save;
>>   };
>>   static int rti_wdt_start(struct watchdog_device *wdd)
>> @@ -107,6 +109,11 @@ static int rti_wdt_ping(struct watchdog_device *wdd)
>>       /* put watchdog in active state */
>>       writel_relaxed(WDKEY_SEQ1, wdt->base + RTIWDKEY);
>> +    if (wdt->min_hw_heartbeat_save) {
>> +        wdd->min_hw_heartbeat_ms = wdt->min_hw_heartbeat_save;
>> +        wdt->min_hw_heartbeat_save = 0;
>> +    }
>> +
>>       return 0;
>>   }
>> @@ -201,6 +208,24 @@ static int rti_wdt_probe(struct platform_device
>> *pdev)
>>           goto err_iomap;
>>       }
>> +    if (readl(wdt->base + RTIDWDCTRL) == WDENABLE_KEY) {
>> +        u32 time_left;
>> +        u32 heartbeat;
>> +
>> +        set_bit(WDOG_HW_RUNNING, &wdd->status);
>> +        time_left = rti_wdt_get_timeleft(wdd);
>> +        heartbeat = readl(wdt->base + RTIDWDPRLD);
>> +        heartbeat <<= WDT_PRELOAD_SHIFT;
>> +        heartbeat /= wdt->freq;
>> +        if (time_left < heartbeat / 2)
>> +            wdd->min_hw_heartbeat_ms = 0;
>> +        else
>> +            wdd->min_hw_heartbeat_ms =
>> +                (time_left - heartbeat / 2 + 1) * 1000;
>> +
>> +        wdt->min_hw_heartbeat_save = 11 * heartbeat * 1000 / 20;
>> +    }
>> +
>>       ret = watchdog_register_device(wdd);
>>       if (ret) {
>>           dev_err(dev, "cannot register watchdog device\n");
>>
>
> This assumes that the bootloader also programmed a 50% window, right?
> The pending U-Boot patch will do that, but what if that may chance or
> someone uses a different setup?

Yes, we assume 50%. I think based on the hw design, 50% is the only sane
value to be used, otherwise you just shrink the open window too much and
for no apparent reason.

-Tero
--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2020-06-25 10:10:36

by Jan Kiszka

[permalink] [raw]
Subject: Re: [PATCH 2/2] watchdog: rti: tweak min_hw_heartbeat_ms to match initial allowed window

On 25.06.20 10:32, Tero Kristo wrote:
> On 24/06/2020 18:24, Jan Kiszka wrote:
>> On 24.06.20 13:45, Tero Kristo wrote:
>>> If the RTI watchdog has been started by someone (like bootloader) when
>>> the driver probes, we must adjust the initial ping timeout to match the
>>> currently running watchdog window to avoid generating watchdog reset.
>>>
>>> Signed-off-by: Tero Kristo <[email protected]>
>>> ---
>>>   drivers/watchdog/rti_wdt.c | 25 +++++++++++++++++++++++++
>>>   1 file changed, 25 insertions(+)
>>>
>>> diff --git a/drivers/watchdog/rti_wdt.c b/drivers/watchdog/rti_wdt.c
>>> index d456dd72d99a..02ea2b2435f5 100644
>>> --- a/drivers/watchdog/rti_wdt.c
>>> +++ b/drivers/watchdog/rti_wdt.c
>>> @@ -55,11 +55,13 @@ static int heartbeat;
>>>    * @base - base io address of WD device
>>>    * @freq - source clock frequency of WDT
>>>    * @wdd  - hold watchdog device as is in WDT core
>>> + * @min_hw_heartbeat_save - save of the min hw heartbeat value
>>>    */
>>>   struct rti_wdt_device {
>>>       void __iomem        *base;
>>>       unsigned long        freq;
>>>       struct watchdog_device    wdd;
>>> +    unsigned int        min_hw_heartbeat_save;
>>>   };
>>>   static int rti_wdt_start(struct watchdog_device *wdd)
>>> @@ -107,6 +109,11 @@ static int rti_wdt_ping(struct watchdog_device
>>> *wdd)
>>>       /* put watchdog in active state */
>>>       writel_relaxed(WDKEY_SEQ1, wdt->base + RTIWDKEY);
>>> +    if (wdt->min_hw_heartbeat_save) {
>>> +        wdd->min_hw_heartbeat_ms = wdt->min_hw_heartbeat_save;
>>> +        wdt->min_hw_heartbeat_save = 0;
>>> +    }
>>> +
>>>       return 0;
>>>   }
>>> @@ -201,6 +208,24 @@ static int rti_wdt_probe(struct platform_device
>>> *pdev)
>>>           goto err_iomap;
>>>       }
>>> +    if (readl(wdt->base + RTIDWDCTRL) == WDENABLE_KEY) {
>>> +        u32 time_left;
>>> +        u32 heartbeat;
>>> +
>>> +        set_bit(WDOG_HW_RUNNING, &wdd->status);
>>> +        time_left = rti_wdt_get_timeleft(wdd);
>>> +        heartbeat = readl(wdt->base + RTIDWDPRLD);
>>> +        heartbeat <<= WDT_PRELOAD_SHIFT;
>>> +        heartbeat /= wdt->freq;
>>> +        if (time_left < heartbeat / 2)
>>> +            wdd->min_hw_heartbeat_ms = 0;
>>> +        else
>>> +            wdd->min_hw_heartbeat_ms =
>>> +                (time_left - heartbeat / 2 + 1) * 1000;
>>> +
>>> +        wdt->min_hw_heartbeat_save = 11 * heartbeat * 1000 / 20;
>>> +    }
>>> +
>>>       ret = watchdog_register_device(wdd);
>>>       if (ret) {
>>>           dev_err(dev, "cannot register watchdog device\n");
>>>
>>
>> This assumes that the bootloader also programmed a 50% window, right?
>> The pending U-Boot patch will do that, but what if that may chance or
>> someone uses a different setup?
>
> Yes, we assume 50%. I think based on the hw design, 50% is the only sane
> value to be used, otherwise you just shrink the open window too much and
> for no apparent reason.

Fine with me, but should we check that assumption when adopting the
watchdog?

Jan

--
Siemens AG, Corporate Technology, CT RDA IOT SES-DE
Corporate Competence Center Embedded Linux

2020-06-25 13:37:45

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 2/2] watchdog: rti: tweak min_hw_heartbeat_ms to match initial allowed window

On 6/25/20 1:32 AM, Tero Kristo wrote:
> On 24/06/2020 18:24, Jan Kiszka wrote:
>> On 24.06.20 13:45, Tero Kristo wrote:
>>> If the RTI watchdog has been started by someone (like bootloader) when
>>> the driver probes, we must adjust the initial ping timeout to match the
>>> currently running watchdog window to avoid generating watchdog reset.
>>>
>>> Signed-off-by: Tero Kristo <[email protected]>
>>> ---
>>>   drivers/watchdog/rti_wdt.c | 25 +++++++++++++++++++++++++
>>>   1 file changed, 25 insertions(+)
>>>
>>> diff --git a/drivers/watchdog/rti_wdt.c b/drivers/watchdog/rti_wdt.c
>>> index d456dd72d99a..02ea2b2435f5 100644
>>> --- a/drivers/watchdog/rti_wdt.c
>>> +++ b/drivers/watchdog/rti_wdt.c
>>> @@ -55,11 +55,13 @@ static int heartbeat;
>>>    * @base - base io address of WD device
>>>    * @freq - source clock frequency of WDT
>>>    * @wdd  - hold watchdog device as is in WDT core
>>> + * @min_hw_heartbeat_save - save of the min hw heartbeat value
>>>    */
>>>   struct rti_wdt_device {
>>>       void __iomem        *base;
>>>       unsigned long        freq;
>>>       struct watchdog_device    wdd;
>>> +    unsigned int        min_hw_heartbeat_save;
>>>   };
>>>   static int rti_wdt_start(struct watchdog_device *wdd)
>>> @@ -107,6 +109,11 @@ static int rti_wdt_ping(struct watchdog_device *wdd)
>>>       /* put watchdog in active state */
>>>       writel_relaxed(WDKEY_SEQ1, wdt->base + RTIWDKEY);
>>> +    if (wdt->min_hw_heartbeat_save) {
>>> +        wdd->min_hw_heartbeat_ms = wdt->min_hw_heartbeat_save;
>>> +        wdt->min_hw_heartbeat_save = 0;
>>> +    }
>>> +
>>>       return 0;
>>>   }
>>> @@ -201,6 +208,24 @@ static int rti_wdt_probe(struct platform_device *pdev)
>>>           goto err_iomap;
>>>       }
>>> +    if (readl(wdt->base + RTIDWDCTRL) == WDENABLE_KEY) {
>>> +        u32 time_left;
>>> +        u32 heartbeat;
>>> +
>>> +        set_bit(WDOG_HW_RUNNING, &wdd->status);
>>> +        time_left = rti_wdt_get_timeleft(wdd);
>>> +        heartbeat = readl(wdt->base + RTIDWDPRLD);
>>> +        heartbeat <<= WDT_PRELOAD_SHIFT;
>>> +        heartbeat /= wdt->freq;
>>> +        if (time_left < heartbeat / 2)
>>> +            wdd->min_hw_heartbeat_ms = 0;
>>> +        else
>>> +            wdd->min_hw_heartbeat_ms =
>>> +                (time_left - heartbeat / 2 + 1) * 1000;
>>> +
>>> +        wdt->min_hw_heartbeat_save = 11 * heartbeat * 1000 / 20;
>>> +    }
>>> +
>>>       ret = watchdog_register_device(wdd);
>>>       if (ret) {
>>>           dev_err(dev, "cannot register watchdog device\n");
>>>
>>
>> This assumes that the bootloader also programmed a 50% window, right? The pending U-Boot patch will do that, but what if that may chance or someone uses a different setup?
>
> Yes, we assume 50%. I think based on the hw design, 50% is the only sane value to be used, otherwise you just shrink the open window too much and for no apparent reason.
>

Not sure if that is a valid assumption. Someone who designs a watchdog
with such a narrow ping window might as well also use it. The question
is if you want to rely on that assumption, or check and change it if needed.

Also, I wonder if we should add an API function such as
"set_last_hw_keepalive()" to avoid all that complexity.

Thanks,
Guenter

2020-06-25 18:26:03

by Tero Kristo

[permalink] [raw]
Subject: Re: [PATCH 2/2] watchdog: rti: tweak min_hw_heartbeat_ms to match initial allowed window

On 25/06/2020 16:35, Guenter Roeck wrote:
> On 6/25/20 1:32 AM, Tero Kristo wrote:
>> On 24/06/2020 18:24, Jan Kiszka wrote:
>>> On 24.06.20 13:45, Tero Kristo wrote:
>>>> If the RTI watchdog has been started by someone (like bootloader) when
>>>> the driver probes, we must adjust the initial ping timeout to match the
>>>> currently running watchdog window to avoid generating watchdog reset.
>>>>
>>>> Signed-off-by: Tero Kristo <[email protected]>
>>>> ---
>>>>   drivers/watchdog/rti_wdt.c | 25 +++++++++++++++++++++++++
>>>>   1 file changed, 25 insertions(+)
>>>>
>>>> diff --git a/drivers/watchdog/rti_wdt.c b/drivers/watchdog/rti_wdt.c
>>>> index d456dd72d99a..02ea2b2435f5 100644
>>>> --- a/drivers/watchdog/rti_wdt.c
>>>> +++ b/drivers/watchdog/rti_wdt.c
>>>> @@ -55,11 +55,13 @@ static int heartbeat;
>>>>    * @base - base io address of WD device
>>>>    * @freq - source clock frequency of WDT
>>>>    * @wdd  - hold watchdog device as is in WDT core
>>>> + * @min_hw_heartbeat_save - save of the min hw heartbeat value
>>>>    */
>>>>   struct rti_wdt_device {
>>>>       void __iomem        *base;
>>>>       unsigned long        freq;
>>>>       struct watchdog_device    wdd;
>>>> +    unsigned int        min_hw_heartbeat_save;
>>>>   };
>>>>   static int rti_wdt_start(struct watchdog_device *wdd)
>>>> @@ -107,6 +109,11 @@ static int rti_wdt_ping(struct watchdog_device *wdd)
>>>>       /* put watchdog in active state */
>>>>       writel_relaxed(WDKEY_SEQ1, wdt->base + RTIWDKEY);
>>>> +    if (wdt->min_hw_heartbeat_save) {
>>>> +        wdd->min_hw_heartbeat_ms = wdt->min_hw_heartbeat_save;
>>>> +        wdt->min_hw_heartbeat_save = 0;
>>>> +    }
>>>> +
>>>>       return 0;
>>>>   }
>>>> @@ -201,6 +208,24 @@ static int rti_wdt_probe(struct platform_device *pdev)
>>>>           goto err_iomap;
>>>>       }
>>>> +    if (readl(wdt->base + RTIDWDCTRL) == WDENABLE_KEY) {
>>>> +        u32 time_left;
>>>> +        u32 heartbeat;
>>>> +
>>>> +        set_bit(WDOG_HW_RUNNING, &wdd->status);
>>>> +        time_left = rti_wdt_get_timeleft(wdd);
>>>> +        heartbeat = readl(wdt->base + RTIDWDPRLD);
>>>> +        heartbeat <<= WDT_PRELOAD_SHIFT;
>>>> +        heartbeat /= wdt->freq;
>>>> +        if (time_left < heartbeat / 2)
>>>> +            wdd->min_hw_heartbeat_ms = 0;
>>>> +        else
>>>> +            wdd->min_hw_heartbeat_ms =
>>>> +                (time_left - heartbeat / 2 + 1) * 1000;
>>>> +
>>>> +        wdt->min_hw_heartbeat_save = 11 * heartbeat * 1000 / 20;
>>>> +    }
>>>> +
>>>>       ret = watchdog_register_device(wdd);
>>>>       if (ret) {
>>>>           dev_err(dev, "cannot register watchdog device\n");
>>>>
>>>
>>> This assumes that the bootloader also programmed a 50% window, right? The pending U-Boot patch will do that, but what if that may chance or someone uses a different setup?
>>
>> Yes, we assume 50%. I think based on the hw design, 50% is the only sane value to be used, otherwise you just shrink the open window too much and for no apparent reason.
>>
>
> Not sure if that is a valid assumption. Someone who designs a watchdog
> with such a narrow ping window might as well also use it. The question
> is if you want to rely on that assumption, or check and change it if needed.

Right, if that is a blocker, I can modify the code. Should be maybe
couple of lines addition.

> Also, I wonder if we should add an API function such as
> "set_last_hw_keepalive()" to avoid all that complexity.

I can try adding that also if it is desirable.

-Tero
--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2020-06-30 21:18:40

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 2/2] watchdog: rti: tweak min_hw_heartbeat_ms to match initial allowed window

On Thu, Jun 25, 2020 at 08:04:50PM +0300, Tero Kristo wrote:
> On 25/06/2020 16:35, Guenter Roeck wrote:
> > On 6/25/20 1:32 AM, Tero Kristo wrote:
> > > On 24/06/2020 18:24, Jan Kiszka wrote:
> > > > On 24.06.20 13:45, Tero Kristo wrote:
> > > > > If the RTI watchdog has been started by someone (like bootloader) when
> > > > > the driver probes, we must adjust the initial ping timeout to match the
> > > > > currently running watchdog window to avoid generating watchdog reset.
> > > > >
> > > > > Signed-off-by: Tero Kristo <[email protected]>
> > > > > ---
> > > > > ? drivers/watchdog/rti_wdt.c | 25 +++++++++++++++++++++++++
> > > > > ? 1 file changed, 25 insertions(+)
> > > > >
> > > > > diff --git a/drivers/watchdog/rti_wdt.c b/drivers/watchdog/rti_wdt.c
> > > > > index d456dd72d99a..02ea2b2435f5 100644
> > > > > --- a/drivers/watchdog/rti_wdt.c
> > > > > +++ b/drivers/watchdog/rti_wdt.c
> > > > > @@ -55,11 +55,13 @@ static int heartbeat;
> > > > > ?? * @base - base io address of WD device
> > > > > ?? * @freq - source clock frequency of WDT
> > > > > ?? * @wdd? - hold watchdog device as is in WDT core
> > > > > + * @min_hw_heartbeat_save - save of the min hw heartbeat value
> > > > > ?? */
> > > > > ? struct rti_wdt_device {
> > > > > ????? void __iomem??????? *base;
> > > > > ????? unsigned long??????? freq;
> > > > > ????? struct watchdog_device??? wdd;
> > > > > +??? unsigned int??????? min_hw_heartbeat_save;
> > > > > ? };
> > > > > ? static int rti_wdt_start(struct watchdog_device *wdd)
> > > > > @@ -107,6 +109,11 @@ static int rti_wdt_ping(struct watchdog_device *wdd)
> > > > > ????? /* put watchdog in active state */
> > > > > ????? writel_relaxed(WDKEY_SEQ1, wdt->base + RTIWDKEY);
> > > > > +??? if (wdt->min_hw_heartbeat_save) {
> > > > > +??????? wdd->min_hw_heartbeat_ms = wdt->min_hw_heartbeat_save;
> > > > > +??????? wdt->min_hw_heartbeat_save = 0;
> > > > > +??? }
> > > > > +
> > > > > ????? return 0;
> > > > > ? }
> > > > > @@ -201,6 +208,24 @@ static int rti_wdt_probe(struct platform_device *pdev)
> > > > > ????????? goto err_iomap;
> > > > > ????? }
> > > > > +??? if (readl(wdt->base + RTIDWDCTRL) == WDENABLE_KEY) {
> > > > > +??????? u32 time_left;
> > > > > +??????? u32 heartbeat;
> > > > > +
> > > > > +??????? set_bit(WDOG_HW_RUNNING, &wdd->status);
> > > > > +??????? time_left = rti_wdt_get_timeleft(wdd);
> > > > > +??????? heartbeat = readl(wdt->base + RTIDWDPRLD);
> > > > > +??????? heartbeat <<= WDT_PRELOAD_SHIFT;
> > > > > +??????? heartbeat /= wdt->freq;
> > > > > +??????? if (time_left < heartbeat / 2)
> > > > > +??????????? wdd->min_hw_heartbeat_ms = 0;
> > > > > +??????? else
> > > > > +??????????? wdd->min_hw_heartbeat_ms =
> > > > > +??????????????? (time_left - heartbeat / 2 + 1) * 1000;
> > > > > +
> > > > > +??????? wdt->min_hw_heartbeat_save = 11 * heartbeat * 1000 / 20;
> > > > > +??? }
> > > > > +
> > > > > ????? ret = watchdog_register_device(wdd);
> > > > > ????? if (ret) {
> > > > > ????????? dev_err(dev, "cannot register watchdog device\n");
> > > > >
> > > >
> > > > This assumes that the bootloader also programmed a 50% window, right? The pending U-Boot patch will do that, but what if that may chance or someone uses a different setup?
> > >
> > > Yes, we assume 50%. I think based on the hw design, 50% is the only sane value to be used, otherwise you just shrink the open window too much and for no apparent reason.
> > >
> >
> > Not sure if that is a valid assumption. Someone who designs a watchdog
> > with such a narrow ping window might as well also use it. The question
> > is if you want to rely on that assumption, or check and change it if needed.
>
> Right, if that is a blocker, I can modify the code. Should be maybe couple
> of lines addition.
>
> > Also, I wonder if we should add an API function such as
> > "set_last_hw_keepalive()" to avoid all that complexity.
>
> I can try adding that also if it is desirable.
>

But wait, the code doesn't really match what the description of this
patch claims, or at least the description is misleading. Per the
description, this is to prevent an early timeout. However, the problem
here is that the watchdog core does not generate a ping, even if
requested, because it believes that it just generated one right before
the watchdog timer was registered, and that it can not generate another
one because min_hw_heartbeat_ms has not elapsed.

With that in mind, the problem is a bit more complex.

First, the driver doesn't really update the current timeout to the
value that is currently configured and enabled. Instead, it just
uses/assumes the default (DEFAULT_HEARTBEAT or whatever the heartbeat
module parameter is set to). This means that it is still possible for
an early timeout to occur if there is a mismatch between the bootloader
timeout and the timeout assumed by the driver. Worse, the timeout
is only updated in the start function - and the start function isn't
called if the watchdog is already running. Actually, the driver does
not support updating the timeout at all. This means that a mismatch
between the bootloader timeout and the timeout assumed by the driver
is not handled well.

To solve this, the driver would have to update the actual timeout to
whatever is programmed into the chip and ignore any module parameter
and default settings if the watchdog is already running. Alternatively,
it would have to support updating the timeout (if the hardware supports
that) after the watchdog was started.

Second, handling min_hw_heartbeat_ms properly should really be implemented
in the watchdog core. Instead of assuming that the most recent keepalive
happened "just before now", as it currently does, it should call the
timeleft function (if available and if the watchdog is running) and
calculate the most recent keepalive (and thus the earliest acceptable
next keepalive) from its return value.

Thanks,
Guenter

2020-07-01 05:53:01

by Tero Kristo

[permalink] [raw]
Subject: Re: [PATCH 2/2] watchdog: rti: tweak min_hw_heartbeat_ms to match initial allowed window

On 30/06/2020 23:23, Guenter Roeck wrote:
> On Thu, Jun 25, 2020 at 08:04:50PM +0300, Tero Kristo wrote:
>> On 25/06/2020 16:35, Guenter Roeck wrote:
>>> On 6/25/20 1:32 AM, Tero Kristo wrote:
>>>> On 24/06/2020 18:24, Jan Kiszka wrote:
>>>>> On 24.06.20 13:45, Tero Kristo wrote:
>>>>>> If the RTI watchdog has been started by someone (like bootloader) when
>>>>>> the driver probes, we must adjust the initial ping timeout to match the
>>>>>> currently running watchdog window to avoid generating watchdog reset.
>>>>>>
>>>>>> Signed-off-by: Tero Kristo <[email protected]>
>>>>>> ---
>>>>>>   drivers/watchdog/rti_wdt.c | 25 +++++++++++++++++++++++++
>>>>>>   1 file changed, 25 insertions(+)
>>>>>>
>>>>>> diff --git a/drivers/watchdog/rti_wdt.c b/drivers/watchdog/rti_wdt.c
>>>>>> index d456dd72d99a..02ea2b2435f5 100644
>>>>>> --- a/drivers/watchdog/rti_wdt.c
>>>>>> +++ b/drivers/watchdog/rti_wdt.c
>>>>>> @@ -55,11 +55,13 @@ static int heartbeat;
>>>>>>    * @base - base io address of WD device
>>>>>>    * @freq - source clock frequency of WDT
>>>>>>    * @wdd  - hold watchdog device as is in WDT core
>>>>>> + * @min_hw_heartbeat_save - save of the min hw heartbeat value
>>>>>>    */
>>>>>>   struct rti_wdt_device {
>>>>>>       void __iomem        *base;
>>>>>>       unsigned long        freq;
>>>>>>       struct watchdog_device    wdd;
>>>>>> +    unsigned int        min_hw_heartbeat_save;
>>>>>>   };
>>>>>>   static int rti_wdt_start(struct watchdog_device *wdd)
>>>>>> @@ -107,6 +109,11 @@ static int rti_wdt_ping(struct watchdog_device *wdd)
>>>>>>       /* put watchdog in active state */
>>>>>>       writel_relaxed(WDKEY_SEQ1, wdt->base + RTIWDKEY);
>>>>>> +    if (wdt->min_hw_heartbeat_save) {
>>>>>> +        wdd->min_hw_heartbeat_ms = wdt->min_hw_heartbeat_save;
>>>>>> +        wdt->min_hw_heartbeat_save = 0;
>>>>>> +    }
>>>>>> +
>>>>>>       return 0;
>>>>>>   }
>>>>>> @@ -201,6 +208,24 @@ static int rti_wdt_probe(struct platform_device *pdev)
>>>>>>           goto err_iomap;
>>>>>>       }
>>>>>> +    if (readl(wdt->base + RTIDWDCTRL) == WDENABLE_KEY) {
>>>>>> +        u32 time_left;
>>>>>> +        u32 heartbeat;
>>>>>> +
>>>>>> +        set_bit(WDOG_HW_RUNNING, &wdd->status);
>>>>>> +        time_left = rti_wdt_get_timeleft(wdd);
>>>>>> +        heartbeat = readl(wdt->base + RTIDWDPRLD);
>>>>>> +        heartbeat <<= WDT_PRELOAD_SHIFT;
>>>>>> +        heartbeat /= wdt->freq;
>>>>>> +        if (time_left < heartbeat / 2)
>>>>>> +            wdd->min_hw_heartbeat_ms = 0;
>>>>>> +        else
>>>>>> +            wdd->min_hw_heartbeat_ms =
>>>>>> +                (time_left - heartbeat / 2 + 1) * 1000;
>>>>>> +
>>>>>> +        wdt->min_hw_heartbeat_save = 11 * heartbeat * 1000 / 20;
>>>>>> +    }
>>>>>> +
>>>>>>       ret = watchdog_register_device(wdd);
>>>>>>       if (ret) {
>>>>>>           dev_err(dev, "cannot register watchdog device\n");
>>>>>>
>>>>>
>>>>> This assumes that the bootloader also programmed a 50% window, right? The pending U-Boot patch will do that, but what if that may chance or someone uses a different setup?
>>>>
>>>> Yes, we assume 50%. I think based on the hw design, 50% is the only sane value to be used, otherwise you just shrink the open window too much and for no apparent reason.
>>>>
>>>
>>> Not sure if that is a valid assumption. Someone who designs a watchdog
>>> with such a narrow ping window might as well also use it. The question
>>> is if you want to rely on that assumption, or check and change it if needed.
>>
>> Right, if that is a blocker, I can modify the code. Should be maybe couple
>> of lines addition.
>>
>>> Also, I wonder if we should add an API function such as
>>> "set_last_hw_keepalive()" to avoid all that complexity.
>>
>> I can try adding that also if it is desirable.
>>
>
> But wait, the code doesn't really match what the description of this
> patch claims, or at least the description is misleading. Per the
> description, this is to prevent an early timeout. However, the problem
> here is that the watchdog core does not generate a ping, even if
> requested, because it believes that it just generated one right before
> the watchdog timer was registered, and that it can not generate another
> one because min_hw_heartbeat_ms has not elapsed.

You are right. Maybe the patch description could use some more beef into it.

>
> With that in mind, the problem is a bit more complex.
>
> First, the driver doesn't really update the current timeout to the
> value that is currently configured and enabled. Instead, it just
> uses/assumes the default (DEFAULT_HEARTBEAT or whatever the heartbeat
> module parameter is set to). This means that it is still possible for
> an early timeout to occur if there is a mismatch between the bootloader
> timeout and the timeout assumed by the driver. Worse, the timeout
> is only updated in the start function - and the start function isn't
> called if the watchdog is already running. Actually, the driver does
> not support updating the timeout at all. This means that a mismatch
> between the bootloader timeout and the timeout assumed by the driver
> is not handled well.
>
> To solve this, the driver would have to update the actual timeout to
> whatever is programmed into the chip and ignore any module parameter
> and default settings if the watchdog is already running. Alternatively,
> it would have to support updating the timeout (if the hardware supports
> that) after the watchdog was started.

Hardware supports changing the timeout value, however it only updates
this during the next window (preload values are picked once user pings
the watchdog.)

> Second, handling min_hw_heartbeat_ms properly should really be implemented
> in the watchdog core. Instead of assuming that the most recent keepalive
> happened "just before now", as it currently does, it should call the
> timeleft function (if available and if the watchdog is running) and
> calculate the most recent keepalive (and thus the earliest acceptable
> next keepalive) from its return value.
Yes, it all becomes bit complex if we let the bootloader configure the
values freely. Current bootloader implementation does not do this, as it
is mostly a copy of the kernel driver.

However, I can modify the kernel driver to take all this into account,
even if the code becomes a bit more complex due to this.

-Tero
--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki

2020-07-01 13:35:26

by Guenter Roeck

[permalink] [raw]
Subject: Re: [PATCH 2/2] watchdog: rti: tweak min_hw_heartbeat_ms to match initial allowed window

On 6/30/20 10:50 PM, Tero Kristo wrote:
[ ... ]

> Hardware supports changing the timeout value, however it only updates this during the next window (preload values are picked once user pings the watchdog.)
>
The current driver doesn't support or use that, though, since the start
function is only called once to start the watchdog, and not at all if
the watchdog is already running. So, if the bootloader sets the timeout
to X, and the user sets a timeout of, say, X * 4, userspace will never
ping the watchdog often enough. The driver will have to address that
to support already-running watchdogs.

Thanks,
Guenter

2020-07-01 14:49:15

by Tero Kristo

[permalink] [raw]
Subject: Re: [PATCH 2/2] watchdog: rti: tweak min_hw_heartbeat_ms to match initial allowed window

On 01/07/2020 16:34, Guenter Roeck wrote:
> On 6/30/20 10:50 PM, Tero Kristo wrote:
> [ ... ]
>
>> Hardware supports changing the timeout value, however it only updates this during the next window (preload values are picked once user pings the watchdog.)
>>
> The current driver doesn't support or use that, though, since the start
> function is only called once to start the watchdog, and not at all if
> the watchdog is already running. So, if the bootloader sets the timeout
> to X, and the user sets a timeout of, say, X * 4, userspace will never
> ping the watchdog often enough. The driver will have to address that
> to support already-running watchdogs.

Yeah, I will modify that. I think I will just prevent changing the
timeout if watchdog has been enabled from boot, it is probably cleanest
approach. Unless I happen to come up with some sane way to change it on fly.

-Tero
--
Texas Instruments Finland Oy, Porkkalankatu 22, 00180 Helsinki. Y-tunnus/Business ID: 0615521-4. Kotipaikka/Domicile: Helsinki