2018-09-14 07:02:18

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH] PM / suspend: Count suspend-to-idle loop as sleep time

From: Rafael J. Wysocki <[email protected]>

There is a difference in behavior between suspend-to-idle and
suspend-to-RAM in the timekeeping handling that leads to functional
issues. Namely, every iteration of the loop in s2idle_loop()
increases the monotinic clock somewhat, even if timekeeping_suspend()
and timekeeping_resume() are invoked from s2idle_enter(), and if
many of them are carried out in a row, the monotonic clock can grow
significantly while the system is regarded as suspended, which
doesn't happen during suspend-to-RAM and so it is unexpected and
leads to confusion and misbehavior in user space (similar to what
ensued when we tried to combine the boottime and monotonic clocks).

To avoid that, count all iterations of the loop in s2idle_loop()
as "sleep time" and adjust the clock for that on exit from
suspend-to-idle.

[That also covers systems on which timekeeping is not suspended
by by s2idle_enter().]

Signed-off-by: Rafael J. Wysocki <[email protected]>
---

This is a replacement for https://patchwork.kernel.org/patch/10599209/

I decided to count the entire loop in s2idle_loop() as "sleep time" as the
patch is then simpler and it also covers systems where timekeeping is not
suspended in the final step of suspend-to-idle.

I dropped the "Fixes:" tag, because the monotonic clock delta problem
has been present on the latter since the very introduction of "freeze"
(as suspend-to-idle was referred to previously) and so this doesn't fix
any particular later commits.

---
kernel/power/suspend.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

Index: linux-pm/kernel/power/suspend.c
===================================================================
--- linux-pm.orig/kernel/power/suspend.c
+++ linux-pm/kernel/power/suspend.c
@@ -109,8 +109,12 @@ static void s2idle_enter(void)

static void s2idle_loop(void)
{
+ ktime_t start, delta;
+
pm_pr_dbg("suspend-to-idle\n");

+ start = ktime_get();
+
for (;;) {
int error;

@@ -150,6 +154,20 @@ static void s2idle_loop(void)
pm_wakeup_clear(false);
}

+ /*
+ * If the monotonic clock difference between the start of the loop and
+ * this point is too large, user space may get confused about whether or
+ * not the system has been suspended and tasks may get killed by
+ * watchdogs etc., so count the loop as "sleep time" to compensate for
+ * that.
+ */
+ delta = ktime_sub(ktime_get(), start);
+ if (ktime_to_ns(delta) > 0) {
+ struct timespec64 timespec64_delta = ktime_to_timespec64(delta);
+
+ timekeeping_inject_sleeptime64(&timespec64_delta);
+ }
+
pm_pr_dbg("resume from suspend-to-idle\n");
}




2018-09-14 07:41:28

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] PM / suspend: Count suspend-to-idle loop as sleep time

On Fri, Sep 14, 2018 at 08:59:03AM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> There is a difference in behavior between suspend-to-idle and
> suspend-to-RAM in the timekeeping handling that leads to functional
> issues. Namely, every iteration of the loop in s2idle_loop()
> increases the monotinic clock somewhat, even if timekeeping_suspend()
> and timekeeping_resume() are invoked from s2idle_enter(), and if
> many of them are carried out in a row, the monotonic clock can grow
> significantly while the system is regarded as suspended, which
> doesn't happen during suspend-to-RAM and so it is unexpected and
> leads to confusion and misbehavior in user space (similar to what
> ensued when we tried to combine the boottime and monotonic clocks).
>
> To avoid that, count all iterations of the loop in s2idle_loop()
> as "sleep time" and adjust the clock for that on exit from
> suspend-to-idle.
>
> [That also covers systems on which timekeeping is not suspended
> by by s2idle_enter().]
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>

Do we want a 'warning' of sorts when the delta becomes significant (for
whatever that is) ? That might be an indication that there are frequent
wakeups which we might not be expecting. Of keep the number of spurious
wakeups in a stat counter somewhere -- something to look at if the
battery drains faster than expected.

Otherwise:

Acked-by: Peter Zijlstra (Intel) <[email protected]>

One minor nit below:

> ---
> kernel/power/suspend.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> Index: linux-pm/kernel/power/suspend.c
> ===================================================================
> --- linux-pm.orig/kernel/power/suspend.c
> +++ linux-pm/kernel/power/suspend.c
> @@ -109,8 +109,12 @@ static void s2idle_enter(void)
>
> static void s2idle_loop(void)
> {
> + ktime_t start, delta;
> +
> pm_pr_dbg("suspend-to-idle\n");
>
> + start = ktime_get();
> +
> for (;;) {
> int error;
>
> @@ -150,6 +154,20 @@ static void s2idle_loop(void)
> pm_wakeup_clear(false);
> }
>
> + /*
> + * If the monotonic clock difference between the start of the loop and
> + * this point is too large, user space may get confused about whether or
> + * not the system has been suspended and tasks may get killed by
> + * watchdogs etc., so count the loop as "sleep time" to compensate for
> + * that.
> + */
> + delta = ktime_sub(ktime_get(), start);
> + if (ktime_to_ns(delta) > 0) {
> + struct timespec64 timespec64_delta = ktime_to_timespec64(delta);
> +
> + timekeeping_inject_sleeptime64(&timespec64_delta);
> + }
> +
> pm_pr_dbg("resume from suspend-to-idle\n");
> }

Like I mentioned yesterday; I myself prefer the form:


u64 stamp = ktimer_get_ns();

for (;;) {
/* ... */
}

stamp = ktime_get_ns() - stamp;
if (stamp)
timekeeping_inject_sleeptime64(ns_to_timespec64(ns));


Esp. since ktime_t _is_ s64 these days, there is no point in keep using
all the weird ktime helpers that make the code harder to read.

2018-09-14 07:48:03

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] PM / suspend: Count suspend-to-idle loop as sleep time

On Fri, Sep 14, 2018 at 9:41 AM Peter Zijlstra <[email protected]> wrote:
>
> On Fri, Sep 14, 2018 at 08:59:03AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <[email protected]>
> >
> > There is a difference in behavior between suspend-to-idle and
> > suspend-to-RAM in the timekeeping handling that leads to functional
> > issues. Namely, every iteration of the loop in s2idle_loop()
> > increases the monotinic clock somewhat, even if timekeeping_suspend()
> > and timekeeping_resume() are invoked from s2idle_enter(), and if
> > many of them are carried out in a row, the monotonic clock can grow
> > significantly while the system is regarded as suspended, which
> > doesn't happen during suspend-to-RAM and so it is unexpected and
> > leads to confusion and misbehavior in user space (similar to what
> > ensued when we tried to combine the boottime and monotonic clocks).
> >
> > To avoid that, count all iterations of the loop in s2idle_loop()
> > as "sleep time" and adjust the clock for that on exit from
> > suspend-to-idle.
> >
> > [That also covers systems on which timekeeping is not suspended
> > by by s2idle_enter().]
> >
> > Signed-off-by: Rafael J. Wysocki <[email protected]>
>
> Do we want a 'warning' of sorts when the delta becomes significant (for
> whatever that is) ? That might be an indication that there are frequent
> wakeups which we might not be expecting. Of keep the number of spurious
> wakeups in a stat counter somewhere -- something to look at if the
> battery drains faster than expected.

If you echo 1 to /sys/power/pm_debug_messages, dmesg will tell you
that (with gory details). :-)

> Otherwise:
>
> Acked-by: Peter Zijlstra (Intel) <[email protected]>
>
> One minor nit below:
>
> > ---
> > kernel/power/suspend.c | 18 ++++++++++++++++++
> > 1 file changed, 18 insertions(+)
> >
> > Index: linux-pm/kernel/power/suspend.c
> > ===================================================================
> > --- linux-pm.orig/kernel/power/suspend.c
> > +++ linux-pm/kernel/power/suspend.c
> > @@ -109,8 +109,12 @@ static void s2idle_enter(void)
> >
> > static void s2idle_loop(void)
> > {
> > + ktime_t start, delta;
> > +
> > pm_pr_dbg("suspend-to-idle\n");
> >
> > + start = ktime_get();
> > +
> > for (;;) {
> > int error;
> >
> > @@ -150,6 +154,20 @@ static void s2idle_loop(void)
> > pm_wakeup_clear(false);
> > }
> >
> > + /*
> > + * If the monotonic clock difference between the start of the loop and
> > + * this point is too large, user space may get confused about whether or
> > + * not the system has been suspended and tasks may get killed by
> > + * watchdogs etc., so count the loop as "sleep time" to compensate for
> > + * that.
> > + */
> > + delta = ktime_sub(ktime_get(), start);
> > + if (ktime_to_ns(delta) > 0) {
> > + struct timespec64 timespec64_delta = ktime_to_timespec64(delta);
> > +
> > + timekeeping_inject_sleeptime64(&timespec64_delta);
> > + }
> > +
> > pm_pr_dbg("resume from suspend-to-idle\n");
> > }
>
> Like I mentioned yesterday; I myself prefer the form:
>
>
> u64 stamp = ktimer_get_ns();
>
> for (;;) {
> /* ... */
> }
>
> stamp = ktime_get_ns() - stamp;
> if (stamp)
> timekeeping_inject_sleeptime64(ns_to_timespec64(ns));
>
>
> Esp. since ktime_t _is_ s64 these days, there is no point in keep using
> all the weird ktime helpers that make the code harder to read.

Looks like a good idea, let me try that.

2018-09-14 08:04:58

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] PM / suspend: Count suspend-to-idle loop as sleep time

On Fri, Sep 14, 2018 at 09:47:28AM +0200, Rafael J. Wysocki wrote:
> On Fri, Sep 14, 2018 at 9:41 AM Peter Zijlstra <[email protected]> wrote:
> > Do we want a 'warning' of sorts when the delta becomes significant (for
> > whatever that is) ? That might be an indication that there are frequent
> > wakeups which we might not be expecting. Of keep the number of spurious
> > wakeups in a stat counter somewhere -- something to look at if the
> > battery drains faster than expected.
>
> If you echo 1 to /sys/power/pm_debug_messages, dmesg will tell you
> that (with gory details). :-)

Shiny, didn't know that ;-)

2018-09-14 08:17:01

by Rafael J. Wysocki

[permalink] [raw]
Subject: [PATCH v2] PM / suspend: Count suspend-to-idle loop as sleep time

From: Rafael J. Wysocki <[email protected]>

There is a difference in behavior between suspend-to-idle and
suspend-to-RAM in the timekeeping handling that leads to functional
issues. Namely, every iteration of the loop in s2idle_loop()
increases the monotinic clock somewhat, even if timekeeping_suspend()
and timekeeping_resume() are invoked from s2idle_enter(), and if
many of them are carried out in a row, the monotonic clock can grow
significantly while the system is regarded as suspended, which
doesn't happen during suspend-to-RAM and so it is unexpected and
leads to confusion and misbehavior in user space (similar to what
ensued when we tried to combine the boottime and monotonic clocks).

To avoid that, count all iterations of the loop in s2idle_loop()
as "sleep time" and adjust the clock for that on exit from
suspend-to-idle.

[That also covers systems on which timekeeping is not suspended
by s2idle_enter().]

Signed-off-by: Rafael J. Wysocki <[email protected]>
Acked-by: Peter Zijlstra (Intel) <[email protected]>
---

-> v2:
- Switched over to ktime_get_ns() as suggested by Peter.
- Tentatively added "Acked-by" from Peter.

---
kernel/power/suspend.c | 18 ++++++++++++++++++
1 file changed, 18 insertions(+)

Index: linux-pm/kernel/power/suspend.c
===================================================================
--- linux-pm.orig/kernel/power/suspend.c
+++ linux-pm/kernel/power/suspend.c
@@ -109,8 +109,12 @@ static void s2idle_enter(void)

static void s2idle_loop(void)
{
+ u64 delta_ns;
+
pm_pr_dbg("suspend-to-idle\n");

+ delta_ns = ktime_get_ns();
+
for (;;) {
int error;

@@ -150,6 +154,20 @@ static void s2idle_loop(void)
pm_wakeup_clear(false);
}

+ /*
+ * If the monotonic clock difference between the start of the loop and
+ * this point is too large, user space may get confused about whether or
+ * not the system has been suspended and tasks may get killed by
+ * watchdogs etc., so count the loop as "sleep time" to compensate for
+ * that.
+ */
+ delta_ns = ktime_get_ns() - delta_ns;
+ if (delta_ns) {
+ struct timespec64 timespec64_delta = ns_to_timespec64(delta_ns);
+
+ timekeeping_inject_sleeptime64(&timespec64_delta);
+ }
+
pm_pr_dbg("resume from suspend-to-idle\n");
}



2018-09-14 08:30:29

by Mika Penttilä

[permalink] [raw]
Subject: Re: [PATCH] PM / suspend: Count suspend-to-idle loop as sleep time

Hi!


On 09/14/2018 09:59 AM, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <[email protected]>
>
> There is a difference in behavior between suspend-to-idle and
> suspend-to-RAM in the timekeeping handling that leads to functional
> issues. Namely, every iteration of the loop in s2idle_loop()
> increases the monotinic clock somewhat, even if timekeeping_suspend()
> and timekeeping_resume() are invoked from s2idle_enter(), and if
> many of them are carried out in a row, the monotonic clock can grow
> significantly while the system is regarded as suspended, which
> doesn't happen during suspend-to-RAM and so it is unexpected and
> leads to confusion and misbehavior in user space (similar to what
> ensued when we tried to combine the boottime and monotonic clocks).
>
> To avoid that, count all iterations of the loop in s2idle_loop()
> as "sleep time" and adjust the clock for that on exit from
> suspend-to-idle.
>
> [That also covers systems on which timekeeping is not suspended
> by by s2idle_enter().]
>
> Signed-off-by: Rafael J. Wysocki <[email protected]>
> ---
>
> This is a replacement for https://patchwork.kernel.org/patch/10599209/
>
> I decided to count the entire loop in s2idle_loop() as "sleep time" as the
> patch is then simpler and it also covers systems where timekeeping is not
> suspended in the final step of suspend-to-idle.
>
> I dropped the "Fixes:" tag, because the monotonic clock delta problem
> has been present on the latter since the very introduction of "freeze"
> (as suspend-to-idle was referred to previously) and so this doesn't fix
> any particular later commits.
>
> ---
> kernel/power/suspend.c | 18 ++++++++++++++++++
> 1 file changed, 18 insertions(+)
>
> Index: linux-pm/kernel/power/suspend.c
> ===================================================================
> --- linux-pm.orig/kernel/power/suspend.c
> +++ linux-pm/kernel/power/suspend.c
> @@ -109,8 +109,12 @@ static void s2idle_enter(void)
>
> static void s2idle_loop(void)
> {
> + ktime_t start, delta;
> +
> pm_pr_dbg("suspend-to-idle\n");
>
> + start = ktime_get();
> +
> for (;;) {
> int error;
>
> @@ -150,6 +154,20 @@ static void s2idle_loop(void)
> pm_wakeup_clear(false);
> }
>
> + /*
> + * If the monotonic clock difference between the start of the loop and
> + * this point is too large, user space may get confused about whether or
> + * not the system has been suspended and tasks may get killed by
> + * watchdogs etc., so count the loop as "sleep time" to compensate for
> + * that.
> + */
> + delta = ktime_sub(ktime_get(), start);
> + if (ktime_to_ns(delta) > 0) {
> + struct timespec64 timespec64_delta = ktime_to_timespec64(delta);
> +
> + timekeeping_inject_sleeptime64(&timespec64_delta);
> + }

But doesn't injecting sleep time here make monotonic clock too large by the amount of sleeptime?
tick_freeze() / tick_unfreeze() already injects the sleeptime (otherwise delta would be 0).

> +
> pm_pr_dbg("resume from suspend-to-idle\n");
> }
>
>


2018-09-14 08:49:40

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] PM / suspend: Count suspend-to-idle loop as sleep time

On Friday, September 14, 2018 10:28:44 AM CEST Mika Penttil? wrote:
> Hi!
>
>
> On 09/14/2018 09:59 AM, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <[email protected]>
> >
> > There is a difference in behavior between suspend-to-idle and
> > suspend-to-RAM in the timekeeping handling that leads to functional
> > issues. Namely, every iteration of the loop in s2idle_loop()
> > increases the monotinic clock somewhat, even if timekeeping_suspend()
> > and timekeeping_resume() are invoked from s2idle_enter(), and if
> > many of them are carried out in a row, the monotonic clock can grow
> > significantly while the system is regarded as suspended, which
> > doesn't happen during suspend-to-RAM and so it is unexpected and
> > leads to confusion and misbehavior in user space (similar to what
> > ensued when we tried to combine the boottime and monotonic clocks).
> >
> > To avoid that, count all iterations of the loop in s2idle_loop()
> > as "sleep time" and adjust the clock for that on exit from
> > suspend-to-idle.
> >
> > [That also covers systems on which timekeeping is not suspended
> > by by s2idle_enter().]
> >
> > Signed-off-by: Rafael J. Wysocki <[email protected]>
> > ---
> >
> > This is a replacement for https://patchwork.kernel.org/patch/10599209/
> >
> > I decided to count the entire loop in s2idle_loop() as "sleep time" as the
> > patch is then simpler and it also covers systems where timekeeping is not
> > suspended in the final step of suspend-to-idle.
> >
> > I dropped the "Fixes:" tag, because the monotonic clock delta problem
> > has been present on the latter since the very introduction of "freeze"
> > (as suspend-to-idle was referred to previously) and so this doesn't fix
> > any particular later commits.
> >
> > ---
> > kernel/power/suspend.c | 18 ++++++++++++++++++
> > 1 file changed, 18 insertions(+)
> >
> > Index: linux-pm/kernel/power/suspend.c
> > ===================================================================
> > --- linux-pm.orig/kernel/power/suspend.c
> > +++ linux-pm/kernel/power/suspend.c
> > @@ -109,8 +109,12 @@ static void s2idle_enter(void)
> >
> > static void s2idle_loop(void)
> > {
> > + ktime_t start, delta;
> > +
> > pm_pr_dbg("suspend-to-idle\n");
> >
> > + start = ktime_get();
> > +
> > for (;;) {
> > int error;
> >
> > @@ -150,6 +154,20 @@ static void s2idle_loop(void)
> > pm_wakeup_clear(false);
> > }
> >
> > + /*
> > + * If the monotonic clock difference between the start of the loop and
> > + * this point is too large, user space may get confused about whether or
> > + * not the system has been suspended and tasks may get killed by
> > + * watchdogs etc., so count the loop as "sleep time" to compensate for
> > + * that.
> > + */
> > + delta = ktime_sub(ktime_get(), start);
> > + if (ktime_to_ns(delta) > 0) {
> > + struct timespec64 timespec64_delta = ktime_to_timespec64(delta);
> > +
> > + timekeeping_inject_sleeptime64(&timespec64_delta);
> > + }
>
> But doesn't injecting sleep time here make monotonic clock too large by the amount of sleeptime?
> tick_freeze() / tick_unfreeze() already injects the sleeptime (otherwise delta would be 0).

No, it doesn't.

The delta here is the extra time taken by the loop which hasn't been counted
as sleep time yet.

Thanks,
Rafael


2018-09-14 09:53:51

by Mika Penttilä

[permalink] [raw]
Subject: Re: [PATCH] PM / suspend: Count suspend-to-idle loop as sleep time

On 09/14/2018 11:46 AM, Rafael J. Wysocki wrote:
> On Friday, September 14, 2018 10:28:44 AM CEST Mika Penttilä wrote:
>> Hi!
>>
>>
>> On 09/14/2018 09:59 AM, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <[email protected]>
>>>
>>> There is a difference in behavior between suspend-to-idle and
>>> suspend-to-RAM in the timekeeping handling that leads to functional
>>> issues. Namely, every iteration of the loop in s2idle_loop()
>>> increases the monotinic clock somewhat, even if timekeeping_suspend()
>>> and timekeeping_resume() are invoked from s2idle_enter(), and if
>>> many of them are carried out in a row, the monotonic clock can grow
>>> significantly while the system is regarded as suspended, which
>>> doesn't happen during suspend-to-RAM and so it is unexpected and
>>> leads to confusion and misbehavior in user space (similar to what
>>> ensued when we tried to combine the boottime and monotonic clocks).
>>>
>>> To avoid that, count all iterations of the loop in s2idle_loop()
>>> as "sleep time" and adjust the clock for that on exit from
>>> suspend-to-idle.
>>>
>>> [That also covers systems on which timekeeping is not suspended
>>> by by s2idle_enter().]
>>>
>>> Signed-off-by: Rafael J. Wysocki <[email protected]>
>>> ---
>>>
>>> This is a replacement for https://patchwork.kernel.org/patch/10599209/
>>>
>>> I decided to count the entire loop in s2idle_loop() as "sleep time" as the
>>> patch is then simpler and it also covers systems where timekeeping is not
>>> suspended in the final step of suspend-to-idle.
>>>
>>> I dropped the "Fixes:" tag, because the monotonic clock delta problem
>>> has been present on the latter since the very introduction of "freeze"
>>> (as suspend-to-idle was referred to previously) and so this doesn't fix
>>> any particular later commits.
>>>
>>> ---
>>> kernel/power/suspend.c | 18 ++++++++++++++++++
>>> 1 file changed, 18 insertions(+)
>>>
>>> Index: linux-pm/kernel/power/suspend.c
>>> ===================================================================
>>> --- linux-pm.orig/kernel/power/suspend.c
>>> +++ linux-pm/kernel/power/suspend.c
>>> @@ -109,8 +109,12 @@ static void s2idle_enter(void)
>>>
>>> static void s2idle_loop(void)
>>> {
>>> + ktime_t start, delta;
>>> +
>>> pm_pr_dbg("suspend-to-idle\n");
>>>
>>> + start = ktime_get();
>>> +
>>> for (;;) {
>>> int error;
>>>
>>> @@ -150,6 +154,20 @@ static void s2idle_loop(void)
>>> pm_wakeup_clear(false);
>>> }
>>>
>>> + /*
>>> + * If the monotonic clock difference between the start of the loop and
>>> + * this point is too large, user space may get confused about whether or
>>> + * not the system has been suspended and tasks may get killed by
>>> + * watchdogs etc., so count the loop as "sleep time" to compensate for
>>> + * that.
>>> + */
>>> + delta = ktime_sub(ktime_get(), start);
>>> + if (ktime_to_ns(delta) > 0) {
>>> + struct timespec64 timespec64_delta = ktime_to_timespec64(delta);
>>> +
>>> + timekeeping_inject_sleeptime64(&timespec64_delta);
>>> + }
>>
>> But doesn't injecting sleep time here make monotonic clock too large by the amount of sleeptime?
>> tick_freeze() / tick_unfreeze() already injects the sleeptime (otherwise delta would be 0).
>
> No, it doesn't.
>
> The delta here is the extra time taken by the loop which hasn't been counted
> as sleep time yet.

I said incorrectly monotonic clock, but timekeeping_inject_sleeptime64() forwards the wall time, by the amount of delta.
Why wouldn't some other cpu update xtime when one cpu is in the loop? And if all cpus enter s2idle, tick_unfreeze()
injects sleeptime. My point is that this extra injection makes wall time wrong, no?

>
> Thanks,
> Rafael
>


2018-09-14 10:07:06

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] PM / suspend: Count suspend-to-idle loop as sleep time

On Fri, Sep 14, 2018 at 11:53 AM Mika Penttilä
<[email protected]> wrote:
>
> On 09/14/2018 11:46 AM, Rafael J. Wysocki wrote:
> > On Friday, September 14, 2018 10:28:44 AM CEST Mika Penttilä wrote:
> >> Hi!
> >>
> >>
> >> On 09/14/2018 09:59 AM, Rafael J. Wysocki wrote:
> >>> From: Rafael J. Wysocki <[email protected]>
> >>>
> >>> There is a difference in behavior between suspend-to-idle and
> >>> suspend-to-RAM in the timekeeping handling that leads to functional
> >>> issues. Namely, every iteration of the loop in s2idle_loop()
> >>> increases the monotinic clock somewhat, even if timekeeping_suspend()
> >>> and timekeeping_resume() are invoked from s2idle_enter(), and if
> >>> many of them are carried out in a row, the monotonic clock can grow
> >>> significantly while the system is regarded as suspended, which
> >>> doesn't happen during suspend-to-RAM and so it is unexpected and
> >>> leads to confusion and misbehavior in user space (similar to what
> >>> ensued when we tried to combine the boottime and monotonic clocks).
> >>>
> >>> To avoid that, count all iterations of the loop in s2idle_loop()
> >>> as "sleep time" and adjust the clock for that on exit from
> >>> suspend-to-idle.
> >>>
> >>> [That also covers systems on which timekeeping is not suspended
> >>> by by s2idle_enter().]
> >>>
> >>> Signed-off-by: Rafael J. Wysocki <[email protected]>
> >>> ---
> >>>
> >>> This is a replacement for https://patchwork.kernel.org/patch/10599209/
> >>>
> >>> I decided to count the entire loop in s2idle_loop() as "sleep time" as the
> >>> patch is then simpler and it also covers systems where timekeeping is not
> >>> suspended in the final step of suspend-to-idle.
> >>>
> >>> I dropped the "Fixes:" tag, because the monotonic clock delta problem
> >>> has been present on the latter since the very introduction of "freeze"
> >>> (as suspend-to-idle was referred to previously) and so this doesn't fix
> >>> any particular later commits.
> >>>
> >>> ---
> >>> kernel/power/suspend.c | 18 ++++++++++++++++++
> >>> 1 file changed, 18 insertions(+)
> >>>
> >>> Index: linux-pm/kernel/power/suspend.c
> >>> ===================================================================
> >>> --- linux-pm.orig/kernel/power/suspend.c
> >>> +++ linux-pm/kernel/power/suspend.c
> >>> @@ -109,8 +109,12 @@ static void s2idle_enter(void)
> >>>
> >>> static void s2idle_loop(void)
> >>> {
> >>> + ktime_t start, delta;
> >>> +
> >>> pm_pr_dbg("suspend-to-idle\n");
> >>>
> >>> + start = ktime_get();
> >>> +
> >>> for (;;) {
> >>> int error;
> >>>
> >>> @@ -150,6 +154,20 @@ static void s2idle_loop(void)
> >>> pm_wakeup_clear(false);
> >>> }
> >>>
> >>> + /*
> >>> + * If the monotonic clock difference between the start of the loop and
> >>> + * this point is too large, user space may get confused about whether or
> >>> + * not the system has been suspended and tasks may get killed by
> >>> + * watchdogs etc., so count the loop as "sleep time" to compensate for
> >>> + * that.
> >>> + */
> >>> + delta = ktime_sub(ktime_get(), start);
> >>> + if (ktime_to_ns(delta) > 0) {
> >>> + struct timespec64 timespec64_delta = ktime_to_timespec64(delta);
> >>> +
> >>> + timekeeping_inject_sleeptime64(&timespec64_delta);
> >>> + }
> >>
> >> But doesn't injecting sleep time here make monotonic clock too large by the amount of sleeptime?
> >> tick_freeze() / tick_unfreeze() already injects the sleeptime (otherwise delta would be 0).
> >
> > No, it doesn't.
> >
> > The delta here is the extra time taken by the loop which hasn't been counted
> > as sleep time yet.
>
> I said incorrectly monotonic clock, but timekeeping_inject_sleeptime64() forwards the wall time, by the amount of delta.
> Why wouldn't some other cpu update xtime when one cpu is in the loop? And if all cpus enter s2idle, tick_unfreeze()
> injects sleeptime. My point is that this extra injection makes wall time wrong, no?

OK, you're right. I got that the other way around.

So, the patch is withdrawn.

2018-09-14 12:42:13

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [PATCH] PM / suspend: Count suspend-to-idle loop as sleep time

On Fri, 14 Sep 2018, Rafael J. Wysocki wrote:
> On Fri, Sep 14, 2018 at 11:53 AM Mika Penttilä

> > >> But doesn't injecting sleep time here make monotonic clock too large
> > >> by the amount of sleeptime? tick_freeze() / tick_unfreeze() already
> > >> injects the sleeptime (otherwise delta would be 0). > > >
> > >
> > > No, it doesn't.
> > >
> > > The delta here is the extra time taken by the loop which hasn't been counted
> > > as sleep time yet.
> >

> > I said incorrectly monotonic clock, but
> > timekeeping_inject_sleeptime64() forwards the wall time, by the amount
> > of delta. Why wouldn't some other cpu update xtime when one cpu is in
> > the loop? And if all cpus enter s2idle, tick_unfreeze() injects
> > sleeptime. My point is that this extra injection makes wall time wrong,
> > no?
>
> OK, you're right. I got that the other way around.
>
> So, the patch is withdrawn.

I just tried to wrap my brain around that whole thing and utterly
failed, so I can't give any recommendations right now.

Rafael, could you please enable some lightweight instrumentation which lets
me see the longer sequence of events which are leading to this or tell me
what I need to do to reproduce that myself?

Thanks,

tglx

2018-09-17 08:08:47

by Rafael J. Wysocki

[permalink] [raw]
Subject: Re: [PATCH] PM / suspend: Count suspend-to-idle loop as sleep time

On Fri, Sep 14, 2018 at 2:41 PM Thomas Gleixner <[email protected]> wrote:
>
> On Fri, 14 Sep 2018, Rafael J. Wysocki wrote:
> > On Fri, Sep 14, 2018 at 11:53 AM Mika Penttilä
>
> > > >> But doesn't injecting sleep time here make monotonic clock too large
> > > >> by the amount of sleeptime? tick_freeze() / tick_unfreeze() already
> > > >> injects the sleeptime (otherwise delta would be 0). > > >
> > > >
> > > > No, it doesn't.
> > > >
> > > > The delta here is the extra time taken by the loop which hasn't been counted
> > > > as sleep time yet.
> > >
>
> > > I said incorrectly monotonic clock, but
> > > timekeeping_inject_sleeptime64() forwards the wall time, by the amount
> > > of delta. Why wouldn't some other cpu update xtime when one cpu is in
> > > the loop? And if all cpus enter s2idle, tick_unfreeze() injects
> > > sleeptime. My point is that this extra injection makes wall time wrong,
> > > no?
> >
> > OK, you're right. I got that the other way around.
> >
> > So, the patch is withdrawn.

[Sorry for the delay, personal life intervened.]

> I just tried to wrap my brain around that whole thing and utterly
> failed, so I can't give any recommendations right now.

Thanks for looking into this!

> Rafael, could you please enable some lightweight instrumentation which lets
> me see the longer sequence of events which are leading to this or tell me
> what I need to do to reproduce that myself?

I will, when I find out more. The reported issues may just go away
after fixing some other bugs.

Cheers,
Rafael

2018-09-22 17:14:16

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] PM / suspend: Count suspend-to-idle loop as sleep time

Hi Rafael,

I love your patch! Yet something to improve:

[auto build test ERROR on linus/master]
[also build test ERROR on v4.19-rc4 next-20180921]
[if your patch is applied to the wrong git tree, please drop us a note to help improve the system]

url: https://github.com/0day-ci/linux/commits/Rafael-J-Wysocki/PM-suspend-Count-suspend-to-idle-loop-as-sleep-time/20180914-170606
config: i386-randconfig-a1-09200857 (attached as .config)
compiler: gcc-4.9 (Debian 4.9.4-2) 4.9.4
reproduce:
# save the attached .config to linux build tree
make ARCH=i386

All errors (new ones prefixed by >>):

kernel/power/suspend.o: In function `s2idle_loop':
>> kernel/power/suspend.c:168: undefined reference to `timekeeping_inject_sleeptime64'

vim +168 kernel/power/suspend.c

109
110 static void s2idle_loop(void)
111 {
112 ktime_t start, delta;
113
114 pm_pr_dbg("suspend-to-idle\n");
115
116 start = ktime_get();
117
118 for (;;) {
119 int error;
120
121 dpm_noirq_begin();
122
123 /*
124 * Suspend-to-idle equals
125 * frozen processes + suspended devices + idle processors.
126 * Thus s2idle_enter() should be called right after
127 * all devices have been suspended.
128 *
129 * Wakeups during the noirq suspend of devices may be spurious,
130 * so prevent them from terminating the loop right away.
131 */
132 error = dpm_noirq_suspend_devices(PMSG_SUSPEND);
133 if (!error)
134 s2idle_enter();
135 else if (error == -EBUSY && pm_wakeup_pending())
136 error = 0;
137
138 if (!error && s2idle_ops && s2idle_ops->wake)
139 s2idle_ops->wake();
140
141 dpm_noirq_resume_devices(PMSG_RESUME);
142
143 dpm_noirq_end();
144
145 if (error)
146 break;
147
148 if (s2idle_ops && s2idle_ops->sync)
149 s2idle_ops->sync();
150
151 if (pm_wakeup_pending())
152 break;
153
154 pm_wakeup_clear(false);
155 }
156
157 /*
158 * If the monotonic clock difference between the start of the loop and
159 * this point is too large, user space may get confused about whether or
160 * not the system has been suspended and tasks may get killed by
161 * watchdogs etc., so count the loop as "sleep time" to compensate for
162 * that.
163 */
164 delta = ktime_sub(ktime_get(), start);
165 if (ktime_to_ns(delta) > 0) {
166 struct timespec64 timespec64_delta = ktime_to_timespec64(delta);
167
> 168 timekeeping_inject_sleeptime64(&timespec64_delta);
169 }
170
171 pm_pr_dbg("resume from suspend-to-idle\n");
172 }
173

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (2.89 kB)
.config.gz (34.42 kB)
Download all attachments