2023-09-29 16:30:17

by John Ogness

[permalink] [raw]
Subject: [PATCH printk] printk: flush consoles before checking progress

Commit 9e70a5e109a4 ("printk: Add per-console suspended state")
removed console lock usage during resume and replaced it with
the clearly defined console_list_lock and srcu mechanisms.

However, the console lock usage had an important side-effect
of flushing the consoles. After its removal, consoles were no
longer flushed before checking their progress.

Add the console_lock/console_unlock dance to the beginning
of __pr_flush() to actually flush the consoles before checking
their progress. Also add comments to clarify this additional
usage of the console lock.

Finally, in case pr_flush() needs to poll-wait (for example,
because there has been a handover to another CPU), reduce the
polling interval from 100ms to 1ms. In the bug report it was
mentioned that 100ms is unnecessarily long.

Reported-by: Todd Brandt <[email protected]>
Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217955
Fixes: 9e70a5e109a4 ("printk: Add per-console suspended state")
Signed-off-by: John Ogness <[email protected]>
---
kernel/printk/printk.c | 23 ++++++++++++-----------
1 file changed, 12 insertions(+), 11 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index 7e0b4dd02398..cf7841fc6eef 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3740,12 +3740,18 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre

seq = prb_next_seq(prb);

+ /* Flush the consoles so that records up to @seq are printed. */
+ console_lock();
+ console_unlock();
+
for (;;) {
diff = 0;

/*
* Hold the console_lock to guarantee safe access to
- * console->seq.
+ * console->seq. Releasing console_lock flushes more
+ * records in case @seq is still not printed on all
+ * usable consoles.
*/
console_lock();

@@ -3775,16 +3781,11 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
if (diff == 0 || remaining == 0)
break;

- if (remaining < 0) {
- /* no timeout limit */
- msleep(100);
- } else if (remaining < 100) {
- msleep(remaining);
- remaining = 0;
- } else {
- msleep(100);
- remaining -= 100;
- }
+ msleep(1);
+
+ /* If @remaining < 0, there is no timeout limit. */
+ if (remaining > 0)
+ remaining--;

last_diff = diff;
}

base-commit: 4952801fc6adb5b50b8ec2bcc5aeef92fcce8730
--
2.39.2


2023-09-29 21:47:16

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH printk] printk: flush consoles before checking progress

On Fri 2023-09-29 13:38:33, John Ogness wrote:
> Commit 9e70a5e109a4 ("printk: Add per-console suspended state")
> removed console lock usage during resume and replaced it with
> the clearly defined console_list_lock and srcu mechanisms.
>
> However, the console lock usage had an important side-effect
> of flushing the consoles. After its removal, consoles were no
> longer flushed before checking their progress.
>
> Add the console_lock/console_unlock dance to the beginning
> of __pr_flush() to actually flush the consoles before checking
> their progress. Also add comments to clarify this additional
> usage of the console lock.
>
> Finally, in case pr_flush() needs to poll-wait (for example,
> because there has been a handover to another CPU), reduce the
> polling interval from 100ms to 1ms. In the bug report it was
> mentioned that 100ms is unnecessarily long.
>
> Reported-by: Todd Brandt <[email protected]>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217955
> Fixes: 9e70a5e109a4 ("printk: Add per-console suspended state")
> Signed-off-by: John Ogness <[email protected]>

Looks good to me:

Reviewed-by: Petr Mladek <[email protected]>

I am going to wait a bit for eventual feedback and push it to
linux-next the following week.

Best Regards,
Petr

2023-10-02 17:29:14

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH printk] printk: flush consoles before checking progress

On Fri 2023-09-29 13:38:33, John Ogness wrote:
> Finally, in case pr_flush() needs to poll-wait (for example,
> because there has been a handover to another CPU), reduce the
> polling interval from 100ms to 1ms. In the bug report it was
> mentioned that 100ms is unnecessarily long.
>
> Reported-by: Todd Brandt <[email protected]>
> Closes: https://bugzilla.kernel.org/show_bug.cgi?id=217955
> Fixes: 9e70a5e109a4 ("printk: Add per-console suspended state")
> Signed-off-by: John Ogness <[email protected]>
> ---
> kernel/printk/printk.c | 23 ++++++++++++-----------
> 1 file changed, 12 insertions(+), 11 deletions(-)
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index 7e0b4dd02398..cf7841fc6eef 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3775,16 +3781,11 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
> if (diff == 0 || remaining == 0)
> break;
>
> - if (remaining < 0) {
> - /* no timeout limit */
> - msleep(100);
> - } else if (remaining < 100) {
> - msleep(remaining);
> - remaining = 0;
> - } else {
> - msleep(100);
> - remaining -= 100;
> - }
> + msleep(1);

I was about to push this patch and ran checkpatch.pl. It warned about

WARNING: msleep < 20ms can sleep for up to 20ms; see Documentation/timers/timers-howto.rst
#73: FILE: kernel/printk/printk.c:3782:
+ msleep(1);

And indeed, Documentation/timers/timers-howto.rst says that msleep()
might sleep longer that expected for <20ms delays. I guess that
it is somehow related to jiffies, HZ, and load on the system.

I think that we need to count jiffies here. Something like:

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index db81b68d7f14..6ea500d95fd9 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -3723,7 +3723,8 @@ late_initcall(printk_late_init);
/* If @con is specified, only wait for that console. Otherwise wait for all. */
static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress)
{
- int remaining = timeout_ms;
+ unsigned long timeout_jiffies = msecs_to_jiffies(timeout_ms);
+ unsigned_long timeout_end = jiffies + timeout_jiffies;
struct console *c;
u64 last_diff = 0;
u64 printk_seq;
@@ -3772,24 +3773,19 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
console_srcu_read_unlock(cookie);

if (diff != last_diff && reset_on_progress)
- remaining = timeout_ms;
+ timeout_end = jiffies + timeout_jiffies;

console_unlock();

/* Note: @diff is 0 if there are no usable consoles. */
- if (diff == 0 || remaining == 0)
+ if (diff == 0)
break;

- if (remaining < 0) {
- /* no timeout limit */
- msleep(100);
- } else if (remaining < 100) {
- msleep(remaining);
- remaining = 0;
- } else {
- msleep(100);
- remaining -= 100;
- }
+ /* Negative timeout means an infinite wait. */
+ if (timeout_ms >= 0 && time_after_eq(jiffies, timeout_end))
+ break;
+
+ msleep(2000 / HZ);

last_diff = diff;
}

Note that I used 2000 / HZ to make sure that it will be >= 1.

And we should do this in a separate patch. It seems that sleeping
is a bigger magic than I expected.

Best Regards,
Petr

2023-10-04 10:25:17

by John Ogness

[permalink] [raw]
Subject: Re: [PATCH printk] printk: flush consoles before checking progress

On 2023-10-02, Petr Mladek <[email protected]> wrote:
> I was about to push this patch and ran checkpatch.pl. It warned about
>
> WARNING: msleep < 20ms can sleep for up to 20ms; see Documentation/timers/timers-howto.rst
> #73: FILE: kernel/printk/printk.c:3782:
> + msleep(1);
>
> And indeed, Documentation/timers/timers-howto.rst says that msleep()
> might sleep longer that expected for <20ms delays. I guess that
> it is somehow related to jiffies, HZ, and load on the system.
>
> I think that we need to count jiffies here.

Agreed. The @timeout_ms parameter should be respected.

> Something like:
>
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index db81b68d7f14..6ea500d95fd9 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -3723,7 +3723,8 @@ late_initcall(printk_late_init);
> /* If @con is specified, only wait for that console. Otherwise wait for all. */
> static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progress)
> {
> - int remaining = timeout_ms;
> + unsigned long timeout_jiffies = msecs_to_jiffies(timeout_ms);
> + unsigned_long timeout_end = jiffies + timeout_jiffies;
> struct console *c;
> u64 last_diff = 0;
> u64 printk_seq;
> @@ -3772,24 +3773,19 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
> console_srcu_read_unlock(cookie);
>
> if (diff != last_diff && reset_on_progress)
> - remaining = timeout_ms;
> + timeout_end = jiffies + timeout_jiffies;
>
> console_unlock();
>
> /* Note: @diff is 0 if there are no usable consoles. */
> - if (diff == 0 || remaining == 0)
> + if (diff == 0)
> break;
>
> - if (remaining < 0) {
> - /* no timeout limit */
> - msleep(100);
> - } else if (remaining < 100) {
> - msleep(remaining);
> - remaining = 0;
> - } else {
> - msleep(100);
> - remaining -= 100;
> - }
> + /* Negative timeout means an infinite wait. */
> + if (timeout_ms >= 0 && time_after_eq(jiffies, timeout_end))
> + break;
> +
> + msleep(2000 / HZ);

Is there really any advantage to this? I would just do msleep(1) and let
msleep round up. Everything else (tracking via jiffies) looks fine to me.

> last_diff = diff;
> }
>
> And we should do this in a separate patch. It seems that sleeping
> is a bigger magic than I expected.

Agreed.

John

2023-10-04 12:39:41

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH printk] printk: flush consoles before checking progress

On Wed 2023-10-04 12:31:07, John Ogness wrote:
> On 2023-10-02, Petr Mladek <[email protected]> wrote:
> > I was about to push this patch and ran checkpatch.pl. It warned about
> >
> > WARNING: msleep < 20ms can sleep for up to 20ms; see Documentation/timers/timers-howto.rst
> > #73: FILE: kernel/printk/printk.c:3782:
> > + msleep(1);
> >
> > And indeed, Documentation/timers/timers-howto.rst says that msleep()
> > might sleep longer that expected for <20ms delays. I guess that
> > it is somehow related to jiffies, HZ, and load on the system.
> >
> > I think that we need to count jiffies here.
>
> Agreed. The @timeout_ms parameter should be respected.
>
> > Something like:
> >
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index db81b68d7f14..6ea500d95fd9 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -3772,24 +3773,19 @@ static bool __pr_flush(struct console *con, int timeout_ms, bool reset_on_progre
> > console_srcu_read_unlock(cookie);
> >
> > if (diff != last_diff && reset_on_progress)
> > - remaining = timeout_ms;
> > + timeout_end = jiffies + timeout_jiffies;
> >
> > console_unlock();
> >
> > /* Note: @diff is 0 if there are no usable consoles. */
> > - if (diff == 0 || remaining == 0)
> > + if (diff == 0)
> > break;
> >
> > - if (remaining < 0) {
> > - /* no timeout limit */
> > - msleep(100);
> > - } else if (remaining < 100) {
> > - msleep(remaining);
> > - remaining = 0;
> > - } else {
> > - msleep(100);
> > - remaining -= 100;
> > - }
> > + /* Negative timeout means an infinite wait. */
> > + if (timeout_ms >= 0 && time_after_eq(jiffies, timeout_end))
> > + break;
> > +
> > + msleep(2000 / HZ);
>
> Is there really any advantage to this? I would just do msleep(1) and let
> msleep round up. Everything else (tracking via jiffies) looks fine to me.

It was attempt to synchronize it with a scheduler tick. I saw it somewhere ;-)
But you are right. Let's keep it simple and use msleep(1).


> > last_diff = diff;
> > }
> >
> > And we should do this in a separate patch. It seems that sleeping
> > is a bigger magic than I expected.
>
> Agreed.

Are you going to prepare them or should I?

Best Regards,
Petr

2023-10-04 12:47:34

by John Ogness

[permalink] [raw]
Subject: Re: [PATCH printk] printk: flush consoles before checking progress

On 2023-10-04, Petr Mladek <[email protected]> wrote:
> Are you going to prepare them or should I?

Really it was your version that we are taking. I suppose you should post
them and I can review them.

Also, the Closes tag probably should just be on the lock/unlock patch.

John