Hi John & folks,
In 5.19, I'm seeing some changes in message ordering / interleaving
which lead to confusion. The most obvious (and benign) example appears
on system boot, in which the "Run /init as init process" message gets
intermixed with the messages that init actually writes() to stdout. For
example, here's a snippet from build.wireguard.com:
[ 0.469732] Freeing unused kernel image (initmem) memory: 4576K
[ 0.469738] Write protecting the kernel read-only data: 10240k
[ 0.473823] Freeing unused kernel image (text/rodata gap) memory: 2044K
[ 0.475228] Freeing unused kernel image (rodata/data gap) memory: 1136K
[ 0.475236] Run /init as init process
WireGuard Test Suite on Linux 5.19.0-rc2+ x86_64
[+] Mounting filesystems...
[+] Module self-tests:
* allowedips self-tests: pass
* nonce counter self-tests: pass
* ratelimiter self-tests: pass
[+] Enabling logging...
[+] Launching tests...
[ 0.475237] with arguments:
[ 0.475238] /init
[ 0.475238] with environment:
[ 0.475239] HOME=/
[ 0.475240] TERM=linux
[+] ip netns add wg-test-46-0
[+] ip netns add wg-test-46-1
Before the "with arguments:" and such would print prior to the
"wireguard test suite on linux 5.19" banner. Now it shows after.
I see the same thing with "Freeing unused kernel image (text/rodata gap)
memory" printing interwoven into the console of my initramfs on my
laptop. And so forth.
But the bigger issue for me is that it makes it very confusing to
interpret CI results later on. Prior, I would nice a nice correlation
of:
[+] some userspace command
[ 1.2345 ] some kernel log output
[+] some userspace command
[ 1.2346 ] some kernel log output
[+] some userspace command
[ 1.2347 ] some kernel log output
Now, the kernel log outputs are all over the place and out of order with
the sequence of commands. This makes debugging issues somewhat tricky,
because post hoc ergo propter hoc winds up being a good intuition to
follow when tracking down bugs, and now the post hoc part is muddled.
I assume this is mostly caused by your threaded printk patchset, which
moves some of that printing into a worker, which means it's more
dependent on the scheduler than before. This probably has important
benefits. But it certainly makes CI and related debugging a bit
trickier as a result.
So I was wondering if there was some way to boot the kernel with a
command line option or compile-time flag that always flushes printk
messages when they're made, or does something to make the ordering a bit
more faithful.
Thanks,
Jason
On Fri, Jun 17, 2022 at 03:23:02PM +0200, Jason A. Donenfeld wrote:
> So I was wondering if there was some way to boot the kernel with a
> command line option or compile-time flag that always flushes printk
> messages when they're made, or does something to make the ordering a bit
> more faithful.
I'll scratch my own itch. Patch incoming.
Jason
In 5.19, there are some changes in printk message ordering /
interleaving which leads to confusion. The most obvious (and benign)
example appears on system boot, in which the "Run /init as init process"
message gets intermixed with the messages that init actually writes() to
stdout. For example, here's a snippet from build.wireguard.com:
[ 0.469732] Freeing unused kernel image (initmem) memory: 4576K
[ 0.469738] Write protecting the kernel read-only data: 10240k
[ 0.473823] Freeing unused kernel image (text/rodata gap) memory: 2044K
[ 0.475228] Freeing unused kernel image (rodata/data gap) memory: 1136K
[ 0.475236] Run /init as init process
WireGuard Test Suite on Linux 5.19.0-rc2+ x86_64
[+] Mounting filesystems...
[+] Module self-tests:
* allowedips self-tests: pass
* nonce counter self-tests: pass
* ratelimiter self-tests: pass
[+] Enabling logging...
[+] Launching tests...
[ 0.475237] with arguments:
[ 0.475238] /init
[ 0.475238] with environment:
[ 0.475239] HOME=/
[ 0.475240] TERM=linux
[+] ip netns add wg-test-46-0
[+] ip netns add wg-test-46-1
Before the "with arguments:" and such would print prior to the
"wireguard test suite on linux 5.19" banner. Now it shows after.
I see the same thing with "Freeing unused kernel image (text/rodata gap)
memory" printing interwoven into the console of my initramfs on my
laptop. And so forth.
But the bigger issue for me is that it makes it very confusing to
interpret CI results later on. Prior, I would nice a nice correlation
of:
[+] some userspace command
[ 1.2345 ] some kernel log output
[+] some userspace command
[ 1.2346 ] some kernel log output
[+] some userspace command
[ 1.2347 ] some kernel log output
Now, the kernel log outputs are all over the place and out of order with
the sequence of commands. This makes debugging issues somewhat tricky,
because post hoc ergo propter hoc winds up being a good intuition to
follow when tracking down bugs, and now the post hoc part is muddled.
This is caused by threaded printk. In order to restore this in debugging
sessions and in CI, this commit adds the ability to always use direct
printk, either set by default at compile time, or overridden with a
runtime command line switch.
Cc: John Ogness <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Marco Elver <[email protected]>
Fixes: 09c5ba0aa2fc ("printk: add kthread console printers")
Signed-off-by: Jason A. Donenfeld <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 8 ++++++++
init/Kconfig | 12 ++++++++++++
kernel/printk/printk.c | 12 ++++++++++++
3 files changed, 32 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 8090130b544b..a960c47a2002 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4389,6 +4389,14 @@
printk.time= Show timing data prefixed to each printk message line
Format: <bool> (1/Y/y=enable, 0/N/n=disable)
+ printk.always_direct=
+ Rather than using kthreads for printk output, always
+ write to the console immediately. This has performance
+ implications, but will result in a more faithful
+ ordering and interleaving with other processes writing
+ to the console.
+ Format: <bool> (1/Y/y=enable, 0/N/n=disable)
+
processor.max_cstate= [HW,ACPI]
Limit processor to maximum C-state
max_cstate=9 overrides any DMI blacklist limit.
diff --git a/init/Kconfig b/init/Kconfig
index c7900e8975f1..7676897f2321 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -798,6 +798,18 @@ config PRINTK_INDEX
There is no additional runtime cost to printk with this enabled.
+config PRINTK_ALWAYS_DIRECT
+ bool "Flush printk output immediately"
+ depends on PRINTK
+ help
+ Rather than using kthreads for printk output, always write to the
+ console immediately. This has performance implications, but will
+ result in a more faithful ordering and interleaving with other
+ processes writing to the console.
+
+ Say N here unless you really need this. This may also be controlled
+ at boot time with printk.always_direct=0/1.
+
#
# Architectures with an unreliable sched_clock() should select this:
#
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index ea3dd55709e7..d9f419a88429 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -178,6 +178,14 @@ static int __init control_devkmsg(char *str)
}
__setup("printk.devkmsg=", control_devkmsg);
+static bool always_direct_printk = IS_ENABLED(CONFIG_PRINTK_ALWAYS_DIRECT);
+
+static int __init control_always_direct_printk(char *str)
+{
+ return kstrtobool(str, &always_direct_printk);
+}
+__setup("printk.always_direct=", control_always_direct_printk);
+
char devkmsg_log_str[DEVKMSG_STR_MAX_SIZE] = "ratelimit";
#if defined(CONFIG_PRINTK) && defined(CONFIG_SYSCTL)
int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
@@ -471,6 +479,10 @@ void printk_prefer_direct_exit(void)
*/
static inline bool allow_direct_printing(void)
{
+ /* If the user has explicitly enabled this to be on always. */
+ if (always_direct_printk)
+ return true;
+
/*
* Checking kthread availability is a possible race because the
* kthread printers can become permanently disabled during runtime.
--
2.35.1
On Fri 2022-06-17 15:23:02, Jason A. Donenfeld wrote:
> Hi John & folks,
>
> In 5.19, I'm seeing some changes in message ordering / interleaving
> which lead to confusion. The most obvious (and benign) example appears
> on system boot, in which the "Run /init as init process" message gets
> intermixed with the messages that init actually writes() to stdout. For
> example, here's a snippet from build.wireguard.com:
>
> [ 0.469732] Freeing unused kernel image (initmem) memory: 4576K
> [ 0.469738] Write protecting the kernel read-only data: 10240k
> [ 0.473823] Freeing unused kernel image (text/rodata gap) memory: 2044K
> [ 0.475228] Freeing unused kernel image (rodata/data gap) memory: 1136K
> [ 0.475236] Run /init as init process
>
>
> WireGuard Test Suite on Linux 5.19.0-rc2+ x86_64
>
>
> [+] Mounting filesystems...
> [+] Module self-tests:
> * allowedips self-tests: pass
> * nonce counter self-tests: pass
> * ratelimiter self-tests: pass
> [+] Enabling logging...
> [+] Launching tests...
> [ 0.475237] with arguments:
> [ 0.475238] /init
> [ 0.475238] with environment:
> [ 0.475239] HOME=/
> [ 0.475240] TERM=linux
> [+] ip netns add wg-test-46-0
> [+] ip netns add wg-test-46-1
I see.
> But the bigger issue for me is that it makes it very confusing to
> interpret CI results later on. Prior, I would nice a nice correlation
> of:
>
> [+] some userspace command
> [ 1.2345 ] some kernel log output
> [+] some userspace command
> [ 1.2346 ] some kernel log output
> [+] some userspace command
> [ 1.2347 ] some kernel log output
>
> I assume this is mostly caused by your threaded printk patchset
Console has never been fully synchronous. printk() did console_trylock()
and flushed the message to the console only the lock was available.
The console kthreads made it asynchronous always when the kthreads
are available and system is in normal state.
> This probably has important benefits. But it certainly makes CI
> and related debugging a bit trickier as a result.
I could imagine.
> So I was wondering if there was some way to boot the kernel with a
> command line option or compile-time flag that always flushes printk
> messages when they're made, or does something to make the ordering a bit
> more faithful.
I am pretty sure that we will have to add such an option sooner or
later. We did not want to do it from the beginning because otherwise
people would use it and would not tell use about their problematic
use-cases ;-)
In fact, in your case you might get even better synchronization
if you do it the other way and write userspace messages into
the kernel log via /dev/kmsg:
echo "Hello world" > /dev/kmsg
That said. I am going to look at your patch the following week.
I also want to wait for an opinion from John.
Best Regards,
Petr
On Fri, Jun 17, 2022 at 04:21:15PM +0200, Petr Mladek wrote:
> In fact, in your case you might get even better synchronization
> if you do it the other way and write userspace messages into
> the kernel log via /dev/kmsg:
>
> echo "Hello world" > /dev/kmsg
The idea is to interleave stderr/stdout properly, as CI has done for
ages, which threaded printk breaks, and which my patch (now posted)
fixes behind a default-off option. You can't dup2 /dev/kmsg to 1 and 2,
as the write() semantics are different, with respect to buffering and
with respect to new lines. The interfaces aren't comparable, so this
isn't quite a solution. (Besides, not breaking things as they currently
exist has a certain benefit.)
Jason
From: Petr Mladek
> Sent: 17 June 2022 15:21
...
> > I assume this is mostly caused by your threaded printk patchset
>
> Console has never been fully synchronous. printk() did console_trylock()
> and flushed the message to the console only the lock was available.
> The console kthreads made it asynchronous always when the kthreads
> are available and system is in normal state.
What priority do these kthreads run at?
I'd have thought they ought to run at a high priority?
That should tend to give kernel messages priority over user ones.
Quite how high is another matter.
Probably a bit below the RT/FIFO:50 of threaded ISR.
Although if an x86 one ends up doing software scrolling of
the vga text buffer using the ISA speed accesses that usually
requires maybe not!
Maybe that is faster on modern systems - but I doubt it.
David
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On 6/17/22 06:38, Jason A. Donenfeld wrote:
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 8090130b544b..a960c47a2002 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4389,6 +4389,14 @@
> printk.time= Show timing data prefixed to each printk message line
> Format: <bool> (1/Y/y=enable, 0/N/n=disable)
>
> + printk.always_direct=
> + Rather than using kthreads for printk output, always
> + write to the console immediately. This has performance
> + implications, but will result in a more faithful
> + ordering and interleaving with other processes writing
> + to the console.
> + Format: <bool> (1/Y/y=enable, 0/N/n=disable)
or:
+ Format: <bool> (1/Y/y/T/t/on=enable, 0/N/n/F/f/off=disable)
--
~Randy
On 2022-06-17, David Laight <[email protected]> wrote:
> What priority do these kthreads run at?
120 (SCHED_OTHER, nice=0)
> I'd have thought they ought to run at a high priority?
> That should tend to give kernel messages priority over user ones.
>
> Quite how high is another matter.
> Probably a bit below the RT/FIFO:50 of threaded ISR.
As a default value, I recommend keeping to the SCHED_OTHER policy as a
default. Perhaps a nice value of -20? There are quite a few kernel
threads using that as their default:
# ps -Leo ni,command | grep ^-20 | sort
-20 [acpi_thermal_pm]
-20 [ata_sff]
-20 [blkcg_punt_bio]
-20 [cfg80211]
-20 [inet_frag_wq]
-20 [ipv6_addrconf]
-20 [kblockd]
-20 [kworker/0:0H-events_highpri]
-20 [kworker/0:1H-events_highpri]
-20 [md]
-20 [mld]
-20 [mm_percpu_wq]
-20 [netns]
-20 [nfsiod]
-20 [rcu_gp]
-20 [rcu_par_gp]
-20 [rpciod]
-20 [scsi_tmf_0]
-20 [scsi_tmf_1]
-20 [writeback]
-20 [xprtiod]
John Ogness
On 6/19/22, Randy Dunlap <[email protected]> wrote:
>
>
> On 6/17/22 06:38, Jason A. Donenfeld wrote:
>> diff --git a/Documentation/admin-guide/kernel-parameters.txt
>> b/Documentation/admin-guide/kernel-parameters.txt
>> index 8090130b544b..a960c47a2002 100644
>> --- a/Documentation/admin-guide/kernel-parameters.txt
>> +++ b/Documentation/admin-guide/kernel-parameters.txt
>> @@ -4389,6 +4389,14 @@
>> printk.time= Show timing data prefixed to each printk message line
>> Format: <bool> (1/Y/y=enable, 0/N/n=disable)
>>
>> + printk.always_direct=
>> + Rather than using kthreads for printk output, always
>> + write to the console immediately. This has performance
>> + implications, but will result in a more faithful
>> + ordering and interleaving with other processes writing
>> + to the console.
>> + Format: <bool> (1/Y/y=enable, 0/N/n=disable)
>
> or:
>
> + Format: <bool> (1/Y/y/T/t/on=enable, 0/N/n/F/f/off=disable)
>
This is the same as the lines above and below. So I think you'd be
better off sending an independent patch that fixes all of these in one
fell swoop, rather than introducing an inconsistency in this one.
Jason
On 2022-06-17, "Jason A. Donenfeld" <[email protected]> wrote:
> In 5.19, there are some changes in printk message ordering /
> interleaving which leads to confusion. The most obvious (and benign)
> example appears on system boot, in which the "Run /init as init process"
> message gets intermixed with the messages that init actually writes() to
> stdout. For example, here's a snippet from build.wireguard.com:
>
> [ 0.469732] Freeing unused kernel image (initmem) memory: 4576K
> [ 0.469738] Write protecting the kernel read-only data: 10240k
> [ 0.473823] Freeing unused kernel image (text/rodata gap) memory: 2044K
> [ 0.475228] Freeing unused kernel image (rodata/data gap) memory: 1136K
> [ 0.475236] Run /init as init process
>
> WireGuard Test Suite on Linux 5.19.0-rc2+ x86_64
>
> [+] Mounting filesystems...
> [+] Module self-tests:
> * allowedips self-tests: pass
> * nonce counter self-tests: pass
> * ratelimiter self-tests: pass
> [+] Enabling logging...
> [+] Launching tests...
> [ 0.475237] with arguments:
> [ 0.475238] /init
> [ 0.475238] with environment:
> [ 0.475239] HOME=/
> [ 0.475240] TERM=linux
> [+] ip netns add wg-test-46-0
> [+] ip netns add wg-test-46-1
As Petr mentioned, this issue is not new. But with kthreads it has
become the common case.
> + printk.always_direct=
Do we need the word "always" in there? I would prefer the simplied:
printk.direct=
> + Rather than using kthreads for printk output, always
> + write to the console immediately. This has performance
The "best effort" part needs to be in there. Perhaps:
Rather than using kthreads for printk output, always attempt to write to
the console immediately.
> + implications, but will result in a more faithful
> + ordering and interleaving with other processes writing
> + to the console.
My main concern is that "direct printing" is not synchronous. Since
2.4.10 it has not been printk's goal to be synchronous. Your patch is
making the issue less likely again, but not solving it. And since this
is the first time we are documenting the printk printing behavior, we
should be careful to not mislead users to think it is truly direct.
The planned solution is atomic consoles, which are able to provide a
truly synchronous printk. But they are not yet available mainline.
> diff --git a/init/Kconfig b/init/Kconfig
> index c7900e8975f1..7676897f2321 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
> @@ -798,6 +798,18 @@ config PRINTK_INDEX
>
> There is no additional runtime cost to printk with this enabled.
>
> +config PRINTK_ALWAYS_DIRECT
(perhaps)
config PRINTK_DIRECT
> + bool "Flush printk output immediately"
Attempt to flush printk output immediately
> + depends on PRINTK
> + help
> + Rather than using kthreads for printk output, always write to the
always attempt to write
> + console immediately. This has performance implications, but will
> + result in a more faithful ordering and interleaving with other
> + processes writing to the console.
> +
> + Say N here unless you really need this. This may also be controlled
> + at boot time with printk.always_direct=0/1.
(perhaps)
printk.direct=0/1
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index ea3dd55709e7..d9f419a88429 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -471,6 +479,10 @@ void printk_prefer_direct_exit(void)
> */
> static inline bool allow_direct_printing(void)
> {
> + /* If the user has explicitly enabled this to be on always. */
> + if (always_direct_printk)
> + return true;
There is no reason to start the threads if they won't be used. Perhaps
something like this instead:
-------- BEGIN HUNK -------
@@ -3602,11 +3610,13 @@ static int __init printk_activate_kthreads(void)
{
struct console *con;
- console_lock();
- printk_kthreads_available = true;
- for_each_console(con)
- printk_start_kthread(con);
- console_unlock();
+ if (!always_direct_printk) {
+ console_lock();
+ printk_kthreads_available = true;
+ for_each_console(con)
+ printk_start_kthread(con);
+ console_unlock();
+ }
return 0;
}
-------- END HUNK -------
Some general comments from me...
Direct printing has a lot of technical issues. It is sometimes directly
responsible for kernels dying. It is sometimes directly responsible for
preventing important information from being made available at crash
time. For the fbcon, direct printing is a ticking timebomb. And in many
cases it isn't even truly doing direct printing anyway.
Direct printing (in its current form) must be phased out at some
point. We are working to bring true synchronous console printing
mainline.
Aside from my above comments, I have no problems with this patch.
John Ogness
From: John Ogness
> Sent: 19 June 2022 09:16
>
> On 2022-06-17, David Laight <[email protected]> wrote:
> > What priority do these kthreads run at?
>
> 120 (SCHED_OTHER, nice=0)
>
> > I'd have thought they ought to run at a high priority?
> > That should tend to give kernel messages priority over user ones.
> >
> > Quite how high is another matter.
> > Probably a bit below the RT/FIFO:50 of threaded ISR.
>
> As a default value, I recommend keeping to the SCHED_OTHER policy as a
> default. Perhaps a nice value of -20? There are quite a few kernel
> threads using that as their default:
That doesn't mean it is a sensible priority :-)
Running at (SCHED_OTHER, nice=0) is almost certainly worse.
There is little guarantee they'll run if the system is busy
and has non-default priority user threads.
I know there is the NIBY style 'my process is more important than yours'
But processes that don't run for very long, or have to run
in order to keep the system working properly, almost certainly
need to be higher priority than the lowest RT one.
I've been fighting system thread priorities on a system that
is doing a lot of real time audio over UDP (ie RTP).
The application threads processing the RTP need to run in preference
to all other application threads (nothing else is that time critical).
Processor affinities don't help - they can only really be used to
move things away from some cpu, and I need to use the idle time.
Running the RTP threads at a RT priority works reasonably well
except that some system threads (like the softint ones napi uses)
get blocked - causing lost packets.
Threaded napi helps - but only if I run the threads under the RT
scheduler.
David
>
> # ps -Leo ni,command | grep ^-20 | sort
> -20 [acpi_thermal_pm]
> -20 [ata_sff]
> -20 [blkcg_punt_bio]
> -20 [cfg80211]
> -20 [inet_frag_wq]
> -20 [ipv6_addrconf]
> -20 [kblockd]
> -20 [kworker/0:0H-events_highpri]
> -20 [kworker/0:1H-events_highpri]
> -20 [md]
> -20 [mld]
> -20 [mm_percpu_wq]
> -20 [netns]
> -20 [nfsiod]
> -20 [rcu_gp]
> -20 [rcu_par_gp]
> -20 [rpciod]
> -20 [scsi_tmf_0]
> -20 [scsi_tmf_1]
> -20 [writeback]
> -20 [xprtiod]
>
> John Ogness
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Hi John,
Thanks for your review. I'll have a v2 for you shortly.
On Sun, Jun 19, 2022 at 01:11:43PM +0206, John Ogness wrote:
> > + printk.always_direct=
>
> Do we need the word "always" in there? I would prefer the simplied:
> printk.direct=
Sure, no problem. I'll do that for v2.
>
> > + Rather than using kthreads for printk output, always
> > + write to the console immediately. This has performance
>
> The "best effort" part needs to be in there. Perhaps:
>
> Rather than using kthreads for printk output, always attempt to write to
> the console immediately.
Ack.
>
> > + implications, but will result in a more faithful
> > + ordering and interleaving with other processes writing
> > + to the console.
>
> My main concern is that "direct printing" is not synchronous. Since
> 2.4.10 it has not been printk's goal to be synchronous. Your patch is
> making the issue less likely again, but not solving it. And since this
> is the first time we are documenting the printk printing behavior, we
> should be careful to not mislead users to think it is truly direct.
Sure, that makes sense. The "attempt to" language clarifies that well.
> > +config PRINTK_ALWAYS_DIRECT
>
> (perhaps)
> config PRINTK_DIRECT
Ack.
>
> > + bool "Flush printk output immediately"
>
> Attempt to flush printk output immediately
Ack.
>
> > + depends on PRINTK
> > + help
> > + Rather than using kthreads for printk output, always write to the
>
> always attempt to write
Ack.
>
> > + console immediately. This has performance implications, but will
> > + result in a more faithful ordering and interleaving with other
> > + processes writing to the console.
> > +
> > + Say N here unless you really need this. This may also be controlled
> > + at boot time with printk.always_direct=0/1.
>
> (perhaps)
> printk.direct=0/1
Ack.
>
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index ea3dd55709e7..d9f419a88429 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -471,6 +479,10 @@ void printk_prefer_direct_exit(void)
> > */
> > static inline bool allow_direct_printing(void)
> > {
> > + /* If the user has explicitly enabled this to be on always. */
> > + if (always_direct_printk)
> > + return true;
>
> There is no reason to start the threads if they won't be used. Perhaps
> something like this instead:
>
> -------- BEGIN HUNK -------
> @@ -3602,11 +3610,13 @@ static int __init printk_activate_kthreads(void)
> {
> struct console *con;
>
> - console_lock();
> - printk_kthreads_available = true;
> - for_each_console(con)
> - printk_start_kthread(con);
> - console_unlock();
> + if (!always_direct_printk) {
> + console_lock();
> + printk_kthreads_available = true;
> + for_each_console(con)
> + printk_start_kthread(con);
> + console_unlock();
> + }
>
> return 0;
> }
> -------- END HUNK -------
Good thinking. I'll do it like this.
> Direct printing has a lot of technical issues. It is sometimes directly
> responsible for kernels dying. It is sometimes directly responsible for
> preventing important information from being made available at crash
> time. For the fbcon, direct printing is a ticking timebomb. And in many
> cases it isn't even truly doing direct printing anyway.
>
> Direct printing (in its current form) must be phased out at some
> point. We are working to bring true synchronous console printing
> mainline.
>
> Aside from my above comments, I have no problems with this patch.
That's a fair point, though it remains a valuable resource for debugging
and CI. The way this patch is written, it defaults to off, and the help
text is kind of discouraging of its use, which is what we want I think.
v2 on its way.
Jason
In 5.19, there are some changes in printk message ordering /
interleaving which leads to confusion. The most obvious (and benign)
example appears on system boot, in which the "Run /init as init process"
message gets intermixed with the messages that init actually writes() to
stdout. For example, here's a snippet from build.wireguard.com:
[ 0.469732] Freeing unused kernel image (initmem) memory: 4576K
[ 0.469738] Write protecting the kernel read-only data: 10240k
[ 0.473823] Freeing unused kernel image (text/rodata gap) memory: 2044K
[ 0.475228] Freeing unused kernel image (rodata/data gap) memory: 1136K
[ 0.475236] Run /init as init process
WireGuard Test Suite on Linux 5.19.0-rc2+ x86_64
[+] Mounting filesystems...
[+] Module self-tests:
* allowedips self-tests: pass
* nonce counter self-tests: pass
* ratelimiter self-tests: pass
[+] Enabling logging...
[+] Launching tests...
[ 0.475237] with arguments:
[ 0.475238] /init
[ 0.475238] with environment:
[ 0.475239] HOME=/
[ 0.475240] TERM=linux
[+] ip netns add wg-test-46-0
[+] ip netns add wg-test-46-1
Before the "with arguments:" and such would print prior to the
"wireguard test suite on linux 5.19" banner. Now it shows after.
I see the same thing with "Freeing unused kernel image (text/rodata gap)
memory" printing interwoven into the console of my initramfs on my
laptop. And so forth.
But the bigger issue for me is that it makes it very confusing to
interpret CI results later on. Prior, I would nice a nice correlation
of:
[+] some userspace command
[ 1.2345 ] some kernel log output
[+] some userspace command
[ 1.2346 ] some kernel log output
[+] some userspace command
[ 1.2347 ] some kernel log output
Now, the kernel log outputs are all over the place and out of order with
the sequence of commands. This makes debugging issues somewhat tricky,
because post hoc ergo propter hoc winds up being a good intuition to
follow when tracking down bugs, and now the post hoc part is muddled.
This is caused by threaded printk. In order to restore this in debugging
sessions and in CI, this commit adds the ability to always use direct
printk, either set by default at compile time, or overridden with a
runtime command line switch.
Cc: John Ogness <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Marco Elver <[email protected]>
Fixes: 09c5ba0aa2fc ("printk: add kthread console printers")
Signed-off-by: Jason A. Donenfeld <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 7 +++++++
init/Kconfig | 12 ++++++++++++
kernel/printk/printk.c | 11 +++++++++++
3 files changed, 30 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 8090130b544b..96e910a4e12b 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4389,6 +4389,13 @@
printk.time= Show timing data prefixed to each printk message line
Format: <bool> (1/Y/y=enable, 0/N/n=disable)
+ printk.direct= Rather than using kthreads for printk output, always
+ attempt to write to the console immediately. This has
+ performance implications, but will result in a more
+ faithful ordering and interleaving with other
+ processes writing to the console.
+ Format: <bool> (1/Y/y=enable, 0/N/n=disable)
+
processor.max_cstate= [HW,ACPI]
Limit processor to maximum C-state
max_cstate=9 overrides any DMI blacklist limit.
diff --git a/init/Kconfig b/init/Kconfig
index c7900e8975f1..47466aa2b0e8 100644
--- a/init/Kconfig
+++ b/init/Kconfig
@@ -798,6 +798,18 @@ config PRINTK_INDEX
There is no additional runtime cost to printk with this enabled.
+config PRINTK_DIRECT
+ bool "Attempt to flush printk output immediately"
+ depends on PRINTK
+ help
+ Rather than using kthreads for printk output, always attempt to write
+ to the console immediately. This has performance implications, but
+ will result in a more faithful ordering and interleaving with other
+ processes writing to the console.
+
+ Say N here unless you really need this. This may also be controlled
+ at boot time with printk.direct=0/1.
+
#
# Architectures with an unreliable sched_clock() should select this:
#
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index ea3dd55709e7..43f8a0074ed6 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -178,6 +178,14 @@ static int __init control_devkmsg(char *str)
}
__setup("printk.devkmsg=", control_devkmsg);
+static bool printk_direct = IS_ENABLED(CONFIG_PRINTK_DIRECT);
+
+static int __init control_printk_direct(char *str)
+{
+ return kstrtobool(str, &printk_direct);
+}
+__setup("printk.direct=", control_printk_direct);
+
char devkmsg_log_str[DEVKMSG_STR_MAX_SIZE] = "ratelimit";
#if defined(CONFIG_PRINTK) && defined(CONFIG_SYSCTL)
int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
@@ -3602,6 +3610,9 @@ static int __init printk_activate_kthreads(void)
{
struct console *con;
+ if (printk_direct)
+ return 0;
+
console_lock();
printk_kthreads_available = true;
for_each_console(con)
--
2.35.1
On 2022-06-19, "Jason A. Donenfeld" <[email protected]> wrote:
> diff --git a/init/Kconfig b/init/Kconfig
> index c7900e8975f1..47466aa2b0e8 100644
> --- a/init/Kconfig
> +++ b/init/Kconfig
Sorry, I missed this in your v1. But I think this config belongs in
lib/Kconfig.debug under the "printk and dmesg options" menu.
> @@ -798,6 +798,18 @@ config PRINTK_INDEX
>
> There is no additional runtime cost to printk with this enabled.
>
> +config PRINTK_DIRECT
> + bool "Attempt to flush printk output immediately"
> + depends on PRINTK
> + help
> + Rather than using kthreads for printk output, always attempt to write
> + to the console immediately. This has performance implications, but
> + will result in a more faithful ordering and interleaving with other
> + processes writing to the console.
> +
> + Say N here unless you really need this. This may also be controlled
> + at boot time with printk.direct=0/1.
> +
> #
> # Architectures with an unreliable sched_clock() should select this:
> #
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index ea3dd55709e7..43f8a0074ed6 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -178,6 +178,14 @@ static int __init control_devkmsg(char *str)
> }
> __setup("printk.devkmsg=", control_devkmsg);
>
> +static bool printk_direct = IS_ENABLED(CONFIG_PRINTK_DIRECT);
I understand why you would name the variable to match the boot arg. But
I would prefer the _internal_ variable had a name that makes it clear
(to us developers) that it is a permanent setting. Perhaps
printk_direct_only or printk_direct_always?
The reason is because when kthreads are active, direct printing is
turned on and off dynamically (using @printk_prefer_direct). And if this
new variable is named @printk_direct, one could easily mistake its
meaning to be related to the dynamic turning on and off.
> +
> +static int __init control_printk_direct(char *str)
> +{
> + return kstrtobool(str, &printk_direct);
> +}
> +__setup("printk.direct=", control_printk_direct);
> +
> char devkmsg_log_str[DEVKMSG_STR_MAX_SIZE] = "ratelimit";
> #if defined(CONFIG_PRINTK) && defined(CONFIG_SYSCTL)
> int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
> @@ -3602,6 +3610,9 @@ static int __init printk_activate_kthreads(void)
> {
> struct console *con;
>
> + if (printk_direct)
I'm wondering if we should output a message here. My suggestion is:
pr_info("printing threads disabled, using direct printing\n");
> + return 0;
> +
> console_lock();
> printk_kthreads_available = true;
> for_each_console(con)
Otherwise it looks OK to me. But you may want to wait on a response from
Petr, Sergey, or Steven before sending a v3. You are adding a kernel
config and a boot argument. Both of these are sensitive topics that
require more feedback from others.
John Ogness
Hi John,
On Mon, Jun 20, 2022 at 01:23:04AM +0206, John Ogness wrote:
> On 2022-06-19, "Jason A. Donenfeld" <[email protected]> wrote:
> > diff --git a/init/Kconfig b/init/Kconfig
> > index c7900e8975f1..47466aa2b0e8 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
>
> Sorry, I missed this in your v1. But I think this config belongs in
> lib/Kconfig.debug under the "printk and dmesg options" menu.
That's better. It helps drive home that it's a debug thing.
> > +static bool printk_direct = IS_ENABLED(CONFIG_PRINTK_DIRECT);
>
> I understand why you would name the variable to match the boot arg. But
> I would prefer the _internal_ variable had a name that makes it clear
> (to us developers) that it is a permanent setting. Perhaps
> printk_direct_only or printk_direct_always?
Sure, I'll do that. The variable can also be __initdata, since it's only
used inside of an __init function.
> > + if (printk_direct)
>
> I'm wondering if we should output a message here. My suggestion is:
>
> pr_info("printing threads disabled, using direct printing\n");
That seems a bit heavy to me, and just adds more log spam. Moving it
into the debug kconfig zone seems like the right way to handle this
instead.
> > + return 0;
> > +
> > console_lock();
> > printk_kthreads_available = true;
> > for_each_console(con)
>
> Otherwise it looks OK to me. But you may want to wait on a response from
> Petr, Sergey, or Steven before sending a v3. You are adding a kernel
> config and a boot argument. Both of these are sensitive topics that
> require more feedback from others.
Fair enough. It's easy enough to send a v3, and I agree with most of
your suggestions, so I'll send that now, and we'll wait to hear if the
others think it's fine too.
Jason
In 5.19, there are some changes in printk message ordering /
interleaving which leads to confusion. The most obvious (and benign)
example appears on system boot, in which the "Run /init as init process"
message gets intermixed with the messages that init actually writes() to
stdout. For example, here's a snippet from build.wireguard.com:
[ 0.469732] Freeing unused kernel image (initmem) memory: 4576K
[ 0.469738] Write protecting the kernel read-only data: 10240k
[ 0.473823] Freeing unused kernel image (text/rodata gap) memory: 2044K
[ 0.475228] Freeing unused kernel image (rodata/data gap) memory: 1136K
[ 0.475236] Run /init as init process
WireGuard Test Suite on Linux 5.19.0-rc2+ x86_64
[+] Mounting filesystems...
[+] Module self-tests:
* allowedips self-tests: pass
* nonce counter self-tests: pass
* ratelimiter self-tests: pass
[+] Enabling logging...
[+] Launching tests...
[ 0.475237] with arguments:
[ 0.475238] /init
[ 0.475238] with environment:
[ 0.475239] HOME=/
[ 0.475240] TERM=linux
[+] ip netns add wg-test-46-0
[+] ip netns add wg-test-46-1
Before the "with arguments:" and such would print prior to the
"wireguard test suite on linux 5.19" banner. Now it shows after.
I see the same thing with "Freeing unused kernel image (text/rodata gap)
memory" printing interwoven into the console of my initramfs on my
laptop. And so forth.
But the bigger issue for me is that it makes it very confusing to
interpret CI results later on. Prior, I would nice a nice correlation
of:
[+] some userspace command
[ 1.2345 ] some kernel log output
[+] some userspace command
[ 1.2346 ] some kernel log output
[+] some userspace command
[ 1.2347 ] some kernel log output
Now, the kernel log outputs are all over the place and out of order with
the sequence of commands. This makes debugging issues somewhat tricky,
because post hoc ergo propter hoc winds up being a good intuition to
follow when tracking down bugs, and now the post hoc part is muddled.
This is caused by threaded printk. In order to restore this in debugging
sessions and in CI, this commit adds the ability to always use direct
printk, either set by default at compile time, or overridden with a
runtime command line switch.
Cc: John Ogness <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Marco Elver <[email protected]>
Fixes: 09c5ba0aa2fc ("printk: add kthread console printers")
Signed-off-by: Jason A. Donenfeld <[email protected]>
---
Documentation/admin-guide/kernel-parameters.txt | 7 +++++++
kernel/printk/printk.c | 11 +++++++++++
lib/Kconfig.debug | 12 ++++++++++++
3 files changed, 30 insertions(+)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 2522b11e593f..04cec66802d1 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -4424,6 +4424,13 @@
printk.time= Show timing data prefixed to each printk message line
Format: <bool> (1/Y/y=enable, 0/N/n=disable)
+ printk.direct= Rather than using kthreads for printk output, always
+ attempt to write to the console immediately. This has
+ performance implications, but will result in a more
+ faithful ordering and interleaving with other
+ processes writing to the console.
+ Format: <bool> (1/Y/y=enable, 0/N/n=disable)
+
processor.max_cstate= [HW,ACPI]
Limit processor to maximum C-state
max_cstate=9 overrides any DMI blacklist limit.
diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index b095fb5f5f61..b7f8f2904f2c 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -178,6 +178,14 @@ static int __init control_devkmsg(char *str)
}
__setup("printk.devkmsg=", control_devkmsg);
+static bool printk_direct_only __initdata = IS_ENABLED(CONFIG_PRINTK_DIRECT);
+
+static int __init control_printk_direct_only(char *str)
+{
+ return kstrtobool(str, &printk_direct_only);
+}
+__setup("printk.direct=", control_printk_direct_only);
+
char devkmsg_log_str[DEVKMSG_STR_MAX_SIZE] = "ratelimit";
#if defined(CONFIG_PRINTK) && defined(CONFIG_SYSCTL)
int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
@@ -3605,6 +3613,9 @@ static int __init printk_activate_kthreads(void)
{
struct console *con;
+ if (printk_direct_only)
+ return 0;
+
console_lock();
printk_kthreads_available = true;
for_each_console(con)
diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
index 2e24db4bff19..1acfb4971ec7 100644
--- a/lib/Kconfig.debug
+++ b/lib/Kconfig.debug
@@ -35,6 +35,18 @@ config PRINTK_CALLER
no option to enable/disable at the kernel command line parameter or
sysfs interface.
+config PRINTK_DIRECT
+ bool "Attempt to flush printk output immediately"
+ depends on PRINTK
+ help
+ Rather than using kthreads for printk output, always attempt to write
+ to the console immediately. This has performance implications, but
+ will result in a more faithful ordering and interleaving with other
+ processes writing to the console.
+
+ Say N here unless you really need this. This may also be controlled
+ at boot time with printk.direct=0/1.
+
config STACKTRACE_BUILD_ID
bool "Show build ID information in stacktraces"
depends on PRINTK
--
2.35.1
From: John Ogness
> Sent: 20 June 2022 00:17
>
> On 2022-06-19, "Jason A. Donenfeld" <[email protected]> wrote:
> > diff --git a/init/Kconfig b/init/Kconfig
> > index c7900e8975f1..47466aa2b0e8 100644
> > --- a/init/Kconfig
> > +++ b/init/Kconfig
>
> Sorry, I missed this in your v1. But I think this config belongs in
> lib/Kconfig.debug under the "printk and dmesg options" menu.
>
> > @@ -798,6 +798,18 @@ config PRINTK_INDEX
> >
> > There is no additional runtime cost to printk with this enabled.
> >
> > +config PRINTK_DIRECT
> > + bool "Attempt to flush printk output immediately"
> > + depends on PRINTK
> > + help
> > + Rather than using kthreads for printk output, always attempt to write
> > + to the console immediately. This has performance implications, but
> > + will result in a more faithful ordering and interleaving with other
> > + processes writing to the console.
> > +
> > + Say N here unless you really need this. This may also be controlled
> > + at boot time with printk.direct=0/1.
> > +
> > #
> > # Architectures with an unreliable sched_clock() should select this:
> > #
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index ea3dd55709e7..43f8a0074ed6 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -178,6 +178,14 @@ static int __init control_devkmsg(char *str)
> > }
> > __setup("printk.devkmsg=", control_devkmsg);
> >
> > +static bool printk_direct = IS_ENABLED(CONFIG_PRINTK_DIRECT);
>
> I understand why you would name the variable to match the boot arg. But
> I would prefer the _internal_ variable had a name that makes it clear
> (to us developers) that it is a permanent setting. Perhaps
> printk_direct_only or printk_direct_always?
>
> The reason is because when kthreads are active, direct printing is
> turned on and off dynamically (using @printk_prefer_direct). And if this
> new variable is named @printk_direct, one could easily mistake its
> meaning to be related to the dynamic turning on and off.
Do you need both variables?
I presume the whole lot can be compiled out (for small kernels).
Apart from that having a sysctl to control which message levels
get 'direct printing' (much like the one that controls whether
they get printed at all) would surely be enough.
That just leaves the question of the initial level at boot.
David
>
> > +
> > +static int __init control_printk_direct(char *str)
> > +{
> > + return kstrtobool(str, &printk_direct);
> > +}
> > +__setup("printk.direct=", control_printk_direct);
> > +
> > char devkmsg_log_str[DEVKMSG_STR_MAX_SIZE] = "ratelimit";
> > #if defined(CONFIG_PRINTK) && defined(CONFIG_SYSCTL)
> > int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
> > @@ -3602,6 +3610,9 @@ static int __init printk_activate_kthreads(void)
> > {
> > struct console *con;
> >
> > + if (printk_direct)
>
> I'm wondering if we should output a message here. My suggestion is:
>
> pr_info("printing threads disabled, using direct printing\n");
>
> > + return 0;
> > +
> > console_lock();
> > printk_kthreads_available = true;
> > for_each_console(con)
>
> Otherwise it looks OK to me. But you may want to wait on a response from
> Petr, Sergey, or Steven before sending a v3. You are adding a kernel
> config and a boot argument. Both of these are sensitive topics that
> require more feedback from others.
>
> John Ogness
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On (22/06/20 01:23), John Ogness wrote:
> I'm wondering if we should output a message here. My suggestion is:
>
> pr_info("printing threads disabled, using direct printing\n");
>
> > + return 0;
> > +
> > console_lock();
> > printk_kthreads_available = true;
> > for_each_console(con)
>
> Otherwise it looks OK to me. But you may want to wait on a response from
> Petr, Sergey, or Steven before sending a v3. You are adding a kernel
> config and a boot argument. Both of these are sensitive topics that
> require more feedback from others.
A tricky situation. I think John already mentioned in another thread
that any such config/boot param potentially can become a default value
for some setups; at the same time, well, yeah, I see what commit message
talks about.
May I just dump some thoughts?
I guess user-space "[+] some userspace command" are write()-s to /dev/ttyX
which end up in uart circular buffer and which are printed from console
IRQ handler (where it handles both TX and RX)
console_irq()
{
int count = XXX;
spin lock irqsafe port->lock
RX
do {
TX char from xmit->buf
} while (--count);
spin unlock irqrestore port->lock
}
The "[ 1.2345 ] some kernel log output" (which I assume are printed
by user-space from some syscall?) now take a much more lengthy path to
console and are always async/deferred.
I know that John and Petr (rightfully so) will hate what I'm about to
say, but would it make sense/be possible/etc. to look into possibility
to address that "deferred kthread vs IRQ" race by moving prb entries
emit to the same IRQ handler that emits chars from uart xmit buffer?
In other words:
console_irq()
{
int count = XXX;
spin lock irqsafe port->lock
RX
do {
TX char from xmit->buf
} while (--count);
+ count = XXX;
+ do {
+ console_emit_next_record();
+ } while (--count);
spin unlock irqrestore port->lock
}
Hi Sergey,
On Mon, Jun 20, 2022 at 02:17:36PM +0900, Sergey Senozhatsky wrote:
> On (22/06/20 01:23), John Ogness wrote:
> > I'm wondering if we should output a message here. My suggestion is:
> >
> > pr_info("printing threads disabled, using direct printing\n");
> >
> > > + return 0;
> > > +
> > > console_lock();
> > > printk_kthreads_available = true;
> > > for_each_console(con)
> >
> > Otherwise it looks OK to me. But you may want to wait on a response from
> > Petr, Sergey, or Steven before sending a v3. You are adding a kernel
> > config and a boot argument. Both of these are sensitive topics that
> > require more feedback from others.
>
> A tricky situation. I think John already mentioned in another thread
> that any such config/boot param potentially can become a default value
> for some setups; at the same time, well, yeah, I see what commit message
> talks about.
Moving it into the debug section of kconfig hopefully sends the right
message there. Plus the help text makes it clear there are performance
implications and such.
Jason
On Mon 2022-06-20 01:33:02, Jason A. Donenfeld wrote:
> In 5.19, there are some changes in printk message ordering /
> interleaving which leads to confusion. The most obvious (and benign)
> example appears on system boot, in which the "Run /init as init process"
> message gets intermixed with the messages that init actually writes() to
> stdout. For example, here's a snippet from build.wireguard.com:
>
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -35,6 +35,18 @@ config PRINTK_CALLER
> no option to enable/disable at the kernel command line parameter or
> sysfs interface.
>
> +config PRINTK_DIRECT
> + bool "Attempt to flush printk output immediately"
> + depends on PRINTK
> + help
> + Rather than using kthreads for printk output, always attempt to write
> + to the console immediately. This has performance implications, but
> + will result in a more faithful ordering and interleaving with other
> + processes writing to the console.
> +
> + Say N here unless you really need this. This may also be controlled
> + at boot time with printk.direct=0/1.
> +
I am fine with the patch except for the config option. Is there
any particular reason why we need it, please?
We should be careful to add new build options in general because they
make building the kernel more complicated. People need to understand
what each option is about and what they want.
Kernel parameters are less intrusive. People need to care about
it only when they need some special behavior. And everyone has
its own default command line anyway. For example, I always use
ignore_loglevel.
Best Regards,
Petr
Hi Petr,
On Mon, Jun 20, 2022 at 06:58:13PM +0200, Petr Mladek wrote:
> On Mon 2022-06-20 01:33:02, Jason A. Donenfeld wrote:
> > In 5.19, there are some changes in printk message ordering /
> > interleaving which leads to confusion. The most obvious (and benign)
> > example appears on system boot, in which the "Run /init as init process"
> > message gets intermixed with the messages that init actually writes() to
> > stdout. For example, here's a snippet from build.wireguard.com:
> >
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -35,6 +35,18 @@ config PRINTK_CALLER
> > no option to enable/disable at the kernel command line parameter or
> > sysfs interface.
> >
> > +config PRINTK_DIRECT
> > + bool "Attempt to flush printk output immediately"
> > + depends on PRINTK
> > + help
> > + Rather than using kthreads for printk output, always attempt to write
> > + to the console immediately. This has performance implications, but
> > + will result in a more faithful ordering and interleaving with other
> > + processes writing to the console.
> > +
> > + Say N here unless you really need this. This may also be controlled
> > + at boot time with printk.direct=0/1.
> > +
>
> I am fine with the patch except for the config option. Is there
> any particular reason why we need it, please?
It's a debug option that I'd like to be able to easily build my CI
kernels with, just like various other debug options.
Jason
> We should be careful to add new build options in general because they
> make building the kernel more complicated. People need to understand
> what each option is about and what they want.
This one doesn't, really. It'd hidden away in the debug menu.
Jason
On (22/06/20 09:56), Jason A. Donenfeld wrote:
> > > Otherwise it looks OK to me. But you may want to wait on a response from
> > > Petr, Sergey, or Steven before sending a v3. You are adding a kernel
> > > config and a boot argument. Both of these are sensitive topics that
> > > require more feedback from others.
> >
> > A tricky situation. I think John already mentioned in another thread
> > that any such config/boot param potentially can become a default value
> > for some setups; at the same time, well, yeah, I see what commit message
> > talks about.
>
> Moving it into the debug section of kconfig hopefully sends the right
> message there. Plus the help text makes it clear there are performance
> implications and such.
No objections from me.
From: Petr Mladek
> Sent: 20 June 2022 17:58
>
> On Mon 2022-06-20 01:33:02, Jason A. Donenfeld wrote:
> > In 5.19, there are some changes in printk message ordering /
> > interleaving which leads to confusion. The most obvious (and benign)
> > example appears on system boot, in which the "Run /init as init process"
> > message gets intermixed with the messages that init actually writes() to
> > stdout. For example, here's a snippet from build.wireguard.com:
> >
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -35,6 +35,18 @@ config PRINTK_CALLER
> > no option to enable/disable at the kernel command line parameter or
> > sysfs interface.
> >
> > +config PRINTK_DIRECT
> > + bool "Attempt to flush printk output immediately"
> > + depends on PRINTK
> > + help
> > + Rather than using kthreads for printk output, always attempt to write
> > + to the console immediately. This has performance implications, but
> > + will result in a more faithful ordering and interleaving with other
> > + processes writing to the console.
> > +
> > + Say N here unless you really need this. This may also be controlled
> > + at boot time with printk.direct=0/1.
> > +
>
> I am fine with the patch except for the config option. Is there
> any particular reason why we need it, please?
>
> We should be careful to add new build options in general because they
> make building the kernel more complicated. People need to understand
> what each option is about and what they want.
Especially since most people use pre-built kernels.
I suspect that kernels only get build for very big and
very small/embedded systems.
For the latter you may want to completely remove using kthreads
for printk in order to shrink the kernel size.
But that doesn't look like what this option is for.
David
> Kernel parameters are less intrusive. People need to care about
> it only when they need some special behavior. And everyone has
> its own default command line anyway. For example, I always use
> ignore_loglevel.
>
> Best Regards,
> Petr
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
Hey John, Petr, Sergey,
On Mon, Jun 20, 2022 at 01:33:02AM +0200, Jason A. Donenfeld wrote:
> In 5.19, there are some changes in printk message ordering /
> interleaving which leads to confusion. The most obvious (and benign)
> example appears on system boot, in which the "Run /init as init process"
> message gets intermixed with the messages that init actually writes() to
> stdout. For example, here's a snippet from build.wireguard.com:
>
> [ 0.469732] Freeing unused kernel image (initmem) memory: 4576K
> [ 0.469738] Write protecting the kernel read-only data: 10240k
> [ 0.473823] Freeing unused kernel image (text/rodata gap) memory: 2044K
> [ 0.475228] Freeing unused kernel image (rodata/data gap) memory: 1136K
> [ 0.475236] Run /init as init process
>
> WireGuard Test Suite on Linux 5.19.0-rc2+ x86_64
>
> [+] Mounting filesystems...
> [+] Module self-tests:
> * allowedips self-tests: pass
> * nonce counter self-tests: pass
> * ratelimiter self-tests: pass
> [+] Enabling logging...
> [+] Launching tests...
> [ 0.475237] with arguments:
> [ 0.475238] /init
> [ 0.475238] with environment:
> [ 0.475239] HOME=/
> [ 0.475240] TERM=linux
> [+] ip netns add wg-test-46-0
> [+] ip netns add wg-test-46-1
>
> Before the "with arguments:" and such would print prior to the
> "wireguard test suite on linux 5.19" banner. Now it shows after.
>
> I see the same thing with "Freeing unused kernel image (text/rodata gap)
> memory" printing interwoven into the console of my initramfs on my
> laptop. And so forth.
>
> But the bigger issue for me is that it makes it very confusing to
> interpret CI results later on. Prior, I would nice a nice correlation
> of:
>
> [+] some userspace command
> [ 1.2345 ] some kernel log output
> [+] some userspace command
> [ 1.2346 ] some kernel log output
> [+] some userspace command
> [ 1.2347 ] some kernel log output
>
> Now, the kernel log outputs are all over the place and out of order with
> the sequence of commands. This makes debugging issues somewhat tricky,
> because post hoc ergo propter hoc winds up being a good intuition to
> follow when tracking down bugs, and now the post hoc part is muddled.
>
> This is caused by threaded printk. In order to restore this in debugging
> sessions and in CI, this commit adds the ability to always use direct
> printk, either set by default at compile time, or overridden with a
> runtime command line switch.
>
> Cc: John Ogness <[email protected]>
> Cc: Petr Mladek <[email protected]>
> Cc: Marco Elver <[email protected]>
> Fixes: 09c5ba0aa2fc ("printk: add kthread console printers")
> Signed-off-by: Jason A. Donenfeld <[email protected]>
> ---
> Documentation/admin-guide/kernel-parameters.txt | 7 +++++++
> kernel/printk/printk.c | 11 +++++++++++
> lib/Kconfig.debug | 12 ++++++++++++
> 3 files changed, 30 insertions(+)
>
> diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> index 2522b11e593f..04cec66802d1 100644
> --- a/Documentation/admin-guide/kernel-parameters.txt
> +++ b/Documentation/admin-guide/kernel-parameters.txt
> @@ -4424,6 +4424,13 @@
> printk.time= Show timing data prefixed to each printk message line
> Format: <bool> (1/Y/y=enable, 0/N/n=disable)
>
> + printk.direct= Rather than using kthreads for printk output, always
> + attempt to write to the console immediately. This has
> + performance implications, but will result in a more
> + faithful ordering and interleaving with other
> + processes writing to the console.
> + Format: <bool> (1/Y/y=enable, 0/N/n=disable)
> +
> processor.max_cstate= [HW,ACPI]
> Limit processor to maximum C-state
> max_cstate=9 overrides any DMI blacklist limit.
> diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> index b095fb5f5f61..b7f8f2904f2c 100644
> --- a/kernel/printk/printk.c
> +++ b/kernel/printk/printk.c
> @@ -178,6 +178,14 @@ static int __init control_devkmsg(char *str)
> }
> __setup("printk.devkmsg=", control_devkmsg);
>
> +static bool printk_direct_only __initdata = IS_ENABLED(CONFIG_PRINTK_DIRECT);
> +
> +static int __init control_printk_direct_only(char *str)
> +{
> + return kstrtobool(str, &printk_direct_only);
> +}
> +__setup("printk.direct=", control_printk_direct_only);
> +
> char devkmsg_log_str[DEVKMSG_STR_MAX_SIZE] = "ratelimit";
> #if defined(CONFIG_PRINTK) && defined(CONFIG_SYSCTL)
> int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
> @@ -3605,6 +3613,9 @@ static int __init printk_activate_kthreads(void)
> {
> struct console *con;
>
> + if (printk_direct_only)
> + return 0;
> +
> console_lock();
> printk_kthreads_available = true;
> for_each_console(con)
> diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> index 2e24db4bff19..1acfb4971ec7 100644
> --- a/lib/Kconfig.debug
> +++ b/lib/Kconfig.debug
> @@ -35,6 +35,18 @@ config PRINTK_CALLER
> no option to enable/disable at the kernel command line parameter or
> sysfs interface.
>
> +config PRINTK_DIRECT
> + bool "Attempt to flush printk output immediately"
> + depends on PRINTK
> + help
> + Rather than using kthreads for printk output, always attempt to write
> + to the console immediately. This has performance implications, but
> + will result in a more faithful ordering and interleaving with other
> + processes writing to the console.
> +
> + Say N here unless you really need this. This may also be controlled
> + at boot time with printk.direct=0/1.
> +
> config STACKTRACE_BUILD_ID
> bool "Show build ID information in stacktraces"
> depends on PRINTK
> --
> 2.35.1
>
From reading the other thread with Linus about console driver locks, it
sounds like there's a good chance threaded printk will be reverted
wholesale, which would of course fix this regression here. But before
that happens, could you make sure this v3 gets applied, so that whenever
threaded printk is reintroduced, I don't run into the same problem and
repeat all the same motions to draw this up again?
Thanks,
Jason
On 2022-06-20, Sergey Senozhatsky <[email protected]> wrote:
> would it make sense/be possible/etc. to look into possibility
> to address that "deferred kthread vs IRQ" race by moving prb entries
> emit to the same IRQ handler that emits chars from uart xmit buffer?
I think this is an interesting idea. We could have a function that would
allow serial drivers (if they had a console on them) to flush any
pending printk messages for their console.
Each driver would need to call this function for themselves. And also
the driver would need to ensure that it is not holding any locks that
the console-part of the driver will need. And it would need to be best
effort, because the related kthread may already be in the process of
printing. But in general I think it could be a nice feature to improve
the sort-of-synchronous behavior of printk for drivers that use it.
John Ogness
On Tue, Jun 21, 2022 at 11:59:31AM +0200, Jason A. Donenfeld wrote:
> Hey John, Petr, Sergey,
>
> On Mon, Jun 20, 2022 at 01:33:02AM +0200, Jason A. Donenfeld wrote:
> > In 5.19, there are some changes in printk message ordering /
> > interleaving which leads to confusion. The most obvious (and benign)
> > example appears on system boot, in which the "Run /init as init process"
> > message gets intermixed with the messages that init actually writes() to
> > stdout. For example, here's a snippet from build.wireguard.com:
> >
> > [ 0.469732] Freeing unused kernel image (initmem) memory: 4576K
> > [ 0.469738] Write protecting the kernel read-only data: 10240k
> > [ 0.473823] Freeing unused kernel image (text/rodata gap) memory: 2044K
> > [ 0.475228] Freeing unused kernel image (rodata/data gap) memory: 1136K
> > [ 0.475236] Run /init as init process
> >
> > WireGuard Test Suite on Linux 5.19.0-rc2+ x86_64
> >
> > [+] Mounting filesystems...
> > [+] Module self-tests:
> > * allowedips self-tests: pass
> > * nonce counter self-tests: pass
> > * ratelimiter self-tests: pass
> > [+] Enabling logging...
> > [+] Launching tests...
> > [ 0.475237] with arguments:
> > [ 0.475238] /init
> > [ 0.475238] with environment:
> > [ 0.475239] HOME=/
> > [ 0.475240] TERM=linux
> > [+] ip netns add wg-test-46-0
> > [+] ip netns add wg-test-46-1
> >
> > Before the "with arguments:" and such would print prior to the
> > "wireguard test suite on linux 5.19" banner. Now it shows after.
> >
> > I see the same thing with "Freeing unused kernel image (text/rodata gap)
> > memory" printing interwoven into the console of my initramfs on my
> > laptop. And so forth.
> >
> > But the bigger issue for me is that it makes it very confusing to
> > interpret CI results later on. Prior, I would nice a nice correlation
> > of:
> >
> > [+] some userspace command
> > [ 1.2345 ] some kernel log output
> > [+] some userspace command
> > [ 1.2346 ] some kernel log output
> > [+] some userspace command
> > [ 1.2347 ] some kernel log output
> >
> > Now, the kernel log outputs are all over the place and out of order with
> > the sequence of commands. This makes debugging issues somewhat tricky,
> > because post hoc ergo propter hoc winds up being a good intuition to
> > follow when tracking down bugs, and now the post hoc part is muddled.
> >
> > This is caused by threaded printk. In order to restore this in debugging
> > sessions and in CI, this commit adds the ability to always use direct
> > printk, either set by default at compile time, or overridden with a
> > runtime command line switch.
> >
> > Cc: John Ogness <[email protected]>
> > Cc: Petr Mladek <[email protected]>
> > Cc: Marco Elver <[email protected]>
> > Fixes: 09c5ba0aa2fc ("printk: add kthread console printers")
> > Signed-off-by: Jason A. Donenfeld <[email protected]>
> > ---
> > Documentation/admin-guide/kernel-parameters.txt | 7 +++++++
> > kernel/printk/printk.c | 11 +++++++++++
> > lib/Kconfig.debug | 12 ++++++++++++
> > 3 files changed, 30 insertions(+)
> >
> > diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
> > index 2522b11e593f..04cec66802d1 100644
> > --- a/Documentation/admin-guide/kernel-parameters.txt
> > +++ b/Documentation/admin-guide/kernel-parameters.txt
> > @@ -4424,6 +4424,13 @@
> > printk.time= Show timing data prefixed to each printk message line
> > Format: <bool> (1/Y/y=enable, 0/N/n=disable)
> >
> > + printk.direct= Rather than using kthreads for printk output, always
> > + attempt to write to the console immediately. This has
> > + performance implications, but will result in a more
> > + faithful ordering and interleaving with other
> > + processes writing to the console.
> > + Format: <bool> (1/Y/y=enable, 0/N/n=disable)
> > +
> > processor.max_cstate= [HW,ACPI]
> > Limit processor to maximum C-state
> > max_cstate=9 overrides any DMI blacklist limit.
> > diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
> > index b095fb5f5f61..b7f8f2904f2c 100644
> > --- a/kernel/printk/printk.c
> > +++ b/kernel/printk/printk.c
> > @@ -178,6 +178,14 @@ static int __init control_devkmsg(char *str)
> > }
> > __setup("printk.devkmsg=", control_devkmsg);
> >
> > +static bool printk_direct_only __initdata = IS_ENABLED(CONFIG_PRINTK_DIRECT);
> > +
> > +static int __init control_printk_direct_only(char *str)
> > +{
> > + return kstrtobool(str, &printk_direct_only);
> > +}
> > +__setup("printk.direct=", control_printk_direct_only);
> > +
> > char devkmsg_log_str[DEVKMSG_STR_MAX_SIZE] = "ratelimit";
> > #if defined(CONFIG_PRINTK) && defined(CONFIG_SYSCTL)
> > int devkmsg_sysctl_set_loglvl(struct ctl_table *table, int write,
> > @@ -3605,6 +3613,9 @@ static int __init printk_activate_kthreads(void)
> > {
> > struct console *con;
> >
> > + if (printk_direct_only)
> > + return 0;
> > +
> > console_lock();
> > printk_kthreads_available = true;
> > for_each_console(con)
> > diff --git a/lib/Kconfig.debug b/lib/Kconfig.debug
> > index 2e24db4bff19..1acfb4971ec7 100644
> > --- a/lib/Kconfig.debug
> > +++ b/lib/Kconfig.debug
> > @@ -35,6 +35,18 @@ config PRINTK_CALLER
> > no option to enable/disable at the kernel command line parameter or
> > sysfs interface.
> >
> > +config PRINTK_DIRECT
> > + bool "Attempt to flush printk output immediately"
> > + depends on PRINTK
> > + help
> > + Rather than using kthreads for printk output, always attempt to write
> > + to the console immediately. This has performance implications, but
> > + will result in a more faithful ordering and interleaving with other
> > + processes writing to the console.
> > +
> > + Say N here unless you really need this. This may also be controlled
> > + at boot time with printk.direct=0/1.
> > +
> > config STACKTRACE_BUILD_ID
> > bool "Show build ID information in stacktraces"
> > depends on PRINTK
> > --
> > 2.35.1
> >
>
> From reading the other thread with Linus about console driver locks, it
> sounds like there's a good chance threaded printk will be reverted
> wholesale, which would of course fix this regression here. But before
> that happens, could you make sure this v3 gets applied, so that whenever
> threaded printk is reintroduced, I don't run into the same problem and
> repeat all the same motions to draw this up again?
>
> Thanks,
> Jason
Another thing I'm finding is that without this patch, printing various
messages from userspace, and then calling reboot() results in those
messages never actually making it to the console, making my test suite
panic() handler kind of useless.
This patch presumably fixes it, since it becomes direct. But maybe if
people don't want direct mode, but still don't want messages to get lost
on reboot, you might consider some kind of flushing?
Jason