2020-03-15 17:10:18

by Eugeniu Rosca

[permalink] [raw]
Subject: [RFC PATCH 0/3] Fix quiet console in pre-panic scenarios

Dear community,

The motivation behind this seris is to save days/weeks, if not months,
of debugging efforts for users who:

* experience an issue like softlockup/hardlockup/hung task/oom, whose
reproduction is not clear and whose occurrence rate is very low
* are constrained to use a low loglevel value (1,2,3) in production
* mostly rely on console logs to debug the issue post-mortem
(e.g. saved to persistent storage via e.g. pstore)

As pointed out in the last patch from this series, under the above
circumstances, users might simply lack any relevant logs during
post-mortem analysis.

Why this series is marked as RFC is because:
* There are several possible approaches to turn console verbosity on
and off. Current series employs the 'ignore_loglevel' functionality,
but an alternative way is to use the 'console_loglevel' variable. The
latter is more intrusive, hence the former has been chosen as v1.
* Manipulating 'ignore_loglevel' might be seen as an abuse, especially
because it breaks the expectation of users who assume the system to
be dead silent after passing loglevel=0 on kernel command line.

Thank you for your comments!

Eugeniu Rosca (3):
printk: convert ignore_loglevel to atomic_t
printk: add console_verbose_{start,end}
watchdog: Turn console verbosity on when reporting softlockup

include/linux/printk.h | 10 ++++++++++
kernel/printk/printk.c | 30 ++++++++++++++++++++++++++----
kernel/watchdog.c | 4 ++++
3 files changed, 40 insertions(+), 4 deletions(-)

--
2.25.0


2020-03-15 17:10:35

by Eugeniu Rosca

[permalink] [raw]
Subject: [RFC PATCH 1/3] printk: convert ignore_loglevel to atomic_t

Prepare for using the 'ignore_loglevel' variable as a lockless and SMP
safe vehicle to turn console verbosity on and off in error paths which
usually (but not necessarily) end up with panic (configurable via
user-selectable knobs).

No functional change intended. Tested on R-Car H3ULCB.

Cc: Petr Mladek <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Eugeniu Rosca <[email protected]>
---
kernel/printk/printk.c | 30 ++++++++++++++++++++++++++----
1 file changed, 26 insertions(+), 4 deletions(-)

diff --git a/kernel/printk/printk.c b/kernel/printk/printk.c
index fada22dc4ab6..d2c75955a0d7 100644
--- a/kernel/printk/printk.c
+++ b/kernel/printk/printk.c
@@ -1181,24 +1181,46 @@ void __init setup_log_buf(int early)
free, (free * 100) / __LOG_BUF_LEN);
}

-static bool __read_mostly ignore_loglevel;
+atomic_t __read_mostly ignore_loglevel = ATOMIC_INIT(0);

static int __init ignore_loglevel_setup(char *str)
{
- ignore_loglevel = true;
+ atomic_set(&ignore_loglevel, 1);
pr_info("debug: ignoring loglevel setting.\n");

return 0;
}

+static int kparam_set_atomic(const char *val, const struct kernel_param *kp)
+{
+ if (param_set_bool(val, kp))
+ return -EINVAL;
+
+ atomic_set(&ignore_loglevel, *(bool *)kp->arg ? 1 : 0);
+
+ return 0;
+}
+
+static int kparam_get_atomic(char *val, const struct kernel_param *kp)
+{
+ return sprintf(val, "%c\n", atomic_read(&ignore_loglevel) ? 'Y' : 'N');
+}
+
+static const struct kernel_param_ops kparam_ops = {
+ .set = kparam_set_atomic,
+ .get = kparam_get_atomic,
+};
+
+static int kparam_buf;
+
early_param("ignore_loglevel", ignore_loglevel_setup);
-module_param(ignore_loglevel, bool, S_IRUGO | S_IWUSR);
+module_param_cb(ignore_loglevel, &kparam_ops, &kparam_buf, 0644);
MODULE_PARM_DESC(ignore_loglevel,
"ignore loglevel setting (prints all kernel messages to the console)");

static bool suppress_message_printing(int level)
{
- return (level >= console_loglevel && !ignore_loglevel);
+ return (level >= console_loglevel && !atomic_read(&ignore_loglevel));
}

#ifdef CONFIG_BOOT_PRINTK_DELAY
--
2.25.0

2020-03-15 17:11:33

by Eugeniu Rosca

[permalink] [raw]
Subject: [RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup

As pointed out in v5.0 commit 168e06f7937d96 ("kernel/hung_task.c:
force console verbose before panic"), users may miss crucial information
in pre-panic scenarios, depending on the loglevel value set by the
distribion/platform maintainers.

For example, assuming loglevel=3 (i.e. print EMERG/ALERT/CRIT only),
below is the output of a softlockup got with pure vanilla kernel [1].
In contrast, with current patch applied, output [2] is far more helpful.

Save potentially days/weeks/months of debugging efforts for those who:
* deal with a softlockup whose reproduction scenario is not clear
* are constrainted to use a low loglevel value in production
* mostly/exclusively rely on logs like [2] post-mortem

[1] Softlockup output [loglevel=3, pure v5.6-rc5, H3ULCB]
watchdog: BUG: soft lockup - CPU#3 stuck for 22s! [modprobe:2354]

[2] Softlockup output [loglevel=3, v5.6-rc5 + this patch, H3ULCB]
[ 292.259324] watchdog: BUG: soft lockup - CPU#3 stuck for 23s! [modprobe:2437]
[ 292.266834] Modules linked in: <-- cut -->
[ 292.302880] irq event stamp: 519660
[ 292.306555] hardirqs last enabled at (519659): [<ffffa00010082a18>] el1_irq+0xd8/0x180
[ 292.314909] hardirqs last disabled at (519660): [<ffffa000100829bc>] el1_irq+0x7c/0x180
[ 292.323274] softirqs last enabled at (519658): [<ffffa00010080d0c>] __do_softirq+0x2e4/0x7c4
[ 292.332179] softirqs last disabled at (519651): [<ffffa000100c3e48>] irq_exit+0x150/0x1e0
[ 292.340720] CPU: 3 PID: 2437 Comm: modprobe Not tainted 5.6.0-rc5+ #96
[ 292.347538] Hardware name: Renesas H3ULCB Kingfisher board based on r8a77951 (DT)
[ 292.355355] pstate: 20000005 (nzCv daif -PAN -UAO)
[ 292.360435] pc : lkdtm_SOFTLOCKUP+0x18/0x20 [lkdtm]
[ 292.365592] lr : lkdtm_SOFTLOCKUP+0x18/0x20 [lkdtm]
[ 292.370699] sp : ffff0006cd9af780
[ 292.374187] x29: ffff0006cd9af780 x28: ffffa0000907d000
[ 292.379757] x27: ffff0006cbaebc10 x26: ffffa000090751c0
[ 292.385327] x25: ffffa0000907dfc0 x24: 0000000000000009
[ 292.390896] x23: ffffa0000907d478 x22: 0000000000000000
[ 292.396464] x21: ffffa0000907d340 x20: ffffa0000907d340
[ 292.402034] x19: ffffa000090751c0 x18: 0000000000000000
[ 292.407603] x17: 0000000000000000 x16: 0000000000000000
[ 292.413171] x15: 0000000000000000 x14: ffffa00000000003
[ 292.418740] x13: ffffa00010dc0180 x12: 0000000000000000
[ 292.424309] x11: ffff8000bfc2401d x10: 1fffe000bfc2401d
[ 292.429877] x9 : dfffa00000000000 x8 : ffffa00010082c00
[ 292.435445] x7 : ffff0005fe1200ef x6 : 0000000000000000
[ 292.441012] x5 : dfffa00000000000 x4 : ffff0006c82e9c50
[ 292.446580] x3 : ffffa0001024b848 x2 : 0000000000000001
[ 292.452147] x1 : 0000000000000000 x0 : 0000000000000000
[ 292.457713] Call trace:
[ 292.460351] lkdtm_SOFTLOCKUP+0x18/0x20 [lkdtm]
[ 292.465147] lkdtm_do_action+0x40/0x44 [lkdtm]
[ 292.469853] lkdtm_register_cpoint+0x44/0xbc [lkdtm]
[ 292.475099] lkdtm_module_init+0x220/0x27c [lkdtm]
[ 292.480120] do_one_initcall+0x1c8/0x40c
[ 292.484244] do_init_module+0xb4/0x330
[ 292.488185] load_module+0x3404/0x34c8
[ 292.492126] __do_sys_finit_module+0x11c/0x13c
[ 292.496787] __arm64_sys_finit_module+0x4c/0x5c
[ 292.501539] el0_svc_common.constprop.0+0x160/0x200
[ 292.506647] do_el0_svc+0x74/0xa4
[ 292.510139] el0_sync_handler+0xc4/0x104
[ 292.514257] el0_sync+0x140/0x180

Cc: Thomas Gleixner <[email protected]>
Cc: Peter Zijlstra <[email protected]>
Cc: Ingo Molnar <[email protected]>
Cc: Petr Mladek <[email protected]>
Cc: Jisheng Zhang <[email protected]>
Cc: Valdis Kletnieks <[email protected]>
Cc: Sebastian Andrzej Siewior <[email protected]>
Signed-off-by: Eugeniu Rosca <[email protected]>
---
kernel/watchdog.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/kernel/watchdog.c b/kernel/watchdog.c
index b6b1f54a7837..cbe7e2c4c5fb 100644
--- a/kernel/watchdog.c
+++ b/kernel/watchdog.c
@@ -428,6 +428,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
}
}

+ console_verbose_start();
+
pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
smp_processor_id(), duration,
current->comm, task_pid_nr(current));
@@ -453,6 +455,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
if (softlockup_panic)
panic("softlockup: hung tasks");
__this_cpu_write(soft_watchdog_warn, true);
+
+ console_verbose_end();
} else
__this_cpu_write(soft_watchdog_warn, false);

--
2.25.0

2020-03-15 17:11:51

by Eugeniu Rosca

[permalink] [raw]
Subject: [RFC PATCH 2/3] printk: add console_verbose_{start,end}

Consider below example scenarios:
- soft lockup
- hard lockup
- hung task
- scheduling while atomic
- rcu stall
- oops
- oom
- WARN

In most of the above situations, it is up to the user to terminate the
execution with a panic (see panic_on_* in kernel/sysctl.c) or to
decide not to abort and try to recover.

A general concern applicable to the above use-cases is that, depending
on the console loglevel set by the user, precious information conveyed
by show_regs(), print_modules(), dump_stack() and friends may simply not
appear on the console. Below example commits tackle this exact concern
in the panic paths:

* commit 168e06f7937d96 ("kernel/hung_task.c: force console verbose before panic")
* commit 5b530fc1832460 ("panic: call console_verbose() in panic")

The approach behind the above commits is straightforward. Whenever
panic is imminent, they simply call console_verbose(). Unfortunately,
the same technique does not apply to non-panic paths. It requires a
counterpart of console_verbose(), which currently does not exist.

With that in mind, create a pair of functions named
console_verbose_start() and console_verbose_end() which turn the
console verbosity on and off in a lockless and SMP safe way.

Cc: Petr Mladek <[email protected]>
Cc: Sergey Senozhatsky <[email protected]>
Cc: Steven Rostedt <[email protected]>
Cc: Ingo Molnar <[email protected]>
Signed-off-by: Eugeniu Rosca <[email protected]>
---
include/linux/printk.h | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/include/linux/printk.h b/include/linux/printk.h
index 1e6108b8d15f..14755ef7b017 100644
--- a/include/linux/printk.h
+++ b/include/linux/printk.h
@@ -3,6 +3,7 @@
#define __KERNEL_PRINTK__

#include <stdarg.h>
+#include <linux/atomic.h>
#include <linux/init.h>
#include <linux/kern_levels.h>
#include <linux/linkage.h>
@@ -77,6 +78,15 @@ static inline void console_verbose(void)
console_loglevel = CONSOLE_LOGLEVEL_MOTORMOUTH;
}

+#ifdef CONFIG_PRINTK
+extern atomic_t ignore_loglevel;
+static inline void console_verbose_start(void) { atomic_inc(&ignore_loglevel); }
+static inline void console_verbose_end(void) { atomic_dec(&ignore_loglevel); }
+#else
+static inline void console_verbose_start(void) { }
+static inline void console_verbose_end(void) { }
+#endif
+
/* strlen("ratelimit") + 1 */
#define DEVKMSG_STR_MAX_SIZE 10
extern char devkmsg_log_str[];
--
2.25.0

2020-03-16 18:37:27

by Steven Rostedt

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] Fix quiet console in pre-panic scenarios

On Sun, 15 Mar 2020 18:09:00 +0100
Eugeniu Rosca <[email protected]> wrote:

> Dear community,
>
> The motivation behind this seris is to save days/weeks, if not months,
> of debugging efforts for users who:
>
> * experience an issue like softlockup/hardlockup/hung task/oom, whose
> reproduction is not clear and whose occurrence rate is very low
> * are constrained to use a low loglevel value (1,2,3) in production
> * mostly rely on console logs to debug the issue post-mortem
> (e.g. saved to persistent storage via e.g. pstore)
>
> As pointed out in the last patch from this series, under the above
> circumstances, users might simply lack any relevant logs during
> post-mortem analysis.
>
> Why this series is marked as RFC is because:
> * There are several possible approaches to turn console verbosity on
> and off. Current series employs the 'ignore_loglevel' functionality,
> but an alternative way is to use the 'console_loglevel' variable. The
> latter is more intrusive, hence the former has been chosen as v1.
> * Manipulating 'ignore_loglevel' might be seen as an abuse, especially
> because it breaks the expectation of users who assume the system to
> be dead silent after passing loglevel=0 on kernel command line.
>
> Thank you for your comments!

I don't see any issues with this patch set. What do others think?

Acked-by: Steven Rostedt (VMware) <[email protected]>

[ Note, I only acked, and did not give a deep review of it ]

-- Steve

>
> Eugeniu Rosca (3):
> printk: convert ignore_loglevel to atomic_t
> printk: add console_verbose_{start,end}
> watchdog: Turn console verbosity on when reporting softlockup
>
> include/linux/printk.h | 10 ++++++++++
> kernel/printk/printk.c | 30 ++++++++++++++++++++++++++----
> kernel/watchdog.c | 4 ++++
> 3 files changed, 40 insertions(+), 4 deletions(-)
>

Subject: Re: [RFC PATCH 0/3] Fix quiet console in pre-panic scenarios

On 2020-03-16 14:35:17 [-0400], Steven Rostedt wrote:
> I don't see any issues with this patch set. What do others think?
>
> Acked-by: Steven Rostedt (VMware) <[email protected]>
>
> [ Note, I only acked, and did not give a deep review of it ]

What is the state of the other larger printk rework? If this does not
solve any -stable related issues then it will be replaced?

Sebastian

2020-03-17 03:27:49

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup

On (20/03/15 18:09), Eugeniu Rosca wrote:

[..]

> @@ -428,6 +428,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> }
> }
>
> + console_verbose_start();
> +
> pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
> smp_processor_id(), duration,
> current->comm, task_pid_nr(current));
> @@ -453,6 +455,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> if (softlockup_panic)
> panic("softlockup: hung tasks");
> __this_cpu_write(soft_watchdog_warn, true);
> +
> + console_verbose_end();
> } else
> __this_cpu_write(soft_watchdog_warn, false);


I'm afraid, as of now, this approach is not going to work the way it's
supposed to work in 100% of cases. Because the only thing that printk()
call sort of guarantees is that the message will be stored somewhere.
Either in the main kernel log buffer, on in one of auxiliary per-CPU
log buffers. It does not guarantee, generally speaking, that the message
will be printed on the console immediately.

Consider the following example:

CPU0 CPU1
console_lock();
schedule();

watchdog()
console_verbose_start();
printk()
log_store()
if (!console_trylock())
return;
console_verbose_end();

...
console_unlock()
print logbuf messages to the consoles
we missed the console_verbose_start/end
on CPU1



IIRC, we had a similar approach in the past. See commit 375899cddcbb26
("printk: make sure to print log on console"). And we reverted it, see
a6ae928c25835 ("Revert "printk: make sure to print log on console.").

-ss

2020-03-18 14:57:54

by Eugeniu Rosca

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] Fix quiet console in pre-panic scenarios

Hi Sebastian,
Cc: John Ogness

On Mon, Mar 16, 2020 at 08:09:48PM +0100, Sebastian Andrzej Siewior wrote:
> On 2020-03-16 14:35:17 [-0400], Steven Rostedt wrote:
> > I don't see any issues with this patch set. What do others think?
> >
> > Acked-by: Steven Rostedt (VMware) <[email protected]>
> >
> > [ Note, I only acked, and did not give a deep review of it ]
>
> What is the state of the other larger printk rework? If this does not
> solve any -stable related issues then it will be replaced?

Is this a question to John and his most recent series in
https://lore.kernel.org/lkml/[email protected]/
?

Is there any upstream agreement to already keep the current printk
mechanism away from any updates?

--
Best Regards
Eugeniu Rosca

2020-03-18 15:42:23

by John Ogness

[permalink] [raw]
Subject: Re: [RFC PATCH 0/3] Fix quiet console in pre-panic scenarios

On 2020-03-18, Eugeniu Rosca <[email protected]> wrote:
> On Mon, Mar 16, 2020 at 08:09:48PM +0100, Sebastian Andrzej Siewior wrote:
>> On 2020-03-16 14:35:17 [-0400], Steven Rostedt wrote:
>> > I don't see any issues with this patch set. What do others think?
>> >
>> > Acked-by: Steven Rostedt (VMware) <[email protected]>
>> >
>> > [ Note, I only acked, and did not give a deep review of it ]
>>
>> What is the state of the other larger printk rework? If this does not
>> solve any -stable related issues then it will be replaced?
>
> Is this a question to John and his most recent series in
> https://lore.kernel.org/lkml/[email protected]/
> ?
>
> Is there any upstream agreement to already keep the current printk
> mechanism away from any updates?

No. Fixes are welcome!

The only thing we are trying to avoid at this stage is massive
refactoring/cleanup work. Your series does not fit into that category.

As to my opinion on this series, assuming it is acceptable for the
maintainers, I would like to see console_verbose() become an alias to
console_verbose_start(). Then @ignore_loglevel would be used for both
and CONSOLE_LOGLEVEL_MOTORMOUTH could be removed.

John Ogness

2020-03-18 16:02:30

by John Ogness

[permalink] [raw]
Subject: Re: [RFC PATCH 2/3] printk: add console_verbose_{start,end}

On 2020-03-15, Eugeniu Rosca <[email protected]> wrote:
> diff --git a/include/linux/printk.h b/include/linux/printk.h
> index 1e6108b8d15f..14755ef7b017 100644
> --- a/include/linux/printk.h
> +++ b/include/linux/printk.h
> @@ -3,6 +3,7 @@
> #define __KERNEL_PRINTK__
>
> #include <stdarg.h>
> +#include <linux/atomic.h>
> #include <linux/init.h>
> #include <linux/kern_levels.h>
> #include <linux/linkage.h>
> @@ -77,6 +78,15 @@ static inline void console_verbose(void)
> console_loglevel = CONSOLE_LOGLEVEL_MOTORMOUTH;
> }
>
> +#ifdef CONFIG_PRINTK
> +extern atomic_t ignore_loglevel;
> +static inline void console_verbose_start(void) { atomic_inc(&ignore_loglevel); }
> +static inline void console_verbose_end(void) { atomic_dec(&ignore_loglevel); }

To be compatible with console_verbose() semantics, this should also
respect console_loglevel=0. Perhaps by updating
suppress_message_printing() to be something like:

static bool suppress_message_printing(int level)
{
return (!console_loglevel ||
(level >= console_loglevel && !ignore_loglevel));
}

John Ogness

2020-03-18 18:06:34

by Eugeniu Rosca

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup

Hi Sergey,

Many thanks for your feedback!

On Tue, Mar 17, 2020 at 11:18:18AM +0900, Sergey Senozhatsky wrote:
> On (20/03/15 18:09), Eugeniu Rosca wrote:
>
> [..]
>
> > @@ -428,6 +428,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> > }
> > }
> >
> > + console_verbose_start();
> > +
> > pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
> > smp_processor_id(), duration,
> > current->comm, task_pid_nr(current));
> > @@ -453,6 +455,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
> > if (softlockup_panic)
> > panic("softlockup: hung tasks");
> > __this_cpu_write(soft_watchdog_warn, true);
> > +
> > + console_verbose_end();
> > } else
> > __this_cpu_write(soft_watchdog_warn, false);
>
>
> I'm afraid, as of now, this approach is not going to work the way it's
> supposed to work in 100% of cases. Because the only thing that printk()
> call sort of guarantees is that the message will be stored somewhere.
> Either in the main kernel log buffer, on in one of auxiliary per-CPU
> log buffers. It does not guarantee, generally speaking, that the message
> will be printed on the console immediately.

I take this passage as an acknowledgement of the problem being _real_,
in spite of the fix being not perfect.

One aspect I would like to emphasize is that (please, NAK this
statement if it's not accurate) the problem reported in this patch is
not specific to the existing printk mechanism, but also applies to the
upcoming kthread-based printk. If that's true, then IMHO this is a
compelling argument to join forces and try to find a working, safe and
future-proof solution.

>
> Consider the following example:
>
> CPU0 CPU1
> console_lock();
> schedule();
>
> watchdog()
> console_verbose_start();
> printk()
> log_store()
> if (!console_trylock())
> return;
> console_verbose_end();
>
> ...
> console_unlock()
> print logbuf messages to the consoles
> we missed the console_verbose_start/end
> on CPU1

This looks plausible. However, I wonder to which degree the same
scenario is a concern in the kthread-based approach?

My current standpoint is that as long as points [A-D] are met, it
should do no harm to accept a (partial) fix like seen in my series:

- [A] the patch tackles at least a subset of problematic use-cases
- [B] the fix is non-intrusive and easy to review
- [C] there is hope to reuse it in the new lockless buffer based printk
- [D] there are no regressions employing the major console knobs
(ignore_loglevel, quiet, loglevel, etc) as it happened in
a6ae928c25835c ("Revert "printk: make sure to print log on console."")

From the above points, my only major concern is that current series
breaks the expectations of users who pass loglevel=0 on kernel
command line and expect the system to be totally silent. This has
already been expressed in the cover letter. I would especially
appreciate if the same view is shared (or invalidated) by others.

>
> IIRC, we had a similar approach in the past. See commit 375899cddcbb26
> ("printk: make sure to print log on console"). And we reverted it, see
> a6ae928c25835 ("Revert "printk: make sure to print log on console.").

Thanks for this reference. It looks to me that in spite of being
relatively compact, commit 375899cddcbb26 ("printk: make sure to print
log on console.") broke criteria [D] listed above. I intend to avoid
it by testing multiple console knob values on my arm64 system.

Looking forward to your feedback on the questions posted above. TIA!

--
Best Regards
Eugeniu Rosca

2020-03-19 06:49:49

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup

Hi,

On (20/03/18 19:05), Eugeniu Rosca wrote:
[..]
> > I'm afraid, as of now, this approach is not going to work the way it's
> > supposed to work in 100% of cases. Because the only thing that printk()
> > call sort of guarantees is that the message will be stored somewhere.
> > Either in the main kernel log buffer, on in one of auxiliary per-CPU
> > log buffers. It does not guarantee, generally speaking, that the message
> > will be printed on the console immediately.
>
> I take this passage as an acknowledgement of the problem being _real_,
> in spite of the fix being not perfect.
>
> One aspect I would like to emphasize is that (please, NAK this
> statement if it's not accurate) the problem reported in this patch is
> not specific to the existing printk mechanism, but also applies to the
> upcoming kthread-based printk. If that's true, then IMHO this is a
> compelling argument to join forces and try to find a working, safe and
> future-proof solution.

Yes, this is a "known issue".

> > Consider the following example:
> >
> > CPU0 CPU1
> > console_lock();
> > schedule();
> >
> > watchdog()
> > console_verbose_start();
> > printk()
> > log_store()
> > if (!console_trylock())
> > return;
> > console_verbose_end();
> >
> > ...
> > console_unlock()
> > print logbuf messages to the consoles
> > we missed the console_verbose_start/end
> > on CPU1
>
> This looks plausible. However, I wonder to which degree the same
> scenario is a concern in the kthread-based approach?

This is a problem for any async/deferred printk() printout. When we
print messages on the consoles we don't have any idea anymore if the
CPU which added those messages to the logbuf had any issues going on.

> My current standpoint is that as long as points [A-D] are met, it
> should do no harm to accept a (partial) fix like seen in my series:
>
> - [A] the patch tackles at least a subset of problematic use-cases
> - [B] the fix is non-intrusive and easy to review
> - [C] there is hope to reuse it in the new lockless buffer based printk
> - [D] there are no regressions employing the major console knobs
> (ignore_loglevel, quiet, loglevel, etc) as it happened in
> a6ae928c25835c ("Revert "printk: make sure to print log on console."")

So the issue is that when we bump `console_loglevel' or `ignore_loglevel'
we lift restrictions globally. For all CPUs, for all contexts which call
printk(). This may have negative impact. Fuzzing is one example. It
usually generates a lot of printk() noise, so lifting global printk()
log_level restrictions does cause problems. I recall that fuzzing people
really disliked the
old_level = console_loglevel
console_loglevel = new_level
....
console_loglevel = old_level

approach. Because if lets all CPUs and tasks to pollute serial logs.
While what we want is to print messages from a particular CPU/task.

So _maybe_ we can explore the per-context printk() flags instead. The idea
is to store a "desired"/emergency ->log_level. When a context sets that
->log_level we need to make any printk() messages issued by that context
at least of that ->log_level. IOW we overwrite messages log_levels, because
printk() is not synchronous by design.

We had similar discussion before [0].
I was misunderstood back then and in order to avoid the same
misunderstanding this time around, let me first state that - *I DO NOT*
propose to disable preemption for printk(). The code in question
was *just a POC snippet*.

To move that POC further - we can have the same per-context printk()
fields in every task_struct struct. Ideally without consuming additional
memory, or consuming as few extra bytes per task_struct as possible.
But even if we will have ->log_level embedded into every task_struct, we
still need to distinguish various cases (contexts):

- when current task wants to declare printk() emergency and sets
current->log_level, only this process is in printk() emergency.
- IRQs and NMIs which interrupt current should not be in printk()
emergency, unless IRQ and/or NMI explicitly declare printk()
emergency.

- when IRQ declares printk() emergency and sets current->log_level, only
this IRQ is in printk() emergency

- when NMI declares printk() emergency and sets current->log_level, only
this NMI is in printk() emergency

Another thing is. Suppose current sets ->log_level, and then encounters
a page_fault, which printk-s from exception handler. We shall distinguish
such cases too, I think.

So something like

printk_emergency_enter(LOG_LEVEL)
current->context[ctx] = LOG_LEVEL;
...
printk()
// overwrite message log_level to current->context[ctx] level
dump_stack()
// same log_level overwrite
...
printk_emergency_exit()
current->context[ctx] = -1

So when we iterate pending logbuf messages (CPU0->console_unlock() in
the example from my first email) messages from CPU1 will have appropriate
"desired" log_level and we will print those messages on the consoles.

Well. Something like this.


[0] https://lore.kernel.org/linuxppc-dev/[email protected]/

-ss

2020-03-19 07:39:14

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup

On Thu, 19 Mar 2020 15:48:36 +0900, Sergey Senozhatsky said:

> So the issue is that when we bump `console_loglevel' or `ignore_loglevel'
> we lift restrictions globally. For all CPUs, for all contexts which call
> printk(). This may have negative impact. Fuzzing is one example. It
> usually generates a lot of printk() noise, so lifting global printk()
> log_level restrictions does cause problems. I recall that fuzzing people
> really disliked the
> old_level = console_loglevel
> console_loglevel = new_level
> ....
> console_loglevel = old_level
>
> approach. Because if lets all CPUs and tasks to pollute serial logs.
> While what we want is to print messages from a particular CPU/task.

Well... how does this sound for a RFC idea? We already have CONFIG_PRINTK_TIME
and CONFIG_PRINTK_CALLER. How about adding an option to allow printing the
calling CPU as well, or just extend CALLER to print [pid/cpu] rather than just
[pid]?

(And yes, I know that the concept of "this CPU" can change quickly, which is
why per_cpu data may not be accurate for anything more than statistical
aggregation purposes. It's only worth pondering how to deal with that if it's
worth developing the patch at all... :)


Attachments:
(No filename) (849.00 B)

2020-03-19 08:03:56

by Sergey Senozhatsky

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup

On (20/03/19 03:38), Valdis Klētnieks wrote:
> On Thu, 19 Mar 2020 15:48:36 +0900, Sergey Senozhatsky said:
>
> > So the issue is that when we bump `console_loglevel' or `ignore_loglevel'
> > we lift restrictions globally. For all CPUs, for all contexts which call
> > printk(). This may have negative impact. Fuzzing is one example. It
> > usually generates a lot of printk() noise, so lifting global printk()
> > log_level restrictions does cause problems. I recall that fuzzing people
> > really disliked the
> > old_level = console_loglevel
> > console_loglevel = new_level
> > ....
> > console_loglevel = old_level
> >
> > approach. Because if lets all CPUs and tasks to pollute serial logs.
> > While what we want is to print messages from a particular CPU/task.
>
> Well... how does this sound for a RFC idea? We already have CONFIG_PRINTK_TIME
> and CONFIG_PRINTK_CALLER. How about adding an option to allow printing the
> calling CPU as well, or just extend CALLER to print [pid/cpu] rather than just
> [pid]?

IIRC, CONFIG_PRINTK_CALLER prints pid when printk() caller is a
running task, and CPU-id otherwise.

It was added by fuzzing people, because it's almost (or "simply")
impossible to get through serial logs sometimes.

-ss

2020-03-19 08:19:28

by Valdis Klētnieks

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup

On Thu, 19 Mar 2020 17:01:56 +0900, Sergey Senozhatsky said:
> IIRC, CONFIG_PRINTK_CALLER prints pid when printk() caller is a
> running task, and CPU-id otherwise.

Ah... so it does.

#ifdef CONFIG_PRINTK_CALLER
u32 caller_id; /* thread id or processor id */
#endif

I haven't seen it output a processor ID yet, so I didn't know that part...


Attachments:
(No filename) (849.00 B)

2020-03-19 08:21:33

by John Ogness

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup

On 2020-03-18, Eugeniu Rosca <[email protected]> wrote:
> On Tue, Mar 17, 2020 at 11:18:18AM +0900, Sergey Senozhatsky wrote:
>> On (20/03/15 18:09), Eugeniu Rosca wrote:
>>> @@ -428,6 +428,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>>> }
>>> }
>>>
>>> + console_verbose_start();
>>> +
>>> pr_emerg("BUG: soft lockup - CPU#%d stuck for %us! [%s:%d]\n",
>>> smp_processor_id(), duration,
>>> current->comm, task_pid_nr(current));
>>> @@ -453,6 +455,8 @@ static enum hrtimer_restart watchdog_timer_fn(struct hrtimer *hrtimer)
>>> if (softlockup_panic)
>>> panic("softlockup: hung tasks");
>>> __this_cpu_write(soft_watchdog_warn, true);
>>> +
>>> + console_verbose_end();
>>> } else
>>> __this_cpu_write(soft_watchdog_warn, false);
>>
>> I'm afraid, as of now, this approach is not going to work the way it's
>> supposed to work in 100% of cases. Because the only thing that printk()
>> call sort of guarantees is that the message will be stored somewhere.
>> Either in the main kernel log buffer, on in one of auxiliary per-CPU
>> log buffers. It does not guarantee, generally speaking, that the message
>> will be printed on the console immediately.
>
> I take this passage as an acknowledgement of the problem being _real_,
> in spite of the fix being not perfect.
>
> One aspect I would like to emphasize is that (please, NAK this
> statement if it's not accurate) the problem reported in this patch is
> not specific to the existing printk mechanism, but also applies to the
> upcoming kthread-based printk. If that's true, then IMHO this is a
> compelling argument to join forces and try to find a working, safe and
> future-proof solution.

Let me clarify that the upcoming rework is _not_ simply to make a
kthread-based printk. There upcoming rework addresses several issues
(kthreads being only a piece):

1. allow printk-callers to get their messages immediately and locklessly
into the log buffer from any context

2. during normal operation, printk-callers are completely decoupled from
console printing

3. in the case of an emergency, every printk-caller will directly and
synchronously perform console printing of its messages on consoles
that support atomic writing

For the rework we decided that triggering an oops is what irreversibly
transitions the system from "normal operation" to "emergency".

One could interpret the current "console_verbose()" to be some sort of
equivalent to irreversibly transitioning the system to "emergency". But
that is not how it was decided to be interpreted. For the rework,
printk-callers do not have any influence on forcing immediate console
printing (unless they trigger an oops). console_verbose() is still
relevant in the rework. console_verbose() is affecting _what_ is printed
to consoles and _not when_.

Once the printk rework is complete, the mechanisms for atomic and
synchronous console printing will be in place. It would be possible that
these mechanisms could also be used in non-oops scenarios. But that
would need to be discussed in depth and clearly defined with caution for
abuse.

John Ogness

2020-05-19 14:47:31

by Petr Mladek

[permalink] [raw]
Subject: Re: [RFC PATCH 3/3] watchdog: Turn console verbosity on when reporting softlockup

On Thu 2020-03-19 15:48:36, Sergey Senozhatsky wrote:
> On (20/03/18 19:05), Eugeniu Rosca wrote:
> > My current standpoint is that as long as points [A-D] are met, it
> > should do no harm to accept a (partial) fix like seen in my series:
> >
> > - [A] the patch tackles at least a subset of problematic use-cases
> > - [B] the fix is non-intrusive and easy to review
> > - [C] there is hope to reuse it in the new lockless buffer based printk
> > - [D] there are no regressions employing the major console knobs
> > (ignore_loglevel, quiet, loglevel, etc) as it happened in
> > a6ae928c25835c ("Revert "printk: make sure to print log on console."")
>
> So the issue is that when we bump `console_loglevel' or `ignore_loglevel'
> we lift restrictions globally. For all CPUs, for all contexts which call
> printk(). This may have negative impact.

Another impact is that many more messages might suddenly appear on the
console even though admins wanted them quiet because they were slow.

The problem is to define what information is critical. In the ideal
world, all messages are visible on the console so that developers
could use them for debugging. The console loglevel is there to
keep it working in the real world.

IMHO, an acceptable solution would be:

+ Print a single critical message about that a lockup happened
+ Make it configurable which log level will get used for the
details.

Note that there is a pending patchset that allows to show stacks
with a given loglevel, see
https://lore.kernel.org/linux-riscv/[email protected]/

Well, I am slightly afraid that it might open a can with hundred of
printk-related options shuffling log level for various events.

The upcoming printk kthread might help to handle even more
messages on slow consoles. It might allow to increase the default
loglevel in these situations. But the problem will still be there.
The throughput will always be limited and different people have
different opinion on what messages are important.

Best Regards,
Petr