LinuxLists.cc - [PATCH] kernel: count warnings and make count accessible to userspace

2022-01-19 19:10:34

Subject: [PATCH] kernel: count warnings and make count accessible to userspace

When testing, it's common to consider a warning to be a test failure,
but it's currently awkward to determine which of multiple sequential
tests is responsible for triggering a warning. Scraping dmesg or
/var/log/messages is somewhat expensive and error-prone. Setting
panic_on_warn is reliable, but spoils test runs for minor issues.
Looking at the taint bit is also reliable, but only works for a single
warning.

We can track the warning count and expose it as a sysfs file. Test
infrastructures can snapshot the value before and after a test. If
the value changes, they can do more expensive things like extracting
logs.

Signed-off-by: John Sperbeck <[email protected]>
---
kernel/panic.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/kernel/panic.c b/kernel/panic.c
index cefd7d82366f..5262c2a0ebf4 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -571,6 +571,8 @@ struct warn_args {
va_list args;
};

+static atomic_t __maybe_unused warn_counter;
+
void __warn(const char *file, int line, void *caller, unsigned taint,
struct pt_regs *regs, struct warn_args *args)
{
@@ -612,6 +614,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,

/* Just a warning, don't kill lockdep. */
add_taint(taint, LOCKDEP_STILL_OK);
+
+ atomic_inc(&warn_counter);
}

#ifndef __WARN_FLAGS
@@ -667,6 +671,7 @@ static __init int register_warn_debugfs(void)
/* Don't care about failure */
debugfs_create_file_unsafe("clear_warn_once", 0200, NULL, NULL,
&clear_warn_once_fops);
+ debugfs_create_atomic_t("warn_count", 0444, NULL, &warn_counter);
return 0;
}

--
2.34.1.703.g22d0c6ccf7-goog

2022-01-21 19:12:34

by Petr Mladek

[permalink] [raw]

Subject: Re: [PATCH] kernel: count warnings and make count accessible to userspace

Adding Andrew into Cc. Most changes in panic.c go via his tree.

On Tue 2022-01-18 06:04:31, John Sperbeck wrote:
> When testing, it's common to consider a warning to be a test failure,
> but it's currently awkward to determine which of multiple sequential
> tests is responsible for triggering a warning. Scraping dmesg or
> /var/log/messages is somewhat expensive and error-prone. Setting
> panic_on_warn is reliable, but spoils test runs for minor issues.
> Looking at the taint bit is also reliable, but only works for a single
> warning.
>
> We can track the warning count and expose it as a sysfs file. Test
> infrastructures can snapshot the value before and after a test. If
> the value changes, they can do more expensive things like extracting
> logs.

The counter makes sense. It might be useful even for normal debugging.
It would be nice to show the value in the log.

> Signed-off-by: John Sperbeck <[email protected]>
> ---
> kernel/panic.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/kernel/panic.c b/kernel/panic.c
> index cefd7d82366f..5262c2a0ebf4 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -571,6 +571,8 @@ struct warn_args {
> va_list args;
> };
>
> +static atomic_t __maybe_unused warn_counter;
> +
> void __warn(const char *file, int line, void *caller, unsigned taint,
> struct pt_regs *regs, struct warn_args *args)
> {
> @@ -612,6 +614,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
>
> /* Just a warning, don't kill lockdep. */
> add_taint(taint, LOCKDEP_STILL_OK);
> +
> + atomic_inc(&warn_counter);
> }
>
> #ifndef __WARN_FLAGS
> @@ -667,6 +671,7 @@ static __init int register_warn_debugfs(void)
> /* Don't care about failure */
> debugfs_create_file_unsafe("clear_warn_once", 0200, NULL, NULL,
> &clear_warn_once_fops);
> + debugfs_create_atomic_t("warn_count", 0444, NULL, &warn_counter);

Is the sysfs interface really important for you use case, please?
Would the value in the log be enough?

Anyway, we already count the number WARN() reports. It is quite hidden
and hashed in init_oops_id()/print_oops_end_marker().

A solution would be to make this hidden counter more explicit.
Something like:

diff --git a/kernel/panic.c b/kernel/panic.c
index cefd7d82366f..8ac19124ceb4 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -537,13 +537,12 @@ void oops_enter(void)
* 64-bit random ID for oopses:
*/
static u64 oops_id;
+static int oops_cnt;

static int init_oops_id(void)
{
if (!oops_id)
get_random_bytes(&oops_id, sizeof(oops_id));
- else
- oops_id++;

return 0;
}
@@ -552,7 +551,9 @@ late_initcall(init_oops_id);
static void print_oops_end_marker(void)
{
init_oops_id();
- pr_warn("---[ end trace %016llx ]---\n", (unsigned long long)oops_id);
+ oops_cnt++;
+ pr_warn("---[ end trace %016llx:%d ]---\n",
+ (unsigned long long)oops_id, oops_cnt);
}

/*

The report would like like:

[ 1871.476204] WARNING: CPU: 2 PID: 2003 at samples/livepatch/livepatch-sample.c:60 livepatch_init+0x11/0x20 [livepatch_sample]
[ 1871.476905] Modules linked in: livepatch_sample(EK+) [last unloaded: livepatch_sample]
[ 1871.477509] CPU: 2 PID: 2003 Comm: modprobe Tainted: G W E K 5.16.0-default+ #312
[ 1871.478175] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[ 1871.478858] RIP: 0010:livepatch_init+0x11/0x20 [livepatch_sample]
[...]
[ 1871.489801] hardirqs last disabled at (9188): [<ffffffffb217aa4e>] vprintk_emit+0x21e/0x2b0
[ 1871.489803] softirqs last enabled at (9096): [<ffffffffb3000364>] __do_softirq+0x364/0x4ab
[ 1871.489805] softirqs last disabled at (9083): [<ffffffffb20efb5d>] irq_exit_rcu+0x10d/0x120
[ 1871.489807] ---[ end trace a19f0f55262cfcc8:2 ]---

Best Regards,
Petr

2022-01-23 15:38:57

by John Sperbeck

[permalink] [raw]

Subject: Re: [PATCH] kernel: count warnings and make count accessible to userspace

On Wed, Jan 19, 2022 at 4:01 AM Petr Mladek <[email protected]> wrote:
>
> Adding Andrew into Cc. Most changes in panic.c go via his tree.
>
> On Tue 2022-01-18 06:04:31, John Sperbeck wrote:
> > When testing, it's common to consider a warning to be a test failure,
> > but it's currently awkward to determine which of multiple sequential
> > tests is responsible for triggering a warning. Scraping dmesg or
> > /var/log/messages is somewhat expensive and error-prone. Setting
> > panic_on_warn is reliable, but spoils test runs for minor issues.
> > Looking at the taint bit is also reliable, but only works for a single
> > warning.
> >
> > We can track the warning count and expose it as a sysfs file. Test
> > infrastructures can snapshot the value before and after a test. If
> > the value changes, they can do more expensive things like extracting
> > logs.
>
> The counter makes sense. It might be useful even for normal debugging.
> It would be nice to show the value in the log.
>
> > Signed-off-by: John Sperbeck <[email protected]>
> > ---
> > kernel/panic.c | 5 +++++
> > 1 file changed, 5 insertions(+)
> >
> > diff --git a/kernel/panic.c b/kernel/panic.c
> > index cefd7d82366f..5262c2a0ebf4 100644
> > --- a/kernel/panic.c
> > +++ b/kernel/panic.c
> > @@ -571,6 +571,8 @@ struct warn_args {
> > va_list args;
> > };
> >
> > +static atomic_t __maybe_unused warn_counter;
> > +
> > void __warn(const char *file, int line, void *caller, unsigned taint,
> > struct pt_regs *regs, struct warn_args *args)
> > {
> > @@ -612,6 +614,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
> >
> > /* Just a warning, don't kill lockdep. */
> > add_taint(taint, LOCKDEP_STILL_OK);
> > +
> > + atomic_inc(&warn_counter);
> > }
> >
> > #ifndef __WARN_FLAGS
> > @@ -667,6 +671,7 @@ static __init int register_warn_debugfs(void)
> > /* Don't care about failure */
> > debugfs_create_file_unsafe("clear_warn_once", 0200, NULL, NULL,
> > &clear_warn_once_fops);
> > + debugfs_create_atomic_t("warn_count", 0444, NULL, &warn_counter);
>
> Is the sysfs interface really important for you use case, please?
> Would the value in the log be enough?
>
> Anyway, we already count the number WARN() reports. It is quite hidden
> and hashed in init_oops_id()/print_oops_end_marker().
>
> A solution would be to make this hidden counter more explicit.
> Something like:
>
> diff --git a/kernel/panic.c b/kernel/panic.c
> index cefd7d82366f..8ac19124ceb4 100644
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -537,13 +537,12 @@ void oops_enter(void)
> * 64-bit random ID for oopses:
> */
> static u64 oops_id;
> +static int oops_cnt;
>
> static int init_oops_id(void)
> {
> if (!oops_id)
> get_random_bytes(&oops_id, sizeof(oops_id));
> - else
> - oops_id++;
>
> return 0;
> }
> @@ -552,7 +551,9 @@ late_initcall(init_oops_id);
> static void print_oops_end_marker(void)
> {
> init_oops_id();
> - pr_warn("---[ end trace %016llx ]---\n", (unsigned long long)oops_id);
> + oops_cnt++;
> + pr_warn("---[ end trace %016llx:%d ]---\n",
> + (unsigned long long)oops_id, oops_cnt);
> }
>
> /*
>
>
> The report would like like:
>
> [ 1871.476204] WARNING: CPU: 2 PID: 2003 at samples/livepatch/livepatch-sample.c:60 livepatch_init+0x11/0x20 [livepatch_sample]
> [ 1871.476905] Modules linked in: livepatch_sample(EK+) [last unloaded: livepatch_sample]
> [ 1871.477509] CPU: 2 PID: 2003 Comm: modprobe Tainted: G W E K 5.16.0-default+ #312
> [ 1871.478175] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
> [ 1871.478858] RIP: 0010:livepatch_init+0x11/0x20 [livepatch_sample]
> [...]
> [ 1871.489801] hardirqs last disabled at (9188): [<ffffffffb217aa4e>] vprintk_emit+0x21e/0x2b0
> [ 1871.489803] softirqs last enabled at (9096): [<ffffffffb3000364>] __do_softirq+0x364/0x4ab
> [ 1871.489805] softirqs last disabled at (9083): [<ffffffffb20efb5d>] irq_exit_rcu+0x10d/0x120
> [ 1871.489807] ---[ end trace a19f0f55262cfcc8:2 ]---
>
> Best Regards,
> Petr

Thanks for adding the CC.

The sysfs interface was kind of the main thing I'm after. The intent is
to avoid parsing logs unless something interesting is likely to be there.

If we were to rely entirely on log parsing, adding a count to the logs
doesn't add much, I think. That is, if a warning is parsed, then we
already know it occurred, and don't need the counter value in the message
that we parsed to tell us what we just discovered.

I see some precedent for exposing counts for things like this: lockdep,
kfence, DMA-API debug. Warnings, KASAN, and UBSAN are some areas that
don't have this convenience.

If the sysfs interface isn't palatable, that's okay. Supporting a
three-line patch locally isn't a tremendous burden.

2022-01-25 09:39:11

by Petr Mladek

[permalink] [raw]

Subject: Re: [PATCH] kernel: count warnings and make count accessible to userspace

Adding Andrew back to Cc :-)

On Sat 2022-01-22 12:58:39, John Sperbeck wrote:
> On Wed, Jan 19, 2022 at 4:01 AM Petr Mladek <[email protected]> wrote:
> >
> > Adding Andrew into Cc. Most changes in panic.c go via his tree.
> >
> > On Tue 2022-01-18 06:04:31, John Sperbeck wrote:
> > > When testing, it's common to consider a warning to be a test failure,
> > > but it's currently awkward to determine which of multiple sequential
> > > tests is responsible for triggering a warning. Scraping dmesg or
> > > /var/log/messages is somewhat expensive and error-prone. Setting
> > > panic_on_warn is reliable, but spoils test runs for minor issues.
> > > Looking at the taint bit is also reliable, but only works for a single
> > > warning.
> > >
> > > We can track the warning count and expose it as a sysfs file. Test
> > > infrastructures can snapshot the value before and after a test. If
> > > the value changes, they can do more expensive things like extracting
> > > logs.
> >
> > The counter makes sense. It might be useful even for normal debugging.
> > It would be nice to show the value in the log.
> >
> > > Signed-off-by: John Sperbeck <[email protected]>
> > > ---
> > > kernel/panic.c | 5 +++++
> > > 1 file changed, 5 insertions(+)
> > >
> > > diff --git a/kernel/panic.c b/kernel/panic.c
> > > index cefd7d82366f..5262c2a0ebf4 100644
> > > --- a/kernel/panic.c
> > > +++ b/kernel/panic.c
> > > @@ -571,6 +571,8 @@ struct warn_args {
> > > va_list args;
> > > };
> > >
> > > +static atomic_t __maybe_unused warn_counter;
> > > +
> > > void __warn(const char *file, int line, void *caller, unsigned taint,
> > > struct pt_regs *regs, struct warn_args *args)
> > > {
> > > @@ -612,6 +614,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
> > >
> > > /* Just a warning, don't kill lockdep. */
> > > add_taint(taint, LOCKDEP_STILL_OK);
> > > +
> > > + atomic_inc(&warn_counter);
> > > }
> > >
> > > #ifndef __WARN_FLAGS
> > > @@ -667,6 +671,7 @@ static __init int register_warn_debugfs(void)
> > > /* Don't care about failure */
> > > debugfs_create_file_unsafe("clear_warn_once", 0200, NULL, NULL,
> > > &clear_warn_once_fops);
> > > + debugfs_create_atomic_t("warn_count", 0444, NULL, &warn_counter);
> >
> > Is the sysfs interface really important for you use case, please?
> > Would the value in the log be enough?
> >
> > Anyway, we already count the number WARN() reports. It is quite hidden
> > and hashed in init_oops_id()/print_oops_end_marker().
> >
> > A solution would be to make this hidden counter more explicit.
> > Something like:
> >
> > diff --git a/kernel/panic.c b/kernel/panic.c
> > index cefd7d82366f..8ac19124ceb4 100644
> > --- a/kernel/panic.c
> > +++ b/kernel/panic.c
> > @@ -537,13 +537,12 @@ void oops_enter(void)
> > * 64-bit random ID for oopses:
> > */
> > static u64 oops_id;
> > +static int oops_cnt;
> >
> > static int init_oops_id(void)
> > {
> > if (!oops_id)
> > get_random_bytes(&oops_id, sizeof(oops_id));
> > - else
> > - oops_id++;
> >
> > return 0;
> > }
> > @@ -552,7 +551,9 @@ late_initcall(init_oops_id);
> > static void print_oops_end_marker(void)
> > {
> > init_oops_id();
> > - pr_warn("---[ end trace %016llx ]---\n", (unsigned long long)oops_id);
> > + oops_cnt++;
> > + pr_warn("---[ end trace %016llx:%d ]---\n",
> > + (unsigned long long)oops_id, oops_cnt);
> > }
> >
> > /*
> >
> >
> > The report would like like:
> >
> > [ 1871.476204] WARNING: CPU: 2 PID: 2003 at samples/livepatch/livepatch-sample.c:60 livepatch_init+0x11/0x20 [livepatch_sample]
> > [ 1871.476905] Modules linked in: livepatch_sample(EK+) [last unloaded: livepatch_sample]
> > [ 1871.477509] CPU: 2 PID: 2003 Comm: modprobe Tainted: G W E K 5.16.0-default+ #312
> > [ 1871.478175] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
> > [ 1871.478858] RIP: 0010:livepatch_init+0x11/0x20 [livepatch_sample]
> > [...]
> > [ 1871.489801] hardirqs last disabled at (9188): [<ffffffffb217aa4e>] vprintk_emit+0x21e/0x2b0
> > [ 1871.489803] softirqs last enabled at (9096): [<ffffffffb3000364>] __do_softirq+0x364/0x4ab
> > [ 1871.489805] softirqs last disabled at (9083): [<ffffffffb20efb5d>] irq_exit_rcu+0x10d/0x120
> > [ 1871.489807] ---[ end trace a19f0f55262cfcc8:2 ]---
> >
> > Best Regards,
> > Petr
>
> Thanks for adding the CC.
>
> The sysfs interface was kind of the main thing I'm after. The intent is
> to avoid parsing logs unless something interesting is likely to be there.
>
> If we were to rely entirely on log parsing, adding a count to the logs
> doesn't add much, I think. That is, if a warning is parsed, then we
> already know it occurred, and don't need the counter value in the message
> that we parsed to tell us what we just discovered.

Fair enough.

> I see some precedent for exposing counts for things like this: lockdep,
> kfence, DMA-API debug. Warnings, KASAN, and UBSAN are some areas that
> don't have this convenience.

I see.

> If the sysfs interface isn't palatable, that's okay. Supporting a
> three-line patch locally isn't a tremendous burden.

I am not against the debugfs interface. The counter might be useful.

Well, I would still combine it with my proposal. The number of
warnings is already "counted" by the number printed in
'end trace XXXXXXXXXXXXXX' line. It does not make sense to
add yet another counter.

Best Regards,
Petr

2022-01-27 07:28:06

by John Sperbeck

[permalink] [raw]

Subject: Re: [PATCH] kernel: count warnings and make count accessible to userspace

On Tue, Jan 25, 2022 at 12:52 AM Petr Mladek <[email protected]> wrote:
>
> Adding Andrew back to Cc :-)
>
> On Sat 2022-01-22 12:58:39, John Sperbeck wrote:
> > On Wed, Jan 19, 2022 at 4:01 AM Petr Mladek <[email protected]> wrote:
> > >
> > > Adding Andrew into Cc. Most changes in panic.c go via his tree.
> > >
> > > On Tue 2022-01-18 06:04:31, John Sperbeck wrote:
> > > > When testing, it's common to consider a warning to be a test failure,
> > > > but it's currently awkward to determine which of multiple sequential
> > > > tests is responsible for triggering a warning. Scraping dmesg or
> > > > /var/log/messages is somewhat expensive and error-prone. Setting
> > > > panic_on_warn is reliable, but spoils test runs for minor issues.
> > > > Looking at the taint bit is also reliable, but only works for a single
> > > > warning.
> > > >
> > > > We can track the warning count and expose it as a sysfs file. Test
> > > > infrastructures can snapshot the value before and after a test. If
> > > > the value changes, they can do more expensive things like extracting
> > > > logs.
> > >
> > > The counter makes sense. It might be useful even for normal debugging.
> > > It would be nice to show the value in the log.
> > >
> > > > Signed-off-by: John Sperbeck <[email protected]>
> > > > ---
> > > > kernel/panic.c | 5 +++++
> > > > 1 file changed, 5 insertions(+)
> > > >
> > > > diff --git a/kernel/panic.c b/kernel/panic.c
> > > > index cefd7d82366f..5262c2a0ebf4 100644
> > > > --- a/kernel/panic.c
> > > > +++ b/kernel/panic.c
> > > > @@ -571,6 +571,8 @@ struct warn_args {
> > > > va_list args;
> > > > };
> > > >
> > > > +static atomic_t __maybe_unused warn_counter;
> > > > +
> > > > void __warn(const char *file, int line, void *caller, unsigned taint,
> > > > struct pt_regs *regs, struct warn_args *args)
> > > > {
> > > > @@ -612,6 +614,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
> > > >
> > > > /* Just a warning, don't kill lockdep. */
> > > > add_taint(taint, LOCKDEP_STILL_OK);
> > > > +
> > > > + atomic_inc(&warn_counter);
> > > > }
> > > >
> > > > #ifndef __WARN_FLAGS
> > > > @@ -667,6 +671,7 @@ static __init int register_warn_debugfs(void)
> > > > /* Don't care about failure */
> > > > debugfs_create_file_unsafe("clear_warn_once", 0200, NULL, NULL,
> > > > &clear_warn_once_fops);
> > > > + debugfs_create_atomic_t("warn_count", 0444, NULL, &warn_counter);
> > >
> > > Is the sysfs interface really important for you use case, please?
> > > Would the value in the log be enough?
> > >
> > > Anyway, we already count the number WARN() reports. It is quite hidden
> > > and hashed in init_oops_id()/print_oops_end_marker().
> > >
> > > A solution would be to make this hidden counter more explicit.
> > > Something like:
> > >
> > > diff --git a/kernel/panic.c b/kernel/panic.c
> > > index cefd7d82366f..8ac19124ceb4 100644
> > > --- a/kernel/panic.c
> > > +++ b/kernel/panic.c
> > > @@ -537,13 +537,12 @@ void oops_enter(void)
> > > * 64-bit random ID for oopses:
> > > */
> > > static u64 oops_id;
> > > +static int oops_cnt;
> > >
> > > static int init_oops_id(void)
> > > {
> > > if (!oops_id)
> > > get_random_bytes(&oops_id, sizeof(oops_id));
> > > - else
> > > - oops_id++;
> > >
> > > return 0;
> > > }
> > > @@ -552,7 +551,9 @@ late_initcall(init_oops_id);
> > > static void print_oops_end_marker(void)
> > > {
> > > init_oops_id();
> > > - pr_warn("---[ end trace %016llx ]---\n", (unsigned long long)oops_id);
> > > + oops_cnt++;
> > > + pr_warn("---[ end trace %016llx:%d ]---\n",
> > > + (unsigned long long)oops_id, oops_cnt);
> > > }
> > >
> > > /*
> > >
> > >
> > > The report would like like:
> > >
> > > [ 1871.476204] WARNING: CPU: 2 PID: 2003 at samples/livepatch/livepatch-sample.c:60 livepatch_init+0x11/0x20 [livepatch_sample]
> > > [ 1871.476905] Modules linked in: livepatch_sample(EK+) [last unloaded: livepatch_sample]
> > > [ 1871.477509] CPU: 2 PID: 2003 Comm: modprobe Tainted: G W E K 5.16.0-default+ #312
> > > [ 1871.478175] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
> > > [ 1871.478858] RIP: 0010:livepatch_init+0x11/0x20 [livepatch_sample]
> > > [...]
> > > [ 1871.489801] hardirqs last disabled at (9188): [<ffffffffb217aa4e>] vprintk_emit+0x21e/0x2b0
> > > [ 1871.489803] softirqs last enabled at (9096): [<ffffffffb3000364>] __do_softirq+0x364/0x4ab
> > > [ 1871.489805] softirqs last disabled at (9083): [<ffffffffb20efb5d>] irq_exit_rcu+0x10d/0x120
> > > [ 1871.489807] ---[ end trace a19f0f55262cfcc8:2 ]---
> > >
> > > Best Regards,
> > > Petr
> >
> > Thanks for adding the CC.
> >
> > The sysfs interface was kind of the main thing I'm after. The intent is
> > to avoid parsing logs unless something interesting is likely to be there.
> >
> > If we were to rely entirely on log parsing, adding a count to the logs
> > doesn't add much, I think. That is, if a warning is parsed, then we
> > already know it occurred, and don't need the counter value in the message
> > that we parsed to tell us what we just discovered.
>
> Fair enough.
>
> > I see some precedent for exposing counts for things like this: lockdep,
> > kfence, DMA-API debug. Warnings, KASAN, and UBSAN are some areas that
> > don't have this convenience.
>
> I see.
>
> > If the sysfs interface isn't palatable, that's okay. Supporting a
> > three-line patch locally isn't a tremendous burden.
>
> I am not against the debugfs interface. The counter might be useful.
>
> Well, I would still combine it with my proposal. The number of
> warnings is already "counted" by the number printed in
> 'end trace XXXXXXXXXXXXXX' line. It does not make sense to
> add yet another counter.
>
> Best Regards,
> Petr

I'm not sure I understand the comment that it doesn't make sense to
add yet another counter? Isn't the alternate proposal to add
'oops_cnt' adding a counter?

Also, is a simple 'int' adequate? Are there any concerns about
multiple CPUs calling warn at the same time?

I'm also wondering about the ramifications of changing the format of a
string that the log-scraping crowd might use for identifying the end
of a warning?

Finally, wouldn't dropping the 'oops_id++' from init_oops_id() lose
the intent of d6624f996ae5 ("oops: increment the oops UUID every time
we oops")?

2022-02-11 13:52:09

by Petr Mladek

[permalink] [raw]

Subject: Re: [PATCH] kernel: count warnings and make count accessible to userspace

Hi John,

I am sorry for the late reply. I had vacation last week. Many things
accumulated...

On Wed 2022-01-26 16:23:31, John Sperbeck wrote:
> On Tue, Jan 25, 2022 at 12:52 AM Petr Mladek <[email protected]> wrote:
> >
> > Adding Andrew back to Cc :-)
> >
> > On Sat 2022-01-22 12:58:39, John Sperbeck wrote:
> > > On Wed, Jan 19, 2022 at 4:01 AM Petr Mladek <[email protected]> wrote:
> > > >
> > > > Adding Andrew into Cc. Most changes in panic.c go via his tree.
> > > >
> > > > On Tue 2022-01-18 06:04:31, John Sperbeck wrote:
> > > > > When testing, it's common to consider a warning to be a test failure,
> > > > > but it's currently awkward to determine which of multiple sequential
> > > > > tests is responsible for triggering a warning. Scraping dmesg or
> > > > > /var/log/messages is somewhat expensive and error-prone. Setting
> > > > > panic_on_warn is reliable, but spoils test runs for minor issues.
> > > > > Looking at the taint bit is also reliable, but only works for a single
> > > > > warning.
> > > > >
> > > > > We can track the warning count and expose it as a sysfs file. Test
> > > > > infrastructures can snapshot the value before and after a test. If
> > > > > the value changes, they can do more expensive things like extracting
> > > > > logs.
> > > >
> > > > The counter makes sense. It might be useful even for normal debugging.
> > > > It would be nice to show the value in the log.
> > > >
> > > > > Signed-off-by: John Sperbeck <[email protected]>
> > > > > ---
> > > > > kernel/panic.c | 5 +++++
> > > > > 1 file changed, 5 insertions(+)
> > > > >
> > > > > diff --git a/kernel/panic.c b/kernel/panic.c
> > > > > index cefd7d82366f..5262c2a0ebf4 100644
> > > > > --- a/kernel/panic.c
> > > > > +++ b/kernel/panic.c
> > > > > @@ -571,6 +571,8 @@ struct warn_args {
> > > > > va_list args;
> > > > > };
> > > > >
> > > > > +static atomic_t __maybe_unused warn_counter;
> > > > > +
> > > > > void __warn(const char *file, int line, void *caller, unsigned taint,
> > > > > struct pt_regs *regs, struct warn_args *args)
> > > > > {
> > > > > @@ -612,6 +614,8 @@ void __warn(const char *file, int line, void *caller, unsigned taint,
> > > > >
> > > > > /* Just a warning, don't kill lockdep. */
> > > > > add_taint(taint, LOCKDEP_STILL_OK);
> > > > > +
> > > > > + atomic_inc(&warn_counter);
> > > > > }
> > > > >
> > > > > #ifndef __WARN_FLAGS
> > > > > @@ -667,6 +671,7 @@ static __init int register_warn_debugfs(void)
> > > > > /* Don't care about failure */
> > > > > debugfs_create_file_unsafe("clear_warn_once", 0200, NULL, NULL,
> > > > > &clear_warn_once_fops);
> > > > > + debugfs_create_atomic_t("warn_count", 0444, NULL, &warn_counter);
> > > >
> > > > Is the sysfs interface really important for you use case, please?
> > > > Would the value in the log be enough?
> > > >
> > > > Anyway, we already count the number WARN() reports. It is quite hidden
> > > > and hashed in init_oops_id()/print_oops_end_marker().
> > > >
> > > > A solution would be to make this hidden counter more explicit.
> > > > Something like:
> > > >
> > > > diff --git a/kernel/panic.c b/kernel/panic.c
> > > > index cefd7d82366f..8ac19124ceb4 100644
> > > > --- a/kernel/panic.c
> > > > +++ b/kernel/panic.c
> > > > @@ -537,13 +537,12 @@ void oops_enter(void)
> > > > * 64-bit random ID for oopses:
> > > > */
> > > > static u64 oops_id;
> > > > +static int oops_cnt;
> > > >
> > > > static int init_oops_id(void)
> > > > {
> > > > if (!oops_id)
> > > > get_random_bytes(&oops_id, sizeof(oops_id));
> > > > - else
> > > > - oops_id++;
> > > >
> > > > return 0;
> > > > }
> > > > @@ -552,7 +551,9 @@ late_initcall(init_oops_id);
> > > > static void print_oops_end_marker(void)
> > > > {
> > > > init_oops_id();
> > > > - pr_warn("---[ end trace %016llx ]---\n", (unsigned long long)oops_id);
> > > > + oops_cnt++;
> > > > + pr_warn("---[ end trace %016llx:%d ]---\n",
> > > > + (unsigned long long)oops_id, oops_cnt);
> > > > }
> > > >
> > > > /*
> > > >
> > > >
> > > > The report would like like:
> > > >
> > > > [ 1871.476204] WARNING: CPU: 2 PID: 2003 at samples/livepatch/livepatch-sample.c:60 livepatch_init+0x11/0x20 [livepatch_sample]
> > > > [ 1871.476905] Modules linked in: livepatch_sample(EK+) [last unloaded: livepatch_sample]
> > > > [ 1871.477509] CPU: 2 PID: 2003 Comm: modprobe Tainted: G W E K 5.16.0-default+ #312
> > > > [ 1871.478175] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
> > > > [ 1871.478858] RIP: 0010:livepatch_init+0x11/0x20 [livepatch_sample]
> > > > [...]
> > > > [ 1871.489801] hardirqs last disabled at (9188): [<ffffffffb217aa4e>] vprintk_emit+0x21e/0x2b0
> > > > [ 1871.489803] softirqs last enabled at (9096): [<ffffffffb3000364>] __do_softirq+0x364/0x4ab
> > > > [ 1871.489805] softirqs last disabled at (9083): [<ffffffffb20efb5d>] irq_exit_rcu+0x10d/0x120
> > > > [ 1871.489807] ---[ end trace a19f0f55262cfcc8:2 ]---
> > > >
> > > > Best Regards,
> > > > Petr
> > >
> > > Thanks for adding the CC.
> > >
> > > The sysfs interface was kind of the main thing I'm after. The intent is
> > > to avoid parsing logs unless something interesting is likely to be there.
> > >
> > > If we were to rely entirely on log parsing, adding a count to the logs
> > > doesn't add much, I think. That is, if a warning is parsed, then we
> > > already know it occurred, and don't need the counter value in the message
> > > that we parsed to tell us what we just discovered.
> >
> > Fair enough.
> >
> > > I see some precedent for exposing counts for things like this: lockdep,
> > > kfence, DMA-API debug. Warnings, KASAN, and UBSAN are some areas that
> > > don't have this convenience.
> >
> > I see.
> >
> > > If the sysfs interface isn't palatable, that's okay. Supporting a
> > > three-line patch locally isn't a tremendous burden.
> >
> > I am not against the debugfs interface. The counter might be useful.
> >
> > Well, I would still combine it with my proposal. The number of
> > warnings is already "counted" by the number printed in
> > 'end trace XXXXXXXXXXXXXX' line. It does not make sense to
> > add yet another counter.
> >
> > Best Regards,
> > Petr
>
> I'm not sure I understand the comment that it doesn't make sense to
> add yet another counter? Isn't the alternate proposal to add
> 'oops_cnt' adding a counter?

The 2nd counter is already there. 'oops_id' is initialized with a random
number but it is always incremented when print_oops_end_marker() is
called.

I though that 'oops_id' should be incremented together with your
'warn_count' to keep them in sync. But I see that
print_oops_end_marker() is called in various situations when
a backtrace is printed.

I agree that we need another counter for __warn() calls.

> Also, is a simple 'int' adequate? Are there any concerns about
> multiple CPUs calling warn at the same time?

Yes, it should be atomic to avoid any race. But oops_id is 'int'.
Also oops_in_progress is simple 'int'. And it seems to be good
enough.

Well, it makes sense to do it properly when introducing new counter.

The patch makes sense after all. Feel free to use:

Acked-by: Petr Mladek <[email protected]>

Andrew, do you want to take it via -mm tree?

Best Regards,
Petr