LinuxLists.cc - [PATCH] panic: Move panic

2021-12-30 16:19:15

Subject: [PATCH] panic: Move panic_print before kmsg dumpers

The panic_print setting allows users to collect more information in a
panic event, like memory stats, tasks, CPUs backtraces, etc.
This is a pretty interesting debug mechanism, but currently the print
event happens *after* kmsg_dump(), meaning that Pstore, for example,
cannot collect a dmesg with the panic_print information.

This patch changes that by moving the panic_print_sys_info() function
call up some lines, in order kmsg_dump() accounts this new information
and exposes it through Pstore or other kmsg dumpers.

Cc: Feng Tang <[email protected]>
Signed-off-by: Guilherme G. Piccoli <[email protected]>
---

Hey folks, thanks in advance for reviews/comments!
One alternative I've considered was to move kmsg_dump() some
lines down, I'm not sure which approach is better, can't see
much difference. Opinions are very welcome =)
Cheers,

Guilherme

kernel/panic.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/kernel/panic.c b/kernel/panic.c
index cefd7d82366f..d918da0a1c29 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -271,6 +271,8 @@ void panic(const char *fmt, ...)
*/
atomic_notifier_call_chain(&panic_notifier_list, 0, buf);

+ panic_print_sys_info();
+
kmsg_dump(KMSG_DUMP_PANIC);

/*
@@ -301,8 +303,6 @@ void panic(const char *fmt, ...)
debug_locks_off();
console_flush_on_panic(CONSOLE_FLUSH_PENDING);

- panic_print_sys_info();
-
if (!panic_blink)
panic_blink = no_blink;

--
2.34.1

2022-01-03 14:58:54

by Petr Mladek

[permalink] [raw]

Subject: Re: [PATCH] panic: Move panic_print before kmsg dumpers

On Thu 2021-12-30 13:18:28, Guilherme G. Piccoli wrote:
> The panic_print setting allows users to collect more information in a
> panic event, like memory stats, tasks, CPUs backtraces, etc.
> This is a pretty interesting debug mechanism, but currently the print
> event happens *after* kmsg_dump(), meaning that Pstore, for example,
> cannot collect a dmesg with the panic_print information.

I see the point. Unfortunately, there is some risk in this change
so I have to ask:

Did you miss this behavior in practice?
Or does it just look like something that might be useful?

> This patch changes that by moving the panic_print_sys_info() function
> call up some lines, in order kmsg_dump() accounts this new information
> and exposes it through Pstore or other kmsg dumpers.

Note that panic_print_sys_info() might produce a lot of lines. There is
non-trivial chance that a lot of information might get lost because
of the limited log buffer size.

It might cause regression even when no dumpers are registered and
crash kernel is not loaded. panic_print_sys_info() might overwrite
the existing messages before we try hard to flush them via
console_flush_on_panic().

> Cc: Feng Tang <[email protected]>
> Signed-off-by: Guilherme G. Piccoli <[email protected]>
> ---
>
> Hey folks, thanks in advance for reviews/comments!
> One alternative I've considered was to move kmsg_dump() some
> lines down, I'm not sure which approach is better, can't see
> much difference. Opinions are very welcome =)

This does not look like a good idea either. The comment below
kmsg_dump(KMSG_DUMP_PANIC) says that it should be called
before crash_kexec() that might fail.

IMHO, the best solution would be to allow dumping messages produced
by panic_print_sys_info() using the registered dumpers directly.
But it might require redesigning the kmsg_dump API.

After all, the proposed patch might be good enough.
panic_print_sys_info() does nothing by default.
It might be enough to document that a large enough log buffer
should be used when some output is enabled, especially when
a dumper is used.

Also we should mention these pitfalls and risks in the commit
message.

Best Regards,
Petr

2022-01-03 16:31:37

by Guilherme G. Piccoli

[permalink] [raw]

Subject: Re: [PATCH] panic: Move panic_print before kmsg dumpers

Hi Petr, first of all, I'd like to thank you for the _very_ complete and
informative response, pretty useful!
Some comments inline below:

On 03/01/2022 11:58, Petr Mladek wrote:
> [...]
> I see the point. Unfortunately, there is some risk in this change
> so I have to ask:
>
> Did you miss this behavior in practice?
> Or does it just look like something that might be useful?

Heh it's a valid question! And the answer is that indeed, we miss this
behavior in practice - the plan is to use pstore as panic "fast" log
collection (instead of kdump), and to print some extra info through
panic_print. Hence the surprise when that information wasn't properly
stored in the pstore saved logs...

> [...]
> Note that panic_print_sys_info() might produce a lot of lines. There is
> non-trivial chance that a lot of information might get lost because
> of the limited log buffer size.
>
> It might cause regression even when no dumpers are registered and
> crash kernel is not loaded. panic_print_sys_info() might overwrite
> the existing messages before we try hard to flush them via
> console_flush_on_panic().

Sure, we already account for this risk, and make use of "log_buf_len" to
increase dmesg size, something like 4M/8M.

Now, the second portion of your statement is a bit frightening! Maybe a
silly question, but the risk is to lose some messages in the *dmesg* or
only in the output console, like a serial/graphic console?

If the risk is to lose messages even in the dmesg (that is going to be
collected through pstore/any other kmsg dumper), do you think would make
sense to have a second (in fact, first!) "console_flush_on_panic(...)"
_before_ "panic_print_sys_info()"?
What would be the cons in this approach? I understand that we may need
to (at least) disable debug_locks earlier than it's currently done, for
example. If kexec is more risky having this earlier flush, we could
flush conditionally on not having a crash kernel loaded.

> [...]
> IMHO, the best solution would be to allow dumping messages produced
> by panic_print_sys_info() using the registered dumpers directly.
> But it might require redesigning the kmsg_dump API.
>
>
> After all, the proposed patch might be good enough.
> panic_print_sys_info() does nothing by default.
> It might be enough to document that a large enough log buffer
> should be used when some output is enabled, especially when
> a dumper is used.
>
> Also we should mention these pitfalls and risks in the commit
> message.

OK, makes total sense. I'll wait on your feedback to my points above,
and maybe others could also provide ideas/concerns, and then, I'll
rework the commit message/code accordingly.

Thanks again, and a happy new year to you and family =)
Cheers,

Guilherme

2022-01-04 11:45:27

by Petr Mladek

[permalink] [raw]

Subject: Re: [PATCH] panic: Move panic_print before kmsg dumpers

On Mon 2022-01-03 13:30:35, Guilherme G. Piccoli wrote:
> Hi Petr, first of all, I'd like to thank you for the _very_ complete and
> informative response, pretty useful!
> Some comments inline below:
>
>
> On 03/01/2022 11:58, Petr Mladek wrote:
> > [...]
> > I see the point. Unfortunately, there is some risk in this change
> > so I have to ask:
> >
> > Did you miss this behavior in practice?
> > Or does it just look like something that might be useful?
>
> Heh it's a valid question! And the answer is that indeed, we miss this
> behavior in practice - the plan is to use pstore as panic "fast" log
> collection (instead of kdump), and to print some extra info through
> panic_print. Hence the surprise when that information wasn't properly
> stored in the pstore saved logs...

Good to know.

> > [...]
> > Note that panic_print_sys_info() might produce a lot of lines. There is
> > non-trivial chance that a lot of information might get lost because
> > of the limited log buffer size.
> >
> > It might cause regression even when no dumpers are registered and
> > crash kernel is not loaded. panic_print_sys_info() might overwrite
> > the existing messages before we try hard to flush them via
> > console_flush_on_panic().
>
> Sure, we already account for this risk, and make use of "log_buf_len" to
> increase dmesg size, something like 4M/8M.
>
> Now, the second portion of your statement is a bit frightening! Maybe a
> silly question, but the risk is to lose some messages in the *dmesg* or
> only in the output console, like a serial/graphic console?
>
> If the risk is to lose messages even in the dmesg (that is going to be
> collected through pstore/any other kmsg dumper),

To make it clear. panic_printk_sys_info() might overwrite the oldest
messages stored on the log buffer. This patch moves
panic_print_sys_info() before kmsg_dump(). It means that even
kmsg_dump() would not see the overwritten oldest messages.

> do you think would make sense to have a second (in fact, first!)
> "console_flush_on_panic(...)" _before_ "panic_print_sys_info()"?
> What would be the cons in this approach? I understand that we may need
> to (at least) disable debug_locks earlier than it's currently done, for
> example. If kexec is more risky having this earlier flush, we could
> flush conditionally on not having a crash kernel loaded.

Good question. I am familiar only with the console problems and there
are many problems. Serial consoles might be slow. All consoles use
internal locks that might cause deadlocks. More complicated consoles
are even more risky.

My experience is that kexec mostly works when there is enough reserved
space for booting the crash kernel. And it can be tested before
the crash.

So, I personally think that console_flush_on_panic() is more risky
and should not get called before kexec().

Also kexec might be simply disabled when it does not work. But
console_flush_on_panic() could not be disabled if we call it
too early. We might make it configurable. But it would be
to complicated for users to get it right.

> > After all, the proposed patch might be good enough.
> > panic_print_sys_info() does nothing by default.
> > It might be enough to document that a large enough log buffer
> > should be used when some output is enabled, especially when
> > a dumper is used.
> >
> > Also we should mention these pitfalls and risks in the commit
> > message.
>
> OK, makes total sense. I'll wait on your feedback to my points above,
> and maybe others could also provide ideas/concerns, and then, I'll
> rework the commit message/code accordingly.

I think that this is the best solution after all.

Well, I see one more problem. CONSOLE_REPLAY_ALL replays all messages
on the console. This flag should be handled when it is now. I mean
after kmsg_dump(), kexec(), console_flush_on_panic().

It is pity that PANIC_PRINT_ALL_PRINTK_MSG is in "panic_print" flags.
It makes sense only for consoles. All the other flags make sense
also for kmsg_dump().

A solution would be to add a new kernel parameter that would
obsolete PANIC_PRINT_ALL_PRINTK_MSG. The parameter can be
called panic_console_replay or so.

> Thanks again, and a happy new year to you and family =)

Same to you.

Best Regards,
Petr

2022-01-04 12:36:29

by Guilherme G. Piccoli

[permalink] [raw]

Subject: Re: [PATCH] panic: Move panic_print before kmsg dumpers

Hi Petr, thanks again for the very comprehensive response.
Comments inline below:

On 04/01/2022 08:45, Petr Mladek wrote:
>[...]
> To make it clear. panic_printk_sys_info() might overwrite the oldest
> messages stored on the log buffer. This patch moves
> panic_print_sys_info() before kmsg_dump(). It means that even
> kmsg_dump() would not see the overwritten oldest messages.

Oh sure, I understand that, and that's fine - we can increase a lot the
size of the log buffer through "log_buf_len" (and it's OK to lose some
early entries in our case).
My first understanding of your message was that "panic_print_sys_info"
could "eat" recent messages in the log buffer due to the lack of flush -
and that would be potentially bad.

>
> Good question. I am familiar only with the console problems and there
> are many problems. Serial consoles might be slow. All consoles use
> internal locks that might cause deadlocks. More complicated consoles
> are even more risky.
>
> My experience is that kexec mostly works when there is enough reserved
> space for booting the crash kernel. And it can be tested before
> the crash.
>
> So, I personally think that console_flush_on_panic() is more risky
> and should not get called before kexec().
>
> Also kexec might be simply disabled when it does not work. But
> console_flush_on_panic() could not be disabled if we call it
> too early. We might make it configurable. But it would be
> to complicated for users to get it right.

OK, got it! So let's not mess with the flush machinery, seems a bit
dangerous...

>
> I think that this is the best solution after all.
>
> Well, I see one more problem. CONSOLE_REPLAY_ALL replays all messages
> on the console. This flag should be handled when it is now. I mean
> after kmsg_dump(), kexec(), console_flush_on_panic().
>
> It is pity that PANIC_PRINT_ALL_PRINTK_MSG is in "panic_print" flags.
> It makes sense only for consoles. All the other flags make sense
> also for kmsg_dump().
>
> A solution would be to add a new kernel parameter that would
> obsolete PANIC_PRINT_ALL_PRINTK_MSG. The parameter can be
> called panic_console_replay or so.
>

Hmm..makes sense. I've thought a bit about this option, it's kinda odd
compared to "print memory status, print tasks"...it would fit into
another parameter, as you proposed (and I'd be happy to do it as part of
this series). But..with that said, I understand we have quite a big
number of parameters nowadays, and panic_print having this
"CONSOLE_REPLAY_ALL" is kind of a kernel API to userspace, so to avoid
any polemics (either due to "oh no, another parameter" or due to "can't
remove that, break userspace"), what if we split panic_print_sys_info
into an upper and lower parts? I can do that using a function parameter,
like this:

<...>
panic_print_sys_info(0);
kmsg_dump();
<... stuff ...>
panic_print_sys_info(1);

So, "panic_print_sys_info(1)" is the current call, like before this
patch - it would just do the "CONSOLE_REPLAY_ALL", whereas
"panic_print_sys_info(0)" do the rest, before the kmsg dumpers. What do
you think?

If you prefer the new kernel parameter than my idea, I'm all for that as
well - it's your choice.

Thanks again for the good discussion.
Cheers,

Guilherme