The panic_print setting allows users to collect more information in a
panic event, like memory stats, tasks, CPUs backtraces, etc.
This is a pretty interesting debug mechanism, but currently the print
event happens *after* kmsg_dump(), meaning that pstore, for example,
cannot collect a dmesg with the panic_print information.
This patch changes that in 2 ways:
(a) First, after a good discussion with Petr in the mailing-list[0],
he noticed that one specific setting of panic_print (the console replay,
bit 5) should not be executed before console proper flushing; hence we
hereby split the panic_print_sys_info() function in upper and lower
portions: if the parameter "after_kmsg_dumpers" is passed, only bit 5
(the console replay thing) is evaluated and the function returns - this
is the lower portion. Otherwise all other bits are checked and the
function prints the user required information; this is the upper/earlier
mode.
(b) With the above change, we can safely insert a panic_print_sys_info()
call up some lines, in order kmsg_dump() accounts this new information
and exposes it through pstore or other kmsg dumpers. Notice that this
new earlier call doesn't set "after_kmsg_dumpers" so we print the
information set by user in panic_print, except the console replay that,
if set, will be executed after the console flushing.
In the first version of this patch we discussed the risks in the
mailing-list[0], and seems this approach is the least terrible,
despite still having risks of polluting the log with the bulk of
information that panic_print may bring, losing older messages.
In order to attenuate that, we've added a warning in the
kernel-parameters.txt so users enabling this mechanism consider
to increase the log buffer size via "log_buf_len" as well.
[0] https://lore.kernel.org/lkml/[email protected]
Cc: Feng Tang <[email protected]>
Cc: Petr Mladek <[email protected]>
Signed-off-by: Guilherme G. Piccoli <[email protected]>
---
V2:
- Rebased to linux-next (20220106), since there are patches
on linux-next touching this portion of code.
- Better document the risks of panic_print and suggest users
to increase log buffer size with log_buf_len when using it.
- "Split" panic_print_sys_info() in order to execute the info
collection in the upper part (allowing kmsg dumpers to save
that information) and the console replay in the lower part,
after the console flush.
Petr, thanks a lot for your review and suggestions!
This is my idea as I've mentioned in the V1 thread, but if you
think the parameter is better, i.e, decouple the console replay
from panic_print, let me know and I can rework - your choice.
I just wanted to avoid adding yet another kernel parameter,
since it's not strictly needed heh
Cheers!
Documentation/admin-guide/kernel-parameters.txt | 4 ++++
kernel/panic.c | 15 ++++++++++-----
2 files changed, 14 insertions(+), 5 deletions(-)
diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index a069d8fe2fee..0f5cbe141bfd 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3727,6 +3727,10 @@
bit 4: print ftrace buffer
bit 5: print all printk messages in buffer
bit 6: print all CPUs backtrace (if available in the arch)
+ *Be aware* that this option may print a _lot_ of lines,
+ so there are risks of losing older messages in the log.
+ Use this option carefully, maybe worth to setup a
+ bigger log buffer with "log_buf_len" along with this.
panic_on_taint= Bitmask for conditionally calling panic() in add_taint()
Format: <hex>[,nousertaint]
diff --git a/kernel/panic.c b/kernel/panic.c
index 41ecf9ab824a..fc055a91c103 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -148,10 +148,13 @@ void nmi_panic(struct pt_regs *regs, const char *msg)
}
EXPORT_SYMBOL(nmi_panic);
-static void panic_print_sys_info(void)
+static void panic_print_sys_info(bool after_kmsg_dumpers)
{
- if (panic_print & PANIC_PRINT_ALL_PRINTK_MSG)
- console_flush_on_panic(CONSOLE_REPLAY_ALL);
+ if (after_kmsg_dumpers) {
+ if (panic_print & PANIC_PRINT_ALL_PRINTK_MSG)
+ console_flush_on_panic(CONSOLE_REPLAY_ALL);
+ return;
+ }
if (panic_print & PANIC_PRINT_ALL_CPU_BT)
trigger_all_cpu_backtrace();
@@ -249,7 +252,7 @@ void panic(const char *fmt, ...)
* show some extra information on kernel log if it was set...
*/
if (kexec_crash_loaded())
- panic_print_sys_info();
+ panic_print_sys_info(false);
/*
* If we have crashed and we have a crash kernel loaded let it handle
@@ -283,6 +286,8 @@ void panic(const char *fmt, ...)
*/
atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
+ panic_print_sys_info(false);
+
kmsg_dump(KMSG_DUMP_PANIC);
/*
@@ -313,7 +318,7 @@ void panic(const char *fmt, ...)
debug_locks_off();
console_flush_on_panic(CONSOLE_FLUSH_PENDING);
- panic_print_sys_info();
+ panic_print_sys_info(true);
if (!panic_blink)
panic_blink = no_blink;
--
2.34.1
On Thu 2022-01-06 18:28:35, Guilherme G. Piccoli wrote:
> The panic_print setting allows users to collect more information in a
> panic event, like memory stats, tasks, CPUs backtraces, etc.
> This is a pretty interesting debug mechanism, but currently the print
> event happens *after* kmsg_dump(), meaning that pstore, for example,
> cannot collect a dmesg with the panic_print information.
> --- a/kernel/panic.c
> +++ b/kernel/panic.c
> @@ -249,7 +252,7 @@ void panic(const char *fmt, ...)
> * show some extra information on kernel log if it was set...
> */
> if (kexec_crash_loaded())
> - panic_print_sys_info();
> + panic_print_sys_info(false);
panic_print_sys_info(false) will be called twice when both
kexec_crash_loaded() and _crash_kexec_post_notifiers are true.
Do we really need to call panic_print_sys_info() here? All information
provided by panic_print_sys_info(false) can be found also in
the crash dump.
> /*
> * If we have crashed and we have a crash kernel loaded let it handle
> @@ -283,6 +286,8 @@ void panic(const char *fmt, ...)
> */
> atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
>
> + panic_print_sys_info(false);
This is where the info might be printed 2nd time.
> +
> kmsg_dump(KMSG_DUMP_PANIC);
>
> /*
Otherwise, the change makes sense to me.
Best Regards,
Petr
On 13/01/2022 08:50, Petr Mladek wrote:
>> @@ -249,7 +252,7 @@ void panic(const char *fmt, ...)
>> * show some extra information on kernel log if it was set...
>> */
>> if (kexec_crash_loaded())
>> - panic_print_sys_info();
>> + panic_print_sys_info(false);
>
> panic_print_sys_info(false) will be called twice when both
> kexec_crash_loaded() and _crash_kexec_post_notifiers are true.
>
> Do we really need to call panic_print_sys_info() here? All information
> provided by panic_print_sys_info(false) can be found also in
> the crash dump.
>
>> /*
>> * If we have crashed and we have a crash kernel loaded let it handle
>> @@ -283,6 +286,8 @@ void panic(const char *fmt, ...)
>> */
>> atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
>>
>> + panic_print_sys_info(false);
>
> This is where the info might be printed 2nd time.
>
>> +
>> kmsg_dump(KMSG_DUMP_PANIC);
>>
>> /*
>
> Otherwise, the change makes sense to me.
>
> Best Regards,
> Petr
Hi Petr, thanks for your great review!
I see you also commented in the thread of the patch introducing the
panic_print_sys_info() before kdump.
Thanks for catching this issue - indeed, if
"_crash_kexec_post_notifiers" is true, with this patch we print stuff
twice. I will submit a V3 that guards against that, using a bool, makes
sense to you?
The interesting question here is:
> Do we really need to call panic_print_sys_info() here? All information
> provided by panic_print_sys_info(false) can be found also in
> the crash dump.
So, we indeed need that in our use case. Crash is meant to be used
post-mortem, i.e., you made a full vmcore collection and then, of
course, you have basically all the data you need accessible though the
crash tool.
Problem is: in our use case, we want more data than a regular dmesg in a
panic event (hence we use panic_print), but we don't collect a full
crash dump, due to its big size. Also, as you can imagine, we do favor
pstore over kdump, but it might fail due to a variety of reasons (like
not having a free RAM buffer for ramoops), so kdump is our fallback.
Hence, we'd like to be able to use panic_print with both kdump and
pstore, and for that, both patches are needed.
Cheers,
Guilherme
On Thu 2022-01-13 09:34:01, Guilherme G. Piccoli wrote:
> On 13/01/2022 08:50, Petr Mladek wrote:
> >> @@ -249,7 +252,7 @@ void panic(const char *fmt, ...)
> >> * show some extra information on kernel log if it was set...
> >> */
> >> if (kexec_crash_loaded())
> >> - panic_print_sys_info();
> >> + panic_print_sys_info(false);
> >
> > panic_print_sys_info(false) will be called twice when both
> > kexec_crash_loaded() and _crash_kexec_post_notifiers are true.
> >
> > Do we really need to call panic_print_sys_info() here? All information
> > provided by panic_print_sys_info(false) can be found also in
> > the crash dump.
> >
> >> /*
> >> * If we have crashed and we have a crash kernel loaded let it handle
> >> @@ -283,6 +286,8 @@ void panic(const char *fmt, ...)
> >> */
> >> atomic_notifier_call_chain(&panic_notifier_list, 0, buf);
> >>
> >> + panic_print_sys_info(false);
> >
> > This is where the info might be printed 2nd time.
> >
> >> +
> >> kmsg_dump(KMSG_DUMP_PANIC);
> Thanks for catching this issue - indeed, if
> "_crash_kexec_post_notifiers" is true, with this patch we print stuff
> twice. I will submit a V3 that guards against that, using a bool, makes
> sense to you?
It might be possible to check kexec_crash_loaded() on the two
locations. But I think about even easier solution, see below.
> The interesting question here is:
> > Do we really need to call panic_print_sys_info() here? All information
> > provided by panic_print_sys_info(false) can be found also in
> > the crash dump.
>
> So, we indeed need that in our use case. Crash is meant to be used
> post-mortem, i.e., you made a full vmcore collection and then, of
> course, you have basically all the data you need accessible though the
> crash tool.
>
> Problem is: in our use case, we want more data than a regular dmesg in a
> panic event (hence we use panic_print), but we don't collect a full
> crash dump, due to its big size. Also, as you can imagine, we do favor
> pstore over kdump, but it might fail due to a variety of reasons (like
> not having a free RAM buffer for ramoops), so kdump is our fallback.
> Hence, we'd like to be able to use panic_print with both kdump and
> pstore, and for that, both patches are needed.
Fair enough.
OK, do we have any specific reason why panic_print_sys_info()
should get called right before kmsg_dump() when this code patch
is used?
Alternative solution would be to remove the check of
kexec_crash_loaded() and always call panic_print_sys_info(false)
at the beginning (after kgdb_panic(buf)).
The advantage is that panic_print_sys_info(false) will be always
called on the same location. It will give the same results
in all code paths so that it will be easier to interpret them.
And it will have the same problems so it should be easier
to debug and maintain.
It is possible that it will not work for some users. Also it is
possible that it might cause some problems. But it is hard to
guess at least for me.
I think that we might try it and see if anyone complains.
Honestly, I think that only few people use panic_printk_sys_info().
And your use-case makes sense.
Best Regards,
Petr
On 13/01/2022 11:22, Petr Mladek wrote:
> [...]
> OK, do we have any specific reason why panic_print_sys_info()
> should get called right before kmsg_dump() when this code patch
> is used?
>
> Alternative solution would be to remove the check of
> kexec_crash_loaded() and always call panic_print_sys_info(false)
> at the beginning (after kgdb_panic(buf)).
>
> The advantage is that panic_print_sys_info(false) will be always
> called on the same location. It will give the same results
> in all code paths so that it will be easier to interpret them.
> And it will have the same problems so it should be easier
> to debug and maintain.
>
> It is possible that it will not work for some users. Also it is
> possible that it might cause some problems. But it is hard to
> guess at least for me.
>
> I think that we might try it and see if anyone complains.
> Honestly, I think that only few people use panic_printk_sys_info().
> And your use-case makes sense.
>
> Best Regards,
> Petr
Hi Petr, thanks for your idea - it's simple and effective, would resolve
the problems in a straightforward way. But there are some cons, let me
detail more.
Currently (in linux-next), if users set panic_print but not kdump, the
panic_print_sys_info() is called *after* the panic notifiers. If user
has kdump configured and still sets panic_print (which is our use case),
then panic_print_sys_info() is called _before_ the panic notifiers. In
other words, the behavior for non-kdump users is the same as before,
since the merge of panic_print functionality.
Your idea brings the printing earlier, always before the panic
notifiers. This isn't terrible, but might lead to unfortunate and
hard-to-debug problems; for example, some panic notifiers are
rcu_panic() and hung_task_panic(), both are simple functions to disable
RCU warnings and hung task detector in panic scenarios. If we go with
your idea, these won't get called before panic_print_sys_info(), even if
kdump is not set. So, since the panic_print triggers a lot of printing
in the console, we could face a stall and trigger RCU messages, maybe
intermixed with the panic_print information.
Again, this is not the end of the world, we could live with that. But
keeping two calls - and guarding against double print using
kexec_crash_loaded() as per your idea - is completely non-invasive and
doesn't change the current behavior.
So, let me know if we should go with your idea or keep both calls but
guarding against double printing to keep the current behavior untouched.
Both ways work for me, I'm slightly inclined to the latter, but I can
rework the patch if you prefer and use your idea!
Cheers,
Guilherme
On Thu 2022-01-13 12:15:08, Guilherme G. Piccoli wrote:
> On 13/01/2022 11:22, Petr Mladek wrote:
> > [...]
> > OK, do we have any specific reason why panic_print_sys_info()
> > should get called right before kmsg_dump() when this code patch
> > is used?
> >
> > Alternative solution would be to remove the check of
> > kexec_crash_loaded() and always call panic_print_sys_info(false)
> > at the beginning (after kgdb_panic(buf)).
> >
> > The advantage is that panic_print_sys_info(false) will be always
> > called on the same location. It will give the same results
> > in all code paths so that it will be easier to interpret them.
> > And it will have the same problems so it should be easier
> > to debug and maintain.
> >
> > It is possible that it will not work for some users. Also it is
> > possible that it might cause some problems. But it is hard to
> > guess at least for me.
> >
> > I think that we might try it and see if anyone complains.
> > Honestly, I think that only few people use panic_printk_sys_info().
> > And your use-case makes sense.
> >
> > Best Regards,
> > Petr
>
> Hi Petr, thanks for your idea - it's simple and effective, would resolve
> the problems in a straightforward way. But there are some cons, let me
> detail more.
>
> Currently (in linux-next), if users set panic_print but not kdump, the
> panic_print_sys_info() is called *after* the panic notifiers. If user
> has kdump configured and still sets panic_print (which is our use case),
> then panic_print_sys_info() is called _before_ the panic notifiers. In
> other words, the behavior for non-kdump users is the same as before,
> since the merge of panic_print functionality.
>
> Your idea brings the printing earlier, always before the panic
> notifiers. This isn't terrible, but might lead to unfortunate and
> hard-to-debug problems; for example, some panic notifiers are
> rcu_panic() and hung_task_panic(), both are simple functions to disable
> RCU warnings and hung task detector in panic scenarios. If we go with
> your idea, these won't get called before panic_print_sys_info(), even if
> kdump is not set. So, since the panic_print triggers a lot of printing
> in the console, we could face a stall and trigger RCU messages, maybe
> intermixed with the panic_print information.
I see. OK, it makes sense to call it after the panic notifiers when
they are used. It would be nice to mention the above in the commit
message and explain why the 2nd call is there.
Just an idea. It might be better to move the 1st call below
if (!_crash_kexec_post_notifiers). It would make it more
clear that it is intended for this code path. I mean:
if (!_crash_kexec_post_notifiers) {
/* ... */
if (kexec_crash_loaded())
panic_print_sys_info(false);
__crash_kexec(NULL);
...
Best Regards,
Petr
On 14/01/2022 09:26, Petr Mladek wrote:
> [...]
> I see. OK, it makes sense to call it after the panic notifiers when
> they are used. It would be nice to mention the above in the commit
> message and explain why the 2nd call is there.
>
> Just an idea. It might be better to move the 1st call below
> if (!_crash_kexec_post_notifiers). It would make it more
> clear that it is intended for this code path. I mean:
>
> if (!_crash_kexec_post_notifiers) {
> /* ... */
> if (kexec_crash_loaded())
> panic_print_sys_info(false);
>
> __crash_kexec(NULL);
> ...
>
> Best Regards,
> Petr
Perfect Petr, thanks again for the very good ideas! I'll work on V3 and
submit today.
Cheers,
Guilherme