2022-01-08 15:35:33

by Guilherme G. Piccoli

[permalink] [raw]
Subject: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

The kernel notifier infrastructure allows function callbacks to be
added in multiple lists, which are then called in the proper time,
like in a reboot or panic event. The panic_notifier_list specifically
contains the callbacks that are executed during a panic event. As any
other notifier list, the panic one has no filtering and all functions
previously registered are executed.

The kdump infrastructure, on the other hand, enables users to set
a crash kernel that is kexec'ed in a panic event, and vmcore/logs
are collected in such crash kernel. When kdump is set, by default
the panic notifiers are ignored - the kexec jumps to the crash kernel
before the list is checked and callbacks executed.

There are some cases though in which kdump users might want to
allow panic notifier callbacks to execute _before_ the kexec to
the crash kernel, for a variety of reasons - for example, users
may think kexec is very prone to fail and want to give a chance
to kmsg dumpers to run (and save logs using pstore), or maybe
some panic notifier is required to properly quiesce some hardware
that must be used to the crash kernel. For these cases, we have
the kernel parameter "crash_kexec_post_notifiers".

But there's a problem: currently it's an "all-or-nothing" situation,
the kdump user choice is either to execute all panic notifiers or
none of them. Given that panic notifiers may increase the risk of a
kdump failure, this is a tough decision and may affect the debug of
hard to reproduce bugs, if for some reason the user choice is to
enable panic notifiers, but kdump then fails.

So, this patch aims to ease this decision: we hereby introduce a filter
for the panic notifier list, in which users may select specifically
which callbacks they wish to run, allowing a safer kdump. The allowlist
should be provided using the parameter "panic_notifier_filter=a,b,..."
where a, b are valid callback names. Invalid symbols are discarded.

Currently up to 16 symbols may be passed in this list, we consider
that this numbers allows enough flexibility (and no matter what
architecture is used, at most 30 panic callbacks are registered).
In an experiment using a qemu x86 virtual machine, by default only
six callbacks are registered in the panic notifier list.
Once a valid callback name is provided in the list, such function
is allowed to be registered/unregistered in the panic_notifier_list;
all other panic callbacks are ignored. Notice that this filter is
only for the panic notifiers and has no effect in the other notifiers.

Signed-off-by: Guilherme G. Piccoli <[email protected]>
---



V4:

* Add some more clean-up suggestion from Andy (thanks).


V3:

* Implemented Alan's suggestion (thanks!), simplifying the check code
in the notifiers register/unregister functions. Notice that the
suggestion was missing a negative in the check function, I even
renamed it now, to be more clear:
s/is_panic_notifier_filtered/should_register_panic_notifier

The condition is !(A && B && C), and C is the check function, that
returns true when a symbol *is found* in the notifier filter; hence
we need to invert that here, as you can see in the code.


* Implemented Andy's suggestion (thanks!), to reduce the code in
the parameter parsing loop. Notice that strsep() modifies the buffer,
so not sure if it was a typo but the correct code here is:

single_param = strsep(&full_param_buffer, ",");


* "Bumped" the log output of the parsing function: users should be
warned in the errors (invalid symbol or exceeded entries) and
informed (pr_info) in case the parsing succeeded - I think pr_debug
was useless there.

Cheers,

Guilherme



.../admin-guide/kernel-parameters.txt | 14 +++++-
include/linux/panic_notifier.h | 10 +++++
kernel/notifier.c | 44 +++++++++++++++++--
kernel/panic.c | 39 ++++++++++++++++
4 files changed, 102 insertions(+), 5 deletions(-)

diff --git a/Documentation/admin-guide/kernel-parameters.txt b/Documentation/admin-guide/kernel-parameters.txt
index 2fba82431efb..2dc4e98823ae 100644
--- a/Documentation/admin-guide/kernel-parameters.txt
+++ b/Documentation/admin-guide/kernel-parameters.txt
@@ -3727,13 +3727,25 @@
panic_on_warn panic() instead of WARN(). Useful to cause kdump
on a WARN().

+ panic_notifier_filter=[function-list]
+ Limit the functions registered by the panic notifier
+ infrastructure. This allowlist is composed by function
+ names, comma separated (invalid symbols are filtered
+ out). Such functionality is useful for kdump users
+ that set "crash_kexec_post_notifiers" in order to
+ execute panic notifiers, but at the same time wish to
+ have just a subset of notifiers, not all of them. The
+ list of functions is limited to 16 entries currently.
+
crash_kexec_post_notifiers
Run kdump after running panic-notifiers and dumping
kmsg. This only for the users who doubt kdump always
succeeds in any situation.
Note that this also increases risks of kdump failure,
because some panic notifiers can make the crashed
- kernel more unstable.
+ kernel more unstable. See the "panic_notifier_filter"
+ parameter to have more control of which notifiers to
+ execute.

parkbd.port= [HW] Parallel port number the keyboard adapter is
connected to, default is 0.
diff --git a/include/linux/panic_notifier.h b/include/linux/panic_notifier.h
index 41e32483d7a7..9a96753e96d8 100644
--- a/include/linux/panic_notifier.h
+++ b/include/linux/panic_notifier.h
@@ -5,6 +5,16 @@
#include <linux/notifier.h>
#include <linux/types.h>

+/*
+ * The panic notifier filter infrastructure - each array element holds a
+ * function address, to be checked against panic_notifier register/unregister
+ * operations; these functions are allowed to be registered in the panic
+ * notifier list. This setting is useful for kdump, since users may want
+ * some panic notifiers to execute, but not all of them.
+ */
+extern unsigned long panic_nf_functions[];
+extern int panic_nf_count;
+
extern struct atomic_notifier_head panic_notifier_list;

extern bool crash_kexec_post_notifiers;
diff --git a/kernel/notifier.c b/kernel/notifier.c
index b8251dc0bc0f..4fc450bbf677 100644
--- a/kernel/notifier.c
+++ b/kernel/notifier.c
@@ -1,4 +1,5 @@
// SPDX-License-Identifier: GPL-2.0-only
+#include <linux/panic_notifier.h>
#include <linux/kdebug.h>
#include <linux/kprobes.h>
#include <linux/export.h>
@@ -127,12 +128,34 @@ static int notifier_call_chain_robust(struct notifier_block **nl,
* use a spinlock, and call_chain is synchronized by RCU (no locks).
*/

+/*
+ * The following helper is part of the panic notifier filter infrastructure;
+ * users can filter what functions they wish to allow being registered in the
+ * notifier system, restricted to the panic notifier. This is useful for kdump
+ * for example, when some notifiers are relevant but running all of them imposes
+ * risks to the kdump kernel reliability.
+ */
+static bool should_register_panic_notifier(struct notifier_block *n)
+{
+ int i;
+
+ for (i = 0; i < panic_nf_count; i++) {
+ if ((unsigned long)(n->notifier_call) == panic_nf_functions[i])
+ return true;
+ }
+
+ return false;
+}
+
/**
* atomic_notifier_chain_register - Add notifier to an atomic notifier chain
* @nh: Pointer to head of the atomic notifier chain
* @n: New entry in notifier chain
*
* Adds a notifier to an atomic notifier chain.
+ * If "panic_notifier_filter" is provided, we hereby filter the
+ * panic_notifier_list and only allow registering the functions
+ * that are present in the filter.
*
* Currently always returns zero.
*/
@@ -140,10 +163,15 @@ int atomic_notifier_chain_register(struct atomic_notifier_head *nh,
struct notifier_block *n)
{
unsigned long flags;
- int ret;
+ int ret = 0;

spin_lock_irqsave(&nh->lock, flags);
- ret = notifier_chain_register(&nh->head, n);
+
+ if (!(nh == &panic_notifier_list &&
+ (panic_nf_count > 0) &&
+ !should_register_panic_notifier(n)))
+ ret = notifier_chain_register(&nh->head, n);
+
spin_unlock_irqrestore(&nh->lock, flags);
return ret;
}
@@ -155,6 +183,9 @@ EXPORT_SYMBOL_GPL(atomic_notifier_chain_register);
* @n: Entry to remove from notifier chain
*
* Removes a notifier from an atomic notifier chain.
+ * If "panic_notifier_filter" is provided, we hereby filter the
+ * panic_notifier_list and only allow unregistering the functions
+ * that are present in the filter.
*
* Returns zero on success or %-ENOENT on failure.
*/
@@ -162,10 +193,15 @@ int atomic_notifier_chain_unregister(struct atomic_notifier_head *nh,
struct notifier_block *n)
{
unsigned long flags;
- int ret;
+ int ret = 0;

spin_lock_irqsave(&nh->lock, flags);
- ret = notifier_chain_unregister(&nh->head, n);
+
+ if (!(nh == &panic_notifier_list &&
+ (panic_nf_count > 0) &&
+ !should_register_panic_notifier(n)))
+ ret = notifier_chain_unregister(&nh->head, n);
+
spin_unlock_irqrestore(&nh->lock, flags);
synchronize_rcu();
return ret;
diff --git a/kernel/panic.c b/kernel/panic.c
index cefd7d82366f..a06fcc4b1d6e 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -31,6 +31,7 @@
#include <linux/console.h>
#include <linux/bug.h>
#include <linux/ratelimit.h>
+#include <linux/kallsyms.h>
#include <linux/debugfs.h>
#include <asm/sections.h>

@@ -67,6 +68,16 @@ EXPORT_SYMBOL_GPL(panic_timeout);
#define PANIC_PRINT_ALL_PRINTK_MSG 0x00000020
unsigned long panic_print;

+/*
+ * Kernel has currently < 30 panic handlers no matter the arch,
+ * based on some code counting; so 16 items seems a good amount;
+ * users that are filtering panic notifiers shouldn't add all
+ * of them in theory, that doesn't make sense...
+ */
+#define PANIC_NF_MAX 16
+unsigned long panic_nf_functions[PANIC_NF_MAX];
+int panic_nf_count;
+
ATOMIC_NOTIFIER_HEAD(panic_notifier_list);

EXPORT_SYMBOL(panic_notifier_list);
@@ -146,6 +157,34 @@ void nmi_panic(struct pt_regs *regs, const char *msg)
}
EXPORT_SYMBOL(nmi_panic);

+static int __init panic_notifier_filter_setup(char *buf)
+{
+ char *func;
+ unsigned long addr;
+
+ while ((func = strsep(&buf, ","))) {
+ addr = kallsyms_lookup_name(func);
+ if (!addr) {
+ pr_warn("panic_notifier_filter: invalid symbol %s\n", func);
+ continue;
+ }
+
+ if (panic_nf_count < PANIC_NF_MAX) {
+ panic_nf_functions[panic_nf_count] = addr;
+ panic_nf_count++;
+ pr_info("panic_notifier_filter: added symbol %s\n", func);
+ } else {
+ pr_warn("panic_notifier_filter: exceeded maximum notifiers (%d), aborting\n",
+ PANIC_NF_MAX);
+ panic_nf_count = 0;
+ break;
+ }
+ }
+
+ return 0;
+}
+early_param("panic_notifier_filter", panic_notifier_filter_setup);
+
static void panic_print_sys_info(void)
{
if (panic_print & PANIC_PRINT_ALL_PRINTK_MSG)
--
2.34.1



2022-01-14 23:04:01

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

Hey folks, sorry for the ping.
But is there any extra reviews? All comments are much appreciated!

Dave, what do you think about the patch? I remember we talked about it
in [0], seems you considered that a good idea right?

Thanks in advance,


Guilherme


[0]
https://lore.kernel.org/lkml/[email protected]/

2022-01-17 07:08:35

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 01/08/22 at 12:34pm, Guilherme G. Piccoli wrote:
......
> So, this patch aims to ease this decision: we hereby introduce a filter
> for the panic notifier list, in which users may select specifically
> which callbacks they wish to run, allowing a safer kdump. The allowlist
> should be provided using the parameter "panic_notifier_filter=a,b,..."
> where a, b are valid callback names. Invalid symbols are discarded.
>
> Currently up to 16 symbols may be passed in this list, we consider
> that this numbers allows enough flexibility (and no matter what
> architecture is used, at most 30 panic callbacks are registered).
> In an experiment using a qemu x86 virtual machine, by default only
> six callbacks are registered in the panic notifier list.
> Once a valid callback name is provided in the list, such function
> is allowed to be registered/unregistered in the panic_notifier_list;
> all other panic callbacks are ignored. Notice that this filter is
> only for the panic notifiers and has no effect in the other notifiers.
>
> Signed-off-by: Guilherme G. Piccoli <[email protected]>

This patch looks good to me, thx.

Acked-by: Baoquan He <[email protected]>

2022-01-18 02:26:04

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 16/01/2022 10:11, Baoquan He wrote:
> [...]
> This patch looks good to me, thx.
>
> Acked-by: Baoquan He <[email protected]>
>

Thanks a lot Baoquan He !

2022-01-20 10:41:18

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 17/01/2022 23:31, Dave Young wrote:
> [...]
> Hi Guilherme,  thank you for making it a formal patch!  Yes, it is a
> nice improvement.
> I'm sorry I did not get time to review the code,  I will leave it to
> other people to review :)

Hi Dave, no worries - thanks for your opinion, much appreciated!
Cheers,


Guilherme

2022-01-21 22:19:48

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

Adding some more people into Cc. Some modified the logic in the past.
Some are familiar with some interesting areas where the panic
notfiers are used.

On Sat 2022-01-08 12:34:51, Guilherme G. Piccoli wrote:
> The kernel notifier infrastructure allows function callbacks to be
> added in multiple lists, which are then called in the proper time,
> like in a reboot or panic event. The panic_notifier_list specifically
> contains the callbacks that are executed during a panic event. As any
> other notifier list, the panic one has no filtering and all functions
> previously registered are executed.
>
> The kdump infrastructure, on the other hand, enables users to set
> a crash kernel that is kexec'ed in a panic event, and vmcore/logs
> are collected in such crash kernel. When kdump is set, by default
> the panic notifiers are ignored - the kexec jumps to the crash kernel
> before the list is checked and callbacks executed.
>
> There are some cases though in which kdump users might want to
> allow panic notifier callbacks to execute _before_ the kexec to
> the crash kernel, for a variety of reasons - for example, users
> may think kexec is very prone to fail and want to give a chance
> to kmsg dumpers to run (and save logs using pstore),

Yes, this seems to be original intention for the
"crash_kexec_post_notifiers" option, see the commit
f06e5153f4ae2e2f3b0300f ("kernel/panic.c: add
"crash_kexec_post_notifiers" option for kdump after panic_notifiers")


> some panic notifier is required to properly quiesce some hardware
> that must be used to the crash kernel.

Do you have any example, please? The above mentioned commit
says "crash_kexec_post_notifiers" actually increases risk
of kdump failure.

Note that kmsg_dump() is called after the notifiers only because
some are printing more information, see the commit
6723734cdff15211bb78a ("panic: call panic handlers before kmsg_dump").
They might still increase the change that kmsg_dump() will never
be called.


> But there's a problem: currently it's an "all-or-nothing" situation,
> the kdump user choice is either to execute all panic notifiers or
> none of them. Given that panic notifiers may increase the risk of a
> kdump failure, this is a tough decision and may affect the debug of
> hard to reproduce bugs, if for some reason the user choice is to
> enable panic notifiers, but kdump then fails.
>
> So, this patch aims to ease this decision: we hereby introduce a filter
> for the panic notifier list, in which users may select specifically
> which callbacks they wish to run, allowing a safer kdump. The allowlist
> should be provided using the parameter "panic_notifier_filter=a,b,..."
> where a, b are valid callback names. Invalid symbols are discarded.

I am afraid that this is almost unusable solution:

+ requires deep knowledge of what each notifier does
+ might need debugging what notifier causes problems
+ the list might need to be updated when new notifiers are added
+ function names are implementation detail and might change
+ requires kallsyms


It is only workaround for a real problem. The problem is that
"panic_notifier_list" is used for many purposes that break
each other.

I checked some notifiers and found few groups:

+ disable watchdogs:
+ hung_task_panic()
+ rcu_panic()

+ dump information:
+ kernel_offset_notifier()
+ trace_panic_handler() (duplicate of panic_print=0x10)

+ inform hypervisor
+ xen_panic_event()
+ pvpanic_panic_notify()
+ hyperv_panic_event()

+ misc cleanup / flush / blinking
+ panic_event() in ipmi_msghandler.c
+ panic_happened() in heartbeat.c
+ led_trigger_panic_notifier()


IMHO, the right solution is to split the callbacks into 2 or more
notifier list. Then we might rework panic() to do:

void panic(void)
{
[...]

/* stop watchdogs + extra info */
atomic_notifier_call_chain(&panic_disable_watchdogs_notifier_list, 0, buf);
atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
panic_print_sys_info();

/* crash_kexec + kmsg_dump in configurable order */
if (!_crash_kexec_post_kmsg_dump) {
__crash_kexec(NULL);
smp_send_stop();
} else {
crash_smp_send_stop();
}

kmsg_dump();
if (_crash_kexec_post_kmsg_dump)
__crash_kexec(NULL);

/* infinite loop or reboot */
atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
atomic_notifier_call_chain(&panic_rest_notifier_list, 0, buf);

console_flush_on_panic(CONSOLE_FLUSH_PENDING);

if (panic_timeout >= 0) {
timeout();
emergency_restart();
}

for (i = 0; ; i += PANIC_TIMER_STEP) {
if (i >= i_next) {
i += panic_blink(state ^= 1);
i_next = i + 3600 / PANIC_BLINK_SPD;
}
mdelay(PANIC_TIMER_STEP);
}
}

Two notifier lists might be enough in the above scenario. I would call
them:

panic_pre_dump_notifier_list
panic_post_dump_notifier_list


It is a real solution that will help everyone. It is more complicated now
but it will makes things much easier in the long term. And it might be done
step by step:

1. introduce the two notifier lists
2. convert all users: one by one
3. remove the original notifier list when there is no user

Best Regards,
Petr

2022-01-22 08:07:39

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

Hi Petr, thanks for the great response, and for CCing more (potentially)
interested parties! Some comments inline below; also, I'm CCing Michael
Kelley as well.


On 20/01/2022 12:14, Petr Mladek wrote:
> Adding some more people into Cc. Some modified the logic in the past.
> Some are familiar with some interesting areas where the panic
> notfiers are used.
>
> On Sat 2022-01-08 12:34:51, Guilherme G. Piccoli wrote:
>> [...]
>> There are some cases though in which kdump users might want to
>> allow panic notifier callbacks to execute _before_ the kexec to
>> the crash kernel, for a variety of reasons - for example, users
>> may think kexec is very prone to fail and want to give a chance
>> to kmsg dumpers to run (and save logs using pstore),
>
> Yes, this seems to be original intention for the
> "crash_kexec_post_notifiers" option, see the commit
> f06e5153f4ae2e2f3b0300f ("kernel/panic.c: add
> "crash_kexec_post_notifiers" option for kdump after panic_notifiers")
>
>> some panic notifier is required to properly quiesce some hardware
>> that must be used to the crash kernel.
>
> Do you have any example, please? The above mentioned commit
> says "crash_kexec_post_notifiers" actually increases risk
> of kdump failure.
>
> Note that kmsg_dump() is called after the notifiers only because
> some are printing more information, see the commit
> 6723734cdff15211bb78a ("panic: call panic handlers before kmsg_dump").
> They might still increase the change that kmsg_dump() will never
> be called.
>

Sure! I guess Michael Kelley's response here is the perfect example:
https://lore.kernel.org/lkml/MWHPR21MB1593A32A3433F5F262796FCFD75B9@MWHPR21MB1593.namprd21.prod.outlook.com/

In my understanding, he is referring the function hyperv_panic_event().
But I also found another 2 examples in a quick look: bcm_vk_on_panic()
and brcmstb_pm_panic_notify().


>> [...]
>> So, this patch aims to ease this decision: we hereby introduce a filter
>> for the panic notifier list, in which users may select specifically
>> which callbacks they wish to run, allowing a safer kdump. The allowlist
>> should be provided using the parameter "panic_notifier_filter=a,b,..."
>> where a, b are valid callback names. Invalid symbols are discarded.
>
> I am afraid that this is almost unusable solution:
>
> + requires deep knowledge of what each notifier does
> + might need debugging what notifier causes problems
> + the list might need to be updated when new notifiers are added
> + function names are implementation detail and might change
> + requires kallsyms
>
>
> It is only workaround for a real problem. The problem is that
> "panic_notifier_list" is used for many purposes that break
> each other.
>
> I checked some notifiers and found few groups:
>
> + disable watchdogs:
> + hung_task_panic()
> + rcu_panic()
>
> + dump information:
> + kernel_offset_notifier()
> + trace_panic_handler() (duplicate of panic_print=0x10)
>
> + inform hypervisor
> + xen_panic_event()
> + pvpanic_panic_notify()
> + hyperv_panic_event()
>
> + misc cleanup / flush / blinking
> + panic_event() in ipmi_msghandler.c
> + panic_happened() in heartbeat.c
> + led_trigger_panic_notifier()
>
>
> IMHO, the right solution is to split the callbacks into 2 or more
> notifier list. Then we might rework panic() to do:
>
> void panic(void)
> {
> [...]
>
> /* stop watchdogs + extra info */
> atomic_notifier_call_chain(&panic_disable_watchdogs_notifier_list, 0, buf);
> atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> panic_print_sys_info();
>
> /* crash_kexec + kmsg_dump in configurable order */
> if (!_crash_kexec_post_kmsg_dump) {
> __crash_kexec(NULL);
> smp_send_stop();
> } else {
> crash_smp_send_stop();
> }
>
> kmsg_dump();
> if (_crash_kexec_post_kmsg_dump)
> __crash_kexec(NULL);
>
> /* infinite loop or reboot */
> atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> atomic_notifier_call_chain(&panic_rest_notifier_list, 0, buf);
>
> console_flush_on_panic(CONSOLE_FLUSH_PENDING);
> [...]
> Two notifier lists might be enough in the above scenario. I would call
> them:
>
> panic_pre_dump_notifier_list
> panic_post_dump_notifier_list
>
>
> It is a real solution that will help everyone. It is more complicated now
> but it will makes things much easier in the long term. And it might be done
> step by step:
>
> 1. introduce the two notifier lists
> 2. convert all users: one by one
> 3. remove the original notifier list when there is no user

That's a great idea! I'm into it, if we have a consensus. The thing that
scares me most here is that this is a big change and consumes time to
implement - I'd not risk such time if somebody is really against that.
So, let's see more opinions, maybe the kdump maintainers have good input.

Also, I'd be interested in still keeping a filter for the pre_dump list,
could be a blacklist by function name for example; since the post_dump
is conditioned to "crash_kexec_post_notifiers" and most of information
output will be in the first notifier, I don't see a strong reason
anymore for filtering the second notifier.

Cheers,


Guilherme

2022-01-23 15:05:00

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 01/21/22 at 05:31pm, Guilherme G. Piccoli wrote:
......
> > IMHO, the right solution is to split the callbacks into 2 or more
> > notifier list. Then we might rework panic() to do:
> >
> > void panic(void)
> > {
> > [...]
> >
> > /* stop watchdogs + extra info */
> > atomic_notifier_call_chain(&panic_disable_watchdogs_notifier_list, 0, buf);
> > atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> > panic_print_sys_info();
> >
> > /* crash_kexec + kmsg_dump in configurable order */
> > if (!_crash_kexec_post_kmsg_dump) {
> > __crash_kexec(NULL);
> > smp_send_stop();
> > } else {
> > crash_smp_send_stop();
> > }
> >
> > kmsg_dump();
> > if (_crash_kexec_post_kmsg_dump)
> > __crash_kexec(NULL);
> >
> > /* infinite loop or reboot */
> > atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> > atomic_notifier_call_chain(&panic_rest_notifier_list, 0, buf);
> >
> > console_flush_on_panic(CONSOLE_FLUSH_PENDING);
> > [...]
> > Two notifier lists might be enough in the above scenario. I would call
> > them:
> >
> > panic_pre_dump_notifier_list
> > panic_post_dump_notifier_list
> >
> >
> > It is a real solution that will help everyone. It is more complicated now
> > but it will makes things much easier in the long term. And it might be done
> > step by step:
> >
> > 1. introduce the two notifier lists
> > 2. convert all users: one by one
> > 3. remove the original notifier list when there is no user
>
> That's a great idea! I'm into it, if we have a consensus. The thing that
> scares me most here is that this is a big change and consumes time to
> implement - I'd not risk such time if somebody is really against that.
> So, let's see more opinions, maybe the kdump maintainers have good input.

I am fine with it. As long as thing is made clear, glad to see code is
refactored to be more understandable and improved. Earlier, during several
rounds of discussion between you and Petr, seveal pitfalls have been
pointed out and avoided.

Meanwhile, I would suggest Masa and HATAYAMA to help give input about
panic_notifier usage and refactory. AFAIK, they contributed code and use
panic_notifier in their product or environment a lot, that will be very
helpful to get the first hand information from them.

Hi Masa, HATAYANA,

Any comment on this? (Please ignore this if it's not in your care.)

2022-01-24 08:44:07

by Masami Hiramatsu

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On Sat, 22 Jan 2022 18:55:14 +0800
Baoquan He <[email protected]> wrote:

> On 01/21/22 at 05:31pm, Guilherme G. Piccoli wrote:
> ......
> > > IMHO, the right solution is to split the callbacks into 2 or more
> > > notifier list. Then we might rework panic() to do:
> > >
> > > void panic(void)
> > > {
> > > [...]
> > >
> > > /* stop watchdogs + extra info */
> > > atomic_notifier_call_chain(&panic_disable_watchdogs_notifier_list, 0, buf);
> > > atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> > > panic_print_sys_info();
> > >
> > > /* crash_kexec + kmsg_dump in configurable order */
> > > if (!_crash_kexec_post_kmsg_dump) {
> > > __crash_kexec(NULL);
> > > smp_send_stop();
> > > } else {
> > > crash_smp_send_stop();
> > > }
> > >
> > > kmsg_dump();
> > > if (_crash_kexec_post_kmsg_dump)
> > > __crash_kexec(NULL);
> > >
> > > /* infinite loop or reboot */
> > > atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> > > atomic_notifier_call_chain(&panic_rest_notifier_list, 0, buf);
> > >
> > > console_flush_on_panic(CONSOLE_FLUSH_PENDING);
> > > [...]
> > > Two notifier lists might be enough in the above scenario. I would call
> > > them:
> > >
> > > panic_pre_dump_notifier_list
> > > panic_post_dump_notifier_list
> > >
> > >
> > > It is a real solution that will help everyone. It is more complicated now
> > > but it will makes things much easier in the long term. And it might be done
> > > step by step:
> > >
> > > 1. introduce the two notifier lists
> > > 2. convert all users: one by one
> > > 3. remove the original notifier list when there is no user
> >
> > That's a great idea! I'm into it, if we have a consensus. The thing that
> > scares me most here is that this is a big change and consumes time to
> > implement - I'd not risk such time if somebody is really against that.
> > So, let's see more opinions, maybe the kdump maintainers have good input.
>
> I am fine with it. As long as thing is made clear, glad to see code is
> refactored to be more understandable and improved. Earlier, during several
> rounds of discussion between you and Petr, seveal pitfalls have been
> pointed out and avoided.
>
> Meanwhile, I would suggest Masa and HATAYAMA to help give input about
> panic_notifier usage and refactory. AFAIK, they contributed code and use
> panic_notifier in their product or environment a lot, that will be very
> helpful to get the first hand information from them.
>
> Hi Masa, HATAYANA,
>
> Any comment on this? (Please ignore this if it's not in your care.)

No, that looks good idea to me. BTW, the 'dump' in the new notifieers
means both kmsg_dump and crash dump, right?

Thank you,

--
Masami Hiramatsu <[email protected]>

2022-01-24 19:08:01

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter


> On 01/21/22 at 05:31pm, Guilherme G. Piccoli wrote:
> ......
> > > IMHO, the right solution is to split the callbacks into 2 or more
> > > notifier list. Then we might rework panic() to do:
> > >
> > > void panic(void)
> > > {
> > > [...]
> > >
> > > /* stop watchdogs + extra info */
> > > atomic_notifier_call_chain(&panic_disable_watchdogs_notifier_list, 0, buf);
> > > atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> > > panic_print_sys_info();
> > >
> > > /* crash_kexec + kmsg_dump in configurable order */
> > > if (!_crash_kexec_post_kmsg_dump) {
> > > __crash_kexec(NULL);
> > > smp_send_stop();
> > > } else {
> > > crash_smp_send_stop();
> > > }
> > >
> > > kmsg_dump();
> > > if (_crash_kexec_post_kmsg_dump)
> > > __crash_kexec(NULL);
> > >
> > > /* infinite loop or reboot */
> > > atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> > > atomic_notifier_call_chain(&panic_rest_notifier_list, 0, buf);
> > >
> > > console_flush_on_panic(CONSOLE_FLUSH_PENDING);
> > > [...]
> > > Two notifier lists might be enough in the above scenario. I would call
> > > them:
> > >
> > > panic_pre_dump_notifier_list
> > > panic_post_dump_notifier_list
> > >
> > >
> > > It is a real solution that will help everyone. It is more complicated now
> > > but it will makes things much easier in the long term. And it might be done
> > > step by step:
> > >
> > > 1. introduce the two notifier lists
> > > 2. convert all users: one by one
> > > 3. remove the original notifier list when there is no user
> >
> > That's a great idea! I'm into it, if we have a consensus. The thing that
> > scares me most here is that this is a big change and consumes time to
> > implement - I'd not risk such time if somebody is really against that.
> > So, let's see more opinions, maybe the kdump maintainers have good input.
>
> I am fine with it. As long as thing is made clear, glad to see code is
> refactored to be more understandable and improved. Earlier, during several
> rounds of discussion between you and Petr, seveal pitfalls have been
> pointed out and avoided.
>
> Meanwhile, I would suggest Masa and HATAYAMA to help give input about
> panic_notifier usage and refactory. AFAIK, they contributed code and use
> panic_notifier in their product or environment a lot, that will be very
> helpful to get the first hand information from them.
>
> Hi Masa, HATAYANA,
>
> Any comment on this? (Please ignore this if it's not in your care.)
>

Thanks for CCing to me. I like this patch set. I have same motivation.

For example, when I used crash_kexec_post_notifiers, I sometimes ran into
deadlock in printk's exclusion logic during the call of panic notifiers since
kaslr outputs kernel offset at panic by dump_kernel_offset() via panic notifers
(although this might never happen now thanks to lockless implementation).

The problem is that in the current design, we have to run all the
tasks registered, although most of them are actually unnecessary for
other users' requirements. Each user wants to call only their own handlers
in order to keep kdump as reliable as possible.

I've just started reading this patch set and have no other comments for now.

Thanks.
HATAYAMA, Daisuke

2022-01-24 19:22:25

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 01/24/22 at 11:43am, [email protected] wrote:
>
> > On 01/21/22 at 05:31pm, Guilherme G. Piccoli wrote:
> > ......
> > > > IMHO, the right solution is to split the callbacks into 2 or more
> > > > notifier list. Then we might rework panic() to do:
> > > >
> > > > void panic(void)
> > > > {
> > > > [...]
> > > >
> > > > /* stop watchdogs + extra info */
> > > > atomic_notifier_call_chain(&panic_disable_watchdogs_notifier_list, 0, buf);
> > > > atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> > > > panic_print_sys_info();
> > > >
> > > > /* crash_kexec + kmsg_dump in configurable order */
> > > > if (!_crash_kexec_post_kmsg_dump) {
> > > > __crash_kexec(NULL);
> > > > smp_send_stop();
> > > > } else {
> > > > crash_smp_send_stop();
> > > > }
> > > >
> > > > kmsg_dump();
> > > > if (_crash_kexec_post_kmsg_dump)
> > > > __crash_kexec(NULL);
> > > >
> > > > /* infinite loop or reboot */
> > > > atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> > > > atomic_notifier_call_chain(&panic_rest_notifier_list, 0, buf);
> > > >
> > > > console_flush_on_panic(CONSOLE_FLUSH_PENDING);
> > > > [...]
> > > > Two notifier lists might be enough in the above scenario. I would call
> > > > them:
> > > >
> > > > panic_pre_dump_notifier_list
> > > > panic_post_dump_notifier_list
> > > >
> > > >
> > > > It is a real solution that will help everyone. It is more complicated now
> > > > but it will makes things much easier in the long term. And it might be done
> > > > step by step:
> > > >
> > > > 1. introduce the two notifier lists
> > > > 2. convert all users: one by one
> > > > 3. remove the original notifier list when there is no user
> > >
> > > That's a great idea! I'm into it, if we have a consensus. The thing that
> > > scares me most here is that this is a big change and consumes time to
> > > implement - I'd not risk such time if somebody is really against that.
> > > So, let's see more opinions, maybe the kdump maintainers have good input.
> >
> > I am fine with it. As long as thing is made clear, glad to see code is
> > refactored to be more understandable and improved. Earlier, during several
> > rounds of discussion between you and Petr, seveal pitfalls have been
> > pointed out and avoided.
> >
> > Meanwhile, I would suggest Masa and HATAYAMA to help give input about
> > panic_notifier usage and refactory. AFAIK, they contributed code and use
> > panic_notifier in their product or environment a lot, that will be very
> > helpful to get the first hand information from them.
> >
> > Hi Masa, HATAYANA,
> >
> > Any comment on this? (Please ignore this if it's not in your care.)
> >
>
> Thanks for CCing to me. I like this patch set. I have same motivation.
>
> For example, when I used crash_kexec_post_notifiers, I sometimes ran into
> deadlock in printk's exclusion logic during the call of panic notifiers since
> kaslr outputs kernel offset at panic by dump_kernel_offset() via panic notifers
> (although this might never happen now thanks to lockless implementation).
>
> The problem is that in the current design, we have to run all the
> tasks registered, although most of them are actually unnecessary for
> other users' requirements. Each user wants to call only their own handlers
> in order to keep kdump as reliable as possible.

If I unerstand you correclty, you are expressing your favour to the
panic_notifier filter this patch adds. I personally like it very much
either because I know users only expect to run those one or several
handlers they added or cared about, from discussing reported cases
realted to them, just as you said.

Now comments to patch lean to split and classify the current panic
notifiers list into two or several sub-lists and execute them in
different order. I think this improvement will benefit people who
defaults to execute all panic notifiers, while the panic notifier
fileter is also very helpful if can be added.

Hope I got you right, HATAYAMA. And thanks a lot for your quick
response.

2022-01-24 19:22:43

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 01/23/22 at 10:07pm, Masami Hiramatsu wrote:
> On Sat, 22 Jan 2022 18:55:14 +0800
> Baoquan He <[email protected]> wrote:
>
> > On 01/21/22 at 05:31pm, Guilherme G. Piccoli wrote:
> > ......
> > > > IMHO, the right solution is to split the callbacks into 2 or more
> > > > notifier list. Then we might rework panic() to do:
> > > >
> > > > void panic(void)
> > > > {
> > > > [...]
> > > >
> > > > /* stop watchdogs + extra info */
> > > > atomic_notifier_call_chain(&panic_disable_watchdogs_notifier_list, 0, buf);
> > > > atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> > > > panic_print_sys_info();
> > > >
> > > > /* crash_kexec + kmsg_dump in configurable order */
> > > > if (!_crash_kexec_post_kmsg_dump) {
> > > > __crash_kexec(NULL);
> > > > smp_send_stop();
> > > > } else {
> > > > crash_smp_send_stop();
> > > > }
> > > >
> > > > kmsg_dump();
> > > > if (_crash_kexec_post_kmsg_dump)
> > > > __crash_kexec(NULL);
> > > >
> > > > /* infinite loop or reboot */
> > > > atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> > > > atomic_notifier_call_chain(&panic_rest_notifier_list, 0, buf);
> > > >
> > > > console_flush_on_panic(CONSOLE_FLUSH_PENDING);
> > > > [...]
> > > > Two notifier lists might be enough in the above scenario. I would call
> > > > them:
> > > >
> > > > panic_pre_dump_notifier_list
> > > > panic_post_dump_notifier_list
> > > >
> > > >
> > > > It is a real solution that will help everyone. It is more complicated now
> > > > but it will makes things much easier in the long term. And it might be done
> > > > step by step:
> > > >
> > > > 1. introduce the two notifier lists
> > > > 2. convert all users: one by one
> > > > 3. remove the original notifier list when there is no user
> > >
> > > That's a great idea! I'm into it, if we have a consensus. The thing that
> > > scares me most here is that this is a big change and consumes time to
> > > implement - I'd not risk such time if somebody is really against that.
> > > So, let's see more opinions, maybe the kdump maintainers have good input.
> >
> > I am fine with it. As long as thing is made clear, glad to see code is
> > refactored to be more understandable and improved. Earlier, during several
> > rounds of discussion between you and Petr, seveal pitfalls have been
> > pointed out and avoided.
> >
> > Meanwhile, I would suggest Masa and HATAYAMA to help give input about
> > panic_notifier usage and refactory. AFAIK, they contributed code and use
> > panic_notifier in their product or environment a lot, that will be very
> > helpful to get the first hand information from them.
> >
> > Hi Masa, HATAYANA,
> >
> > Any comment on this? (Please ignore this if it's not in your care.)
>
> No, that looks good idea to me. BTW, the 'dump' in the new notifieers
> means both kmsg_dump and crash dump, right?

Thanks for quick response, Masa.

I guess it's crash dump, namely kdump.

About pre_dump, if the dump is crash dump, hope those pre_dump notifiers
will be executed under conditional check, e.g only if 'crash_kexec_post_notifiers'
is specified in kernel cmdline.

2022-01-24 19:24:00

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 24/01/2022 10:59, Baoquan He wrote:
> [...]
> About pre_dump, if the dump is crash dump, hope those pre_dump notifiers
> will be executed under conditional check, e.g only if 'crash_kexec_post_notifiers'
> is specified in kernel cmdline.

Hi Baoquan, based on Petr's suggestion, I think pre_dump would be
responsible for really *non-intrusive/non-risky* tasks and should be
always executed in the panic path (before kdump), regardless of
"crash_kexec_post_notifiers".

The idea is that the majority of the notifiers would be executed in the
post_dump portion, and for that, we have the
"crash_kexec_post_notifiers" conditional. I also suggest we have
blacklist options (based on function names) for both notifiers, in order
to make kdump issues debug easier.

Do you agree with that? Feel free to comment with suggestions!
Cheers,


Guilherme

2022-01-25 16:46:05

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

> @@ -146,6 +157,34 @@ void nmi_panic(struct pt_regs *regs, const char *msg)
> }
> EXPORT_SYMBOL(nmi_panic);
>
> +static int __init panic_notifier_filter_setup(char *buf)
> +{
> + char *func;
> + unsigned long addr;
> +
> + while ((func = strsep(&buf, ","))) {
> + addr = kallsyms_lookup_name(func);
> + if (!addr) {
> + pr_warn("panic_notifier_filter: invalid symbol %s\n", func);
> + continue;
> + }

Could you remove this check?

panic_notifier_list is exported to kernel modules and this check
prevents such users from using this feature.

Thanks.
HATAYAMA, Daisuke

2022-01-25 16:57:34

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 25/01/2022 08:50, [email protected] wrote:
>> + while ((func = strsep(&buf, ","))) {
>> + addr = kallsyms_lookup_name(func);
>> + if (!addr) {
>> + pr_warn("panic_notifier_filter: invalid symbol %s\n", func);
>> + continue;
>> + }
>
> Could you remove this check?
>
> panic_notifier_list is exported to kernel modules and this check
> prevents such users from using this feature.
>
> Thanks.
> HATAYAMA, Daisuke


Hi, thanks for the review. First of all, notice that it's very likely
this patch isn't gonna get merged this way, we are considering a
refactor that included 2 panic notifiers: one a bit earlier (pre_dump),
that includes functions less risky, as watchdog unloaders, kernel offset
dump, etc, and the second panic notifier (post_dump) will keep the
majority of callbacks, and can be conditionally executed on kdump
through the usage of "crash_kexec_post_notifiers".

Anyway, I'm curious with your code review - how can we use this filter
with modules, if the filter setup is invoked as early_param(), before
modules load? In that case, module functions won't have a valid address,
correct? So, in that moment, this lookup fails, we cannot record an
unloaded module address in such list. Please, correct me if I'm wrong.

Cheers,


Guilherme

2022-01-25 17:39:49

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

> Hi, thanks for the review. First of all, notice that it's very likely
> this patch isn't gonna get merged this way, we are considering a
> refactor that included 2 panic notifiers: one a bit earlier (pre_dump),
> that includes functions less risky, as watchdog unloaders, kernel offset
> dump, etc, and the second panic notifier (post_dump) will keep the
> majority of callbacks, and can be conditionally executed on kdump
> through the usage of "crash_kexec_post_notifiers".

But the pre_dump cannot avoid calling multiple unnecessary handlers, right?
It's more risky than the previous idea...

> Anyway, I'm curious with your code review - how can we use this filter
> with modules, if the filter setup is invoked as early_param(), before
> modules load? In that case, module functions won't have a valid address,
> correct? So, in that moment, this lookup fails, we cannot record an
> unloaded module address in such list. Please, correct me if I'm wrong.

For example, how about simply maintaining function symbol names in the list
as string, not address.

Thanks.
HATAYAMA, Daisuke

2022-01-26 17:01:44

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 01/24/22 at 11:48am, Guilherme G. Piccoli wrote:
> On 24/01/2022 10:59, Baoquan He wrote:
> > [...]
> > About pre_dump, if the dump is crash dump, hope those pre_dump notifiers
> > will be executed under conditional check, e.g only if 'crash_kexec_post_notifiers'
> > is specified in kernel cmdline.
>
> Hi Baoquan, based on Petr's suggestion, I think pre_dump would be
> responsible for really *non-intrusive/non-risky* tasks and should be
> always executed in the panic path (before kdump), regardless of
> "crash_kexec_post_notifiers".
>
> The idea is that the majority of the notifiers would be executed in the
> post_dump portion, and for that, we have the
> "crash_kexec_post_notifiers" conditional. I also suggest we have
> blacklist options (based on function names) for both notifiers, in order
> to make kdump issues debug easier.
>
> Do you agree with that? Feel free to comment with suggestions!
> Cheers,

I would say "please NO" cautiously.

As Petr said, kdump mostly works only if people configure it correctly.
That's because we try best to switch to kdump kernel from the fragile
panicked kernel immediately. When we try to add anthing before the switching,
please consider carefully and ask if that adding is mandatory, otherwise
switching into kdump kernel may fail. If the answer is yes, the adding
is needed and welcomed. Othewise, any unnecessary action, including any
"non-intrusive/non-risky" tasks, would be unwelcomed.

Surely, we don't oppose the "non-intrusive/non-risky" or completely
"intrusive/risky" action adding before kdump kernel switching, with a
conditional limitation. When we handle customers' kdump support, we
explicitly declare we only support normal and default kdump operation.
If any action which is done before switching into kdump kernel is specified,
e.g panic_notifier, panic_print, they need take care of their own kdump
failure.

2022-01-26 21:14:54

by [email protected]

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

> On 01/24/22 at 11:48am, Guilherme G. Piccoli wrote:
> > On 24/01/2022 10:59, Baoquan He wrote:
> > > [...]
> > > About pre_dump, if the dump is crash dump, hope those pre_dump notifiers
> > > will be executed under conditional check, e.g only if 'crash_kexec_post_notifiers'
> > > is specified in kernel cmdline.
> >
> > Hi Baoquan, based on Petr's suggestion, I think pre_dump would be
> > responsible for really *non-intrusive/non-risky* tasks and should be
> > always executed in the panic path (before kdump), regardless of
> > "crash_kexec_post_notifiers".
> >
> > The idea is that the majority of the notifiers would be executed in the
> > post_dump portion, and for that, we have the
> > "crash_kexec_post_notifiers" conditional. I also suggest we have
> > blacklist options (based on function names) for both notifiers, in order
> > to make kdump issues debug easier.
> >
> > Do you agree with that? Feel free to comment with suggestions!
> > Cheers,
>
> I would say "please NO" cautiously.
>
> As Petr said, kdump mostly works only if people configure it correctly.
> That's because we try best to switch to kdump kernel from the fragile
> panicked kernel immediately. When we try to add anthing before the switching,
> please consider carefully and ask if that adding is mandatory, otherwise
> switching into kdump kernel may fail. If the answer is yes, the adding
> is needed and welcomed. Othewise, any unnecessary action, including any
> "non-intrusive/non-risky" tasks, would be unwelcomed.
>
> Surely, we don't oppose the "non-intrusive/non-risky" or completely
> "intrusive/risky" action adding before kdump kernel switching, with a
> conditional limitation. When we handle customers' kdump support, we
> explicitly declare we only support normal and default kdump operation.
> If any action which is done before switching into kdump kernel is specified,
> e.g panic_notifier, panic_print, they need take care of their own kdump
> failure.

Sorry for bringing back the past discussion, but how about
reconsidering the following design?

- kdump-specific notifier list within crash_kexec()

I don't think that the current implementation of
crash_kexec_post_notifiers is ideal, where panic() and panic notifier
are used, which contains the code that is unnecessary for kdump, and
it unnecessarily decreases kdump's reliability. The presence of kdump
code in panic() also conversely makes panic()'s code unnecessarily
complicated.

I use crash_kexec_post_notifiers with the understanding that it
affects kdump's reliability to some degree, but at the same time I
want it to be as reliable as possible.

I think introducing kdump-specific notifier list will resolve the
issues caused by dependencies to panic()'s code.

In addition, as I already said in another mail, I want to avoid
invoking any other handlers passed by users other than me, in order to
keep kdump's reliability. For such purpose, I think some feature like
Guilherme's panic notifier filter is needed.

2022-01-26 21:19:10

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On Wed 2022-01-26 11:10:39, Baoquan He wrote:
> On 01/24/22 at 11:48am, Guilherme G. Piccoli wrote:
> > On 24/01/2022 10:59, Baoquan He wrote:
> > > [...]
> > > About pre_dump, if the dump is crash dump, hope those pre_dump notifiers
> > > will be executed under conditional check, e.g only if 'crash_kexec_post_notifiers'
> > > is specified in kernel cmdline.
> >
> > Hi Baoquan, based on Petr's suggestion, I think pre_dump would be
> > responsible for really *non-intrusive/non-risky* tasks and should be
> > always executed in the panic path (before kdump), regardless of
> > "crash_kexec_post_notifiers".
> >
> > The idea is that the majority of the notifiers would be executed in the
> > post_dump portion, and for that, we have the
> > "crash_kexec_post_notifiers" conditional. I also suggest we have
> > blacklist options (based on function names) for both notifiers, in order
> > to make kdump issues debug easier.
> >
> > Do you agree with that? Feel free to comment with suggestions!
> > Cheers,
>
> I would say "please NO" cautiously.
>
> As Petr said, kdump mostly works only if people configure it correctly.
> That's because we try best to switch to kdump kernel from the fragile
> panicked kernel immediately. When we try to add anthing before the switching,
> please consider carefully and ask if that adding is mandatory, otherwise
> switching into kdump kernel may fail. If the answer is yes, the adding
> is needed and welcomed. Othewise, any unnecessary action, including any
> "non-intrusive/non-risky" tasks, would be unwelcomed.

I still do not have the complete picture. But it seems that some
actions make always sense even for kdump:

+ Super safe operations that might disable churn from broken
system. For examle, disabling watchdogs by setting a single
variable, see rcu_panic() notifier

+ Actions needed that allow to kexec the crash kernel a safe
way under some hypervisor, see
https://lore.kernel.org/r/MWHPR21MB15933573F5C81C5250BF6A1CD75E9@MWHPR21MB1593.namprd21.prod.outlook.com


> Surely, we don't oppose the "non-intrusive/non-risky" or completely
> "intrusive/risky" action adding before kdump kernel switching, with a
> conditional limitation. When we handle customers' kdump support, we
> explicitly declare we only support normal and default kdump operation.
> If any action which is done before switching into kdump kernel is specified,
> e.g panic_notifier, panic_print, they need take care of their own kdump
> failure.

All this actually started because of kmsg_dump. It might make sense to
allow both kmsg_dump and kdump together. The messages stored by
kmsg_dump might be better than nothing when kdump fails.

It actually seems to be the main motivation to introduce
"crash_kexec_post_notifier" parameter, see the commit
f06e5153f4ae2e2f3b03 ("kernel/panic.c: add "crash_kexec_post_notifiers"
option for kdump after panic_notifers").

And this patch introduces panic_notifier_filter that tries to select
notifiers that are helpful and harmful. IMHO, it is almost unusable.
It seems that even kernel developers do not understand what exactly
some notifiers do and why they are needed. Usually only the author
and people familiar with the subsystem have some idea. It makes
it pretty hard for anyone to create a reasonable filter.

I am pretty sure that we could do better. I propose to add more
notifier lists that will be called at various places with reasonable
rules and restrictions. Then the susbsystem maintainers could decide
where exactly a given action must be done.

The result might be that we will need only few options that will
enable/disable some well defined optional features.

Best Regards,
Petr

2022-01-28 11:23:05

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 25/01/2022 10:06, [email protected] wrote:
>
> But the pre_dump cannot avoid calling multiple unnecessary handlers, right?
> It's more risky than the previous idea...
>

I think we could have 2 kernel parameters then:

crash_kernel_disable_pre_notitifers (of course we can think in some
better name here heh)

crash_kernel_enable_post_notifiers (which is the same as the current
"crash_kernel_post_notifiers", we can keep it)

The point being (if I understand correctly): some callbacks are really
simple and don't introduce risks for kdump, like the RCU; a bunch of
them just set one variable. Those could be enable by default, before the
kdump.

The majority would fit in the 2nd group, meaning they are not enabled by
default, requiring some parameter for that.

Petr, let me know if that makes sense and is aligned with your suggestion.


> For example, how about simply maintaining function symbol names in the list
> as string, not address.
>

I considered that before, it was my first idea but it's not great due to
memory allocation. We'd need to use memblock to allocate a struct to
hold function names, and the comparison on register time is slower, I
guess... so it's much easier to pre-allocate some handlers and only
track the addresses of the function. I personally do not see much use in
this filter for module callbacks, but if that's a use case, we can think
on how to do that. But notice that the current implementation of the
filter wont hold if we end-up following the suggestions in this thread,
not sure even if we're gonna have a filter...

Cheers,


Guilherme

2022-01-31 11:04:26

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On Thu 2022-01-27 14:16:20, Guilherme G. Piccoli wrote:
> On 25/01/2022 10:06, [email protected] wrote:
> >
> > But the pre_dump cannot avoid calling multiple unnecessary handlers, right?
> > It's more risky than the previous idea...
> >
>
> I think we could have 2 kernel parameters then:
>
> crash_kernel_disable_pre_notitifers (of course we can think in some
> better name here heh)
>
> crash_kernel_enable_post_notifiers (which is the same as the current
> "crash_kernel_post_notifiers", we can keep it)
>
> The point being (if I understand correctly): some callbacks are really
> simple and don't introduce risks for kdump, like the RCU; a bunch of
> them just set one variable. Those could be enable by default, before the
> kdump.
>
> The majority would fit in the 2nd group, meaning they are not enabled by
> default, requiring some parameter for that.
>
> Petr, let me know if that makes sense and is aligned with your suggestion.

First, I am sorry for the very long mail. But the problem is really
complicated. I did my best to describe it a clean way.

I have discussed these problems with a colleague and he had some good
points. And my view evolved even further.

There are two groups of people interested in panic() behavior:

1. Users wants to get as reliable as possible: kdump, kmsg_dump,
console log, useful last message on screen, reboot, hypervisor
notification.

Different users have different priorities according to the use case.


2. Subsystem maintainers and developers that need to do something
special in panic(). They have to deal with the user requirements
and bug reports.

Most operations in panic() have unclear results because the system
is in unclear state. Maintainers and developers wants to make their
life easier. They do not want to deal with problems caused by
others. So that they want to disable others or run as early as
possible.

It is nicely visible. kdump maintainer is afraid of calling
anything before kdump. Many people support the idea of filtering
because it moves problems to the user side.


I see two basic problems here: ordering and reliability:

1. Ordering problems are partly solved by configuration and partly by
definition. I mean that:

+ kdump, kmsg_dump, panic_print_sys_info() are optional
+ console output is the best effort; more risky in final flush
+ reboot, infinite loop are the very last step

IMHO, the ordering should be pretty clear:

+ panic_print_sys_info(), kmsg_dump(), kdump(), console flush, reboot

Why?

+ panic_print_sys_info(), kmsg_dump(), kdump() are all optional
and disabled by default
+ Users might want panic_print_sys_info() in kmsg_dump() and kdump()
+ Users might prefer kmsg_dump() over kdump()
+ kdump() is the last operation when enabled

Where are panic notifiers in this picture?
Where are CPUs stopped?


2. Reliability is the best effort and any subsystem should do its
best.

Users need to be aware (documentation, warning) that:

+ kmsg_dump() is less reliable when panic_print_sys_info() is enabled
+ kdump() is less reliable when panic_print_sys_info() and/or
kmsg_dump() is enabled.

Where are panic notifiers in this picture?
How stopped CPUs affect reliability?


Regarding panic notifiers. They look like a problematic black box:

+ ordering against other operations is not clear
+ are not safe enough
+ various users require some and do not need others
+ some are too complex so that only few people know what
they do


So far, we have two proposals how to handle panic notifiers:

1. Allow to filter them with parameter:

+ pros:
+ it allows users to customize and work around problems

+ cons:
+ ordering is still not clear

+ user has to know what he does; note that sometimes only
few people know what their notifier does

+ hard to use in the long term; callbacks names are
implementation detail; new notifiers are added

+ lower motivation to solve problems; easy to wave them with
"just disable it when it does not work for you..."


2. Split notifiers into more lists:

+ pros:
+ might solve ordering problems

+ subsystem maintainers could find the proper and more safe
location


+ cons:
+ subsystem maintainers tend to think that their problem is
the most important one; they will tend to do the operation
as early as possible; so that even dangerous operations
might be done early => the original problem is still there

+ it might not motivate developers enough to make the notifiers as
safe as possible

+ some might still need to be optional; for example, it should
be possible to disable hypervisor notifier when it breaks
kdump


Regarding stopped CPUs, it looks like a good idea most of the time:

+ They should stop all tasks and reduce further damage of the
system.

+ It should also reduce noise (messages) produced by other CPUs.

+ But a special handling is needed when it is done before crash
dump.


Sigh, it looks really really complicated. We should be careful.

OK, the original problems are:

+ allow to call this order: panic_print_sys_info(), kmsg_dump(), kdump()
+ make it more safe with problematic notifiers


My opinion:

+ allow the desired ordering makes sense

+ something should be done with notifiers:

+ adding filer looks like a workaround that is not much
usable; it is not easy to use; it does not motivate people
fix problems so that is might make things worse in
the long term

+ splitting might make sense but it is not clear how

+ some notifiers make always sense before kmsg_dump;
some should be optional

+ we need a compromise to keep the panic() code sane and can't
support all combinations


I think about the following solution:

+ split the notifiers into three lists:

+ info: stop watchdogs, provide extra info
+ hypervisor: poke hypervisor
+ reboot: actions needed only when crash dump did not happen

+ allow to call hypervisor notifiers before or after kdump

+ stop CPUs before kdump when either hypervisor notifiers or
kmsg_dump is enabled

Note that it still allows to call kdump as the first action when
hypervisor notifiers are called after kdump and no kmsg dumper
is registered.


void panic(void)
{
[...]

if (crash_kexec_post_hypervisor || panic_print || enabled_kmsg_dump()) {
/*
* Stop CPUs when some extra action is required before
* crash dump. We will need architecture dependent extra
* works in addition to stopping other CPUs.
*/
crash_smp_send_stop();
cpus_stopped = true;
}

if (crash_kexec_post_hypervisor) {
/* Tell hypervisor about the panic */
atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
}

if (enabled_kmsg_dump) {
/*
* Print extra info by notifiers.
* Prevent rumors, for example, by stopping watchdogs.
*/
atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
}

/* Optional extra info */
panic_printk_sys_info();

/* No dumper by default */
kmsg_dump();

/* Used only when crash kernel loaded */
__crash_kexec(NULL);

if (!cpus_stopped) {
/*
* Note smp_send_stop is the usual smp shutdown function, which
* unfortunately means it may not be hardened to work in a
* panic situation.
*/
smp_send_stop();
}

if (!crash_kexec_post_hypervisor) {
/* Tell hypervisor about the panic */
atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
}

if (!enabled_kmsg_dump) {
/*
* Print extra info by notifiers.
* Prevent rumors, for example, by stopping watchdogs.
*/
atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
}

/*
* Help to reboot a safe way.
*/
atomic_notifier_call_chain(&panic_reboot_notifier_list, 0, buf);

[...]
}

Any opinion?
Do the notifier list names make sense?

Best Regards,
Petr

PS: I have vacation the following week. I'll continue in the
discussion when I am back.

2022-02-01 10:14:18

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 01/26/22 at 02:20pm, Petr Mladek wrote:
> On Wed 2022-01-26 11:10:39, Baoquan He wrote:
> > On 01/24/22 at 11:48am, Guilherme G. Piccoli wrote:
> > > On 24/01/2022 10:59, Baoquan He wrote:
> > > > [...]
> > > > About pre_dump, if the dump is crash dump, hope those pre_dump notifiers
> > > > will be executed under conditional check, e.g only if 'crash_kexec_post_notifiers'
> > > > is specified in kernel cmdline.
> > >
> > > Hi Baoquan, based on Petr's suggestion, I think pre_dump would be
> > > responsible for really *non-intrusive/non-risky* tasks and should be
> > > always executed in the panic path (before kdump), regardless of
> > > "crash_kexec_post_notifiers".
> > >
> > > The idea is that the majority of the notifiers would be executed in the
> > > post_dump portion, and for that, we have the
> > > "crash_kexec_post_notifiers" conditional. I also suggest we have
> > > blacklist options (based on function names) for both notifiers, in order
> > > to make kdump issues debug easier.
> > >
> > > Do you agree with that? Feel free to comment with suggestions!
> > > Cheers,
> >
> > I would say "please NO" cautiously.
> >
> > As Petr said, kdump mostly works only if people configure it correctly.
> > That's because we try best to switch to kdump kernel from the fragile
> > panicked kernel immediately. When we try to add anthing before the switching,
> > please consider carefully and ask if that adding is mandatory, otherwise
> > switching into kdump kernel may fail. If the answer is yes, the adding
> > is needed and welcomed. Othewise, any unnecessary action, including any
> > "non-intrusive/non-risky" tasks, would be unwelcomed.
>
> I still do not have the complete picture. But it seems that some
> actions make always sense even for kdump:
>
> + Super safe operations that might disable churn from broken
> system. For examle, disabling watchdogs by setting a single
> variable, see rcu_panic() notifier
>
> + Actions needed that allow to kexec the crash kernel a safe
> way under some hypervisor, see
> https://lore.kernel.org/r/MWHPR21MB15933573F5C81C5250BF6A1CD75E9@MWHPR21MB1593.namprd21.prod.outlook.com

Yes, I agree with this after going through threads of discussion again.
There is much space we can do something for panic_notifier, and it might
be a good time to do now with these discussion and some clarification.

>
>
> > Surely, we don't oppose the "non-intrusive/non-risky" or completely
> > "intrusive/risky" action adding before kdump kernel switching, with a
> > conditional limitation. When we handle customers' kdump support, we
> > explicitly declare we only support normal and default kdump operation.
> > If any action which is done before switching into kdump kernel is specified,
> > e.g panic_notifier, panic_print, they need take care of their own kdump
> > failure.
>
> All this actually started because of kmsg_dump. It might make sense to
> allow both kmsg_dump and kdump together. The messages stored by
> kmsg_dump might be better than nothing when kdump fails.

I think this can be done later, after panics notifiers are combed and
tidied up.

>
> It actually seems to be the main motivation to introduce
> "crash_kexec_post_notifier" parameter, see the commit
> f06e5153f4ae2e2f3b03 ("kernel/panic.c: add "crash_kexec_post_notifiers"
> option for kdump after panic_notifers").

From discussion with Hitachi and FJ engineers, they use
crash_kexec_post_notifiers when 1st kernel panicked and kdump kernel is
not so stable to function. In that case, the captured information
with best effort after panic in 1st kernel can help analyze what
happened in 1st kernel, and also might give hint on why kdump kernel
is unstbale. But they will not add crash_kexec_post_notifiers in kernel
cmdline by default. The unstable kdump kernel rarely happened, need be
debugged and investigated.

>
> And this patch introduces panic_notifier_filter that tries to select
> notifiers that are helpful and harmful. IMHO, it is almost unusable.
> It seems that even kernel developers do not understand what exactly
> some notifiers do and why they are needed. Usually only the author
> and people familiar with the subsystem have some idea. It makes
> it pretty hard for anyone to create a reasonable filter.

Then people can select the notifiers they know to execute. E.g for
Hyper-V, they can run the notifiers related. And as HATAYAMA mentioned,
they have the same expection. This kind of filter is not for people to
blindly pick and run, but for the professional.

I think the chaos is from not being monitored entirely. People can
freely register one when they need. The priority, impaction to the
entirety is not considered by subsystem developer.

>
> I am pretty sure that we could do better. I propose to add more
> notifier lists that will be called at various places with reasonable
> rules and restrictions. Then the susbsystem maintainers could decide
> where exactly a given action must be done.

Agree that we can do something to improve.

2022-02-09 05:41:03

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 28/01/2022 10:38, Petr Mladek wrote:
> [...] On Thu 2022-01-27 14:16:20, Guilherme G. Piccoli wrote:
> First, I am sorry for the very long mail. But the problem is really
> complicated. I did my best to describe it a clean way.
>
> I have discussed these problems with a colleague and he had some good
> points. And my view evolved even further.

Thanks Petr for the very comprehensive and detailed email - this helps a
lot in shaping the future of panic notifier(s)!


> [...]
> I think about the following solution:
>
> + split the notifiers into three lists:
>
> + info: stop watchdogs, provide extra info
> + hypervisor: poke hypervisor
> + reboot: actions needed only when crash dump did not happen
>
> + allow to call hypervisor notifiers before or after kdump
>
> + stop CPUs before kdump when either hypervisor notifiers or
> kmsg_dump is enabled
>
> Note that it still allows to call kdump as the first action when
> hypervisor notifiers are called after kdump and no kmsg dumper
> is registered.
>
>
> void panic(void)
> {
> [...]
>
> if (crash_kexec_post_hypervisor || panic_print || enabled_kmsg_dump()) {
> /*
> * Stop CPUs when some extra action is required before
> * crash dump. We will need architecture dependent extra
> * works in addition to stopping other CPUs.
> */
> crash_smp_send_stop();
> cpus_stopped = true;
> }
>
> if (crash_kexec_post_hypervisor) {
> /* Tell hypervisor about the panic */
> atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> }
>
> if (enabled_kmsg_dump) {
> /*
> * Print extra info by notifiers.
> * Prevent rumors, for example, by stopping watchdogs.
> */
> atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> }
>
> /* Optional extra info */
> panic_printk_sys_info();
>
> /* No dumper by default */
> kmsg_dump();
>
> /* Used only when crash kernel loaded */
> __crash_kexec(NULL);
>
> if (!cpus_stopped) {
> /*
> * Note smp_send_stop is the usual smp shutdown function, which
> * unfortunately means it may not be hardened to work in a
> * panic situation.
> */
> smp_send_stop();
> }
>
> if (!crash_kexec_post_hypervisor) {
> /* Tell hypervisor about the panic */
> atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> }
>
> if (!enabled_kmsg_dump) {
> /*
> * Print extra info by notifiers.
> * Prevent rumors, for example, by stopping watchdogs.
> */
> atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> }
>
> /*
> * Help to reboot a safe way.
> */
> atomic_notifier_call_chain(&panic_reboot_notifier_list, 0, buf);
>
> [...]
> }
>
> Any opinion?
> Do the notifier list names make sense?
>

This was exposed very clearly, thanks. I agree with you, it's a good
approach, and we can evolve that during the implementation phase, like
"function A is not good in the hypervisor list because of this and
that", so we move it to the reboot list. Also, name of the lists is not
so relevant, might evolve in the implementation phase - I personally
liked them, specially the "info" and "hypervisor" ones (reboot seems
good but not great heh).

So, what are the opinions from kdump maintainers about this idea?
Baoquan / Vivek / Dave, does it make sense to you? Do you have any
suggestions/concerns to add on top of Petr draft?

I prefer this refactor than the filter, certainly. If nobody else
working on that, I can try implementing that - it's very interesting.
The only thing I'd like to have first is an ACK from the kdump
maintainers about the general idea.

Cheers,


Guilherme

2022-02-09 08:24:54

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 02/08/22 at 03:51pm, Guilherme G. Piccoli wrote:
> On 28/01/2022 10:38, Petr Mladek wrote:
> > [...] On Thu 2022-01-27 14:16:20, Guilherme G. Piccoli wrote:
> > First, I am sorry for the very long mail. But the problem is really
> > complicated. I did my best to describe it a clean way.
> >
> > I have discussed these problems with a colleague and he had some good
> > points. And my view evolved even further.
>
> Thanks Petr for the very comprehensive and detailed email - this helps a
> lot in shaping the future of panic notifier(s)!
>
>
> > [...]
> > I think about the following solution:
> >
> > + split the notifiers into three lists:
> >
> > + info: stop watchdogs, provide extra info
> > + hypervisor: poke hypervisor
> > + reboot: actions needed only when crash dump did not happen
> >
> > + allow to call hypervisor notifiers before or after kdump
> >
> > + stop CPUs before kdump when either hypervisor notifiers or
> > kmsg_dump is enabled
> >
> > Note that it still allows to call kdump as the first action when
> > hypervisor notifiers are called after kdump and no kmsg dumper
> > is registered.
> >
> >
> > void panic(void)
> > {
> > [...]
> >
> > if (crash_kexec_post_hypervisor || panic_print || enabled_kmsg_dump()) {
> > /*
> > * Stop CPUs when some extra action is required before
> > * crash dump. We will need architecture dependent extra
> > * works in addition to stopping other CPUs.
> > */
> > crash_smp_send_stop();
> > cpus_stopped = true;
> > }
> >
> > if (crash_kexec_post_hypervisor) {
> > /* Tell hypervisor about the panic */
> > atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> > }
> >
> > if (enabled_kmsg_dump) {
> > /*
> > * Print extra info by notifiers.
> > * Prevent rumors, for example, by stopping watchdogs.
> > */
> > atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> > }
> >
> > /* Optional extra info */
> > panic_printk_sys_info();
> >
> > /* No dumper by default */
> > kmsg_dump();
> >
> > /* Used only when crash kernel loaded */
> > __crash_kexec(NULL);
> >
> > if (!cpus_stopped) {
> > /*
> > * Note smp_send_stop is the usual smp shutdown function, which
> > * unfortunately means it may not be hardened to work in a
> > * panic situation.
> > */
> > smp_send_stop();
> > }
> >
> > if (!crash_kexec_post_hypervisor) {
> > /* Tell hypervisor about the panic */
> > atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> > }
> >
> > if (!enabled_kmsg_dump) {
> > /*
> > * Print extra info by notifiers.
> > * Prevent rumors, for example, by stopping watchdogs.
> > */
> > atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> > }
> >
> > /*
> > * Help to reboot a safe way.
> > */
> > atomic_notifier_call_chain(&panic_reboot_notifier_list, 0, buf);
> >
> > [...]
> > }
> >
> > Any opinion?
> > Do the notifier list names make sense?
> >
>
> This was exposed very clearly, thanks. I agree with you, it's a good
> approach, and we can evolve that during the implementation phase, like
> "function A is not good in the hypervisor list because of this and
> that", so we move it to the reboot list. Also, name of the lists is not
> so relevant, might evolve in the implementation phase - I personally
> liked them, specially the "info" and "hypervisor" ones (reboot seems
> good but not great heh).
>
> So, what are the opinions from kdump maintainers about this idea?
> Baoquan / Vivek / Dave, does it make sense to you? Do you have any
> suggestions/concerns to add on top of Petr draft?

Yeah, it's reasonable. As I replied to Michael in another thread, I
think splitting the current notifier list is a good idea. At least the
code to archieve hyper-V's goal with panic_notifier is a little odd and
should be taken out and execute w/o conditional before kdump, and maybe
some others Petr has combed out.

For those which will be switched on with the need of adding panic_notifier
or panic_print into cmdline, the heavy users like HATAYAMA and Masa can
help check.

For Petr's draft code, does it mean hyper-V need another knob to trigger
the needed notifiers? Will you go with the draft direclty? Hyper-V now
runs panic notifiers by default, just a reminder.

>
> I prefer this refactor than the filter, certainly. If nobody else
> working on that, I can try implementing that - it's very interesting.
> The only thing I'd like to have first is an ACK from the kdump
> maintainers about the general idea.
>
> Cheers,
>
>
> Guilherme
>


2022-02-10 19:14:04

by Michael Kelley (LINUX)

[permalink] [raw]
Subject: RE: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

From: Guilherme G. Piccoli <[email protected]> Sent: Thursday, February 10, 2022 8:40 AM
>
> On 08/02/2022 21:31, [email protected] wrote:
> > [...]
> >> So, what are the opinions from kdump maintainers about this idea?
> >> Baoquan / Vivek / Dave, does it make sense to you? Do you have any
> >> suggestions/concerns to add on top of Petr draft?
> >
> > Yeah, it's reasonable. As I replied to Michael in another thread, I
> > think splitting the current notifier list is a good idea. At least the
> > code to archieve hyper-V's goal with panic_notifier is a little odd and
> > should be taken out and execute w/o conditional before kdump, and maybe
> > some others Petr has combed out.
> >
> > For those which will be switched on with the need of adding panic_notifier
> > or panic_print into cmdline, the heavy users like HATAYAMA and Masa can
> > help check.
> >
> > For Petr's draft code, does it mean hyper-V need another knob to trigger
> > the needed notifiers? Will you go with the draft direclty? Hyper-V now
> > runs panic notifiers by default, just a reminder.
> >
>
> Hi Baoquan, thanks for your comments.
>
> I'll need to study the Hyper-V code and how it's done today -

Let me know if you need any assistance or explanation as you look
at the Hyper-V code.

Michael Kelley
Principal SW Engineer
Linux Systems Group
Microsoft Corporation

> I guess
> most part of this implementation will be studying the notifiers we have
> currently, split them among the 3 new notifiers and comb them into
> patches, so they can be reviewed for all relevant maintainers (who know
> the code we are changing).
>
> I'm not sure if I go directly with the draft, likely it'll have some
> changes, but the draft should be the skeleton of the new implementation.
> Specially if you/other kdump maintainers agree it's a good idea =)
>
> Cheers,
>
>
> Guilherme

2022-02-11 02:00:39

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 08/02/2022 21:31, [email protected] wrote:
> [...]
>> So, what are the opinions from kdump maintainers about this idea?
>> Baoquan / Vivek / Dave, does it make sense to you? Do you have any
>> suggestions/concerns to add on top of Petr draft?
>
> Yeah, it's reasonable. As I replied to Michael in another thread, I
> think splitting the current notifier list is a good idea. At least the
> code to archieve hyper-V's goal with panic_notifier is a little odd and
> should be taken out and execute w/o conditional before kdump, and maybe
> some others Petr has combed out.
>
> For those which will be switched on with the need of adding panic_notifier
> or panic_print into cmdline, the heavy users like HATAYAMA and Masa can
> help check.
>
> For Petr's draft code, does it mean hyper-V need another knob to trigger
> the needed notifiers? Will you go with the draft direclty? Hyper-V now
> runs panic notifiers by default, just a reminder.
>

Hi Baoquan, thanks for your comments.

I'll need to study the Hyper-V code and how it's done today - I guess
most part of this implementation will be studying the notifiers we have
currently, split them among the 3 new notifiers and comb them into
patches, so they can be reviewed for all relevant maintainers (who know
the code we are changing).

I'm not sure if I go directly with the draft, likely it'll have some
changes, but the draft should be the skeleton of the new implementation.
Specially if you/other kdump maintainers agree it's a good idea =)

Cheers,


Guilherme

2022-02-11 10:32:35

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 10/02/2022 14:26, Michael Kelley (LINUX) wrote:
> From: Guilherme G. Piccoli <[email protected]> Sent: Thursday, February 10, 2022 8:40 AM
> [...]>> I'll need to study the Hyper-V code and how it's done today -
>
> Let me know if you need any assistance or explanation as you look
> at the Hyper-V code.
>
Perfect Michael, thanks a lot! Much appreciated =)

2022-03-07 02:45:02

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 28/01/2022 10:38, Petr Mladek wrote:
> [...]
> I think about the following solution:
>
> + split the notifiers into three lists:
>
> + info: stop watchdogs, provide extra info
> + hypervisor: poke hypervisor
> + reboot: actions needed only when crash dump did not happen
>
> + allow to call hypervisor notifiers before or after kdump
>
> + stop CPUs before kdump when either hypervisor notifiers or
> kmsg_dump is enabled
>
> Note that it still allows to call kdump as the first action when
> hypervisor notifiers are called after kdump and no kmsg dumper
> is registered.
>
>
> void panic(void)
> {
> [...]
>
> if (crash_kexec_post_hypervisor || panic_print || enabled_kmsg_dump()) {
> /*
> * Stop CPUs when some extra action is required before
> * crash dump. We will need architecture dependent extra
> * works in addition to stopping other CPUs.
> */
> crash_smp_send_stop();
> cpus_stopped = true;
> }
>
> if (crash_kexec_post_hypervisor) {
> /* Tell hypervisor about the panic */
> atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> }
>
> if (enabled_kmsg_dump) {
> /*
> * Print extra info by notifiers.
> * Prevent rumors, for example, by stopping watchdogs.
> */
> atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> }
>
> /* Optional extra info */
> panic_printk_sys_info();
>
> /* No dumper by default */
> kmsg_dump();
>
> /* Used only when crash kernel loaded */
> __crash_kexec(NULL);
>
> if (!cpus_stopped) {
> /*
> * Note smp_send_stop is the usual smp shutdown function, which
> * unfortunately means it may not be hardened to work in a
> * panic situation.
> */
> smp_send_stop();
> }
>
> if (!crash_kexec_post_hypervisor) {
> /* Tell hypervisor about the panic */
> atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> }
>
> if (!enabled_kmsg_dump) {
> /*
> * Print extra info by notifiers.
> * Prevent rumors, for example, by stopping watchdogs.
> */
> atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> }
>
> /*
> * Help to reboot a safe way.
> */
> atomic_notifier_call_chain(&panic_reboot_notifier_list, 0, buf);
>
> [...]
> }
>
> Any opinion?
> Do the notifier list names make sense?
>
> Best Regards,
> Petr


Hi folks, I'm working on this now, and while looking into it I've
noticed that we have the concept of "priority" in the notifiers list.
Basically, you can order the calls the way it fits best, priority is an
integer and must the set in the moment of registration, it's up to the
users of the notifiers to set it and enforce the ordering.

So what I'm thinking is: currently, only 3 or 4 panic notifiers make use
of that. What if, since we're re-working this, we add a priority for
*all* notifiers and enforce its usage? This way we guarantee
consistency, it'd make debug easier and maybe even more important:
having all the notifiers and their priorities in a list present in the
header file would be great documentation about all the existing
notifiers and how they are called - today this information is quite
obscure and requires lots of code grepping!

Let me know your thoughts Petr / Baoquan - it would add slightly more
code / complexity, but in my opinion the payback is very good.
Cheers,


Guilherme

2022-03-07 06:53:08

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 03/06/22 at 11:21am, Guilherme G. Piccoli wrote:
> On 28/01/2022 10:38, Petr Mladek wrote:
> > [...]
> > I think about the following solution:
> >
> > + split the notifiers into three lists:
> >
> > + info: stop watchdogs, provide extra info
> > + hypervisor: poke hypervisor
> > + reboot: actions needed only when crash dump did not happen
> >
> > + allow to call hypervisor notifiers before or after kdump
> >
> > + stop CPUs before kdump when either hypervisor notifiers or
> > kmsg_dump is enabled
> >
> > Note that it still allows to call kdump as the first action when
> > hypervisor notifiers are called after kdump and no kmsg dumper
> > is registered.
> >
> >
> > void panic(void)
> > {
> > [...]
> >
> > if (crash_kexec_post_hypervisor || panic_print || enabled_kmsg_dump()) {
> > /*
> > * Stop CPUs when some extra action is required before
> > * crash dump. We will need architecture dependent extra
> > * works in addition to stopping other CPUs.
> > */
> > crash_smp_send_stop();
> > cpus_stopped = true;
> > }
> >
> > if (crash_kexec_post_hypervisor) {
> > /* Tell hypervisor about the panic */
> > atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> > }
> >
> > if (enabled_kmsg_dump) {
> > /*
> > * Print extra info by notifiers.
> > * Prevent rumors, for example, by stopping watchdogs.
> > */
> > atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> > }
> >
> > /* Optional extra info */
> > panic_printk_sys_info();
> >
> > /* No dumper by default */
> > kmsg_dump();
> >
> > /* Used only when crash kernel loaded */
> > __crash_kexec(NULL);
> >
> > if (!cpus_stopped) {
> > /*
> > * Note smp_send_stop is the usual smp shutdown function, which
> > * unfortunately means it may not be hardened to work in a
> > * panic situation.
> > */
> > smp_send_stop();
> > }
> >
> > if (!crash_kexec_post_hypervisor) {
> > /* Tell hypervisor about the panic */
> > atomic_notifier_call_chain(&panic_hypervisor_notifier_list, 0, buf);
> > }
> >
> > if (!enabled_kmsg_dump) {
> > /*
> > * Print extra info by notifiers.
> > * Prevent rumors, for example, by stopping watchdogs.
> > */
> > atomic_notifier_call_chain(&panic_info_notifier_list, 0, buf);
> > }
> >
> > /*
> > * Help to reboot a safe way.
> > */
> > atomic_notifier_call_chain(&panic_reboot_notifier_list, 0, buf);
> >
> > [...]
> > }
> >
> > Any opinion?
> > Do the notifier list names make sense?
> >
> > Best Regards,
> > Petr
>
>
> Hi folks, I'm working on this now, and while looking into it I've
> noticed that we have the concept of "priority" in the notifiers list.
> Basically, you can order the calls the way it fits best, priority is an
> integer and must the set in the moment of registration, it's up to the
> users of the notifiers to set it and enforce the ordering.
>
> So what I'm thinking is: currently, only 3 or 4 panic notifiers make use
> of that. What if, since we're re-working this, we add a priority for
> *all* notifiers and enforce its usage? This way we guarantee
> consistency, it'd make debug easier and maybe even more important:
> having all the notifiers and their priorities in a list present in the
> header file would be great documentation about all the existing
> notifiers and how they are called - today this information is quite
> obscure and requires lots of code grepping!
>
> Let me know your thoughts Petr / Baoquan - it would add slightly more
> code / complexity, but in my opinion the payback is very good.
> Cheers,

The ideal situation is each panic notifier has an order or index to
indicate its priority. Wondering how to make it. What I think of is
copying initcall. We have several priorities, at the same priority,
execution sequence is not important. Not sure if I get your point.

~~~~~~~
#define core_initcall(fn) __define_initcall(fn, 1)
#define core_initcall_sync(fn) __define_initcall(fn, 1s)
......
#define late_initcall(fn) __define_initcall(fn, 7)
#define late_initcall_sync(fn) __define_initcall(fn, 7s)

2022-03-07 14:23:50

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 07/03/2022 00:42, [email protected] wrote:
> [...]
>> Let me know your thoughts Petr / Baoquan - it would add slightly more
>> code / complexity, but in my opinion the payback is very good.
>> Cheers,
>
> The ideal situation is each panic notifier has an order or index to
> indicate its priority. Wondering how to make it. What I think of is
> copying initcall. We have several priorities, at the same priority,
> execution sequence is not important. Not sure if I get your point.
>
> ~~~~~~~
> #define core_initcall(fn) __define_initcall(fn, 1)
> #define core_initcall_sync(fn) __define_initcall(fn, 1s)
> ......
> #define late_initcall(fn) __define_initcall(fn, 7)
> #define late_initcall_sync(fn) __define_initcall(fn, 7s)
>

Hi Baoquan, thanks for you consideration! In fact, the notifiers
infrastructure already have a mechanism of ordering, my idea is to make
use of that. It's not that different from the initcall system...

For instance, the code in the notifier register function checks for the
priority field in the notifier block:
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/notifier.c#n31
.

For example, the Xen panic notifier is one of the few blocks that make
use of that, currently:

static struct notifier_block xen_panic_block = {
.notifier_call = xen_panic_event,
.priority = INT_MIN
};

(see
https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/xen/enlighten.c#n286)

In this case, xen_panic_event() will be the latest one to run. My idea
is to make use of that in *all* panic notifiers, having a table/list on
panic_notifier.h (defines or enum, I'll think about that when writing
this part) so all notifiers are documented and the ordering is clear and
enforced.

Makes sense to you?
Cheers,


Guilherme

2022-03-08 12:48:09

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 03/07/22 at 10:11am, Guilherme G. Piccoli wrote:
> On 07/03/2022 00:42, [email protected] wrote:
> > [...]
> >> Let me know your thoughts Petr / Baoquan - it would add slightly more
> >> code / complexity, but in my opinion the payback is very good.
> >> Cheers,
> >
> > The ideal situation is each panic notifier has an order or index to
> > indicate its priority. Wondering how to make it. What I think of is
> > copying initcall. We have several priorities, at the same priority,
> > execution sequence is not important. Not sure if I get your point.
> >
> > ~~~~~~~
> > #define core_initcall(fn) __define_initcall(fn, 1)
> > #define core_initcall_sync(fn) __define_initcall(fn, 1s)
> > ......
> > #define late_initcall(fn) __define_initcall(fn, 7)
> > #define late_initcall_sync(fn) __define_initcall(fn, 7s)
> >
>
> Hi Baoquan, thanks for you consideration! In fact, the notifiers
> infrastructure already have a mechanism of ordering, my idea is to make
> use of that. It's not that different from the initcall system...

Ah, sorry, I even didn't notice that. That's awesome if we can make use
of that. While I still have concerns:

1) about those we have decided to take out from panic notifier list and
put before kdump, e.g the Hypver-V notifier, how will we do with it? Are
we going to handle them as we have discussed?

2) Combing and settling priority for all existing panic notifier looks
great, even though it will take some effort. How about the later newly
added one? How can we guarantee that those new notifiers are getting
appropriate priority to mark their order? Sometime we even don't know
a new panic notifier is added since code change may be made in any
component or driver.

>
> For instance, the code in the notifier register function checks for the
> priority field in the notifier block:
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/kernel/notifier.c#n31
> .
>
> For example, the Xen panic notifier is one of the few blocks that make
> use of that, currently:
>
> static struct notifier_block xen_panic_block = {
> .notifier_call = xen_panic_event,
> .priority = INT_MIN
> };
>
> (see
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/xen/enlighten.c#n286)
>
> In this case, xen_panic_event() will be the latest one to run. My idea
> is to make use of that in *all* panic notifiers, having a table/list on
> panic_notifier.h (defines or enum, I'll think about that when writing
> this part) so all notifiers are documented and the ordering is clear and
> enforced.
>
> Makes sense to you?
> Cheers,
>
>
> Guilherme
>

2022-03-08 16:40:58

by Petr Mladek

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On Mon 2022-03-07 11:25:30, Guilherme G. Piccoli wrote:
> On 07/03/2022 11:04, [email protected] wrote:
> > [...]
> > Ah, sorry, I even didn't notice that. That's awesome if we can make use
> > of that. While I still have concerns:
> >
>
> Thanks, nice that you liked the idea.
>
> > 1) about those we have decided to take out from panic notifier list and
> > put before kdump, e.g the Hypver-V notifier, how will we do with it? Are
> > we going to handle them as we have discussed?
> >
>
> While implementing that I will think of something, but if
> understood/remember correctly Hyper-V gonna be one of the first to run
> in the first notifier list proposed by Petr - so we might still use
> ordering by priority there, having Hyper-V being the first heh

My understanding is that the problem is not a priority but an ordering
against other operations.

Namely, Hyper-V must be called even before crash dump. Some others before
kmsg_dump(). And the rest only when the crash dump is not called at all.


> > 2) Combing and settling priority for all existing panic notifier looks
> > great, even though it will take some effort. How about the later newly
> > added one? How can we guarantee that those new notifiers are getting
> > appropriate priority to mark their order? Sometime we even don't know
> > a new panic notifier is added since code change may be made in any
> > component or driver.
> >
>
> This is a great point! How to do it? One idea is to have a special
> registering function for panic notifiers that checks for priority field
> missing, and good documentation is a good idea as well, always.
>
> But if you / others have other suggestions, let me know - appreciate that.
> Cheers,

Honestly, I am not that keen about enforcing the priorities.

It would make sense only when the ordering is really important.
Then there should be some rules. It should be obvious why
something has to be done earlier than something else.

From my POV, it is just another complexity. Someone will need to
assign the priority to the existing notifiers. People will wonder
about it for newly added notifiers.

Reproducibility seems to be the only motivation. Is it really a
problem? Do the notifiers affect each other?

Also the notifiers are typically registered by some early boot
code. It will define some ordering out of box.

Best Regards,
Petr

2022-03-09 00:41:28

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 07/03/2022 11:04, [email protected] wrote:
> [...]
> Ah, sorry, I even didn't notice that. That's awesome if we can make use
> of that. While I still have concerns:
>

Thanks, nice that you liked the idea.

> 1) about those we have decided to take out from panic notifier list and
> put before kdump, e.g the Hypver-V notifier, how will we do with it? Are
> we going to handle them as we have discussed?
>

While implementing that I will think of something, but if
understood/remember correctly Hyper-V gonna be one of the first to run
in the first notifier list proposed by Petr - so we might still use
ordering by priority there, having Hyper-V being the first heh

> 2) Combing and settling priority for all existing panic notifier looks
> great, even though it will take some effort. How about the later newly
> added one? How can we guarantee that those new notifiers are getting
> appropriate priority to mark their order? Sometime we even don't know
> a new panic notifier is added since code change may be made in any
> component or driver.
>

This is a great point! How to do it? One idea is to have a special
registering function for panic notifiers that checks for priority field
missing, and good documentation is a good idea as well, always.

But if you / others have other suggestions, let me know - appreciate that.
Cheers,


Guilherme

2022-03-09 00:52:56

by Guilherme G. Piccoli

[permalink] [raw]
Subject: Re: [PATCH V4] notifier/panic: Introduce panic_notifier_filter

On 08/03/2022 09:54, Petr Mladek wrote:
> [...]
> Honestly, I am not that keen about enforcing the priorities.
>
> It would make sense only when the ordering is really important.
> Then there should be some rules. It should be obvious why
> something has to be done earlier than something else.
>
> From my POV, it is just another complexity. Someone will need to
> assign the priority to the existing notifiers. People will wonder
> about it for newly added notifiers.
>
> Reproducibility seems to be the only motivation. Is it really a
> problem? Do the notifiers affect each other?
>
> Also the notifiers are typically registered by some early boot
> code. It will define some ordering out of box.
>
> Best Regards,
> Petr

OK Petr, makes sense. I feel that one the pros of the approach would be
to document all the notifiers in the header, but I guess nothing
prevents me to do that heh
I don't need enforcing the priorities for this.

I liked the idea of having everything "reproducible" as you said, but we
could face the problem of how to order some independent notifiers...
would be something "ad-hoc".

I'll progress here without priorities for now, if somebody see an
advantage for that, we can use it.

Thanks for your points,


Guilherme