2022-01-31 09:37:04

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH 0/5] Update doc and fix some issues about kdump

Tiezhu Yang (5):
docs: kdump: update description about sysfs file system support
docs: kdump: add scp sample to write out the dump file
kcsan: unset panic_on_warn before calling panic()
sched: unset panic_on_warn before calling panic()
kfence: unset panic_on_warn before calling panic()

Documentation/admin-guide/kdump/kdump.rst | 10 +++++++---
kernel/kcsan/report.c | 10 +++++++++-
kernel/sched/core.c | 11 ++++++++++-
mm/kfence/report.c | 10 +++++++++-
4 files changed, 35 insertions(+), 6 deletions(-)

--
2.1.0


2022-01-31 09:37:42

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH 2/5] docs: kdump: add scp sample to write out the dump file

Except cp and makedumpfile, add scp sample to write out the dump file.

Signed-off-by: Tiezhu Yang <[email protected]>
---
Documentation/admin-guide/kdump/kdump.rst | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
index d187df2..a748e7e 100644
--- a/Documentation/admin-guide/kdump/kdump.rst
+++ b/Documentation/admin-guide/kdump/kdump.rst
@@ -533,6 +533,10 @@ the following command::

cp /proc/vmcore <dump-file>

+or use scp to write out the dump file between hosts on a network, e.g::
+
+ scp /proc/vmcore remote_username@remote_ip:<dump-file>
+
You can also use makedumpfile utility to write out the dump file
with specified options to filter out unwanted contents, e.g::

--
2.1.0

2022-01-31 09:40:28

by Tiezhu Yang

[permalink] [raw]
Subject: [PATCH 4/5] sched: unset panic_on_warn before calling panic()

As done in the full WARN() handler, panic_on_warn needs to be cleared
before calling panic() to avoid recursive panics.

Signed-off-by: Tiezhu Yang <[email protected]>
---
kernel/sched/core.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 848eaa0..f5b0886 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5524,8 +5524,17 @@ static noinline void __schedule_bug(struct task_struct *prev)
pr_err("Preemption disabled at:");
print_ip_sym(KERN_ERR, preempt_disable_ip);
}
- if (panic_on_warn)
+
+ if (panic_on_warn) {
+ /*
+ * This thread may hit another WARN() in the panic path.
+ * Resetting this prevents additional WARN() from panicking the
+ * system on this thread. Other threads are blocked by the
+ * panic_mutex in panic().
+ */
+ panic_on_warn = 0;
panic("scheduling while atomic\n");
+ }

dump_stack();
add_taint(TAINT_WARN, LOCKDEP_STILL_OK);
--
2.1.0

2022-01-31 10:00:51

by Marco Elver

[permalink] [raw]
Subject: Re: [PATCH 4/5] sched: unset panic_on_warn before calling panic()

On Fri, 28 Jan 2022 at 12:42, Tiezhu Yang <[email protected]> wrote:
>
> As done in the full WARN() handler, panic_on_warn needs to be cleared
> before calling panic() to avoid recursive panics.
>
> Signed-off-by: Tiezhu Yang <[email protected]>
> ---
> kernel/sched/core.c | 11 ++++++++++-
> 1 file changed, 10 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 848eaa0..f5b0886 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5524,8 +5524,17 @@ static noinline void __schedule_bug(struct task_struct *prev)
> pr_err("Preemption disabled at:");
> print_ip_sym(KERN_ERR, preempt_disable_ip);
> }
> - if (panic_on_warn)
> +
> + if (panic_on_warn) {
> + /*
> + * This thread may hit another WARN() in the panic path.
> + * Resetting this prevents additional WARN() from panicking the
> + * system on this thread. Other threads are blocked by the
> + * panic_mutex in panic().
> + */
> + panic_on_warn = 0;
> panic("scheduling while atomic\n");

I agree this is worth fixing.

But: Why can't the "panic_on_warn = 0" just be moved inside panic(),
instead of copy-pasting this all over the place?

I may be missing something obvious why this hasn't been done before...

Thanks,
-- Marco

2022-02-01 09:10:56

by Tiezhu Yang

[permalink] [raw]
Subject: Re: [PATCH 4/5] sched: unset panic_on_warn before calling panic()



On 1/28/22 19:52, Marco Elver wrote:
> On Fri, 28 Jan 2022 at 12:42, Tiezhu Yang <[email protected]> wrote:
>>
>> As done in the full WARN() handler, panic_on_warn needs to be cleared
>> before calling panic() to avoid recursive panics.
>>
>> Signed-off-by: Tiezhu Yang <[email protected]>
>> ---
>> kernel/sched/core.c | 11 ++++++++++-
>> 1 file changed, 10 insertions(+), 1 deletion(-)
>>
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 848eaa0..f5b0886 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -5524,8 +5524,17 @@ static noinline void __schedule_bug(struct task_struct *prev)
>> pr_err("Preemption disabled at:");
>> print_ip_sym(KERN_ERR, preempt_disable_ip);
>> }
>> - if (panic_on_warn)
>> +
>> + if (panic_on_warn) {
>> + /*
>> + * This thread may hit another WARN() in the panic path.
>> + * Resetting this prevents additional WARN() from panicking the
>> + * system on this thread. Other threads are blocked by the
>> + * panic_mutex in panic().
>> + */
>> + panic_on_warn = 0;
>> panic("scheduling while atomic\n");
>
> I agree this is worth fixing.
>
> But: Why can't the "panic_on_warn = 0" just be moved inside panic(),
> instead of copy-pasting this all over the place?

OK, it looks better.

Let me wait for some days, if no more comments, I will send v2
to move "panic_on_warn = 0" inside panic() and remove it from
the other places, like this:

diff --git a/kernel/panic.c b/kernel/panic.c
index 55b50e052ec3..95ba825522dd 100644
--- a/kernel/panic.c
+++ b/kernel/panic.c
@@ -185,6 +185,16 @@ void panic(const char *fmt, ...)
int old_cpu, this_cpu;
bool _crash_kexec_post_notifiers = crash_kexec_post_notifiers;

+ if (panic_on_warn) {
+ /*
+ * This thread may hit another WARN() in the panic path.
+ * Resetting this prevents additional WARN() from
panicking the
+ * system on this thread. Other threads are blocked by the
+ * panic_mutex in panic().
+ */
+ panic_on_warn = 0;
+ }
+
/*
* Disable local interrupts. This will prevent panic_smp_self_stop
* from deadlocking the first cpu that invokes the panic, since
@@ -576,16 +586,8 @@ void __warn(const char *file, int line, void
*caller, unsigned taint,
if (regs)
show_regs(regs);

- if (panic_on_warn) {
- /*
- * This thread may hit another WARN() in the panic path.
- * Resetting this prevents additional WARN() from
panicking the
- * system on this thread. Other threads are blocked by the
- * panic_mutex in panic().
- */
- panic_on_warn = 0;
+ if (panic_on_warn)
panic("panic_on_warn set ...\n");
- }

if (!regs)
dump_stack();
diff --git a/lib/ubsan.c b/lib/ubsan.c
index bdc380ff5d5c..36bd75e33426 100644
--- a/lib/ubsan.c
+++ b/lib/ubsan.c
@@ -154,16 +154,8 @@ static void ubsan_epilogue(void)

current->in_ubsan--;

- if (panic_on_warn) {
- /*
- * This thread may hit another WARN() in the panic path.
- * Resetting this prevents additional WARN() from
panicking the
- * system on this thread. Other threads are blocked by the
- * panic_mutex in panic().
- */
- panic_on_warn = 0;
+ if (panic_on_warn)
panic("panic_on_warn set ...\n");
- }
}

void __ubsan_handle_divrem_overflow(void *_data, void *lhs, void *rhs)
diff --git a/mm/kasan/report.c b/mm/kasan/report.c
index 3ad9624dcc56..f14146563d41 100644
--- a/mm/kasan/report.c
+++ b/mm/kasan/report.c
@@ -117,16 +117,8 @@ static void end_report(unsigned long *flags,
unsigned long addr)

pr_err("==================================================================\n");
add_taint(TAINT_BAD_PAGE, LOCKDEP_NOW_UNRELIABLE);
spin_unlock_irqrestore(&report_lock, *flags);
- if (panic_on_warn && !test_bit(KASAN_BIT_MULTI_SHOT,
&kasan_flags)) {
- /*
- * This thread may hit another WARN() in the panic path.
- * Resetting this prevents additional WARN() from
panicking the
- * system on this thread. Other threads are blocked by the
- * panic_mutex in panic().
- */
- panic_on_warn = 0;
+ if (panic_on_warn && !test_bit(KASAN_BIT_MULTI_SHOT, &kasan_flags))
panic("panic_on_warn set ...\n");
- }
if (kasan_arg_fault == KASAN_ARG_FAULT_PANIC)
panic("kasan.fault=panic set ...\n");
kasan_enable_current();

Thanks,
Tiezhu

>
> I may be missing something obvious why this hasn't been done before...
>
> Thanks,
> -- Marco

2022-02-01 09:36:51

by Baoquan He

[permalink] [raw]
Subject: Re: [PATCH 2/5] docs: kdump: add scp sample to write out the dump file

On 01/28/22 at 07:42pm, Tiezhu Yang wrote:
> Except cp and makedumpfile, add scp sample to write out the dump file.
~~~~~~? You mean example?

I think we just give example here, but not list all cases. seems
adding scp is nothing bad. Anyway, except of the concern for 'sample':

Acked-by: Baoquan He <[email protected]>

>
> Signed-off-by: Tiezhu Yang <[email protected]>
> ---
> Documentation/admin-guide/kdump/kdump.rst | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/Documentation/admin-guide/kdump/kdump.rst b/Documentation/admin-guide/kdump/kdump.rst
> index d187df2..a748e7e 100644
> --- a/Documentation/admin-guide/kdump/kdump.rst
> +++ b/Documentation/admin-guide/kdump/kdump.rst
> @@ -533,6 +533,10 @@ the following command::
>
> cp /proc/vmcore <dump-file>
>
> +or use scp to write out the dump file between hosts on a network, e.g::
> +
> + scp /proc/vmcore remote_username@remote_ip:<dump-file>
> +
> You can also use makedumpfile utility to write out the dump file
> with specified options to filter out unwanted contents, e.g::
>
> --
> 2.1.0
>