LinuxLists.cc - [PATCH] riscv: stacktrace: fix dump_backtrace/walk

2021-06-27 09:38:17

Subject: [PATCH] riscv: stacktrace: fix dump_backtrace/walk_stackframe with NULL task

Some places try to show backtrace with NULL task, and expect the task is
'current'. For example, dump_stack()->show_stack(NULL,...). So the
stacktrace code should take care of this case.

Here is an oops caused by this issue when accessing the NULL task.

[ 15.180813] Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
[ 15.182382] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc7-00111-g625acffd7ae2-dirty #18
[ 15.183431] Hardware name: riscv-virtio,qemu (DT)
[ 15.184253] Call Trace:
[ 15.223617] Unable to handle kernel paging request at virtual address 0000000000001590
[ 15.267378] Oops [#1]
[ 15.268215] Modules linked in:
[ 15.272027] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc7-00111-g625acffd7ae2-dirty #18
[ 15.273997] Hardware name: riscv-virtio,qemu (DT)
[ 15.275134] epc : walk_stackframe+0xc4/0xdc
[ 15.280146] ra : dump_backtrace+0x30/0x38
[ 15.280799] epc : ffffffff8000597e ra : ffffffff800059c6 sp : ffffffe002383d60
[ 15.281622] gp : ffffffff8179ad18 tp : ffffffe002378000 t0 : ffffffff81bc1a3f
[ 15.282574] t1 : 0000000000000001 t2 : 0000000000000000 s0 : ffffffe002383dc0
[ 15.283782] s1 : ffffffff812b7d18 a0 : 0000000000001000 a1 : 0000000000000000
[ 15.285115] a2 : ffffffff807ec668 a3 : ffffffff812b7d18 a4 : c76c00cabf08b500
[ 15.286213] a5 : 0000000000001000 a6 : 000000001a9ef260 a7 : 0000000000000000
[ 15.287317] s2 : 0000000000000000 s3 : 0000000000000000 s4 : 0000000000000000
[ 15.288323] s5 : ffffffff807ec668 s6 : ffffffff812b7d18 s7 : 0000000000000000
[ 15.289530] s8 : 0000000000000000 s9 : 0000000000000000 s10: 0000000000000000
[ 15.290995] s11: 0000000000000000 t3 : 0000000000000001 t4 : 0000000000000000
[ 15.292465] t5 : 206f74206e6f6974 t6 : ffffffe002383b28
[ 15.293859] status: 0000000000000100 badaddr: 0000000000001590 cause: 000000000000000d
[ 15.296035] [<ffffffff8000597e>] walk_stackframe+0xc4/0xdc
[ 15.297342] [<ffffffff800059c6>] dump_backtrace+0x30/0x38
[ 15.298333] [<ffffffff807ec6e0>] show_stack+0x40/0x4c
[ 15.299765] [<ffffffff807f07ac>] dump_stack+0x7c/0x96
[ 15.300553] [<ffffffff807ec8be>] panic+0x118/0x300
[ 15.301147] [<ffffffff807f61e8>] kernel_init+0x12c/0x138
[ 15.302056] [<ffffffff80003a22>] ret_from_exception+0x0/0xc
[ 15.338628] ---[ end trace 0a3fa0cc7f3393cd ]---
[ 15.339919] note: swapper/0[1] exited with preempt_count 1
[ 15.341995] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
[ 15.343889] SMP: stopping secondary CPUs
[ 16.802836] SMP: failed to stop secondary CPUs 0-3
[ 16.806264] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

This patch fixes it by setting the task to current if it's NULL before
accessing it.

Signed-off-by: Changbin Du <[email protected]>
Fixes: 5d8544e2d0 ("RISC-V: Generic library routines and assembly")
---
arch/riscv/kernel/stacktrace.c | 6 ++++++
1 file changed, 6 insertions(+)

diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
index bde85fc53357..788b65eba965 100644
--- a/arch/riscv/kernel/stacktrace.c
+++ b/arch/riscv/kernel/stacktrace.c
@@ -23,6 +23,9 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
{
unsigned long fp, sp, pc;

+ if (!task)
+ task = current;
+
if (regs) {
fp = frame_pointer(regs);
sp = user_stack_pointer(regs);
@@ -73,6 +76,9 @@ void notrace walk_stackframe(struct task_struct *task,
unsigned long sp, pc;
unsigned long *ksp;

+ if (!task)
+ task = current;
+
if (regs) {
sp = user_stack_pointer(regs);
pc = instruction_pointer(regs);
--
2.30.2

2021-06-28 05:48:11

by Jisheng Zhang

[permalink] [raw]

Subject: Re: [PATCH] riscv: stacktrace: fix dump_backtrace/walk_stackframe with NULL task

On Sun, 27 Jun 2021 17:26:59 +0800
Changbin Du <[email protected]> wrote:

>
>
> Some places try to show backtrace with NULL task, and expect the task is
> 'current'. For example, dump_stack()->show_stack(NULL,...). So the
> stacktrace code should take care of this case.

I fixed this issue one week ago:

http://lists.infradead.org/pipermail/linux-riscv/2021-June/007258.html

>
> Here is an oops caused by this issue when accessing the NULL task.
>
> [ 15.180813] Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
> [ 15.182382] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc7-00111-g625acffd7ae2-dirty #18
> [ 15.183431] Hardware name: riscv-virtio,qemu (DT)
> [ 15.184253] Call Trace:
> [ 15.223617] Unable to handle kernel paging request at virtual address 0000000000001590
> [ 15.267378] Oops [#1]
> [ 15.268215] Modules linked in:
> [ 15.272027] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc7-00111-g625acffd7ae2-dirty #18
> [ 15.273997] Hardware name: riscv-virtio,qemu (DT)
> [ 15.275134] epc : walk_stackframe+0xc4/0xdc
> [ 15.280146] ra : dump_backtrace+0x30/0x38
> [ 15.280799] epc : ffffffff8000597e ra : ffffffff800059c6 sp : ffffffe002383d60
> [ 15.281622] gp : ffffffff8179ad18 tp : ffffffe002378000 t0 : ffffffff81bc1a3f
> [ 15.282574] t1 : 0000000000000001 t2 : 0000000000000000 s0 : ffffffe002383dc0
> [ 15.283782] s1 : ffffffff812b7d18 a0 : 0000000000001000 a1 : 0000000000000000
> [ 15.285115] a2 : ffffffff807ec668 a3 : ffffffff812b7d18 a4 : c76c00cabf08b500
> [ 15.286213] a5 : 0000000000001000 a6 : 000000001a9ef260 a7 : 0000000000000000
> [ 15.287317] s2 : 0000000000000000 s3 : 0000000000000000 s4 : 0000000000000000
> [ 15.288323] s5 : ffffffff807ec668 s6 : ffffffff812b7d18 s7 : 0000000000000000
> [ 15.289530] s8 : 0000000000000000 s9 : 0000000000000000 s10: 0000000000000000
> [ 15.290995] s11: 0000000000000000 t3 : 0000000000000001 t4 : 0000000000000000
> [ 15.292465] t5 : 206f74206e6f6974 t6 : ffffffe002383b28
> [ 15.293859] status: 0000000000000100 badaddr: 0000000000001590 cause: 000000000000000d
> [ 15.296035] [<ffffffff8000597e>] walk_stackframe+0xc4/0xdc
> [ 15.297342] [<ffffffff800059c6>] dump_backtrace+0x30/0x38
> [ 15.298333] [<ffffffff807ec6e0>] show_stack+0x40/0x4c
> [ 15.299765] [<ffffffff807f07ac>] dump_stack+0x7c/0x96
> [ 15.300553] [<ffffffff807ec8be>] panic+0x118/0x300
> [ 15.301147] [<ffffffff807f61e8>] kernel_init+0x12c/0x138
> [ 15.302056] [<ffffffff80003a22>] ret_from_exception+0x0/0xc
> [ 15.338628] ---[ end trace 0a3fa0cc7f3393cd ]---
> [ 15.339919] note: swapper/0[1] exited with preempt_count 1
> [ 15.341995] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> [ 15.343889] SMP: stopping secondary CPUs
> [ 16.802836] SMP: failed to stop secondary CPUs 0-3
> [ 16.806264] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
>
> This patch fixes it by setting the task to current if it's NULL before
> accessing it.
>
> Signed-off-by: Changbin Du <[email protected]>
> Fixes: 5d8544e2d0 ("RISC-V: Generic library routines and assembly")

Hmm, this fixes tag should be
Fixes: eac2f3059e02 ("riscv: stacktrace: fix the riscv stacktrace when CONFIG_FRAME_POINTER enabled"

> ---
> arch/riscv/kernel/stacktrace.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
> index bde85fc53357..788b65eba965 100644
> --- a/arch/riscv/kernel/stacktrace.c
> +++ b/arch/riscv/kernel/stacktrace.c
> @@ -23,6 +23,9 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
> {
> unsigned long fp, sp, pc;
>
> + if (!task)
> + task = current;
> +
> if (regs) {
> fp = frame_pointer(regs);
> sp = user_stack_pointer(regs);
> @@ -73,6 +76,9 @@ void notrace walk_stackframe(struct task_struct *task,
> unsigned long sp, pc;
> unsigned long *ksp;
>
> + if (!task)
> + task = current;
> +
> if (regs) {
> sp = user_stack_pointer(regs);
> pc = instruction_pointer(regs);
> --
> 2.30.2
>
>
> _______________________________________________
> linux-riscv mailing list
> [email protected]

2021-06-29 00:39:18

by Changbin Du

[permalink] [raw]

Subject: Re: [PATCH] riscv: stacktrace: fix dump_backtrace/walk_stackframe with NULL task

On Mon, Jun 28, 2021 at 01:44:04PM +0800, Jisheng Zhang wrote:
> On Sun, 27 Jun 2021 17:26:59 +0800
> Changbin Du <[email protected]> wrote:
>
>
> >
> >
> > Some places try to show backtrace with NULL task, and expect the task is
> > 'current'. For example, dump_stack()->show_stack(NULL,...). So the
> > stacktrace code should take care of this case.
>
> I fixed this issue one week ago:
>
> http://lists.infradead.org/pipermail/linux-riscv/2021-June/007258.html
>
Good to know. Thanks!

> >
> > Here is an oops caused by this issue when accessing the NULL task.
> >
> > [ 15.180813] Kernel panic - not syncing: No working init found. Try passing init= option to kernel. See Linux Documentation/admin-guide/init.rst for guidance.
> > [ 15.182382] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc7-00111-g625acffd7ae2-dirty #18
> > [ 15.183431] Hardware name: riscv-virtio,qemu (DT)
> > [ 15.184253] Call Trace:
> > [ 15.223617] Unable to handle kernel paging request at virtual address 0000000000001590
> > [ 15.267378] Oops [#1]
> > [ 15.268215] Modules linked in:
> > [ 15.272027] CPU: 3 PID: 1 Comm: swapper/0 Not tainted 5.13.0-rc7-00111-g625acffd7ae2-dirty #18
> > [ 15.273997] Hardware name: riscv-virtio,qemu (DT)
> > [ 15.275134] epc : walk_stackframe+0xc4/0xdc
> > [ 15.280146] ra : dump_backtrace+0x30/0x38
> > [ 15.280799] epc : ffffffff8000597e ra : ffffffff800059c6 sp : ffffffe002383d60
> > [ 15.281622] gp : ffffffff8179ad18 tp : ffffffe002378000 t0 : ffffffff81bc1a3f
> > [ 15.282574] t1 : 0000000000000001 t2 : 0000000000000000 s0 : ffffffe002383dc0
> > [ 15.283782] s1 : ffffffff812b7d18 a0 : 0000000000001000 a1 : 0000000000000000
> > [ 15.285115] a2 : ffffffff807ec668 a3 : ffffffff812b7d18 a4 : c76c00cabf08b500
> > [ 15.286213] a5 : 0000000000001000 a6 : 000000001a9ef260 a7 : 0000000000000000
> > [ 15.287317] s2 : 0000000000000000 s3 : 0000000000000000 s4 : 0000000000000000
> > [ 15.288323] s5 : ffffffff807ec668 s6 : ffffffff812b7d18 s7 : 0000000000000000
> > [ 15.289530] s8 : 0000000000000000 s9 : 0000000000000000 s10: 0000000000000000
> > [ 15.290995] s11: 0000000000000000 t3 : 0000000000000001 t4 : 0000000000000000
> > [ 15.292465] t5 : 206f74206e6f6974 t6 : ffffffe002383b28
> > [ 15.293859] status: 0000000000000100 badaddr: 0000000000001590 cause: 000000000000000d
> > [ 15.296035] [<ffffffff8000597e>] walk_stackframe+0xc4/0xdc
> > [ 15.297342] [<ffffffff800059c6>] dump_backtrace+0x30/0x38
> > [ 15.298333] [<ffffffff807ec6e0>] show_stack+0x40/0x4c
> > [ 15.299765] [<ffffffff807f07ac>] dump_stack+0x7c/0x96
> > [ 15.300553] [<ffffffff807ec8be>] panic+0x118/0x300
> > [ 15.301147] [<ffffffff807f61e8>] kernel_init+0x12c/0x138
> > [ 15.302056] [<ffffffff80003a22>] ret_from_exception+0x0/0xc
> > [ 15.338628] ---[ end trace 0a3fa0cc7f3393cd ]---
> > [ 15.339919] note: swapper/0[1] exited with preempt_count 1
> > [ 15.341995] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> > [ 15.343889] SMP: stopping secondary CPUs
> > [ 16.802836] SMP: failed to stop secondary CPUs 0-3
> > [ 16.806264] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
> >
> > This patch fixes it by setting the task to current if it's NULL before
> > accessing it.
> >
> > Signed-off-by: Changbin Du <[email protected]>
> > Fixes: 5d8544e2d0 ("RISC-V: Generic library routines and assembly")
>
> Hmm, this fixes tag should be
> Fixes: eac2f3059e02 ("riscv: stacktrace: fix the riscv stacktrace when CONFIG_FRAME_POINTER enabled"
>
>
> > ---
> > arch/riscv/kernel/stacktrace.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/arch/riscv/kernel/stacktrace.c b/arch/riscv/kernel/stacktrace.c
> > index bde85fc53357..788b65eba965 100644
> > --- a/arch/riscv/kernel/stacktrace.c
> > +++ b/arch/riscv/kernel/stacktrace.c
> > @@ -23,6 +23,9 @@ void notrace walk_stackframe(struct task_struct *task, struct pt_regs *regs,
> > {
> > unsigned long fp, sp, pc;
> >
> > + if (!task)
> > + task = current;
> > +
> > if (regs) {
> > fp = frame_pointer(regs);
> > sp = user_stack_pointer(regs);
> > @@ -73,6 +76,9 @@ void notrace walk_stackframe(struct task_struct *task,
> > unsigned long sp, pc;
> > unsigned long *ksp;
> >
> > + if (!task)
> > + task = current;
> > +
> > if (regs) {
> > sp = user_stack_pointer(regs);
> > pc = instruction_pointer(regs);
> > --
> > 2.30.2
> >
> >
> > _______________________________________________
> > linux-riscv mailing list
> > [email protected]
>

--
Cheers,
Changbin Du

2021-07-27 22:17:55

by Changbin Du

[permalink] [raw]

Subject: Re: [PATCH] riscv: stacktrace: fix dump_backtrace/walk_stackframe with NULL task

2021-07-28 13:53:39

by Jisheng Zhang

[permalink] [raw]

Subject: Re: [PATCH] riscv: stacktrace: fix dump_backtrace/walk_stackframe with NULL task

On Wed, 28 Jul 2021 06:16:56 +0800
Changbin Du <[email protected]> wrote:

> On Mon, Jun 28, 2021 at 01:44:04PM +0800, Jisheng Zhang wrote:
> > On Sun, 27 Jun 2021 17:26:59 +0800
> > Changbin Du <[email protected]> wrote:
> >
> >
> > >
> > >
> > > Some places try to show backtrace with NULL task, and expect the task is
> > > 'current'. For example, dump_stack()->show_stack(NULL,...). So the
> > > stacktrace code should take care of this case.
> >
> > I fixed this issue one week ago:
> >
> > http://lists.infradead.org/pipermail/linux-riscv/2021-June/007258.html
>
> I still see this issue on mainline. Is your fix merged? Thanks!

Nope, the fix is missed twice. Palmer has added the fix patch into
fix branch, I help it will be in next rc

Regards