2024-04-10 10:27:47

by Naresh Kamboju

[permalink] [raw]
Subject: BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq next-20240410

Following kernel crash noticed on Linux next-20240410 tag while running
kunit testing on qemu-arm64 and qemu-x86_64.

Reported-by: Linux Kernel Functional Testing <[email protected]>

Crash log on qemu-arm64:
----------------
<3>[ 30.465716] BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq (include/linux/instrumented.h:96 include/linux/atomic/atomic-instrumented.h:1301 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
<3>[ 30.467097] Write of size 4 at addr 0000000000000008 by task swapper/0/1
<3>[ 30.468059]
<3>[ 30.468393] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B N 6.9.0-rc3-next-20240410 #1
<3>[ 30.469209] Hardware name: linux,dummy-virt (DT)
<3>[ 30.469645] Call trace:
<3>[ 30.469919] dump_backtrace (arch/arm64/kernel/stacktrace.c:319)
<3>[ 30.471622] show_stack (arch/arm64/kernel/stacktrace.c:326)
<3>[ 30.472124] dump_stack_lvl (lib/dump_stack.c:117)
<3>[ 30.472947] print_report (mm/kasan/report.c:493)
<3>[ 30.473755] kasan_report (mm/kasan/report.c:603)
<3>[ 30.474524] kasan_check_range (mm/kasan/generic.c:175 mm/kasan/generic.c:189)
<3>[ 30.475094] __kasan_check_write (mm/kasan/shadow.c:38)
<3>[ 30.475683] _raw_spin_lock_irq (include/linux/instrumented.h:96 include/linux/atomic/atomic-instrumented.h:1301 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
<3>[ 30.476257] wait_for_completion_timeout (kernel/sched/completion.c:84 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:167)
<3>[ 30.476909] kunit_try_catch_run (lib/kunit/try-catch.c:86)
<3>[ 30.477628] kunit_run_case_catch_errors (lib/kunit/test.c:544)
<3>[ 30.478311] kunit_run_tests (lib/kunit/test.c:635)
<3>[ 30.478865] __kunit_test_suites_init (lib/kunit/test.c:729 (discriminator 1))
<3>[ 30.479482] kunit_run_all_tests (lib/kunit/executor.c:276 lib/kunit/executor.c:392)
<3>[ 30.480079] kernel_init_freeable (init/main.c:1578)
<3>[ 30.480747] kernel_init (init/main.c:1465)
<3>[ 30.481474] ret_from_fork (arch/arm64/kernel/entry.S:861)
<3>[ 30.482080] ==================================================================
<1>[ 30.484503] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
<1>[ 30.485369] Mem abort info:
<1>[ 30.485923] ESR = 0x000000009600006b
<1>[ 30.486943] EC = 0x25: DABT (current EL), IL = 32 bits
<1>[ 30.487540] SET = 0, FnV = 0
<1>[ 30.488007] EA = 0, S1PTW = 0
<1>[ 30.488509] FSC = 0x2b: level -1 translation fault
<1>[ 30.489150] Data abort info:
<1>[ 30.489610] ISV = 0, ISS = 0x0000006b, ISS2 = 0x00000000
<1>[ 30.490360] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
<1>[ 30.491057] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
<1>[ 30.491822] [0000000000000008] user address but active_mm is swapper
<0>[ 30.493008] Internal error: Oops: 000000009600006b [#1] PREEMPT SMP
<4>[ 30.494105] Modules linked in:
<4>[ 30.496244] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B N 6.9.0-rc3-next-20240410 #1
<4>[ 30.497171] Hardware name: linux,dummy-virt (DT)
<4>[ 30.497905] pstate: 224000c9 (nzCv daIF +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
<4>[ 30.498895] pc : _raw_spin_lock_irq (arch/arm64/include/asm/atomic_lse.h:271 arch/arm64/include/asm/cmpxchg.h:120 arch/arm64/include/asm/cmpxchg.h:169 include/linux/atomic/atomic-arch-fallback.h:2055 include/linux/atomic/atomic-arch-fallback.h:2173 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
<4>[ 30.499542] lr : _raw_spin_lock_irq (include/linux/atomic/atomic-arch-fallback.h:2172 (discriminator 1) include/linux/atomic/atomic-instrumented.h:1302 (discriminator 1) include/asm-generic/qspinlock.h:111 (discriminator 1) include/linux/spinlock.h:187 (discriminator 1) include/linux/spinlock_api_smp.h:120 (discriminator 1) kernel/locking/spinlock.c:170 (discriminator 1))

<trim>

<4>[ 30.511022] Call trace:
<4>[ 30.511437] _raw_spin_lock_irq (arch/arm64/include/asm/atomic_lse.h:271 arch/arm64/include/asm/cmpxchg.h:120 arch/arm64/include/asm/cmpxchg.h:169 include/linux/atomic/atomic-arch-fallback.h:2055 include/linux/atomic/atomic-arch-fallback.h:2173 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
<4>[ 30.512013] wait_for_completion_timeout (kernel/sched/completion.c:84 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:167)
<4>[ 30.512627] kunit_try_catch_run (lib/kunit/try-catch.c:86)
<4>[ 30.513188] kunit_run_case_catch_errors (lib/kunit/test.c:544)
<4>[ 30.513801] kunit_run_tests (lib/kunit/test.c:635)
<4>[ 30.514674] __kunit_test_suites_init (lib/kunit/test.c:729 (discriminator 1))
<4>[ 30.515259] kunit_run_all_tests (lib/kunit/executor.c:276 lib/kunit/executor.c:392)
<4>[ 30.515831] kernel_init_freeable (init/main.c:1578)
<4>[ 30.516384] kernel_init (init/main.c:1465)
<4>[ 30.516900] ret_from_fork (arch/arm64/kernel/entry.S:861)
<0>[ 30.518151] Code: 93407c02 d503201f 2a0003e1 52800022 (88e17e62)
All code
========
0: 93407c02 sxtw x2, w0
4: d503201f nop
8: 2a0003e1 mov w1, w0
c: 52800022 mov w2, #0x1 // #1
10:* 88e17e62 casa w1, w2, [x19] <-- trapping instruction

Code starting with the faulting instruction
===========================================
0: 88e17e62 casa w1, w2, [x19]
<4>[ 30.519501] ---[ end trace 0000000000000000 ]---
<6>[ 30.520317] note: swapper/0[1] exited with irqs disabled
<6>[ 30.521355] note: swapper/0[1] exited with preempt_count 1
<0>[ 30.523129] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
<2>[ 30.524397] SMP: stopping secondary CPUs
<0>[ 30.525553] Kernel Offset: 0x25148d400000 from 0xffff800080000000
<0>[ 30.528341] PHYS_OFFSET: 0x40000000
<0>[ 30.529003] CPU features: 0x0,00000006,8f17bd7c,6766773f
<0>[ 30.530313] Memory Limit: none
<0>[ 30.531319] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---

Steps to reproduce:
---
https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2etdCz631GU6PILJzs8reteba8i/reproducer

Links:
- https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240410/testrun/23381881/suite/log-parser-test/tests/
- https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2etdCz631GU6PILJzs8reteba8i
- https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2etdDjlx3eRFhhK9cy2UsEHAXTr

--
Linaro LKFT
https://lkft.linaro.org


2024-04-10 15:24:40

by Will Deacon

[permalink] [raw]
Subject: Re: BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq next-20240410

On Wed, Apr 10, 2024 at 03:57:10PM +0530, Naresh Kamboju wrote:
> Following kernel crash noticed on Linux next-20240410 tag while running
> kunit testing on qemu-arm64 and qemu-x86_64.
>
> Reported-by: Linux Kernel Functional Testing <[email protected]>
>
> Crash log on qemu-arm64:
> ----------------
> <3>[ 30.465716] BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq (include/linux/instrumented.h:96 include/linux/atomic/atomic-instrumented.h:1301 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> <3>[ 30.467097] Write of size 4 at addr 0000000000000008 by task swapper/0/1
> <3>[ 30.468059]
> <3>[ 30.468393] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B N 6.9.0-rc3-next-20240410 #1
> <3>[ 30.469209] Hardware name: linux,dummy-virt (DT)
> <3>[ 30.469645] Call trace:
> <3>[ 30.469919] dump_backtrace (arch/arm64/kernel/stacktrace.c:319)
> <3>[ 30.471622] show_stack (arch/arm64/kernel/stacktrace.c:326)
> <3>[ 30.472124] dump_stack_lvl (lib/dump_stack.c:117)
> <3>[ 30.472947] print_report (mm/kasan/report.c:493)
> <3>[ 30.473755] kasan_report (mm/kasan/report.c:603)
> <3>[ 30.474524] kasan_check_range (mm/kasan/generic.c:175 mm/kasan/generic.c:189)
> <3>[ 30.475094] __kasan_check_write (mm/kasan/shadow.c:38)
> <3>[ 30.475683] _raw_spin_lock_irq (include/linux/instrumented.h:96 include/linux/atomic/atomic-instrumented.h:1301 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> <3>[ 30.476257] wait_for_completion_timeout (kernel/sched/completion.c:84 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:167)
> <3>[ 30.476909] kunit_try_catch_run (lib/kunit/try-catch.c:86)
> <3>[ 30.477628] kunit_run_case_catch_errors (lib/kunit/test.c:544)
> <3>[ 30.478311] kunit_run_tests (lib/kunit/test.c:635)
> <3>[ 30.478865] __kunit_test_suites_init (lib/kunit/test.c:729 (discriminator 1))
> <3>[ 30.479482] kunit_run_all_tests (lib/kunit/executor.c:276 lib/kunit/executor.c:392)
> <3>[ 30.480079] kernel_init_freeable (init/main.c:1578)
> <3>[ 30.480747] kernel_init (init/main.c:1465)
> <3>[ 30.481474] ret_from_fork (arch/arm64/kernel/entry.S:861)
> <3>[ 30.482080] ==================================================================
> <1>[ 30.484503] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> <1>[ 30.485369] Mem abort info:
> <1>[ 30.485923] ESR = 0x000000009600006b
> <1>[ 30.486943] EC = 0x25: DABT (current EL), IL = 32 bits
> <1>[ 30.487540] SET = 0, FnV = 0
> <1>[ 30.488007] EA = 0, S1PTW = 0
> <1>[ 30.488509] FSC = 0x2b: level -1 translation fault
> <1>[ 30.489150] Data abort info:
> <1>[ 30.489610] ISV = 0, ISS = 0x0000006b, ISS2 = 0x00000000
> <1>[ 30.490360] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> <1>[ 30.491057] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> <1>[ 30.491822] [0000000000000008] user address but active_mm is swapper
> <0>[ 30.493008] Internal error: Oops: 000000009600006b [#1] PREEMPT SMP
> <4>[ 30.494105] Modules linked in:
> <4>[ 30.496244] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B N 6.9.0-rc3-next-20240410 #1
> <4>[ 30.497171] Hardware name: linux,dummy-virt (DT)
> <4>[ 30.497905] pstate: 224000c9 (nzCv daIF +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
> <4>[ 30.498895] pc : _raw_spin_lock_irq (arch/arm64/include/asm/atomic_lse.h:271 arch/arm64/include/asm/cmpxchg.h:120 arch/arm64/include/asm/cmpxchg.h:169 include/linux/atomic/atomic-arch-fallback.h:2055 include/linux/atomic/atomic-arch-fallback.h:2173 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> <4>[ 30.499542] lr : _raw_spin_lock_irq (include/linux/atomic/atomic-arch-fallback.h:2172 (discriminator 1) include/linux/atomic/atomic-instrumented.h:1302 (discriminator 1) include/asm-generic/qspinlock.h:111 (discriminator 1) include/linux/spinlock.h:187 (discriminator 1) include/linux/spinlock_api_smp.h:120 (discriminator 1) kernel/locking/spinlock.c:170 (discriminator 1))
>
> <trim>

It's a shame that you have trimmed the register dump here.

> <4>[ 30.511022] Call trace:
> <4>[ 30.511437] _raw_spin_lock_irq (arch/arm64/include/asm/atomic_lse.h:271 arch/arm64/include/asm/cmpxchg.h:120 arch/arm64/include/asm/cmpxchg.h:169 include/linux/atomic/atomic-arch-fallback.h:2055 include/linux/atomic/atomic-arch-fallback.h:2173 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> <4>[ 30.512013] wait_for_completion_timeout (kernel/sched/completion.c:84 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:167)
> <4>[ 30.512627] kunit_try_catch_run (lib/kunit/try-catch.c:86)
> <4>[ 30.513188] kunit_run_case_catch_errors (lib/kunit/test.c:544)
> <4>[ 30.513801] kunit_run_tests (lib/kunit/test.c:635)

Ok, so 'task_struct->vfork_done' is NULL. Looks like this code was added
recently, so adding Micka?l to cc.

Will

> <4>[ 30.514674] __kunit_test_suites_init (lib/kunit/test.c:729 (discriminator 1))
> <4>[ 30.515259] kunit_run_all_tests (lib/kunit/executor.c:276 lib/kunit/executor.c:392)
> <4>[ 30.515831] kernel_init_freeable (init/main.c:1578)
> <4>[ 30.516384] kernel_init (init/main.c:1465)
> <4>[ 30.516900] ret_from_fork (arch/arm64/kernel/entry.S:861)
> <0>[ 30.518151] Code: 93407c02 d503201f 2a0003e1 52800022 (88e17e62)
> All code
> ========
> 0: 93407c02 sxtw x2, w0
> 4: d503201f nop
> 8: 2a0003e1 mov w1, w0
> c: 52800022 mov w2, #0x1 // #1
> 10:* 88e17e62 casa w1, w2, [x19] <-- trapping instruction
>
> Code starting with the faulting instruction
> ===========================================
> 0: 88e17e62 casa w1, w2, [x19]
> <4>[ 30.519501] ---[ end trace 0000000000000000 ]---
> <6>[ 30.520317] note: swapper/0[1] exited with irqs disabled
> <6>[ 30.521355] note: swapper/0[1] exited with preempt_count 1
> <0>[ 30.523129] Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b
> <2>[ 30.524397] SMP: stopping secondary CPUs
> <0>[ 30.525553] Kernel Offset: 0x25148d400000 from 0xffff800080000000
> <0>[ 30.528341] PHYS_OFFSET: 0x40000000
> <0>[ 30.529003] CPU features: 0x0,00000006,8f17bd7c,6766773f
> <0>[ 30.530313] Memory Limit: none
> <0>[ 30.531319] ---[ end Kernel panic - not syncing: Attempted to kill init! exitcode=0x0000000b ]---
>
> Steps to reproduce:
> ---
> https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2etdCz631GU6PILJzs8reteba8i/reproducer
>
> Links:
> - https://qa-reports.linaro.org/lkft/linux-next-master/build/next-20240410/testrun/23381881/suite/log-parser-test/tests/
> - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2etdCz631GU6PILJzs8reteba8i
> - https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2etdDjlx3eRFhhK9cy2UsEHAXTr
>
> --
> Linaro LKFT
> https://lkft.linaro.org

2024-04-10 17:17:13

by Naresh Kamboju

[permalink] [raw]
Subject: Re: BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq next-20240410

On Wed, 10 Apr 2024 at 20:53, Will Deacon <[email protected]> wrote:
>
> On Wed, Apr 10, 2024 at 03:57:10PM +0530, Naresh Kamboju wrote:
> > Following kernel crash noticed on Linux next-20240410 tag while running
> > kunit testing on qemu-arm64 and qemu-x86_64.
> >
> > Reported-by: Linux Kernel Functional Testing <[email protected]>
> >
> > Crash log on qemu-arm64:
> > ----------------
> > <3>[ 30.465716] BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq (include/linux/instrumented.h:96 include/linux/atomic/atomic-instrumented.h:1301 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)

> It's a shame that you have trimmed the register dump here.

My apologies for that, the detailed crash log is attached and
the links provided in the tail of this email.

>
> > <4>[ 30.511022] Call trace:
> > <4>[ 30.511437] _raw_spin_lock_irq (arch/arm64/include/asm/atomic_lse.h:271 arch/arm64/include/asm/cmpxchg.h:120 arch/arm64/include/asm/cmpxchg.h:169 include/linux/atomic/atomic-arch-fallback.h:2055 include/linux/atomic/atomic-arch-fallback.h:2173 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > <4>[ 30.512013] wait_for_completion_timeout (kernel/sched/completion.c:84 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:167)
> > <4>[ 30.512627] kunit_try_catch_run (lib/kunit/try-catch.c:86)
> > <4>[ 30.513188] kunit_run_case_catch_errors (lib/kunit/test.c:544)
> > <4>[ 30.513801] kunit_run_tests (lib/kunit/test.c:635)
>
> Ok, so 'task_struct->vfork_done' is NULL. Looks like this code was added
> recently, so adding Mickaël to cc.
>
> Will

Thank you.

- Naresh


Attachments:
output-kasan-kernel-crash.txt (6.71 kB)

2024-04-10 17:25:16

by Naresh Kamboju

[permalink] [raw]
Subject: Re: BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq next-20240410

On Wed, 10 Apr 2024 at 22:44, Naresh Kamboju <[email protected]> wrote:
>
> On Wed, 10 Apr 2024 at 20:53, Will Deacon <[email protected]> wrote:
> >
> > On Wed, Apr 10, 2024 at 03:57:10PM +0530, Naresh Kamboju wrote:
> > > Following kernel crash noticed on Linux next-20240410 tag while running
> > > kunit testing on qemu-arm64 and qemu-x86_64.
> > >
> > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > >
> > > Crash log on qemu-arm64:
> > > ----------------
> > > <3>[ 30.465716] BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq (include/linux/instrumented.h:96 include/linux/atomic/atomic-instrumented.h:1301 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
>
> > It's a shame that you have trimmed the register dump here.
>
> My apologies for that, the detailed crash log is attached and
> the links provided in the tail of this email.
>
> >
> > > <4>[ 30.511022] Call trace:
> > > <4>[ 30.511437] _raw_spin_lock_irq (arch/arm64/include/asm/atomic_lseh:271 arch/arm64/include/asm/cmpxchg.h:120 arch/arm64/include/asm/cmpxchg.h:169 include/linux/atomic/atomic-arch-fallback.h:2055 include/linux/atomic/atomic-arch-fallback.h:2173 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > > <4>[ 30.512013] wait_for_completion_timeout (kernel/sched/completion.c:84 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:167)
> > > <4>[ 30.512627] kunit_try_catch_run (lib/kunit/try-catch.c:86)
> > > <4>[ 30.513188] kunit_run_case_catch_errors (lib/kunit/test.c:544)
> > > <4>[ 30.513801] kunit_run_tests (lib/kunit/test.c:635)
> >
> > Ok, so 'task_struct->vfork_done' is NULL. Looks like this code was added
> > recently, so adding Mickaël to cc.
> >
> > Will

The decode stack trace dump file for arm64 has been attached.

- Naresh


Attachments:
output-kasan-kunit-kernel-crash.txt (6.79 kB)

2024-04-11 04:27:59

by David Gow

[permalink] [raw]
Subject: Re: BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq next-20240410

On Wed, 10 Apr 2024 at 23:23, Will Deacon <[email protected]> wrote:
>
> On Wed, Apr 10, 2024 at 03:57:10PM +0530, Naresh Kamboju wrote:
> > Following kernel crash noticed on Linux next-20240410 tag while running
> > kunit testing on qemu-arm64 and qemu-x86_64.
> >
> > Reported-by: Linux Kernel Functional Testing <[email protected]>
> >
> > Crash log on qemu-arm64:
> > ----------------
> > <3>[ 30.465716] BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq (include/linux/instrumented.h:96 include/linux/atomic/atomic-instrumented.h:1301 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > <3>[ 30.467097] Write of size 4 at addr 0000000000000008 by task swapper/0/1
> > <3>[ 30.468059]
> > <3>[ 30.468393] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B N 6.9.0-rc3-next-20240410 #1
> > <3>[ 30.469209] Hardware name: linux,dummy-virt (DT)
> > <3>[ 30.469645] Call trace:
> > <3>[ 30.469919] dump_backtrace (arch/arm64/kernel/stacktrace.c:319)
> > <3>[ 30.471622] show_stack (arch/arm64/kernel/stacktrace.c:326)
> > <3>[ 30.472124] dump_stack_lvl (lib/dump_stack.c:117)
> > <3>[ 30.472947] print_report (mm/kasan/report.c:493)
> > <3>[ 30.473755] kasan_report (mm/kasan/report.c:603)
> > <3>[ 30.474524] kasan_check_range (mm/kasan/generic.c:175 mm/kasan/generic.c:189)
> > <3>[ 30.475094] __kasan_check_write (mm/kasan/shadow.c:38)
> > <3>[ 30.475683] _raw_spin_lock_irq (include/linux/instrumented.h:96 include/linux/atomic/atomic-instrumented.h:1301 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > <3>[ 30.476257] wait_for_completion_timeout (kernel/sched/completion.c:84 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:167)
> > <3>[ 30.476909] kunit_try_catch_run (lib/kunit/try-catch.c:86)
> > <3>[ 30.477628] kunit_run_case_catch_errors (lib/kunit/test.c:544)
> > <3>[ 30.478311] kunit_run_tests (lib/kunit/test.c:635)
> > <3>[ 30.478865] __kunit_test_suites_init (lib/kunit/test.c:729 (discriminator 1))
> > <3>[ 30.479482] kunit_run_all_tests (lib/kunit/executor.c:276 lib/kunit/executor.c:392)
> > <3>[ 30.480079] kernel_init_freeable (init/main.c:1578)
> > <3>[ 30.480747] kernel_init (init/main.c:1465)
> > <3>[ 30.481474] ret_from_fork (arch/arm64/kernel/entry.S:861)
> > <3>[ 30.482080] ==================================================================
> > <1>[ 30.484503] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> > <1>[ 30.485369] Mem abort info:
> > <1>[ 30.485923] ESR = 0x000000009600006b
> > <1>[ 30.486943] EC = 0x25: DABT (current EL), IL = 32 bits
> > <1>[ 30.487540] SET = 0, FnV = 0
> > <1>[ 30.488007] EA = 0, S1PTW = 0
> > <1>[ 30.488509] FSC = 0x2b: level -1 translation fault
> > <1>[ 30.489150] Data abort info:
> > <1>[ 30.489610] ISV = 0, ISS = 0x0000006b, ISS2 = 0x00000000
> > <1>[ 30.490360] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> > <1>[ 30.491057] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> > <1>[ 30.491822] [0000000000000008] user address but active_mm is swapper
> > <0>[ 30.493008] Internal error: Oops: 000000009600006b [#1] PREEMPT SMP
> > <4>[ 30.494105] Modules linked in:
> > <4>[ 30.496244] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B N 6.9.0-rc3-next-20240410 #1
> > <4>[ 30.497171] Hardware name: linux,dummy-virt (DT)
> > <4>[ 30.497905] pstate: 224000c9 (nzCv daIF +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
> > <4>[ 30.498895] pc : _raw_spin_lock_irq (arch/arm64/include/asm/atomic_lse.h:271 arch/arm64/include/asm/cmpxchg.h:120 arch/arm64/include/asm/cmpxchg.h:169 include/linux/atomic/atomic-arch-fallback.h:2055 include/linux/atomic/atomic-arch-fallback.h:2173 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > <4>[ 30.499542] lr : _raw_spin_lock_irq (include/linux/atomic/atomic-arch-fallback.h:2172 (discriminator 1) include/linux/atomic/atomic-instrumented.h:1302 (discriminator 1) include/asm-generic/qspinlock.h:111 (discriminator 1) include/linux/spinlock.h:187 (discriminator 1) include/linux/spinlock_api_smp.h:120 (discriminator 1) kernel/locking/spinlock.c:170 (discriminator 1))
> >
> > <trim>
>
> It's a shame that you have trimmed the register dump here.
>
> > <4>[ 30.511022] Call trace:
> > <4>[ 30.511437] _raw_spin_lock_irq (arch/arm64/include/asm/atomic_lse.h:271 arch/arm64/include/asm/cmpxchg.h:120 arch/arm64/include/asm/cmpxchg.h:169 include/linux/atomic/atomic-arch-fallback.h:2055 include/linux/atomic/atomic-arch-fallback.h:2173 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > <4>[ 30.512013] wait_for_completion_timeout (kernel/sched/completion.c:84 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:167)
> > <4>[ 30.512627] kunit_try_catch_run (lib/kunit/try-catch.c:86)
> > <4>[ 30.513188] kunit_run_case_catch_errors (lib/kunit/test.c:544)
> > <4>[ 30.513801] kunit_run_tests (lib/kunit/test.c:635)
>
> Ok, so 'task_struct->vfork_done' is NULL. Looks like this code was added
> recently, so adding Mickaël to cc.
>

Thanks. This looks like a race condition where the KUnit test kthread
can terminate before we wait on it.

Mickaël, does this seem like a correct fix to you?
---
From: David Gow <[email protected]>
Date: Thu, 11 Apr 2024 12:07:47 +0800
Subject: [PATCH] kunit: Fix race condition in try-catch completion

KUnit's try-catch infrastructure now uses vfork_done, which is always
set to a valid completion when a kthread is crated, but which is set to
NULL once the thread terminates. This creates a race condition, where
the kthread exits before we can wait on it.

Keep a copy of vfork_done, which is taken before we wake_up_process()
and so valid, and wait on that instead.

Fixes: 4de2a8e4cca4 ("kunit: Handle test faults")
Reported-by: Linux Kernel Functional Testing <[email protected]>
Signed-off-by: David Gow <[email protected]>
---
lib/kunit/try-catch.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c
index fa687278ccc9..fc6cd4d7e80f 100644
--- a/lib/kunit/try-catch.c
+++ b/lib/kunit/try-catch.c
@@ -63,6 +63,7 @@ void kunit_try_catch_run(struct kunit_try_catch
*try_catch, void *context)
{
struct kunit *test = try_catch->test;
struct task_struct *task_struct;
+ struct completion *task_done;
int exit_code, time_remaining;

try_catch->context = context;
@@ -75,13 +76,14 @@ void kunit_try_catch_run(struct kunit_try_catch
*try_catch, void *context)
return;
}
get_task_struct(task_struct);
+ task_done = task_struct->vfork_done;
wake_up_process(task_struct);
/*
* As for a vfork(2), task_struct->vfork_done (pointing to the
* underlying kthread->exited) can be used to wait for the end of a
* kernel thread.
*/
- time_remaining = wait_for_completion_timeout(task_struct->vfork_done,
+ time_remaining = wait_for_completion_timeout(task_done,
kunit_test_timeout());
if (time_remaining == 0) {
try_catch->try_result = -ETIMEDOUT;
--


Attachments:
smime.p7s (3.92 kB)
S/MIME Cryptographic Signature

2024-04-11 08:49:04

by Mickaël Salaün

[permalink] [raw]
Subject: Re: BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq next-20240410

On Thu, Apr 11, 2024 at 12:25:40PM +0800, David Gow wrote:
> On Wed, 10 Apr 2024 at 23:23, Will Deacon <[email protected]> wrote:
> >
> > On Wed, Apr 10, 2024 at 03:57:10PM +0530, Naresh Kamboju wrote:
> > > Following kernel crash noticed on Linux next-20240410 tag while running
> > > kunit testing on qemu-arm64 and qemu-x86_64.
> > >
> > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > >
> > > Crash log on qemu-arm64:
> > > ----------------
> > > <3>[ 30.465716] BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq (include/linux/instrumented.h:96 include/linux/atomic/atomic-instrumented.h:1301 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > > <3>[ 30.467097] Write of size 4 at addr 0000000000000008 by task swapper/0/1
> > > <3>[ 30.468059]
> > > <3>[ 30.468393] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B N 6.9.0-rc3-next-20240410 #1
> > > <3>[ 30.469209] Hardware name: linux,dummy-virt (DT)
> > > <3>[ 30.469645] Call trace:
> > > <3>[ 30.469919] dump_backtrace (arch/arm64/kernel/stacktrace.c:319)
> > > <3>[ 30.471622] show_stack (arch/arm64/kernel/stacktrace.c:326)
> > > <3>[ 30.472124] dump_stack_lvl (lib/dump_stack.c:117)
> > > <3>[ 30.472947] print_report (mm/kasan/report.c:493)
> > > <3>[ 30.473755] kasan_report (mm/kasan/report.c:603)
> > > <3>[ 30.474524] kasan_check_range (mm/kasan/generic.c:175 mm/kasan/generic.c:189)
> > > <3>[ 30.475094] __kasan_check_write (mm/kasan/shadow.c:38)
> > > <3>[ 30.475683] _raw_spin_lock_irq (include/linux/instrumented.h:96 include/linux/atomic/atomic-instrumented.h:1301 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > > <3>[ 30.476257] wait_for_completion_timeout (kernel/sched/completion.c:84 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:167)
> > > <3>[ 30.476909] kunit_try_catch_run (lib/kunit/try-catch.c:86)
> > > <3>[ 30.477628] kunit_run_case_catch_errors (lib/kunit/test.c:544)
> > > <3>[ 30.478311] kunit_run_tests (lib/kunit/test.c:635)
> > > <3>[ 30.478865] __kunit_test_suites_init (lib/kunit/test.c:729 (discriminator 1))
> > > <3>[ 30.479482] kunit_run_all_tests (lib/kunit/executor.c:276 lib/kunit/executor.c:392)
> > > <3>[ 30.480079] kernel_init_freeable (init/main.c:1578)
> > > <3>[ 30.480747] kernel_init (init/main.c:1465)
> > > <3>[ 30.481474] ret_from_fork (arch/arm64/kernel/entry.S:861)
> > > <3>[ 30.482080] ==================================================================
> > > <1>[ 30.484503] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000008
> > > <1>[ 30.485369] Mem abort info:
> > > <1>[ 30.485923] ESR = 0x000000009600006b
> > > <1>[ 30.486943] EC = 0x25: DABT (current EL), IL = 32 bits
> > > <1>[ 30.487540] SET = 0, FnV = 0
> > > <1>[ 30.488007] EA = 0, S1PTW = 0
> > > <1>[ 30.488509] FSC = 0x2b: level -1 translation fault
> > > <1>[ 30.489150] Data abort info:
> > > <1>[ 30.489610] ISV = 0, ISS = 0x0000006b, ISS2 = 0x00000000
> > > <1>[ 30.490360] CM = 0, WnR = 1, TnD = 0, TagAccess = 0
> > > <1>[ 30.491057] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
> > > <1>[ 30.491822] [0000000000000008] user address but active_mm is swapper
> > > <0>[ 30.493008] Internal error: Oops: 000000009600006b [#1] PREEMPT SMP
> > > <4>[ 30.494105] Modules linked in:
> > > <4>[ 30.496244] CPU: 0 PID: 1 Comm: swapper/0 Tainted: G B N 6.9.0-rc3-next-20240410 #1
> > > <4>[ 30.497171] Hardware name: linux,dummy-virt (DT)
> > > <4>[ 30.497905] pstate: 224000c9 (nzCv daIF +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
> > > <4>[ 30.498895] pc : _raw_spin_lock_irq (arch/arm64/include/asm/atomic_lse.h:271 arch/arm64/include/asm/cmpxchg.h:120 arch/arm64/include/asm/cmpxchg.h:169 include/linux/atomic/atomic-arch-fallback.h:2055 include/linux/atomic/atomic-arch-fallback.h:2173 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > > <4>[ 30.499542] lr : _raw_spin_lock_irq (include/linux/atomic/atomic-arch-fallback.h:2172 (discriminator 1) include/linux/atomic/atomic-instrumented.h:1302 (discriminator 1) include/asm-generic/qspinlock.h:111 (discriminator 1) include/linux/spinlock.h:187 (discriminator 1) include/linux/spinlock_api_smp.h:120 (discriminator 1) kernel/locking/spinlock.c:170 (discriminator 1))
> > >
> > > <trim>
> >
> > It's a shame that you have trimmed the register dump here.
> >
> > > <4>[ 30.511022] Call trace:
> > > <4>[ 30.511437] _raw_spin_lock_irq (arch/arm64/include/asm/atomic_lse.h:271 arch/arm64/include/asm/cmpxchg.h:120 arch/arm64/include/asm/cmpxchg.h:169 include/linux/atomic/atomic-arch-fallback.h:2055 include/linux/atomic/atomic-arch-fallback.h:2173 include/linux/atomic/atomic-instrumented.h:1302 include/asm-generic/qspinlock.h:111 include/linux/spinlock.h:187 include/linux/spinlock_api_smp.h:120 kernel/locking/spinlock.c:170)
> > > <4>[ 30.512013] wait_for_completion_timeout (kernel/sched/completion.c:84 kernel/sched/completion.c:116 kernel/sched/completion.c:127 kernel/sched/completion.c:167)
> > > <4>[ 30.512627] kunit_try_catch_run (lib/kunit/try-catch.c:86)
> > > <4>[ 30.513188] kunit_run_case_catch_errors (lib/kunit/test.c:544)
> > > <4>[ 30.513801] kunit_run_tests (lib/kunit/test.c:635)
> >
> > Ok, so 'task_struct->vfork_done' is NULL. Looks like this code was added
> > recently, so adding Mickaël to cc.
> >
>
> Thanks. This looks like a race condition where the KUnit test kthread
> can terminate before we wait on it.
>
> Mickaël, does this seem like a correct fix to you?
> ---
> From: David Gow <[email protected]>
> Date: Thu, 11 Apr 2024 12:07:47 +0800
> Subject: [PATCH] kunit: Fix race condition in try-catch completion
>
> KUnit's try-catch infrastructure now uses vfork_done, which is always
> set to a valid completion when a kthread is crated, but which is set to

s/crated/created/

> NULL once the thread terminates. This creates a race condition, where
> the kthread exits before we can wait on it.
>
> Keep a copy of vfork_done, which is taken before we wake_up_process()
> and so valid, and wait on that instead.
>
> Fixes: 4de2a8e4cca4 ("kunit: Handle test faults")
> Reported-by: Linux Kernel Functional Testing <[email protected]>
> Signed-off-by: David Gow <[email protected]>

Minor suggestions, but it looks good. Thanks!

Acked-by: Mickaël Salaün <[email protected]>


> ---
> lib/kunit/try-catch.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c
> index fa687278ccc9..fc6cd4d7e80f 100644
> --- a/lib/kunit/try-catch.c
> +++ b/lib/kunit/try-catch.c
> @@ -63,6 +63,7 @@ void kunit_try_catch_run(struct kunit_try_catch
> *try_catch, void *context)
> {
> struct kunit *test = try_catch->test;
> struct task_struct *task_struct;
> + struct completion *task_done;
> int exit_code, time_remaining;
>
> try_catch->context = context;
> @@ -75,13 +76,14 @@ void kunit_try_catch_run(struct kunit_try_catch
> *try_catch, void *context)
> return;
> }
> get_task_struct(task_struct);
> + task_done = task_struct->vfork_done;
> wake_up_process(task_struct);

> /*
> * As for a vfork(2), task_struct->vfork_done (pointing to the
> * underlying kthread->exited) can be used to wait for the end of a
> * kernel thread.

"kernel thread. It is set to NULL when the thread ends."

> */

This block comment can now be moved up where task_done is set.

> - time_remaining = wait_for_completion_timeout(task_struct->vfork_done,
> + time_remaining = wait_for_completion_timeout(task_done,
> kunit_test_timeout());
> if (time_remaining == 0) {
> try_catch->try_result = -ETIMEDOUT;
> --



2024-04-11 14:45:09

by Naresh Kamboju

[permalink] [raw]
Subject: Re: BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq next-20240410

On Thu, 11 Apr 2024 at 09:55, David Gow <[email protected]> wrote:
>
> On Wed, 10 Apr 2024 at 23:23, Will Deacon <[email protected]> wrote:
> >
> > On Wed, Apr 10, 2024 at 03:57:10PM +0530, Naresh Kamboju wrote:
> > > Following kernel crash noticed on Linux next-20240410 tag while running
> > > kunit testing on qemu-arm64 and qemu-x86_64.
> > >
> > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > >

<trim>

> >
> > Ok, so 'task_struct->vfork_done' is NULL. Looks like this code was added
> > recently, so adding Mickaël to cc.
> >
>
> Thanks. This looks like a race condition where the KUnit test kthread
> can terminate before we wait on it.
>
> Mickaël, does this seem like a correct fix to you?
> ---
> From: David Gow <[email protected]>
> Date: Thu, 11 Apr 2024 12:07:47 +0800
> Subject: [PATCH] kunit: Fix race condition in try-catch completion
>
> KUnit's try-catch infrastructure now uses vfork_done, which is always
> set to a valid completion when a kthread is crated, but which is set to
> NULL once the thread terminates. This creates a race condition, where
> the kthread exits before we can wait on it.
>
> Keep a copy of vfork_done, which is taken before we wake_up_process()
> and so valid, and wait on that instead.
>
> Fixes: 4de2a8e4cca4 ("kunit: Handle test faults")
> Reported-by: Linux Kernel Functional Testing <[email protected]>
> Signed-off-by: David Gow <[email protected]>

This patch tested on top of Linux next and reported issues fixed.

Tested-by: Linux Kernel Functional Testing <[email protected]>

> ---
> lib/kunit/try-catch.c | 4 +++-
> 1 file changed, 3 insertions(+), 1 deletion(-)
>
> diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c
> index fa687278ccc9..fc6cd4d7e80f 100644
> --- a/lib/kunit/try-catch.c
> +++ b/lib/kunit/try-catch.c
> @@ -63,6 +63,7 @@ void kunit_try_catch_run(struct kunit_try_catch
> *try_catch, void *context)
> {
> struct kunit *test = try_catch->test;
> struct task_struct *task_struct;
> + struct completion *task_done;
> int exit_code, time_remaining;
>
> try_catch->context = context;
> @@ -75,13 +76,14 @@ void kunit_try_catch_run(struct kunit_try_catch
> *try_catch, void *context)
> return;
> }
> get_task_struct(task_struct);
> + task_done = task_struct->vfork_done;
> wake_up_process(task_struct);
> /*
> * As for a vfork(2), task_struct->vfork_done (pointing to the
> * underlying kthread->exited) can be used to wait for the end of a
> * kernel thread.
> */
> - time_remaining = wait_for_completion_timeout(task_struct->vfork_done,
> + time_remaining = wait_for_completion_timeout(task_done,
> kunit_test_timeout());
> if (time_remaining == 0) {
> try_catch->try_result = -ETIMEDOUT;
> --

--
Linaro LKFT
https://lkft.linaro.org

2024-04-11 15:01:41

by Dan Carpenter

[permalink] [raw]
Subject: Re: BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq next-20240410

On Thu, Apr 11, 2024 at 08:20:55PM +0530, Naresh Kamboju wrote:
>
> I use to notice kernel panic while running kunit tests
> now I have noticed this
>
> Unable to handle kernel paging request at virtual address
> KASAN: null-ptr-deref in range
> pc : kunit_test_null_dereference (lib/kunit/kunit-test.c:119)
> lr : kunit_generic_run_threadfn_adapter (lib/kunit/try-catch.c:31)
>
> The kunit tests run to completion and the system is stable.
> Kernel did not panic.
>

[ Snip ]

> <0>[ 76.808597] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> <4>[ 76.809876] Modules linked in:
> <4>[ 76.812055] CPU: 1 PID: 567 Comm: kunit_try_catch Tainted: G
> B N 6.9.0-rc3-next-20240410 #1
> <4>[ 76.812987] Hardware name: linux,dummy-virt (DT)
> <4>[ 76.814123] pstate: 12400009 (nzcV daif +PAN -UAO +TCO -DIT
> -SSBS BTYPE=--)
> <4>[ 76.814947] pc : kunit_test_null_dereference (lib/kunit/kunit-test.c:119)
^^^^^^^^^^^^^^^^^^^^^^^^^^^
This is a new intentional NULL dereferencer that was added yesterday.

Maybe these should have a big printk, "Intentional NULL dereference
coming up!\n".

regards,
dan carpenter


2024-04-11 15:06:33

by Guenter Roeck

[permalink] [raw]
Subject: Re: BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq next-20240410

On Thu, Apr 11, 2024 at 06:00:25PM +0300, Dan Carpenter wrote:
> On Thu, Apr 11, 2024 at 08:20:55PM +0530, Naresh Kamboju wrote:
> >
> > I use to notice kernel panic while running kunit tests
> > now I have noticed this
> >
> > Unable to handle kernel paging request at virtual address
> > KASAN: null-ptr-deref in range
> > pc : kunit_test_null_dereference (lib/kunit/kunit-test.c:119)
> > lr : kunit_generic_run_threadfn_adapter (lib/kunit/try-catch.c:31)
> >
> > The kunit tests run to completion and the system is stable.
> > Kernel did not panic.
> >
>
> [ Snip ]
>
> > <0>[ 76.808597] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> > <4>[ 76.809876] Modules linked in:
> > <4>[ 76.812055] CPU: 1 PID: 567 Comm: kunit_try_catch Tainted: G
> > B N 6.9.0-rc3-next-20240410 #1
> > <4>[ 76.812987] Hardware name: linux,dummy-virt (DT)
> > <4>[ 76.814123] pstate: 12400009 (nzcV daif +PAN -UAO +TCO -DIT
> > -SSBS BTYPE=--)
> > <4>[ 76.814947] pc : kunit_test_null_dereference (lib/kunit/kunit-test.c:119)
> ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> This is a new intentional NULL dereferencer that was added yesterday.
>
> Maybe these should have a big printk, "Intentional NULL dereference
> coming up!\n".
>

Can the backtrace be suppressed, similar to the warnings suppression I am
working on ?

Thanks,
Guenter

2024-04-11 15:15:48

by Naresh Kamboju

[permalink] [raw]
Subject: Re: BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq next-20240410

On Thu, 11 Apr 2024 at 20:12, Naresh Kamboju <[email protected]> wrote:
>
> On Thu, 11 Apr 2024 at 09:55, David Gow <[email protected]> wrote:
> >
> > On Wed, 10 Apr 2024 at 23:23, Will Deacon <[email protected]> wrote:
> > >
> > > On Wed, Apr 10, 2024 at 03:57:10PM +0530, Naresh Kamboju wrote:
> > > > Following kernel crash noticed on Linux next-20240410 tag while running
> > > > kunit testing on qemu-arm64 and qemu-x86_64.
> > > >
> > > > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > > >
>
> <trim>
>
> > >
> > > Ok, so 'task_struct->vfork_done' is NULL. Looks like this code was added
> > > recently, so adding Mickaël to cc.
> > >
> >
> > Thanks. This looks like a race condition where the KUnit test kthread
> > can terminate before we wait on it.
> >
> > Mickaël, does this seem like a correct fix to you?
> > ---
> > From: David Gow <[email protected]>
> > Date: Thu, 11 Apr 2024 12:07:47 +0800
> > Subject: [PATCH] kunit: Fix race condition in try-catch completion
> >
> > KUnit's try-catch infrastructure now uses vfork_done, which is always
> > set to a valid completion when a kthread is crated, but which is set to
> > NULL once the thread terminates. This creates a race condition, where
> > the kthread exits before we can wait on it.
> >
> > Keep a copy of vfork_done, which is taken before we wake_up_process()
> > and so valid, and wait on that instead.
> >
> > Fixes: 4de2a8e4cca4 ("kunit: Handle test faults")
> > Reported-by: Linux Kernel Functional Testing <[email protected]>
> > Signed-off-by: David Gow <[email protected]>
>
> This patch tested on top of Linux next and reported issues fixed.
>
> Tested-by: Linux Kernel Functional Testing <[email protected]>



>
> > ---
> > lib/kunit/try-catch.c | 4 +++-
> > 1 file changed, 3 insertions(+), 1 deletion(-)
> >
> > diff --git a/lib/kunit/try-catch.c b/lib/kunit/try-catch.c
> > index fa687278ccc9..fc6cd4d7e80f 100644
> > --- a/lib/kunit/try-catch.c
> > +++ b/lib/kunit/try-catch.c
> > @@ -63,6 +63,7 @@ void kunit_try_catch_run(struct kunit_try_catch
> > *try_catch, void *context)
> > {
> > struct kunit *test = try_catch->test;
> > struct task_struct *task_struct;
> > + struct completion *task_done;
> > int exit_code, time_remaining;
> >
> > try_catch->context = context;
> > @@ -75,13 +76,14 @@ void kunit_try_catch_run(struct kunit_try_catch
> > *try_catch, void *context)
> > return;
> > }
> > get_task_struct(task_struct);
> > + task_done = task_struct->vfork_done;
> > wake_up_process(task_struct);
> > /*
> > * As for a vfork(2), task_struct->vfork_done (pointing to the
> > * underlying kthread->exited) can be used to wait for the end of a
> > * kernel thread.
> > */
> > - time_remaining = wait_for_completion_timeout(task_struct->vfork_done,
> > + time_remaining = wait_for_completion_timeout(task_done,
> > kunit_test_timeout());
> > if (time_remaining == 0) {
> > try_catch->try_result = -ETIMEDOUT;
> > --

I use to notice kernel panic while running kunit tests
now I have noticed this

Unable to handle kernel paging request at virtual address
KASAN: null-ptr-deref in range
pc : kunit_test_null_dereference (lib/kunit/kunit-test.c:119)
lr : kunit_generic_run_threadfn_adapter (lib/kunit/try-catch.c:31)

The kunit tests run to completion and the system is stable.
Kernel did not panic.

kunit test log:
------
<6>[ 76.784878] # Subtest: kunit_fault
<6>[ 76.785527] # module: kunit_test
<6>[ 76.785785] 1..1
<1>[ 76.794318] Unable to handle kernel paging request at virtual
address dfff800000000000
<1>[ 76.796137] KASAN: null-ptr-deref in range
[0x0000000000000000-0x0000000000000007]
<1>[ 76.796970] Mem abort info:
<1>[ 76.797685] ESR = 0x0000000096000005
<1>[ 76.798868] EC = 0x25: DABT (current EL), IL = 32 bits
<1>[ 76.800355] SET = 0, FnV = 0
<1>[ 76.800893] EA = 0, S1PTW = 0
<1>[ 76.801715] FSC = 0x05: level 1 translation fault
<1>[ 76.802654] Data abort info:
<1>[ 76.803713] ISV = 0, ISS = 0x00000005, ISS2 = 0x00000000
<1>[ 76.804362] CM = 0, WnR = 0, TnD = 0, TagAccess = 0
<1>[ 76.805278] GCS = 0, Overlay = 0, DirtyBit = 0, Xs = 0
<1>[ 76.806302] [dfff800000000000] address between user and kernel
address ranges
<0>[ 76.808597] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
<4>[ 76.809876] Modules linked in:
<4>[ 76.812055] CPU: 1 PID: 567 Comm: kunit_try_catch Tainted: G
B N 6.9.0-rc3-next-20240410 #1
<4>[ 76.812987] Hardware name: linux,dummy-virt (DT)
<4>[ 76.814123] pstate: 12400009 (nzcV daif +PAN -UAO +TCO -DIT
-SSBS BTYPE=--)
<4>[ 76.814947] pc : kunit_test_null_dereference (lib/kunit/kunit-test.c:119)
<4>[ 76.815862] lr : kunit_generic_run_threadfn_adapter
(lib/kunit/try-catch.c:31)
<4>[ 76.816765] sp : ffff800083137dc0
<4>[ 76.817473] x29: ffff800083137e20 x28: 0000000000000000 x27:
0000000000000000
<4>[ 76.818684] x26: 0000000000000000 x25: 0000000000000000 x24:
fff00000c1b30c00
<4>[ 76.819798] x23: ffffa76fb372e348 x22: ffffa76fb3736550 x21:
fff00000c1b30c08
<4>[ 76.820900] x20: 1ffff00010626fb8 x19: ffff8000800879f0 x18:
0000000000000068
<4>[ 76.822008] x17: 0000000000000000 x16: fff00000da132180 x15:
ffffa76fb36f3b04
<4>[ 76.823125] x14: ffffa76fb2e3cc28 x13: 1ffe0000181547e4 x12:
fffd80001832511a
<4>[ 76.824229] x11: 1ffe000018325119 x10: fffd800018325119 x9 :
ffffa76fb372e3d0
<4>[ 76.825409] x8 : ffff800083137cb8 x7 : 0000000000000000 x6 :
0000000041b58ab3
<4>[ 76.826532] x5 : ffff700010626fb8 x4 : 00000000f1f1f1f1 x3 :
0000000000000003
<4>[ 76.827653] x2 : dfff800000000000 x1 : fff00000c1928000 x0 :
ffff8000800879f0
<4>[ 76.828829] Call trace:
<4>[ 76.829410] kunit_test_null_dereference (lib/kunit/kunit-test.c:119)
<4>[ 76.830294] kunit_generic_run_threadfn_adapter (lib/kunit/try-catch.c:31)
<4>[ 76.831168] kthread (kernel/kthread.c:389)
<4>[ 76.831870] ret_from_fork (arch/arm64/kernel/entry.S:861)
<0>[ 76.833252] Code: b90004a3 d5384101 52800063 aa0003f3 (39c00042)
All code
========
0: b90004a3 str w3, [x5, #4]
4: d5384101 mrs x1, sp_el0
8: 52800063 mov w3, #0x3 // #3
c: aa0003f3 mov x19, x0
10:* 39c00042 ldrsb w2, [x2] <-- trapping instruction

Code starting with the faulting instruction
===========================================
0: 39c00042 ldrsb w2, [x2]
<4>[ 76.834489] ---[ end trace 0000000000000000 ]---

Links:
- https://tuxapi.tuxsuite.com/v1/groups/linaro/projects/lkft/tests/2exQ84OHGOdSUQMBfFMxJoo8tAo
--
Linaro LKFT
https://lkft.linaro.org

2024-04-12 03:26:51

by David Gow

[permalink] [raw]
Subject: Re: BUG: KASAN: null-ptr-deref in _raw_spin_lock_irq next-20240410

On Thu, 11 Apr 2024 at 23:05, Guenter Roeck <[email protected]> wrote:
>
> On Thu, Apr 11, 2024 at 06:00:25PM +0300, Dan Carpenter wrote:
> > On Thu, Apr 11, 2024 at 08:20:55PM +0530, Naresh Kamboju wrote:
> > >
> > > I use to notice kernel panic while running kunit tests
> > > now I have noticed this
> > >
> > > Unable to handle kernel paging request at virtual address
> > > KASAN: null-ptr-deref in range
> > > pc : kunit_test_null_dereference (lib/kunit/kunit-test.c:119)
> > > lr : kunit_generic_run_threadfn_adapter (lib/kunit/try-catch.c:31)
> > >
> > > The kunit tests run to completion and the system is stable.
> > > Kernel did not panic.
> > >
> >
> > [ Snip ]
> >
> > > <0>[ 76.808597] Internal error: Oops: 0000000096000005 [#1] PREEMPT SMP
> > > <4>[ 76.809876] Modules linked in:
> > > <4>[ 76.812055] CPU: 1 PID: 567 Comm: kunit_try_catch Tainted: G
> > > B N 6.9.0-rc3-next-20240410 #1
> > > <4>[ 76.812987] Hardware name: linux,dummy-virt (DT)
> > > <4>[ 76.814123] pstate: 12400009 (nzcV daif +PAN -UAO +TCO -DIT
> > > -SSBS BTYPE=--)
> > > <4>[ 76.814947] pc : kunit_test_null_dereference (lib/kunit/kunit-test.c:119)
> > ^^^^^^^^^^^^^^^^^^^^^^^^^^^
> > This is a new intentional NULL dereferencer that was added yesterday.
> >
> > Maybe these should have a big printk, "Intentional NULL dereference
> > coming up!\n".
> >
>
> Can the backtrace be suppressed, similar to the warnings suppression I am
> working on ?
>

I'd like to do that going forward. Of course, this isn't a warning, so
it doesn't work as-is (and it'd be harder to pass things like the
function name through), but it seems like a worthwhile feature to have
going forward.
We did have some similar stuff for trapping KASAN errors as a part of
the KASAN tests a while ago: that's also something to look at.

I have been playing with the warning suppression with the fortify
test, and that seems to be working well.

Cheers,
-- David


Attachments:
smime.p7s (3.92 kB)
S/MIME Cryptographic Signature