2022-11-02 15:04:47

by Naresh Kamboju

[permalink] [raw]
Subject: KASAN / KUNIT: testing ran on qemu-arm and list of failures

This is a report to get a quick update on kasan on qemu-arm.

The KASAN / KUNIT testing ran on qemu-arm and the following test cases failed
and the kernel crashed.

Following tests failed,
kasan_strings - FAILED
vmalloc_oob - FAILED
kasan_memchr - FAILED
kasan - FAILED
kasan_bitops_generic - FAILED

Reported-by: Linux Kernel Functional Testing <[email protected]>

Boot and test log:
[ 0.000000] Linux version 6.0.7-rc1 (tuxmake@tuxmake)
(arm-linux-gnueabihf-gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld
(GNU Binutils for Debian) 2.35.2) #1 SMP @1667356522
[ 0.000000] CPU: ARMv7 Processor [410fd034] revision 4 (ARMv7), cr=10c5383d
...
[ 0.000000] kasan: Mapping kernel virtual memory block:
c0000000-f0000000 at shadow: b7000000-bd000000
[ 0.000000] kasan: Mapping kernel virtual memory block:
bfe00000-c0000000 at shadow: b6fc0000-b7000000
[ 0.000000] kasan: Kernel address sanitizer initialized
...
[ 81.058636] ok 41 - kmem_cache_double_destroy
[ 81.059932] # kasan_memchr: EXPECTATION FAILED at lib/test_kasan.c:920
[ 81.059932] KASAN failure expected in \"kasan_ptr_result =
memchr(ptr, '1', size + 1)\", but none occurred
[ 81.063106] not ok 42 - kasan_memchr
...
[ 81.221595] # kasan_strings: EXPECTATION FAILED at lib/test_kasan.c:975
[ 81.221595] KASAN failure expected in \"kasan_ptr_result =
strchr(ptr, '1')\", but none occurred
[ 81.223903] # kasan_strings: EXPECTATION FAILED at lib/test_kasan.c:977
[ 81.223903] KASAN failure expected in \"kasan_ptr_result =
strrchr(ptr, '1')\", but none occurred
...
[ 429.920201] Insufficient stack space to handle exception!
[ 429.920232] Task stack: [0xfa000000..0xfa004000]
[ 429.925226] IRQ stack: [0xf0808000..0xf080c000]
[ 429.927424] Overflow stack: [0xc4190000..0xc4191000]
[ 429.929785] Internal error: kernel stack overflow: 0 [#1] SMP ARM
[ 429.933101] Modules linked in: usbtest pci_endpoint_test
pci_epf_test preemptirq_delay_test soc_utils_test(N) snd_soc_core
ac97_bus snd_pcm_dmaengine snd_pcm snd_timer snd soundcore cfg80211
bluetooth crc32_arm_ce sha2_arm_ce sha256_arm sha1_arm_ce sha1_arm
aes_arm_ce crypto_simd
[ 429.946324] CPU: 1 PID: 3390 Comm: grep Tainted: G B
N 6.0.7-rc1 #1
[ 429.950389] Hardware name: Generic DT based system
[ 429.952979] PC is at trace_hardirqs_off+0x0/0x16c
[ 429.955349] LR is at __dabt_svc+0x48/0x80
...
[ 902.927481] Insufficient stack space to handle exception!
[ 902.927520] Task stack: [0xfa138000..0xfa13c000]
[ 902.932386] IRQ stack: [0xf0800000..0xf0804000]
[ 902.934770] Overflow stack: [0xc418f000..0xc4190000]
[ 902.937770] Internal error: kernel stack overflow: 0 [#3] SMP ARM
[ 902.941255] Modules linked in: usbtest pci_endpoint_test
pci_epf_test preemptirq_delay_test soc_utils_test(N) snd_soc_core
ac97_bus snd_pcm_dmaengine snd_pcm snd_timer snd soundcore cfg80211
bluetooth crc32_arm_ce sha2_arm_ce sha256_arm sha1_arm_ce sha1_arm
aes_arm_ce crypto_simd
[ 902.954667] CPU: 0 PID: 3440 Comm: agetty Tainted: G B D
N 6.0.7-rc1 #1
[ 902.959155] Hardware name: Generic DT based system
[ 902.961688] PC is at trace_hardirqs_off+0x0/0x16c
[ 902.964151] LR is at __dabt_svc+0x48/0x80
[ 902.966393] pc : [<c04c98fc>] lr : [<c0300b28>] psr: 400c0193
[ 902.969925] sp : fa138008 ip : 00000051 fp : fa13be74
[ 902.973008] r10: c6175180 r9 : ce082c00 r8 : fa1380b8
[ 902.976025] r7 : fa13803c r6 : ffffffff r5 : 200c0193 r4 : c05eb00c
[ 902.979718] r3 : c1b29438 r2 : fa138054 r1 : be42701f r0 : 00000051
[ 902.983275] Flags: nZcv IRQs off FIQs on Mode SVC_32 ISA ARM
Segment none
[ 902.986903] Control: 10c5383d Table: 498d806a DAC: 00000051
[ 902.989830] Register r0 information: non-paged memory
[ 902.992732] Register r1 information: non-paged memory
[ 902.995532] Register r2 information: 4-page vmalloc region starting
at 0xfa138000 allocated at kernel_clone+0xb0/0x53c
[ 903.001090] Register r3 information: non-slab/vmalloc memory
[ 903.004149] Register r4 information: non-slab/vmalloc memory
[ 903.007262] Register r5 information: non-paged memory
[ 903.010091] Register r6 information: non-paged memory
[ 903.012782] Register r7 information: 4-page vmalloc region starting
at 0xfa138000 allocated at kernel_clone+0xb0/0x53c
[ 903.018743] Register r8 information: 4-page vmalloc region starting
at 0xfa138000 allocated at kernel_clone+0xb0/0x53c
[ 903.024227] Register r9 information: slab task_struct start
ce082c00 pointer offset 0
[ 903.028265] Register r10 information: slab mm_struct start c6175180
pointer offset 0 size 168
[ 903.033051] Register r11 information: 4-page vmalloc region
starting at 0xfa138000 allocated at kernel_clone+0xb0/0x53c
[ 903.038677] Register r12 information: non-paged memory
[ 903.041458] Process agetty (pid: 3440, stack limit = 0xa4d91b13)
...
[ 905.331477] trace_hardirqs_off from __dabt_svc+0x48/0x80
[ 905.334275] Exception stack(0xfa138008 to 0xfa138050)
[ 905.337192] 8000: fa1380f8 be42701f fa1380f8
00000003 be427035 fa1380b8
[ 905.341804] 8020: c31192e0 00000005 fa1380b8 c3119330 c6175180
fa13be74 00000051 fa138054
[ 905.346074] 8040: c1b29438 c05eb00c 200c0193 ffffffff
[ 905.348746] __dabt_svc from __asan_load4+0x30/0x88
[ 905.351332] __asan_load4 from do_translation_fault+0x34/0x124
[ 905.354779] do_translation_fault from do_DataAbort+0x54/0xf4
[ 905.358091] do_DataAbort from __dabt_svc+0x50/0x80
[ 905.360842] Exception stack(0xfa1380b8 to 0xfa138100)

[1] https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.0.y/build/v6.0.6-241-g436175d0f780/testrun/12809413/suite/log-parser-test/test/check-kernel-bug/details/
[2] https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.0.y/build/v6.0.6-241-g436175d0f780/testrun/12809413/suite/log-parser-test/test/check-kernel-bug/log

metadata:
git_ref: linux-6.0.y
git_repo: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc
git_sha: 436175d0f780af8302164b3102ecf0ff99f7a376
git_describe: v6.0.6-241-g436175d0f780
kernel_version: 6.0.7-rc1
kernel-config: https://builds.tuxbuild.com/2GyMQxdakmLexUwkh1d3VjAfSgv/config
build-url: https://gitlab.com/Linaro/lkft/mirrors/stable/linux-stable-rc/-/pipelines/683032123
artifact-location: https://builds.tuxbuild.com/2GyMQxdakmLexUwkh1d3VjAfSgv
toolchain: gcc-10


--
Linaro LKFT
https://lkft.linaro.org


2022-11-08 14:28:31

by Linus Walleij

[permalink] [raw]
Subject: Re: KASAN / KUNIT: testing ran on qemu-arm and list of failures

On Wed, Nov 2, 2022 at 3:15 PM Naresh Kamboju <[email protected]> wrote:

> This is a report to get a quick update on kasan on qemu-arm.
>
> The KASAN / KUNIT testing ran on qemu-arm and the following test cases failed
> and the kernel crashed.
>
> Following tests failed,
> kasan_strings - FAILED
> vmalloc_oob - FAILED
> kasan_memchr - FAILED
> kasan - FAILED
> kasan_bitops_generic - FAILED
>
> Reported-by: Linux Kernel Functional Testing <[email protected]>

Which isn't very strange since:

> [ 429.920201] Insufficient stack space to handle exception!

the system ran out of stack. With VMAP stack and IRQSTACKS
there is really not much more memory we can provide.

When I discussed this with syzbot it seemed they were using some
really big userspace program written in Go that just used up all
the virtual memory :P

I don't know the nature of this test though. Using a lot of memory??

Yours,
Linus Walleij

2022-11-08 14:44:34

by Arnd Bergmann

[permalink] [raw]
Subject: Re: KASAN / KUNIT: testing ran on qemu-arm and list of failures

On Tue, Nov 8, 2022, at 14:51, Linus Walleij wrote:
> On Wed, Nov 2, 2022 at 3:15 PM Naresh Kamboju <[email protected]> wrote:
>
>> This is a report to get a quick update on kasan on qemu-arm.
>>
>> The KASAN / KUNIT testing ran on qemu-arm and the following test cases failed
>> and the kernel crashed.
>>
>> Following tests failed,
>> kasan_strings - FAILED
>> vmalloc_oob - FAILED
>> kasan_memchr - FAILED
>> kasan - FAILED
>> kasan_bitops_generic - FAILED
>>
>> Reported-by: Linux Kernel Functional Testing <[email protected]>
>
> Which isn't very strange since:
>
>> [ 429.920201] Insufficient stack space to handle exception!
>
> the system ran out of stack. With VMAP stack and IRQSTACKS
> there is really not much more memory we can provide.
>
> When I discussed this with syzbot it seemed they were using some
> really big userspace program written in Go that just used up all
> the virtual memory :P
>
> I don't know the nature of this test though. Using a lot of memory??

From the log file[1], I see that the actual problem is a
recursive data abort. The problem is not that any particular
piece uses too much memory, it's that each time it tries to
handle the fault, it causes a new fault:

Citing more of the output:

[ 429.920201] Insufficient stack space to handle exception!
[ 429.920232] Task stack: [0xfa000000..0xfa004000]
[ 429.925226] IRQ stack: [0xf0808000..0xf080c000]
[ 429.927424] Overflow stack: [0xc4190000..0xc4191000]
[ 429.929785] Internal error: kernel stack overflow: 0 [#1] SMP ARM
[ 429.933101] Modules linked in: usbtest pci_endpoint_test pci_epf_test preemptirq_delay_test soc_utils_test(N) snd_soc_core ac97_bus snd_pcm_dmaengine snd_pcm snd_timer snd soundcore cfg80211 bluetooth crc32_arm_ce sha2_arm_ce sha256_arm sha1_arm_ce sha1_arm aes_arm_ce crypto_simd
[ 429.946324] CPU: 1 PID: 3390 Comm: grep Tainted: G B N 6.0.7-rc1 #1
[ 429.950389] Hardware name: Generic DT based system
[ 429.952979] PC is at trace_hardirqs_off+0x0/0x16c
[ 429.955349] LR is at __dabt_svc+0x48/0x80
[ 429.957676] pc : [<c04c98fc>] lr : [<c0300b28>] psr: 400f0193
[ 429.961073] sp : fa000008 ip : 00000051 fp : fa003a54
[ 429.963850] r10: c44f6c80 r9 : cc7aa100 r8 : fa0000b8
[ 429.966725] r7 : fa00003c r6 : ffffffff r5 : 200f0193 r4 : c05eb00c
[ 429.970284] r3 : c1b29438 r2 : fa000054 r1 : be40001f r0 : 00000051
[ 429.973596] Flags: nZcv IRQs off FIQs on Mode SVC_32 ISA ARM Segment none
[ 429.977644] Control: 10c5383d Table: 4df4406a DAC: 00000051
...
[ 430.032192] Process grep (pid: 3390, stack limit = 0xfab94337)
[ 430.035496] Stack: (0xfa000008 to 0xfa004000)
[ 430.037882] 0000: fa0000f8 be40001f fa0000f8 00000003 be400035 fa0000b8
[ 430.041998] 0020: c31192e0 00000005 fa0000b8 c3119330 c44f6c80 fa003a54 00000051 fa000054
...
[ 432.224241] 3fc0: 00000001 aeca02b0 00000000 000000f8 aec9ee5c 00000001 00000000 aec9f384
[ 432.228111] 3fe0: 000000f8 b6669814 aec16049 aebb1ae6 600b0030 00000000 00000000 00000000
[ 432.232008] trace_hardirqs_off from __dabt_svc+0x48/0x80
[ 432.234649] Exception stack(0xfa000008 to 0xfa000050)
[ 432.237101] 0000: fa0000f8 be40001f fa0000f8 00000003 be400035 fa0000b8
[ 432.240995] 0020: c31192e0 00000005 fa0000b8 c3119330 c44f6c80 fa003a54 00000051 fa000054
[ 432.244861] 0040: c1b29438 c05eb00c 200f0193 ffffffff
[ 432.247286] __dabt_svc from __asan_load4+0x30/0x88
[ 432.249665] __asan_load4 from do_translation_fault+0x34/0x124
[ 432.252456] do_translation_fault from do_DataAbort+0x54/0xf4
[ 432.255278] do_DataAbort from __dabt_svc+0x50/0x80
[ 432.258226] Exception stack(0xfa0000b8 to 0xfa000100)
[ 432.260998] 00a0: fa0001a8 be400035
[ 432.265219] 00c0: fa0001a8 00000003 be40004b fa000168 c31192e0 00000005 fa000168 c3119330
[ 432.269670] 00e0: c44f6c80 fa003a54 00000051 fa000104 c1b29438 c05eb00c 200f0193 ffffffff
[ 432.273872] __dabt_svc from __asan_load4+0x30/0x88
[ 432.276409] __asan_load4 from do_translation_fault+0x34/0x124
[ 432.279577] do_translation_fault from do_DataAbort+0x54/0xf4
[ 432.282646] do_DataAbort from __dabt_svc+0x50/0x80
...
[ 434.328344] Exception stack(0xfa0037b8 to 0xfa003800)
[ 434.331100] 37a0: fa0038a8 be400715
[ 434.335411] 37c0: fa0038a8 00000003 be40072b fa003868 c31192e0 00000005 fa003868 c3119330
[ 434.339676] 37e0: c44f6c80 fa003a54 00000051 fa003804 c1b29438 c05eb00c 200f0193 ffffffff
[ 434.344067] __dabt_svc from __asan_load4+0x30/0x88
[ 434.346673] __asan_load4 from do_translation_fault+0x34/0x124
[ 434.350064] do_translation_fault from do_DataAbort+0x54/0xf4
[ 434.353242] do_DataAbort from __dabt_svc+0x50/0x80
[ 434.355711] Exception stack(0xfa003868 to 0xfa0038b0)
[ 434.358544] 3860: fa003958 be40072b fa003958 00000003 be400738 fa003918
[ 434.362770] 3880: c31192e0 00000805 fa003918 c3119330 c44f6c80 fa003a54 00000051 fa0038b4
[ 434.366943] 38a0: c1b29438 c05eb00c 200f0193 ffffffff
[ 434.369838] __dabt_svc from __asan_load4+0x30/0x88
[ 434.372308] __asan_load4 from do_translation_fault+0x34/0x124
[ 434.375566] do_translation_fault from do_DataAbort+0x54/0xf4
[ 434.378889] do_DataAbort from __dabt_svc+0x50/0x80
[ 434.381535] Exception stack(0xfa003918 to 0xfa003960)
[ 434.384064] 3900: e8476d80 00000000
[ 434.388732] 3920: be400738 00000000 c30f2044 cb4f0b00 cc7aa100 2537d000 ca86ca00 ca86ca00
[ 434.393063] 3940: c44f6c80 fa003a54 fa003968 fa003968 c03a81ac c1b1d4c4 200f0113 ffffffff
[ 434.397387] __dabt_svc from __schedule+0x590/0xfc0
[ 434.399943] __schedule from __cond_resched+0x50/0x6c
[ 434.402576] __cond_resched from zap_pte_range+0x56c/0xa08
[ 434.405862] zap_pte_range from unmap_page_range+0x12c/0x364
[ 434.409044] unmap_page_range from unmap_vmas+0x124/0x178
[ 434.411851] unmap_vmas from exit_mmap+0x128/0x304
[ 434.414529] exit_mmap from __mmput+0x34/0x188
[ 434.416946] __mmput from do_exit+0x508/0xef8
[ 434.419338] do_exit from do_group_exit+0x50/0x108
[ 434.421858] do_group_exit from __wake_up_parent+0x0/0x34
[ 434.424729] Code: e2840d41 e2800030 e8bd41f0 eaff8cce (e92d47f0)
[ 434.428055] ---[ end trace 0000000000000000 ]---
[ 434.430356] Fixing recursive fault but reboot is needed!

Note the "Exception stack(0xfa003918 to 0xfa003960)" values slightly
shrinking with each iteration. I can see that both
__schedule+0x590/0xfc0 and __asan_load4+0x30/0x88 trigger an
unexpected exception here. The latter of those is what causes
the recursion. Presumably both them try to access the same
invalid pointer, but I have not disassembled the vmlinux
yet to see what it's actually trying to do here.

Arnd

[1] https://qa-reports.linaro.org/lkft/linux-stable-rc-linux-6.0.y/build/v6.0.6-241-g436175d0f780/testrun/12809413/suite/log-parser-test/test/check-kernel-bug/log