2024-02-02 02:20:36

by Pengfei Xu

[permalink] [raw]
Subject: [Syzkaller & bisect] There is sys_bpf related out_of_memory WARNING in v6.8-rc2

Hi Joanne Koong and bpf experts,

Greeting!

There is sys_bpf related out_of_memory WARNING in v6.8-rc2 in guest:

All detailed info: https://github.com/xupengfe/syzkaller_logs/tree/main/240201_154123_out_of_memory_bpf_related_issue
Syzkaller reproduced code: https://github.com/xupengfe/syzkaller_logs/blob/main/240201_154123_out_of_memory_bpf_related_issue/repro.c
Syzkaller repro syscall steps: https://github.com/xupengfe/syzkaller_logs/blob/main/240201_154123_out_of_memory_bpf_related_issue/repro.prog
Kconfig_origin(make olddefconfig): https://github.com/xupengfe/syzkaller_logs/blob/main/240201_154123_out_of_memory_bpf_related_issue/kconfig_origin
Bisect info: https://github.com/xupengfe/syzkaller_logs/blob/main/240201_154123_out_of_memory_bpf_related_issue/bisect_info.log
Issue dmesg: https://github.com/xupengfe/syzkaller_logs/blob/main/240201_154123_out_of_memory_bpf_related_issue/41bccc98fb7931d63d03f326a746ac4d429c1dd3_dmesg.log
bzImage_v6.8-rc2: https://github.com/xupengfe/syzkaller_logs/raw/main/240201_154123_out_of_memory_bpf_related_issue/bzImage_v6.8-rc2.tar.gz

Bisected and found first bad commit:
"
9330986c0300 bpf: Add bloom filter map implementation
"

Syzkaller repro.report: https://github.com/xupengfe/syzkaller_logs/blob/main/240201_154123_out_of_memory_bpf_related_issue/repro.report
"
Out of memory: Killed process 4572 (syz-executor378) total-vm:19292kB, anon-rss:0kB, file-rss:128kB, shmem-rss:0kB, UID:0 pgtables:52kB oom_score_adj:1000
------------[ cut here ]------------
WARNING: CPU: 0 PID: 4585 at arch/x86/mm/pat/memtype.c:1060 untrack_pfn+0x466/0x590 arch/x86/mm/pat/memtype.c:1060
Modules linked in:
CPU: 0 PID: 4585 Comm: syz-executor378 Not tainted 6.8.0-rc2-f31cefc5e516+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
RIP: 0010:untrack_pfn+0x466/0x590 arch/x86/mm/pat/memtype.c:1060
Code: 48 b8 00 00 00 00 00 fc ff df 48 89 fa 48 c1 ea 03 80 3c 02 00 0f 84 e0 fd ff ff e8 d4 6a a1 00 e9 d6 fd ff ff e8 3a f0 40 00 <0f> 0b e9 d2 fd ff ff e8 2e f0 40 00 49 8d bc 24 a0 01 00 00 31 f6
RSP: 0000:ffff8880598df710 EFLAGS: 00010293
RAX: 0000000000000000 RBX: ffff888052477a50 RCX: ffffffff8121265c
RDX: ffff888030efca00 RSI: ffffffff81212966 RDI: 0000000000000005
RBP: ffff8880598df7d0 R08: 0000000000000001 R09: 0000000000000002
R10: 00000000ffffffea R11: 0000000000000001 R12: 00000000ffffffea
R13: 1ffff1100b31bee5 R14: 0000000000000000 R15: ffff8880598df7a8
FS: 0000000000000000(0000) GS:ffff88806cc00000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007f240f7bc1b0 CR3: 0000000006a7e004 CR4: 0000000000770ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000ffff07f0 DR7: 0000000000000400
PKRU: 55555554
Call Trace:
<TASK>
unmap_single_vma+0x1d9/0x2d0 mm/memory.c:1692
unmap_vmas+0x210/0x470 mm/memory.c:1758
exit_mmap+0x19b/0xac0 mm/mmap.c:3284
__mmput+0xde/0x3e0 kernel/fork.c:1343
mmput+0x74/0x90 kernel/fork.c:1365
exit_mm kernel/exit.c:569 [inline]
do_exit+0xa59/0x28c0 kernel/exit.c:858
do_group_exit+0xe5/0x2c0 kernel/exit.c:1020
get_signal+0x2715/0x27d0 kernel/signal.c:2893
arch_do_signal_or_restart+0x8e/0x7e0 arch/x86/kernel/signal.c:310
exit_to_user_mode_loop kernel/entry/common.c:105 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:201 [inline]
syscall_exit_to_user_mode+0x140/0x210 kernel/entry/common.c:212
do_syscall_64+0x82/0x150 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x6e/0x76
RIP: 0033:0x7f022b83ee5d
Code: Unable to access opcode bytes at 0x7f022b83ee33.
RSP: 002b:00007f022bab0d68 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: fffffffffffffff4 RBX: 00000000004051a8 RCX: 00007f022b83ee5d
RDX: 0000000000000020 RSI: 0000000020000180 RDI: 0000000000000002
RBP: 00000000004051a0 R08: 00007f022bab1640 R09: 0000000000000000
R10: 00007f022bab1640 R11: 0000000000000246 R12: 00000000004051ac
R13: 0000000000000011 R14: 00007f022b89f560 R15: 0000000000000000
</TASK>
irq event stamp: 17161
hardirqs last enabled at (17169): [<ffffffff81431b45>] __up_console_sem kernel/printk/printk.c:341 [inline]
hardirqs last enabled at (17169): [<ffffffff81431b45>] __console_unlock kernel/printk/printk.c:2706 [inline]
hardirqs last enabled at (17169): [<ffffffff81431b45>] console_unlock+0x2d5/0x310 kernel/printk/printk.c:3038
hardirqs last disabled at (17200): [<ffffffff81431b2a>] __up_console_sem kernel/printk/printk.c:339 [inline]
hardirqs last disabled at (17200): [<ffffffff81431b2a>] __console_unlock kernel/printk/printk.c:2706 [inline]
hardirqs last disabled at (17200): [<ffffffff81431b2a>] console_unlock+0x2ba/0x310 kernel/printk/printk.c:3038
softirqs last enabled at (17224): [<ffffffff8126c9b8>] invoke_softirq kernel/softirq.c:427 [inline]
softirqs last enabled at (17224): [<ffffffff8126c9b8>] __irq_exit_rcu+0xa8/0x110 kernel/softirq.c:632
softirqs last disabled at (17235): [<ffffffff8126c9b8>] invoke_softirq kernel/softirq.c:427 [inline]
softirqs last disabled at (17235): [<ffffffff8126c9b8>] __irq_exit_rcu+0xa8/0x110 kernel/softirq.c:632
---[ end trace 0000000000000000 ]---
syz-executor378 invoked oom-killer: gfp_mask=0x102cc2(GFP_HIGHUSER|__GFP_NOWARN), order=0, oom_score_adj=1000
CPU: 0 PID: 4635 Comm: syz-executor378 Tainted: G W 6.8.0-rc2-f31cefc5e516+ #1
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.16.0-0-gd239552ce722-prebuilt.qemu.org 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xe1/0x110 lib/dump_stack.c:106
dump_stack+0x19/0x20 lib/dump_stack.c:113
dump_header+0x111/0x8f0 mm/oom_kill.c:461
oom_kill_process+0x287/0xa70 mm/oom_kill.c:1032
out_of_memory+0x34e/0x1720 mm/oom_kill.c:1170
__alloc_pages_may_oom mm/page_alloc.c:3483 [inline]
__alloc_pages_slowpath.constprop.0+0x182a/0x2140 mm/page_alloc.c:4243
__alloc_pages+0x45a/0x530 mm/page_alloc.c:4580
alloc_pages_mpol+0x278/0x580 mm/mempolicy.c:2133
alloc_pages+0x140/0x160 mm/mempolicy.c:2204
vm_area_alloc_pages mm/vmalloc.c:3063 [inline]
__vmalloc_area_node mm/vmalloc.c:3139 [inline]
__vmalloc_node_range+0xb7c/0x1570 mm/vmalloc.c:3320
kvmalloc_node+0x1be/0x240 mm/util.c:642
kvmalloc include/linux/slab.h:728 [inline]
kvmemdup_bpfptr include/linux/bpfptr.h:70 [inline]
map_update_elem kernel/bpf/syscall.c:1547 [inline]
__sys_bpf+0x426e/0x55c0 kernel/bpf/syscall.c:5445
__do_sys_bpf kernel/bpf/syscall.c:5561 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5559 [inline]
__x64_sys_bpf+0x7d/0xc0 kernel/bpf/syscall.c:5559
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0x73/0x150 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x6e/0x76
RIP: 0033:0x7f022b83ee5d
Code: Unable to access opcode bytes at 0x7f022b83ee33.
RSP: 002b:00007f022bab0d68 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00000000004051a8 RCX: 00007f022b83ee5d
RDX: 0000000000000020 RSI: 0000000020000180 RDI: 0000000000000002
RBP: 00000000004051a0 R08: 00007f022bab1640 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00000000004051ac
R13: 0000000000000011 R14: 00007f022b89f560 R15: 0000000000000000
</TASK>
"

Hope it's helpful.

---

If you don't need the following environment to reproduce the problem or if you
already have one reproduced environment, please ignore the following information.

How to reproduce:
git clone https://gitlab.com/xupengfe/repro_vm_env.git
cd repro_vm_env
tar -xvf repro_vm_env.tar.gz
cd repro_vm_env; ./start3.sh // it needs qemu-system-x86_64 and I used v7.1.0
// start3.sh will load bzImage_2241ab53cbb5cdb08a6b2d4688feb13971058f65 v6.2-rc5 kernel
// You could change the bzImage_xxx as you want
// Maybe you need to remove line "-drive if=pflash,format=raw,readonly=on,file=./OVMF_CODE.fd \" for different qemu version
You could use below command to log in, there is no password for root.
ssh -p 10023 root@localhost

After login vm(virtual machine) successfully, you could transfer reproduced
binary to the vm by below way, and reproduce the problem in vm:
gcc -pthread -o repro repro.c
scp -P 10023 repro root@localhost:/root/

Get the bzImage for target kernel:
Please use target kconfig and copy it to kernel_src/.config
make olddefconfig
make -jx bzImage //x should equal or less than cpu num your pc has

Fill the bzImage file into above start3.sh to load the target kernel in vm.


Tips:
If you already have qemu-system-x86_64, please ignore below info.
If you want to install qemu v7.1.0 version:
git clone https://github.com/qemu/qemu.git
cd qemu
git checkout -f v7.1.0
mkdir build
cd build
yum install -y ninja-build.x86_64
yum -y install libslirp-devel.x86_64
./configure --target-list=x86_64-softmmu --enable-kvm --enable-vnc --enable-gtk --enable-sdl --enable-usb-redir --enable-slirp
make
make install

Best Regards,
Thanks!