2022-12-22 23:04:04

by syzbot

[permalink] [raw]
Subject: [syzbot] [erofs?] WARNING: CPU: NUM PID: NUM at mm/page_alloc.c:LINE get_page_from_freeli

Hello,

syzbot found the following issue on:

HEAD commit: f9ff5644bcc0 Merge tag 'hsi-for-6.2' of git://git.kernel.o..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=15aa58f7880000
kernel config: https://syzkaller.appspot.com/x/.config?x=827916bd156c2ec6
dashboard link: https://syzkaller.appspot.com/bug?extid=c3729cda01706a04fb98
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11bdd020480000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12c53ab3880000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/0c8a5f06ceb3/disk-f9ff5644.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/be222e852ae2/vmlinux-f9ff5644.xz
kernel image: https://storage.googleapis.com/syzbot-assets/d9f42a53b05e/bzImage-f9ff5644.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/7f2f76b76cd2/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

------------[ cut here ]------------
WARNING: CPU: 1 PID: 4386 at mm/page_alloc.c:3829 get_page_from_freeli


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
syzbot can test patches for this issue, for details see:
https://goo.gl/tpsmEJ#testing-patches


2023-01-05 10:20:43

by Gao Xiang

[permalink] [raw]
Subject: Re: [syzbot] [erofs?] WARNING: CPU: NUM PID: NUM at mm/page_alloc.c:LINE get_page_from_freeli

Hi,

On 2022/12/23 06:55, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: f9ff5644bcc0 Merge tag 'hsi-for-6.2' of git://git.kernel.o..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=15aa58f7880000
> kernel config: https://syzkaller.appspot.com/x/.config?x=827916bd156c2ec6

I wasn't able to build the kernel with this kernel config, it shows:
"...
FATAL: modpost: vmlinux.o is truncated. sechdrs[i].sh_offset=1399394064 > sizeof(*hrd)=64
make[2]: *** [Module.symvers] Error 1
make[1]: *** [modpost] Error 2
make: *** [__sub-make] Error 2
"

Not sure what happened, and it seems some other person also reported
before:
https://lore.kernel.org/r/CAAGKmqL9k87xw68zwH9ZM7fQFFsgMnA7V=RB+tQ-M2WS6CZg4A@mail.gmail.com/


> dashboard link: https://syzkaller.appspot.com/bug?extid=c3729cda01706a04fb98
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=11bdd020480000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=12c53ab3880000

Then I tried this C reproducer with my own kernel config and
it didn't show any strange.


>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/0c8a5f06ceb3/disk-f9ff5644.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/be222e852ae2/vmlinux-f9ff5644.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/d9f42a53b05e/bzImage-f9ff5644.xz

Finally I tried the original kernel image, and it printed some other
random bug when booting system and then reboot, like:

[ 36.991123][ T1] ==================================================================
[ 36.991800][ T1] BUG: KASAN: slab-out-of-bounds in copy_array+0x96/0x100
[ 36.992438][ T1] Write of size 32 at addr ffff888018c34640 by task systemd/1
[ 36.993032][ T1]
[ 36.993249][ T1] CPU: 2 PID: 1 Comm: systemd Not tainted 6.1.0-syzkaller-13139-gf9ff5644bcc0 #0
[ 36.993980][ T1] Hardware name: Red Hat KVM, BIOS 0.5.1 01/01/2011
[ 36.994520][ T1] Call Trace:
[ 36.994806][ T1] <TASK>
[ 36.995060][ T1] dump_stack_lvl+0xd1/0x138
[ 36.995488][ T1] print_report+0x15e/0x45d
[ 36.995891][ T1] ? __phys_addr+0xc8/0x140
[ 36.996293][ T1] ? copy_array+0x96/0x100
[ 36.996677][ T1] kasan_report+0xbf/0x1f0
[ 36.997063][ T1] ? copy_array+0x96/0x100
[ 36.997448][ T1] kasan_check_range+0x141/0x190
[ 36.997865][ T1] memcpy+0x3d/0x60
[ 36.998196][ T1] copy_array+0x96/0x100
[ 36.998561][ T1] copy_verifier_state+0xa9/0xc60
[ 36.998985][ T1] ? bpf_log+0x270/0x270
[ 36.999343][ T1] ? check_buffer_access.constprop.0+0x2e0/0x2e0
[ 36.999867][ T1] pop_stack+0x8c/0x2f0
[ 37.000223][ T1] do_check_common+0x5663/0xbca0
[ 37.000654][ T1] ? _raw_spin_unlock_irqrestore+0x54/0x70
[ 37.001152][ T1] ? check_helper_call+0x8ef0/0x8ef0
[ 37.001614][ T1] ? kvfree+0x46/0x50
[ 37.001963][ T1] ? check_cfg+0x6aa/0xb20
[ 37.002353][ T1] bpf_check+0x7348/0xacc0
[ 37.002751][ T1] ? find_held_lock+0x2d/0x110
[ 37.003168][ T1] ? lockdep_hardirqs_on_prepare+0x410/0x410
[ 37.003676][ T1] ? bpf_get_btf_vmlinux+0x20/0x20
[ 37.004104][ T1] ? find_held_lock+0x2d/0x110
[ 37.004524][ T1] ? bpf_prog_load+0x1486/0x2230
[ 37.004935][ T1] ? lock_downgrade+0x6e0/0x6e0
[ 37.005341][ T1] ? __might_fault+0xd9/0x180
[ 37.005747][ T1] ? memset+0x24/0x50
[ 37.006087][ T1] ? bpf_obj_name_cpy+0x148/0x1a0
[ 37.006541][ T1] bpf_prog_load+0x1543/0x2230
[ 37.006943][ T1] ? __bpf_prog_put.constprop.0+0x220/0x220
[ 37.007438][ T1] ? find_held_lock+0x2d/0x110
[ 37.007854][ T1] ? __might_fault+0xd9/0x180
[ 37.008261][ T1] ? lock_downgrade+0x6e0/0x6e0
[ 37.008673][ T1] ? bpf_lsm_bpf+0x9/0x10
[ 37.009055][ T1] __sys_bpf+0x1436/0x4ff0
[ 37.009434][ T1] ? bpf_perf_link_attach+0x520/0x520
[ 37.009875][ T1] ? lock_downgrade+0x6e0/0x6e0
[ 37.010299][ T1] __x64_sys_bpf+0x79/0xc0
[ 37.010678][ T1] ? syscall_enter_from_user_mode+0x26/0xb0
[ 37.011171][ T1] do_syscall_64+0x39/0xb0
[ 37.011561][ T1] entry_SYSCALL_64_after_hwframe+0x63/0xcd
[ 37.012058][ T1] RIP: 0033:0x7fee6fa1c5a9
...

May I ask it can be reproducable on the latest -rc kernel?

Thanks,
Gao Xiang


> mounted in repro: https://storage.googleapis.com/syzbot-assets/7f2f76b76cd2/mount_0.gz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]
>
> ------------[ cut here ]------------
> WARNING: CPU: 1 PID: 4386 at mm/page_alloc.c:3829 get_page_from_freeli
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at [email protected].
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
> syzbot can test patches for this issue, for details see:
> https://goo.gl/tpsmEJ#testing-patches

2023-01-05 12:13:07

by Aleksandr Nogikh

[permalink] [raw]
Subject: Re: [syzbot] [erofs?] WARNING: CPU: NUM PID: NUM at mm/page_alloc.c:LINE get_page_from_freeli

Hi,

On Thu, Jan 5, 2023 at 11:54 AM Xiang Gao <[email protected]> wrote:

> I wasn't able to build the kernel with this kernel config, it shows:
> "...
> FATAL: modpost: vmlinux.o is truncated. sechdrs[i].sh_offset=1399394064 > sizeof(*hrd)=64
> make[2]: *** [Module.symvers] Error 1
> make[1]: *** [modpost] Error 2
> make: *** [__sub-make] Error 2
> "

Could you please tell, what exact compiler/linker version did you use?


> >
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/0c8a5f06ceb3/disk-f9ff5644.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/be222e852ae2/vmlinux-f9ff5644.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/d9f42a53b05e/bzImage-f9ff5644.xz
>
> Finally I tried the original kernel image, and it printed some other
> random bug when booting system and then reboot, like:
>
> [ 36.991123][ T1] ==================================================================
> [ 36.991800][ T1] BUG: KASAN: slab-out-of-bounds in copy_array+0x96/0x100
> [ 36.992438][ T1] Write of size 32 at addr ffff888018c34640 by task systemd/1
< .. >

Interesting!
I've just tried to boot it with qemu and it was fine.

qemu-system-x86_64 -smp 2,sockets=2,cores=1 -m 4G -drive
file=disk-f9ff5644.raw,format=raw -snapshot -nographic -enable-kvm

So it looks like it's some difference between these VMMs that causes
that bug to fire.

>
> May I ask it can be reproducable on the latest -rc kernel?

We can ask syzbot about v6.2-rc2:

#syz test git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
88603b6dc419445847923fcb7fe5080067a30f98

>
> Thanks,
> Gao Xiang
>

--
Aleksandr

2023-01-05 14:40:20

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [erofs?] WARNING: CPU: NUM PID: NUM at mm/page_alloc.c:LINE get_page_from_freeli

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
WARNING in get_page_from_freelist

------------[ cut here ]------------
WARNING: CPU: 1 PID: 4385 at mm/page_alloc.c:3829 rmqueue mm/page_alloc.c:3829 [inline]
WARNING: CPU: 1 PID: 4385 at mm/page_alloc.c:3829 get_page_from_freelist+0xbf3/0x2ce0 mm/page_alloc.c:4280
Modules linked in:
CPU: 1 PID: 4385 Comm: kworker/u5:1 Not tainted 6.2.0-rc2-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/26/2022
Workqueue: erofs_unzipd z_erofs_decompressqueue_work
RIP: 0010:rmqueue mm/page_alloc.c:3829 [inline]
RIP: 0010:get_page_from_freelist+0xbf3/0x2ce0 mm/page_alloc.c:4280
Code: 48 c1 e8 03 42 80 3c 28 00 0f 85 18 1f 00 00 48 8b 03 f7 84 24 d8 00 00 00 00 80 00 00 48 89 44 24 68 74 08 41 83 fe 01 76 02 <0f> 0b 41 83 fe 09 0f 94 c2 41 83 fe 03 0f 96 c0 08 c2 88 54 24 50
RSP: 0018:ffffc900055e74d8 EFLAGS: 00010202
RAX: ffff88813fffae00 RBX: ffff88813fffc300 RCX: ffff88813fffabe8
RDX: 0000000000000000 RSI: 0000000000000008 RDI: ffffc900055e7718
RBP: 0000000000000002 R08: 0000000000002b49 R09: 0000000000078534
R10: 0000000000002b48 R11: 0000000000000000 R12: 0000000000002b48
R13: dffffc0000000000 R14: 0000000000000009 R15: ffff88813fffa700
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ff496515829 CR3: 000000000c48e000 CR4: 0000000000350ee0
Call Trace:
<TASK>
__alloc_pages+0x1cb/0x5b0 mm/page_alloc.c:5549
alloc_pages+0x1aa/0x270 mm/mempolicy.c:2286
vm_area_alloc_pages mm/vmalloc.c:2989 [inline]
__vmalloc_area_node mm/vmalloc.c:3057 [inline]
__vmalloc_node_range+0x978/0x13c0 mm/vmalloc.c:3227
kvmalloc_node+0x156/0x1a0 mm/util.c:606
kvmalloc include/linux/slab.h:737 [inline]
kvmalloc_array include/linux/slab.h:755 [inline]
kvcalloc include/linux/slab.h:760 [inline]
z_erofs_decompress_pcluster fs/erofs/zdata.c:1035 [inline]
z_erofs_decompress_queue+0x6e2/0x3060 fs/erofs/zdata.c:1141
z_erofs_decompressqueue_work+0x77/0xb0 fs/erofs/zdata.c:1153
process_one_work+0x9bf/0x1710 kernel/workqueue.c:2289
worker_thread+0x669/0x1090 kernel/workqueue.c:2436
kthread+0x2e8/0x3a0 kernel/kthread.c:376
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
</TASK>


Tested on:

commit: 88603b6d Linux 6.2-rc2
git tree: git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=1193edc6480000
kernel config: https://syzkaller.appspot.com/x/.config?x=46221e8203c7aca6
dashboard link: https://syzkaller.appspot.com/bug?extid=c3729cda01706a04fb98
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2

Note: no patches were applied.

2023-01-05 16:09:22

by Gao Xiang

[permalink] [raw]
Subject: Re: [syzbot] [erofs?] WARNING: CPU: NUM PID: NUM at mm/page_alloc.c:LINE get_page_from_freeli


Hi Aleksandr,

On 2023/1/5 19:14, Aleksandr Nogikh wrote:
> Hi,
>
> On Thu, Jan 5, 2023 at 11:54 AM Xiang Gao <[email protected]> wrote:
>
>> I wasn't able to build the kernel with this kernel config, it shows:
>> "...
>> FATAL: modpost: vmlinux.o is truncated. sechdrs[i].sh_offset=1399394064 > sizeof(*hrd)=64
>> make[2]: *** [Module.symvers] Error 1
>> make[1]: *** [modpost] Error 2
>> make: *** [__sub-make] Error 2
>> "
>
> Could you please tell, what exact compiler/linker version did you use?

Thanks for your help.

GCC 9.2.1 on my developping server.

>
>
>>>
>>> Downloadable assets:
>>> disk image: https://storage.googleapis.com/syzbot-assets/0c8a5f06ceb3/disk-f9ff5644.raw.xz
>>> vmlinux: https://storage.googleapis.com/syzbot-assets/be222e852ae2/vmlinux-f9ff5644.xz
>>> kernel image: https://storage.googleapis.com/syzbot-assets/d9f42a53b05e/bzImage-f9ff5644.xz
>>
>> Finally I tried the original kernel image, and it printed some other
>> random bug when booting system and then reboot, like:
>>
>> [ 36.991123][ T1] ==================================================================
>> [ 36.991800][ T1] BUG: KASAN: slab-out-of-bounds in copy_array+0x96/0x100
>> [ 36.992438][ T1] Write of size 32 at addr ffff888018c34640 by task systemd/1
> < .. >
>
> Interesting!
> I've just tried to boot it with qemu and it was fine.
>
> qemu-system-x86_64 -smp 2,sockets=2,cores=1 -m 4G -drive
> file=disk-f9ff5644.raw,format=raw -snapshot -nographic -enable-kvm
>
> So it looks like it's some difference between these VMMs that causes
> that bug to fire.

I think the problem is that the rootfs which I used has more complicated
workload than the given one.

>
>>
>> May I ask it can be reproducable on the latest -rc kernel?
>
> We can ask syzbot about v6.2-rc2:
>
> #syz test git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git
> 88603b6dc419445847923fcb7fe5080067a30f98

I think I know the root cause: It seems that kvcalloc doesn't support
GFP_NOFAIL, I will use kcalloc directly instead.

Thanks,
Gao Xiang

>
>>
>> Thanks,
>> Gao Xiang
>>
>
> --
> Aleksandr