Hello,
syzbot found the following issue on:
HEAD commit: 6dc544b66971 Add linux-next specific files for 20240528
git tree: linux-next
console output: https://syzkaller.appspot.com/x/log.txt?x=14c7f806980000
kernel config: https://syzkaller.appspot.com/x/.config?x=6a363b35598e573d
dashboard link: https://syzkaller.appspot.com/bug?extid=981b8efffb3d71c46bef
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/334699ab67f8/disk-6dc544b6.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/4ca32b2218ce/vmlinux-6dc544b6.xz
kernel image: https://storage.googleapis.com/syzbot-assets/400bc5f019b3/bzImage-6dc544b6.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
------------[ cut here ]------------
UBSAN: shift-out-of-bounds in mm/shrinker.c:406:18
shift exponent -1 is negative
CPU: 0 PID: 5278 Comm: syz-executor.1 Not tainted 6.10.0-rc1-next-20240528-syzkaller #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
ubsan_epilogue lib/ubsan.c:231 [inline]
__ubsan_handle_shift_out_of_bounds+0x3c8/0x420 lib/ubsan.c:468
do_shrink_slab+0xe26/0x1160 mm/shrinker.c:406
shrink_slab_memcg mm/shrinker.c:548 [inline]
shrink_slab+0x87c/0x14d0 mm/shrinker.c:626
shrink_node_memcgs mm/vmscan.c:5923 [inline]
shrink_node+0xb82/0x4150 mm/vmscan.c:5961
shrink_zones mm/vmscan.c:6205 [inline]
do_try_to_free_pages+0x789/0x1cb0 mm/vmscan.c:6267
try_to_free_mem_cgroup_pages+0x48f/0xb10 mm/vmscan.c:6598
try_charge_memcg+0x704/0x1850 mm/memcontrol.c:2946
obj_cgroup_charge_pages mm/memcontrol.c:3420 [inline]
__memcg_kmem_charge_page+0xe2/0x250 mm/memcontrol.c:3446
__alloc_pages_noprof+0x28c/0x6c0 mm/page_alloc.c:4712
__alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
bpf_ringbuf_area_alloc kernel/bpf/ringbuf.c:122 [inline]
bpf_ringbuf_alloc+0xcb/0x420 kernel/bpf/ringbuf.c:170
ringbuf_map_alloc+0x1d7/0x2f0 kernel/bpf/ringbuf.c:204
map_create+0x90c/0x1200 kernel/bpf/syscall.c:1333
__sys_bpf+0x6d1/0x810 kernel/bpf/syscall.c:5669
__do_sys_bpf kernel/bpf/syscall.c:5794 [inline]
__se_sys_bpf kernel/bpf/syscall.c:5792 [inline]
__x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5792
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7efea107cee9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007efea1de60c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
RAX: ffffffffffffffda RBX: 00007efea11b3fa0 RCX: 00007efea107cee9
RDX: 0000000000000048 RSI: 00000000200002c0 RDI: 0000000000000000
RBP: 00007efea10c947f R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007efea11b3fa0 R15: 00007fff6651b5d8
</TASK>
---[ end trace ]---
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
On Sat, Jun 01, 2024 at 12:08:25AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 6dc544b66971 Add linux-next specific files for 20240528
> git tree: linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=14c7f806980000
> kernel config: https://syzkaller.appspot.com/x/.config?x=6a363b35598e573d
> dashboard link: https://syzkaller.appspot.com/bug?extid=981b8efffb3d71c46bef
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/334699ab67f8/disk-6dc544b6.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/4ca32b2218ce/vmlinux-6dc544b6.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/400bc5f019b3/bzImage-6dc544b6.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]
>
> ------------[ cut here ]------------
> UBSAN: shift-out-of-bounds in mm/shrinker.c:406:18
> shift exponent -1 is negative
> CPU: 0 PID: 5278 Comm: syz-executor.1 Not tainted 6.10.0-rc1-next-20240528-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
> ubsan_epilogue lib/ubsan.c:231 [inline]
> __ubsan_handle_shift_out_of_bounds+0x3c8/0x420 lib/ubsan.c:468
> do_shrink_slab+0xe26/0x1160 mm/shrinker.c:406
total_scan = nr >> priority;
Ok, that means the shrinker has been passed a priority of -1 from
the core memory reclaim code. That means it is more likely that
something has gone wrong with the higher level struct scan_control
sc->priority handling, not something in teh shrinker code itself.
> shrink_slab_memcg mm/shrinker.c:548 [inline]
> shrink_slab+0x87c/0x14d0 mm/shrinker.c:626
> shrink_node_memcgs mm/vmscan.c:5923 [inline]
> shrink_node+0xb82/0x4150 mm/vmscan.c:5961
> shrink_zones mm/vmscan.c:6205 [inline]
> do_try_to_free_pages+0x789/0x1cb0 mm/vmscan.c:6267
This has a loop that does:
do {
.....
shrink_zones(zonelist, sc);
.....
} while (--sc->priority >= 0);
and all the callers initialise sc->priority to DEF_PRIORITY. Hence
I can't see how we get shrink_zones() gets called with sc->priority
== -1 from here or anywhere else that decrements sc->priority. This
needs someone with more core mm reclaim expertise than I have to
triage this further.
-Dave.
--
Dave Chinner
[email protected]
Hi,
I think this bug was introduced by commit 6be5e186fd65
("mm: vmscan: restore incremental cgroup iteration"), and
can be fixed by commit 9c8805439853 ("mm: vmscan: reset sc->priority on
retry").
Thanks,
Qi
On 2024/6/1 15:08, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 6dc544b66971 Add linux-next specific files for 20240528
> git tree: linux-next
> console output: https://syzkaller.appspot.com/x/log.txt?x=14c7f806980000
> kernel config: https://syzkaller.appspot.com/x/.config?x=6a363b35598e573d
> dashboard link: https://syzkaller.appspot.com/bug?extid=981b8efffb3d71c46bef
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/334699ab67f8/disk-6dc544b6.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/4ca32b2218ce/vmlinux-6dc544b6.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/400bc5f019b3/bzImage-6dc544b6.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]
>
> ------------[ cut here ]------------
> UBSAN: shift-out-of-bounds in mm/shrinker.c:406:18
> shift exponent -1 is negative
> CPU: 0 PID: 5278 Comm: syz-executor.1 Not tainted 6.10.0-rc1-next-20240528-syzkaller #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/02/2024
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
> ubsan_epilogue lib/ubsan.c:231 [inline]
> __ubsan_handle_shift_out_of_bounds+0x3c8/0x420 lib/ubsan.c:468
> do_shrink_slab+0xe26/0x1160 mm/shrinker.c:406
> shrink_slab_memcg mm/shrinker.c:548 [inline]
> shrink_slab+0x87c/0x14d0 mm/shrinker.c:626
> shrink_node_memcgs mm/vmscan.c:5923 [inline]
> shrink_node+0xb82/0x4150 mm/vmscan.c:5961
> shrink_zones mm/vmscan.c:6205 [inline]
> do_try_to_free_pages+0x789/0x1cb0 mm/vmscan.c:6267
> try_to_free_mem_cgroup_pages+0x48f/0xb10 mm/vmscan.c:6598
> try_charge_memcg+0x704/0x1850 mm/memcontrol.c:2946
> obj_cgroup_charge_pages mm/memcontrol.c:3420 [inline]
> __memcg_kmem_charge_page+0xe2/0x250 mm/memcontrol.c:3446
> __alloc_pages_noprof+0x28c/0x6c0 mm/page_alloc.c:4712
> __alloc_pages_node_noprof include/linux/gfp.h:269 [inline]
> alloc_pages_node_noprof include/linux/gfp.h:296 [inline]
> bpf_ringbuf_area_alloc kernel/bpf/ringbuf.c:122 [inline]
> bpf_ringbuf_alloc+0xcb/0x420 kernel/bpf/ringbuf.c:170
> ringbuf_map_alloc+0x1d7/0x2f0 kernel/bpf/ringbuf.c:204
> map_create+0x90c/0x1200 kernel/bpf/syscall.c:1333
> __sys_bpf+0x6d1/0x810 kernel/bpf/syscall.c:5669
> __do_sys_bpf kernel/bpf/syscall.c:5794 [inline]
> __se_sys_bpf kernel/bpf/syscall.c:5792 [inline]
> __x64_sys_bpf+0x7c/0x90 kernel/bpf/syscall.c:5792
> do_syscall_x64 arch/x86/entry/common.c:52 [inline]
> do_syscall_64+0xf3/0x230 arch/x86/entry/common.c:83
> entry_SYSCALL_64_after_hwframe+0x77/0x7f
> RIP: 0033:0x7efea107cee9
> Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
> RSP: 002b:00007efea1de60c8 EFLAGS: 00000246 ORIG_RAX: 0000000000000141
> RAX: ffffffffffffffda RBX: 00007efea11b3fa0 RCX: 00007efea107cee9
> RDX: 0000000000000048 RSI: 00000000200002c0 RDI: 0000000000000000
> RBP: 00007efea10c947f R08: 0000000000000000 R09: 0000000000000000
> R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
> R13: 000000000000000b R14: 00007efea11b3fa0 R15: 00007fff6651b5d8
> </TASK>
> ---[ end trace ]---
>
>
> ---
> This report is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at [email protected].
>
> syzbot will keep track of this issue. See:
> https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
>
> If the report is already addressed, let syzbot know by replying with:
> #syz fix: exact-commit-title
>
> If you want to overwrite report's subsystems, reply with:
> #syz set subsystems: new-subsystem
> (See the list of subsystem names on the web dashboard)
>
> If the report is a duplicate of another one, reply with:
> #syz dup: exact-subject-of-another-report
>
> If you want to undo deduplication, reply with:
> #syz undup
On Mon, Jun 03, 2024 at 11:25:42AM +0800, Qi Zheng wrote:
> Hi,
>
> I think this bug was introduced by commit 6be5e186fd65
> ("mm: vmscan: restore incremental cgroup iteration"), and
> can be fixed by commit 9c8805439853 ("mm: vmscan: reset sc->priority on
> retry").
I'm almost sure it's the same issue.
Thanks