Hello,
syzbot found the following issue on:
HEAD commit: 04b8076df253 Merge tag 'firewire-fixes-6.8-rc7' of git://g..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=150add9a180000
kernel config: https://syzkaller.appspot.com/x/.config?x=be0288b26c967205
dashboard link: https://syzkaller.appspot.com/bug?extid=d7c7a495a5e466c031b6
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
Unfortunately, I don't have any reproducer for this issue yet.
Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-04b8076d.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/926d19cdf690/vmlinux-04b8076d.xz
kernel image: https://storage.googleapis.com/syzbot-assets/c0754e78c2bc/bzImage-04b8076d.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
==================================================================
BUG: KASAN: slab-use-after-free in p9_fid_destroy+0xb5/0xd0 net/9p/client.c:884
Read of size 8 at addr ffff888064295880 by task kworker/u16:0/11
CPU: 0 PID: 11 Comm: kworker/u16:0 Not tainted 6.8.0-rc6-syzkaller-00250-g04b8076df253 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Workqueue: events_unbound v9fs_upload_to_server_worker
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x1b0 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:377 [inline]
print_report+0xc4/0x620 mm/kasan/report.c:488
kasan_report+0xda/0x110 mm/kasan/report.c:601
p9_fid_destroy+0xb5/0xd0 net/9p/client.c:884
p9_client_clunk+0x12a/0x170 net/9p/client.c:1456
p9_fid_put include/net/9p/client.h:278 [inline]
v9fs_free_request+0xdc/0x110 fs/9p/vfs_addr.c:128
netfs_free_request+0x225/0x670 fs/netfs/objects.c:97
netfs_put_request+0x19b/0x1f0 fs/netfs/objects.c:130
netfs_free_subrequest fs/netfs/objects.c:178 [inline]
netfs_put_subrequest+0x3be/0x600 fs/netfs/objects.c:192
v9fs_upload_to_server fs/9p/vfs_addr.c:36 [inline]
v9fs_upload_to_server_worker+0x182/0x360 fs/9p/vfs_addr.c:44
process_one_work+0x889/0x15e0 kernel/workqueue.c:2633
process_scheduled_works kernel/workqueue.c:2706 [inline]
worker_thread+0x8b9/0x12a0 kernel/workqueue.c:2787
kthread+0x2c6/0x3b0 kernel/kthread.c:388
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:243
</TASK>
Allocated by task 14429:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
kasan_save_track+0x14/0x30 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
__kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:387
kmalloc include/linux/slab.h:590 [inline]
kzalloc include/linux/slab.h:711 [inline]
p9_fid_create+0x45/0x470 net/9p/client.c:853
p9_client_walk+0xc7/0x550 net/9p/client.c:1154
clone_fid fs/9p/fid.h:23 [inline]
v9fs_fid_clone fs/9p/fid.h:33 [inline]
v9fs_file_open+0x623/0xc30 fs/9p/vfs_file.c:56
do_dentry_open+0x8da/0x18c0 fs/open.c:953
do_open fs/namei.c:3645 [inline]
path_openat+0x1e00/0x29a0 fs/namei.c:3802
do_filp_open+0x1de/0x440 fs/namei.c:3829
do_sys_openat2+0x17a/0x1e0 fs/open.c:1404
do_sys_open fs/open.c:1419 [inline]
__do_sys_openat fs/open.c:1435 [inline]
__se_sys_openat fs/open.c:1430 [inline]
__x64_sys_openat+0x175/0x210 fs/open.c:1430
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xd5/0x270 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x6f/0x77
Freed by task 18115:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
kasan_save_track+0x14/0x30 mm/kasan/common.c:68
kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:589
poison_slab_object mm/kasan/common.c:240 [inline]
__kasan_slab_free+0x11d/0x1a0 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2121 [inline]
slab_free mm/slub.c:4299 [inline]
kfree+0x124/0x370 mm/slub.c:4409
p9_client_destroy+0x14c/0x480 net/9p/client.c:1070
v9fs_session_close+0x49/0x2d0 fs/9p/v9fs.c:506
v9fs_kill_super+0x4d/0xa0 fs/9p/vfs_super.c:223
deactivate_locked_super+0xbe/0x1a0 fs/super.c:472
deactivate_super+0xde/0x100 fs/super.c:505
cleanup_mnt+0x222/0x450 fs/namespace.c:1267
task_work_run+0x14f/0x250 kernel/task_work.c:180
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop kernel/entry/common.c:108 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:201 [inline]
syscall_exit_to_user_mode+0x278/0x2a0 kernel/entry/common.c:212
do_syscall_64+0xe5/0x270 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x6f/0x77
The buggy address belongs to the object at ffff888064295880
which belongs to the cache kmalloc-96 of size 96
The buggy address is located 0 bytes inside of
freed 96-byte region [ffff888064295880, ffff8880642958e0)
The buggy address belongs to the physical page:
page:ffffea000190a540 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x64295
ksm flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000000800 ffff888014c42780 ffffea0000c8c240 dead000000000007
raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112c40(GFP_NOFS|__GFP_NOWARN|__GFP_NORETRY|__GFP_HARDWALL), pid 10902, tgid 10901 (syz-executor.3), ts 360371748345, free_ts 360348940469
set_page_owner include/linux/page_owner.h:31 [inline]
post_alloc_hook+0x2d4/0x350 mm/page_alloc.c:1533
prep_new_page mm/page_alloc.c:1540 [inline]
get_page_from_freelist+0xa28/0x3780 mm/page_alloc.c:3311
__alloc_pages+0x22f/0x2440 mm/page_alloc.c:4567
__alloc_pages_node include/linux/gfp.h:238 [inline]
alloc_pages_node include/linux/gfp.h:261 [inline]
alloc_slab_page mm/slub.c:2190 [inline]
allocate_slab mm/slub.c:2354 [inline]
new_slab+0xcc/0x3a0 mm/slub.c:2407
___slab_alloc+0x4af/0x19a0 mm/slub.c:3540
__slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3625
__slab_alloc_node mm/slub.c:3678 [inline]
slab_alloc_node mm/slub.c:3850 [inline]
__do_kmalloc_node mm/slub.c:3980 [inline]
__kmalloc+0x3b8/0x440 mm/slub.c:3994
kmalloc_array include/linux/slab.h:627 [inline]
kcalloc include/linux/slab.h:658 [inline]
ext4_find_extent+0x95c/0xce0 fs/ext4/extents.c:914
ext4_ext_map_blocks+0x26b/0x5bb0 fs/ext4/extents.c:4143
ext4_map_blocks+0x61d/0x17d0 fs/ext4/inode.c:623
ext4_iomap_alloc fs/ext4/inode.c:3318 [inline]
ext4_iomap_begin+0x472/0x7d0 fs/ext4/inode.c:3368
iomap_iter+0x48b/0xff0 fs/iomap/iter.c:91
__iomap_dio_rw+0x6c4/0x1bd0 fs/iomap/direct-io.c:658
iomap_dio_rw+0x40/0xa0 fs/iomap/direct-io.c:748
ext4_dio_write_iter fs/ext4/file.c:577 [inline]
ext4_file_write_iter+0x12c6/0x1960 fs/ext4/file.c:696
call_write_iter include/linux/fs.h:2087 [inline]
iter_file_splice_write+0x908/0x10b0 fs/splice.c:743
page last free pid 5164 tgid 5164 stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1140 [inline]
free_unref_page_prepare+0x527/0xb10 mm/page_alloc.c:2346
free_unref_page+0x33/0x3c0 mm/page_alloc.c:2486
skb_free_frag include/linux/skbuff.h:3270 [inline]
skb_free_head+0xa6/0x1b0 net/core/skbuff.c:996
skb_release_data+0x5c0/0x880 net/core/skbuff.c:1028
skb_release_all net/core/skbuff.c:1094 [inline]
__kfree_skb+0x51/0x70 net/core/skbuff.c:1108
tcp_rcv_established+0xd72/0x20f0 net/ipv4/tcp_input.c:6080
tcp_v4_do_rcv+0x6ab/0xa50 net/ipv4/tcp_ipv4.c:1906
sk_backlog_rcv include/net/sock.h:1092 [inline]
__release_sock+0x31b/0x400 net/core/sock.c:2972
release_sock+0x5a/0x220 net/core/sock.c:3538
tcp_sendmsg+0x38/0x50 net/ipv4/tcp.c:1342
inet_sendmsg+0xb9/0x140 net/ipv4/af_inet.c:850
sock_sendmsg_nosec net/socket.c:730 [inline]
__sock_sendmsg net/socket.c:745 [inline]
sock_write_iter+0x4b8/0x5c0 net/socket.c:1160
call_write_iter include/linux/fs.h:2087 [inline]
new_sync_write fs/read_write.c:497 [inline]
vfs_write+0x6de/0x1110 fs/read_write.c:590
ksys_write+0x1f8/0x260 fs/read_write.c:643
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xd5/0x270 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x6f/0x77
Memory state around the buggy address:
ffff888064295780: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
ffff888064295800: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
>ffff888064295880: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
^
ffff888064295900: 00 00 00 00 00 00 00 00 03 fc fc fc fc fc fc fc
ffff888064295980: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
==================================================================
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
syzbot has found a reproducer for the following issue on:
HEAD commit: ea5f6ad9ad96 Merge tag 'platform-drivers-x86-v6.10-1' of g..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=17340168980000
kernel config: https://syzkaller.appspot.com/x/.config?x=f1cd4092753f97c5
dashboard link: https://syzkaller.appspot.com/bug?extid=d7c7a495a5e466c031b6
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=1065b4f0980000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11df3084980000
Downloadable assets:
disk image (non-bootable): https://storage.googleapis.com/syzbot-assets/7bc7510fe41f/non_bootable_disk-ea5f6ad9.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/a551265cc1bb/vmlinux-ea5f6ad9.xz
kernel image: https://storage.googleapis.com/syzbot-assets/b814900d9571/bzImage-ea5f6ad9.xz
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
9pnet: Found fid 3 not clunked
==================================================================
BUG: KASAN: slab-use-after-free in p9_fid_destroy+0xb5/0xd0 net/9p/client.c:885
Read of size 8 at addr ffff888023d14a00 by task syz-executor145/5220
CPU: 3 PID: 5220 Comm: syz-executor145 Not tainted 6.9.0-syzkaller-08284-gea5f6ad9ad96 #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0xc3/0x620 mm/kasan/report.c:488
kasan_report+0xd9/0x110 mm/kasan/report.c:601
p9_fid_destroy+0xb5/0xd0 net/9p/client.c:885
p9_client_destroy+0x14c/0x480 net/9p/client.c:1071
v9fs_session_close+0x49/0x2d0 fs/9p/v9fs.c:506
v9fs_kill_super+0x4d/0xa0 fs/9p/vfs_super.c:196
deactivate_locked_super+0xbe/0x1a0 fs/super.c:472
deactivate_super+0xde/0x100 fs/super.c:505
cleanup_mnt+0x222/0x450 fs/namespace.c:1267
task_work_run+0x14e/0x250 kernel/task_work.c:180
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x278/0x2a0 kernel/entry/common.c:218
do_syscall_64+0xdc/0x260 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f373e54ddb7
Code: 07 00 48 83 c4 08 5b 5d c3 66 2e 0f 1f 84 00 00 00 00 00 c3 66 2e 0f 1f 84 00 00 00 00 00 0f 1f 44 00 00 b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 c7 c2 b8 ff ff ff f7 d8 64 89 02 b8
RSP: 002b:00007ffe809881b8 EFLAGS: 00000206 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f373e54ddb7
RDX: 0000000000000000 RSI: 0000000000000009 RDI: 00007ffe80988270
RBP: 00007ffe80988270 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000206 R12: 00007ffe809892e0
R13: 0000555582e557c0 R14: 431bde82d7b634db R15: 00007ffe80989300
</TASK>
Allocated by task 10647:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
kasan_save_track+0x14/0x30 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
__kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:387
kmalloc include/linux/slab.h:628 [inline]
kzalloc include/linux/slab.h:749 [inline]
p9_fid_create+0x45/0x470 net/9p/client.c:854
p9_client_walk+0xc6/0x550 net/9p/client.c:1155
clone_fid fs/9p/fid.h:23 [inline]
v9fs_fid_clone fs/9p/fid.h:33 [inline]
v9fs_file_open+0x5b5/0xae0 fs/9p/vfs_file.c:56
do_dentry_open+0x8da/0x18c0 fs/open.c:955
do_open fs/namei.c:3650 [inline]
path_openat+0x1dfb/0x2990 fs/namei.c:3807
do_filp_open+0x1dc/0x430 fs/namei.c:3834
do_sys_openat2+0x17a/0x1e0 fs/open.c:1406
do_sys_open fs/open.c:1421 [inline]
__do_sys_creat fs/open.c:1497 [inline]
__se_sys_creat fs/open.c:1491 [inline]
__x64_sys_creat+0xcd/0x120 fs/open.c:1491
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcf/0x260 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 1091:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
kasan_save_track+0x14/0x30 mm/kasan/common.c:68
kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:579
poison_slab_object mm/kasan/common.c:240 [inline]
__kasan_slab_free+0x11d/0x1a0 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2121 [inline]
slab_free mm/slub.c:4353 [inline]
kfree+0x129/0x3a0 mm/slub.c:4463
p9_client_clunk+0x12a/0x170 net/9p/client.c:1457
p9_fid_put include/net/9p/client.h:280 [inline]
v9fs_free_request+0xdc/0x110 fs/9p/vfs_addr.c:138
netfs_free_request+0x22c/0x690 fs/netfs/objects.c:133
netfs_put_request+0x19b/0x1f0 fs/netfs/objects.c:165
netfs_write_collection_worker+0x19d0/0x59e0 fs/netfs/write_collect.c:701
process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
process_scheduled_works kernel/workqueue.c:3312 [inline]
worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393
kthread+0x2c1/0x3a0 kernel/kthread.c:389
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
The buggy address belongs to the object at ffff888023d14a00
which belongs to the cache kmalloc-96 of size 96
The buggy address is located 0 bytes inside of
freed 96-byte region [ffff888023d14a00, ffff888023d14a60)
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x23d14
anon flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000000800 ffff888015442280 0000000000000000 dead000000000001
raw: 0000000000000000 0000000000200020 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x52cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP), pid 1, tgid 1 (swapper/0), ts 12927324375, free_ts 10880195922
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x2d4/0x350 mm/page_alloc.c:1534
prep_new_page mm/page_alloc.c:1541 [inline]
get_page_from_freelist+0xa28/0x3780 mm/page_alloc.c:3317
__alloc_pages+0x22b/0x2460 mm/page_alloc.c:4575
__alloc_pages_node include/linux/gfp.h:238 [inline]
alloc_pages_node include/linux/gfp.h:261 [inline]
alloc_slab_page mm/slub.c:2190 [inline]
allocate_slab mm/slub.c:2353 [inline]
new_slab+0xcc/0x3a0 mm/slub.c:2406
___slab_alloc+0xd28/0x1810 mm/slub.c:3592
__slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3682
__slab_alloc_node mm/slub.c:3735 [inline]
slab_alloc_node mm/slub.c:3908 [inline]
kmalloc_trace+0x306/0x340 mm/slub.c:4065
kmalloc include/linux/slab.h:628 [inline]
kzalloc include/linux/slab.h:749 [inline]
dev_pm_qos_expose_flags+0x96/0x310 drivers/base/power/qos.c:782
usb_hub_create_port_device+0x8fd/0xde0 drivers/usb/core/port.c:812
hub_configure drivers/usb/core/hub.c:1710 [inline]
hub_probe+0x1e31/0x3210 drivers/usb/core/hub.c:1965
usb_probe_interface+0x309/0x9d0 drivers/usb/core/driver.c:399
call_driver_probe drivers/base/dd.c:578 [inline]
really_probe+0x23e/0xa90 drivers/base/dd.c:656
__driver_probe_device+0x1de/0x440 drivers/base/dd.c:798
driver_probe_device+0x4c/0x1b0 drivers/base/dd.c:828
__device_attach_driver+0x1df/0x310 drivers/base/dd.c:956
bus_for_each_drv+0x157/0x1e0 drivers/base/bus.c:457
page last free pid 62 tgid 62 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1141 [inline]
free_unref_page_prepare+0x527/0xb10 mm/page_alloc.c:2347
free_unref_page+0x33/0x3c0 mm/page_alloc.c:2487
vfree+0x181/0x7a0 mm/vmalloc.c:3340
delayed_vfree_work+0x56/0x70 mm/vmalloc.c:3261
process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
process_scheduled_works kernel/workqueue.c:3312 [inline]
worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393
kthread+0x2c1/0x3a0 kernel/kthread.c:389
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
Memory state around the buggy address:
ffff888023d14900: 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc
ffff888023d14980: 00 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc
>ffff888023d14a00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
^
ffff888023d14a80: 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc
ffff888023d14b00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
==================================================================
---
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
On Fri, 17 May 2024 04:31:28 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: ea5f6ad9ad96 Merge tag 'platform-drivers-x86-v6.10-1' of g..
> git tree: upstream
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11df3084980000
#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ea5f6ad9ad96
--- x/include/net/9p/client.h
+++ y/include/net/9p/client.h
@@ -11,6 +11,7 @@
#include <linux/utsname.h>
#include <linux/idr.h>
+#include <linux/mutex.h>
#include <linux/tracepoint-defs.h>
/* Number of requests per row */
@@ -122,6 +123,7 @@ struct p9_client {
struct idr fids;
struct idr reqs;
+ struct mutex destroy_mutex;
char name[__NEW_UTS_LEN + 1];
};
--- x/net/9p/client.c
+++ y/net/9p/client.c
@@ -1041,6 +1041,7 @@ struct p9_client *p9_client_create(const
0, 0, P9_HDRSZ + 4,
clnt->msize - (P9_HDRSZ + 4),
NULL);
+ mutex_init(&clnt->destroy_mutex);
return clnt;
@@ -1065,11 +1066,13 @@ void p9_client_destroy(struct p9_client
clnt->trans_mod->close(clnt);
v9fs_put_trans(clnt->trans_mod);
+ mutex_lock(&clnt->destroy_mutex);
idr_for_each_entry(&clnt->fids, fid, id) {
pr_info("Found fid %d not clunked\n", fid->fid);
p9_fid_destroy(fid);
}
+ mutex_unlock(&clnt->destroy_mutex);
p9_tag_cleanup(clnt);
@@ -1454,7 +1457,10 @@ error:
if (retries++ == 0)
goto again;
} else {
- p9_fid_destroy(fid);
+ if (mutex_trylock(&clnt->destroy_mutex)) {
+ p9_fid_destroy(fid);
+ mutex_unlock(&clnt->destroy_mutex);
+ }
}
return err;
}
--
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue:
KASAN: slab-use-after-free Read in p9_fid_destroy
==================================================================
BUG: KASAN: slab-use-after-free in p9_fid_destroy+0xb5/0xd0 net/9p/client.c:885
Read of size 8 at addr ffff88801d65ea80 by task kworker/u32:10/1148
CPU: 1 PID: 1148 Comm: kworker/u32:10 Not tainted 6.9.0-syzkaller-08284-gea5f6ad9ad96-dirty #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Workqueue: events_unbound netfs_write_collection_worker
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x116/0x1f0 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0xc3/0x620 mm/kasan/report.c:488
kasan_report+0xd9/0x110 mm/kasan/report.c:601
p9_fid_destroy+0xb5/0xd0 net/9p/client.c:885
p9_client_clunk+0x175/0x190 net/9p/client.c:1461
p9_fid_put include/net/9p/client.h:282 [inline]
v9fs_free_request+0xdc/0x110 fs/9p/vfs_addr.c:138
netfs_free_request+0x22c/0x690 fs/netfs/objects.c:133
netfs_put_request+0x19b/0x1f0 fs/netfs/objects.c:165
netfs_write_collection_worker+0x19d0/0x59e0 fs/netfs/write_collect.c:701
process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
process_scheduled_works kernel/workqueue.c:3312 [inline]
worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393
kthread+0x2c1/0x3a0 kernel/kthread.c:389
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Allocated by task 10976:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
kasan_save_track+0x14/0x30 mm/kasan/common.c:68
poison_kmalloc_redzone mm/kasan/common.c:370 [inline]
__kasan_kmalloc+0xaa/0xb0 mm/kasan/common.c:387
kmalloc include/linux/slab.h:628 [inline]
kzalloc include/linux/slab.h:749 [inline]
p9_fid_create+0x45/0x470 net/9p/client.c:854
p9_client_walk+0xc6/0x550 net/9p/client.c:1158
clone_fid fs/9p/fid.h:23 [inline]
v9fs_fid_clone fs/9p/fid.h:33 [inline]
v9fs_file_open+0x5b5/0xae0 fs/9p/vfs_file.c:56
do_dentry_open+0x8da/0x18c0 fs/open.c:955
do_open fs/namei.c:3650 [inline]
path_openat+0x1dfb/0x2990 fs/namei.c:3807
do_filp_open+0x1dc/0x430 fs/namei.c:3834
do_sys_openat2+0x17a/0x1e0 fs/open.c:1406
do_sys_open fs/open.c:1421 [inline]
__do_sys_creat fs/open.c:1497 [inline]
__se_sys_creat fs/open.c:1491 [inline]
__x64_sys_creat+0xcd/0x120 fs/open.c:1491
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcf/0x260 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Freed by task 5340:
kasan_save_stack+0x33/0x60 mm/kasan/common.c:47
kasan_save_track+0x14/0x30 mm/kasan/common.c:68
kasan_save_free_info+0x3b/0x60 mm/kasan/generic.c:579
poison_slab_object mm/kasan/common.c:240 [inline]
__kasan_slab_free+0x11d/0x1a0 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2121 [inline]
slab_free mm/slub.c:4353 [inline]
kfree+0x129/0x3a0 mm/slub.c:4463
p9_client_destroy+0x160/0x4a0 net/9p/client.c:1073
v9fs_session_close+0x49/0x2d0 fs/9p/v9fs.c:506
v9fs_kill_super+0x4d/0xa0 fs/9p/vfs_super.c:196
deactivate_locked_super+0xbe/0x1a0 fs/super.c:472
deactivate_super+0xde/0x100 fs/super.c:505
cleanup_mnt+0x222/0x450 fs/namespace.c:1267
task_work_run+0x14e/0x250 kernel/task_work.c:180
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x278/0x2a0 kernel/entry/common.c:218
do_syscall_64+0xdc/0x260 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
The buggy address belongs to the object at ffff88801d65ea80
which belongs to the cache kmalloc-96 of size 96
The buggy address is located 0 bytes inside of
freed 96-byte region [ffff88801d65ea80, ffff88801d65eae0)
The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0xffff88801d65ef00 pfn:0x1d65e
flags: 0xfff00000000a00(workingset|slab|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000000a00 ffff888015442280 ffffea0000b5d350 ffffea0000890c50
raw: ffff88801d65ef00 000000000020001d 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x152c40(GFP_NOFS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_HARDWALL), pid 5340, tgid 5340 (syz-executor.2), ts 92315884435, free_ts 92313381805
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x2d4/0x350 mm/page_alloc.c:1534
prep_new_page mm/page_alloc.c:1541 [inline]
get_page_from_freelist+0xa28/0x3780 mm/page_alloc.c:3317
__alloc_pages+0x22b/0x2460 mm/page_alloc.c:4575
__alloc_pages_node include/linux/gfp.h:238 [inline]
alloc_pages_node include/linux/gfp.h:261 [inline]
alloc_slab_page mm/slub.c:2190 [inline]
allocate_slab mm/slub.c:2353 [inline]
new_slab+0xcc/0x3a0 mm/slub.c:2406
___slab_alloc+0xd28/0x1810 mm/slub.c:3592
__slab_alloc.constprop.0+0x56/0xb0 mm/slub.c:3682
__slab_alloc_node mm/slub.c:3735 [inline]
slab_alloc_node mm/slub.c:3908 [inline]
__do_kmalloc_node mm/slub.c:4038 [inline]
__kmalloc+0x3bf/0x440 mm/slub.c:4052
kmalloc include/linux/slab.h:632 [inline]
kzalloc include/linux/slab.h:749 [inline]
tomoyo_encode2+0x100/0x3e0 security/tomoyo/realpath.c:45
tomoyo_encode+0x29/0x50 security/tomoyo/realpath.c:80
tomoyo_realpath_from_path+0x19d/0x720 security/tomoyo/realpath.c:283
tomoyo_get_realpath security/tomoyo/file.c:151 [inline]
tomoyo_path_perm+0x273/0x450 security/tomoyo/file.c:822
tomoyo_path_unlink+0x92/0xe0 security/tomoyo/tomoyo.c:162
security_path_unlink+0x100/0x170 security/security.c:1857
do_unlinkat+0x55b/0x750 fs/namei.c:4404
__do_sys_unlink fs/namei.c:4455 [inline]
__se_sys_unlink fs/namei.c:4453 [inline]
__x64_sys_unlink+0xc7/0x110 fs/namei.c:4453
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xcf/0x260 arch/x86/entry/common.c:83
page last free pid 5494 tgid 5493 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1141 [inline]
free_unref_page_prepare+0x527/0xb10 mm/page_alloc.c:2347
free_unref_page+0x33/0x3c0 mm/page_alloc.c:2487
tlb_batch_list_free mm/mmu_gather.c:159 [inline]
tlb_finish_mmu+0x237/0x7b0 mm/mmu_gather.c:468
exit_mmap+0x3da/0xb90 mm/mmap.c:3282
__mmput+0x12a/0x4d0 kernel/fork.c:1346
mmput+0x62/0x70 kernel/fork.c:1368
exit_mm kernel/exit.c:569 [inline]
do_exit+0x999/0x2c10 kernel/exit.c:865
do_group_exit+0xd3/0x2a0 kernel/exit.c:1027
get_signal+0x2616/0x2710 kernel/signal.c:2911
arch_do_signal_or_restart+0x90/0x7e0 arch/x86/kernel/signal.c:310
exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x14a/0x2a0 kernel/entry/common.c:218
do_syscall_64+0xdc/0x260 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
Memory state around the buggy address:
ffff88801d65e980: 00 00 00 00 00 00 00 00 00 fc fc fc fc fc fc fc
ffff88801d65ea00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
>ffff88801d65ea80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
^
ffff88801d65eb00: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
ffff88801d65eb80: fa fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc
==================================================================
Tested on:
commit: ea5f6ad9 Merge tag 'platform-drivers-x86-v6.10-1' of g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=17d86268980000
kernel config: https://syzkaller.appspot.com/x/.config?x=f1cd4092753f97c5
dashboard link: https://syzkaller.appspot.com/bug?extid=d7c7a495a5e466c031b6
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=15216e04980000
On Fri, 17 May 2024 04:31:28 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: ea5f6ad9ad96 Merge tag 'platform-drivers-x86-v6.10-1' of g..
> git tree: upstream
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11df3084980000
#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ea5f6ad9ad96
--- x/include/net/9p/client.h
+++ y/include/net/9p/client.h
@@ -11,6 +11,7 @@
#include <linux/utsname.h>
#include <linux/idr.h>
+#include <linux/mutex.h>
#include <linux/tracepoint-defs.h>
/* Number of requests per row */
@@ -122,6 +123,7 @@ struct p9_client {
struct idr fids;
struct idr reqs;
+ struct mutex destroy_mutex;
char name[__NEW_UTS_LEN + 1];
};
--- x/net/9p/client.c
+++ y/net/9p/client.c
@@ -1041,6 +1041,7 @@ struct p9_client *p9_client_create(const
0, 0, P9_HDRSZ + 4,
clnt->msize - (P9_HDRSZ + 4),
NULL);
+ mutex_init(&clnt->destroy_mutex);
return clnt;
@@ -1058,6 +1059,7 @@ void p9_client_destroy(struct p9_client
{
struct p9_fid *fid;
int id;
+ int clean;
p9_debug(P9_DEBUG_MUX, "clnt %p\n", clnt);
@@ -1066,9 +1068,18 @@ void p9_client_destroy(struct p9_client
v9fs_put_trans(clnt->trans_mod);
+again:
+ clean = 1;
+ mutex_lock(&clnt->destroy_mutex);
idr_for_each_entry(&clnt->fids, fid, id) {
pr_info("Found fid %d not clunked\n", fid->fid);
- p9_fid_destroy(fid);
+ clean = 0;
+ break;
+ }
+ mutex_unlock(&clnt->destroy_mutex);
+ if (!clean) {
+ schedule_timeout_idle(2);
+ goto again;
}
p9_tag_cleanup(clnt);
@@ -1454,7 +1465,9 @@ error:
if (retries++ == 0)
goto again;
} else {
+ mutex_lock(&clnt->destroy_mutex);
p9_fid_destroy(fid);
+ mutex_unlock(&clnt->destroy_mutex);
}
return err;
}
--
Hello,
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Reported-and-tested-by: [email protected]
Tested on:
commit: ea5f6ad9 Merge tag 'platform-drivers-x86-v6.10-1' of g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=16900458980000
kernel config: https://syzkaller.appspot.com/x/.config?x=f1cd4092753f97c5
dashboard link: https://syzkaller.appspot.com/bug?extid=d7c7a495a5e466c031b6
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=140d6b5c980000
Note: testing is done by a robot and is best-effort only.
On Fri, 17 May 2024 04:31:28 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: ea5f6ad9ad96 Merge tag 'platform-drivers-x86-v6.10-1' of g..
> git tree: upstream
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11df3084980000
#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ea5f6ad9ad96
--- x/net/9p/client.c
+++ y/net/9p/client.c
@@ -879,15 +879,19 @@ static void p9_fid_destroy(struct p9_fid
{
struct p9_client *clnt;
unsigned long flags;
+ bool empty;
p9_debug(P9_DEBUG_FID, "fid %d\n", fid->fid);
trace_9p_fid_ref(fid, P9_FID_REF_DESTROY);
clnt = fid->clnt;
spin_lock_irqsave(&clnt->lock, flags);
idr_remove(&clnt->fids, fid->fid);
+ empty = idr_is_empty(&clnt->fids) && clnt->status == Hung + 1;
spin_unlock_irqrestore(&clnt->lock, flags);
kfree(fid->rdir);
kfree(fid);
+ if (empty)
+ kfree(clnt);
}
/* We also need to export tracepoint symbols for tracepoint_enabled() */
@@ -1057,6 +1061,8 @@ EXPORT_SYMBOL(p9_client_create);
void p9_client_destroy(struct p9_client *clnt)
{
struct p9_fid *fid;
+ unsigned long flags;
+ bool empty;
int id;
p9_debug(P9_DEBUG_MUX, "clnt %p\n", clnt);
@@ -1068,13 +1074,18 @@ void p9_client_destroy(struct p9_client
idr_for_each_entry(&clnt->fids, fid, id) {
pr_info("Found fid %d not clunked\n", fid->fid);
- p9_fid_destroy(fid);
}
p9_tag_cleanup(clnt);
kmem_cache_destroy(clnt->fcall_cache);
- kfree(clnt);
+ spin_lock_irqsave(&clnt->lock, flags);
+ clnt->status = Hung + 1;
+ empty = idr_is_empty(&clnt->fids);
+ spin_unlock_irqrestore(&clnt->lock, flags);
+
+ if (empty)
+ kfree(clnt);
}
EXPORT_SYMBOL(p9_client_destroy);
--
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue:
WARNING: refcount bug in p9_req_put
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 1 PID: 90 at lib/refcount.c:28 refcount_warn_saturate+0x14a/0x210 lib/refcount.c:28
Modules linked in:
CPU: 1 PID: 90 Comm: kworker/u32:4 Not tainted 6.9.0-syzkaller-08284-gea5f6ad9ad96-dirty #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Workqueue: events_unbound netfs_write_collection_worker
RIP: 0010:refcount_warn_saturate+0x14a/0x210 lib/refcount.c:28
Code: ff 89 de e8 98 2d 0d fd 84 db 0f 85 66 ff ff ff e8 0b 33 0d fd c6 05 97 cc 4c 0b 01 90 48 c7 c7 00 24 8f 8b e8 f7 47 cf fc 90 <0f> 0b 90 90 e9 43 ff ff ff e8 e8 32 0d fd 0f b6 1d 72 cc 4c 0b 31
RSP: 0018:ffffc9000163f830 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff814ff319
RDX: ffff88801c180000 RSI: ffffffff814ff326 RDI: 0000000000000001
RBP: ffff8880127f4bb8 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000002 R12: ffff8880127f4bb0
R13: ffff8880127f4bb8 R14: ffff88802ad24400 R15: 00000000ffffffea
FS: 0000000000000000(0000) GS:ffff88806b100000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000000020001000 CR3: 000000002c226000 CR4: 0000000000350ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__refcount_sub_and_test include/linux/refcount.h:275 [inline]
__refcount_dec_and_test include/linux/refcount.h:307 [inline]
refcount_dec_and_test include/linux/refcount.h:325 [inline]
p9_req_put+0x1f4/0x250 net/9p/client.c:402
p9_client_rpc+0x591/0xc10 net/9p/client.c:759
p9_client_clunk+0x93/0x170 net/9p/client.c:1450
p9_fid_put include/net/9p/client.h:280 [inline]
v9fs_free_request+0xdc/0x110 fs/9p/vfs_addr.c:138
netfs_free_request+0x22c/0x690 fs/netfs/objects.c:133
netfs_put_request+0x19b/0x1f0 fs/netfs/objects.c:165
netfs_write_collection_worker+0x19d0/0x59e0 fs/netfs/write_collect.c:701
process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
process_scheduled_works kernel/workqueue.c:3312 [inline]
worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393
kthread+0x2c1/0x3a0 kernel/kthread.c:389
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Tested on:
commit: ea5f6ad9 Merge tag 'platform-drivers-x86-v6.10-1' of g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=16c7da34980000
kernel config: https://syzkaller.appspot.com/x/.config?x=f1cd4092753f97c5
dashboard link: https://syzkaller.appspot.com/bug?extid=d7c7a495a5e466c031b6
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=1097adf0980000
On Fri, 17 May 2024 04:31:28 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: ea5f6ad9ad96 Merge tag 'platform-drivers-x86-v6.10-1' of g..
> git tree: upstream
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11df3084980000
#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ea5f6ad9ad96
--- x/net/9p/client.c
+++ y/net/9p/client.c
@@ -703,8 +703,6 @@ p9_client_rpc(struct p9_client *c, int8_
err = c->trans_mod->request(c, req);
if (err < 0) {
- /* write won't happen */
- p9_req_put(c, req);
if (err != -ERESTARTSYS && err != -EFAULT)
c->status = Disconnected;
goto recalc_sigpending;
@@ -879,15 +877,19 @@ static void p9_fid_destroy(struct p9_fid
{
struct p9_client *clnt;
unsigned long flags;
+ bool empty;
p9_debug(P9_DEBUG_FID, "fid %d\n", fid->fid);
trace_9p_fid_ref(fid, P9_FID_REF_DESTROY);
clnt = fid->clnt;
spin_lock_irqsave(&clnt->lock, flags);
idr_remove(&clnt->fids, fid->fid);
+ empty = idr_is_empty(&clnt->fids) && clnt->status == Hung + 1;
spin_unlock_irqrestore(&clnt->lock, flags);
kfree(fid->rdir);
kfree(fid);
+ if (empty)
+ kfree(clnt);
}
/* We also need to export tracepoint symbols for tracepoint_enabled() */
@@ -1057,6 +1059,8 @@ EXPORT_SYMBOL(p9_client_create);
void p9_client_destroy(struct p9_client *clnt)
{
struct p9_fid *fid;
+ unsigned long flags;
+ bool empty;
int id;
p9_debug(P9_DEBUG_MUX, "clnt %p\n", clnt);
@@ -1068,13 +1072,18 @@ void p9_client_destroy(struct p9_client
idr_for_each_entry(&clnt->fids, fid, id) {
pr_info("Found fid %d not clunked\n", fid->fid);
- p9_fid_destroy(fid);
}
p9_tag_cleanup(clnt);
kmem_cache_destroy(clnt->fcall_cache);
- kfree(clnt);
+ spin_lock_irqsave(&clnt->lock, flags);
+ clnt->status = Hung + 1;
+ empty = idr_is_empty(&clnt->fids);
+ spin_unlock_irqrestore(&clnt->lock, flags);
+
+ if (empty)
+ kfree(clnt);
}
EXPORT_SYMBOL(p9_client_destroy);
--
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue:
WARNING: refcount bug in p9_req_put
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 3 PID: 1092 at lib/refcount.c:28 refcount_warn_saturate+0x14a/0x210 lib/refcount.c:28
Modules linked in:
CPU: 3 PID: 1092 Comm: kworker/u32:8 Not tainted 6.9.0-syzkaller-08284-gea5f6ad9ad96-dirty #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
Workqueue: events_unbound netfs_write_collection_worker
RIP: 0010:refcount_warn_saturate+0x14a/0x210 lib/refcount.c:28
Code: ff 89 de e8 98 2d 0d fd 84 db 0f 85 66 ff ff ff e8 0b 33 0d fd c6 05 97 cc 4c 0b 01 90 48 c7 c7 00 24 8f 8b e8 f7 47 cf fc 90 <0f> 0b 90 90 e9 43 ff ff ff e8 e8 32 0d fd 0f b6 1d 72 cc 4c 0b 31
RSP: 0018:ffffc90003a5f830 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff814ff319
RDX: ffff888023018000 RSI: ffffffff814ff326 RDI: 0000000000000001
RBP: ffff88804070b768 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000002 R12: ffff88804070b760
R13: ffff88804070b768 R14: ffff88802d844c00 R15: 00000000ffffffea
FS: 0000000000000000(0000) GS:ffff88806b300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffc1bfd2c38 CR3: 000000000d97a000 CR4: 0000000000350ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__refcount_sub_and_test include/linux/refcount.h:275 [inline]
__refcount_dec_and_test include/linux/refcount.h:307 [inline]
refcount_dec_and_test include/linux/refcount.h:325 [inline]
p9_req_put+0x1f4/0x250 net/9p/client.c:402
p9_client_rpc+0x5fd/0xc00 net/9p/client.c:757
p9_client_clunk+0x93/0x170 net/9p/client.c:1448
p9_fid_put include/net/9p/client.h:280 [inline]
v9fs_free_request+0xdc/0x110 fs/9p/vfs_addr.c:138
netfs_free_request+0x22c/0x690 fs/netfs/objects.c:133
netfs_put_request+0x19b/0x1f0 fs/netfs/objects.c:165
netfs_write_collection_worker+0x19d0/0x59e0 fs/netfs/write_collect.c:701
process_one_work+0x9fb/0x1b60 kernel/workqueue.c:3231
process_scheduled_works kernel/workqueue.c:3312 [inline]
worker_thread+0x6c8/0xf70 kernel/workqueue.c:3393
kthread+0x2c1/0x3a0 kernel/kthread.c:389
ret_from_fork+0x45/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>
Tested on:
commit: ea5f6ad9 Merge tag 'platform-drivers-x86-v6.10-1' of g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=140cc3d0980000
kernel config: https://syzkaller.appspot.com/x/.config?x=f1cd4092753f97c5
dashboard link: https://syzkaller.appspot.com/bug?extid=d7c7a495a5e466c031b6
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=1755df84980000
On Fri, 17 May 2024 04:31:28 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: ea5f6ad9ad96 Merge tag 'platform-drivers-x86-v6.10-1' of g..
> git tree: upstream
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11df3084980000
#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ea5f6ad9ad96
--- x/net/9p/client.c
+++ y/net/9p/client.c
@@ -703,8 +703,6 @@ p9_client_rpc(struct p9_client *c, int8_
err = c->trans_mod->request(c, req);
if (err < 0) {
- /* write won't happen */
- p9_req_put(c, req);
if (err != -ERESTARTSYS && err != -EFAULT)
c->status = Disconnected;
goto recalc_sigpending;
@@ -755,8 +753,8 @@ recalc_sigpending:
trace_9p_client_res(c, type, req->rc.tag, err);
if (!err)
return req;
-reterr:
p9_req_put(c, req);
+reterr:
return ERR_PTR(safe_errno(err));
}
@@ -879,15 +877,19 @@ static void p9_fid_destroy(struct p9_fid
{
struct p9_client *clnt;
unsigned long flags;
+ bool empty;
p9_debug(P9_DEBUG_FID, "fid %d\n", fid->fid);
trace_9p_fid_ref(fid, P9_FID_REF_DESTROY);
clnt = fid->clnt;
spin_lock_irqsave(&clnt->lock, flags);
idr_remove(&clnt->fids, fid->fid);
+ empty = idr_is_empty(&clnt->fids) && clnt->status == Hung + 1;
spin_unlock_irqrestore(&clnt->lock, flags);
kfree(fid->rdir);
kfree(fid);
+ if (empty)
+ kfree(clnt);
}
/* We also need to export tracepoint symbols for tracepoint_enabled() */
@@ -1057,6 +1059,8 @@ EXPORT_SYMBOL(p9_client_create);
void p9_client_destroy(struct p9_client *clnt)
{
struct p9_fid *fid;
+ unsigned long flags;
+ bool empty;
int id;
p9_debug(P9_DEBUG_MUX, "clnt %p\n", clnt);
@@ -1068,13 +1072,18 @@ void p9_client_destroy(struct p9_client
idr_for_each_entry(&clnt->fids, fid, id) {
pr_info("Found fid %d not clunked\n", fid->fid);
- p9_fid_destroy(fid);
}
p9_tag_cleanup(clnt);
kmem_cache_destroy(clnt->fcall_cache);
- kfree(clnt);
+ spin_lock_irqsave(&clnt->lock, flags);
+ clnt->status = Hung + 1;
+ empty = idr_is_empty(&clnt->fids);
+ spin_unlock_irqrestore(&clnt->lock, flags);
+
+ if (empty)
+ kfree(clnt);
}
EXPORT_SYMBOL(p9_client_destroy);
--
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue:
WARNING: refcount bug in p9_req_put
9pnet: Found fid 3 not clunked
9pnet: Tag 0 still in use
------------[ cut here ]------------
refcount_t: underflow; use-after-free.
WARNING: CPU: 3 PID: 5345 at lib/refcount.c:28 refcount_warn_saturate+0x14a/0x210 lib/refcount.c:28
Modules linked in:
CPU: 3 PID: 5345 Comm: syz-executor.3 Not tainted 6.9.0-syzkaller-08284-gea5f6ad9ad96-dirty #0
Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
RIP: 0010:refcount_warn_saturate+0x14a/0x210 lib/refcount.c:28
Code: ff 89 de e8 98 2d 0d fd 84 db 0f 85 66 ff ff ff e8 0b 33 0d fd c6 05 97 cc 4c 0b 01 90 48 c7 c7 00 24 8f 8b e8 f7 47 cf fc 90 <0f> 0b 90 90 e9 43 ff ff ff e8 e8 32 0d fd 0f b6 1d 72 cc 4c 0b 31
RSP: 0018:ffffc900037bfc78 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffffff814ff319
RDX: ffff88801f798000 RSI: ffffffff814ff326 RDI: 0000000000000001
RBP: ffff88802c045108 R08: 0000000000000001 R09: 0000000000000000
R10: 0000000000000001 R11: 0000000000000001 R12: ffff88802b369870
R13: ffff88802c045108 R14: ffff88802b369800 R15: ffff888025704c80
FS: 000055556f75f480(0000) GS:ffff88806b300000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 0000555569357788 CR3: 00000000278a4000 CR4: 0000000000350ef0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
__refcount_sub_and_test include/linux/refcount.h:275 [inline]
__refcount_dec_and_test include/linux/refcount.h:307 [inline]
refcount_dec_and_test include/linux/refcount.h:325 [inline]
p9_req_put+0x1f4/0x250 net/9p/client.c:402
p9_tag_cleanup net/9p/client.c:429 [inline]
p9_client_destroy+0x219/0x540 net/9p/client.c:1077
v9fs_session_close+0x49/0x2d0 fs/9p/v9fs.c:506
v9fs_kill_super+0x4d/0xa0 fs/9p/vfs_super.c:196
deactivate_locked_super+0xbe/0x1a0 fs/super.c:472
deactivate_super+0xde/0x100 fs/super.c:505
cleanup_mnt+0x222/0x450 fs/namespace.c:1267
task_work_run+0x14e/0x250 kernel/task_work.c:180
resume_user_mode_work include/linux/resume_user_mode.h:50 [inline]
exit_to_user_mode_loop kernel/entry/common.c:114 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
__syscall_exit_to_user_mode_work kernel/entry/common.c:207 [inline]
syscall_exit_to_user_mode+0x278/0x2a0 kernel/entry/common.c:218
do_syscall_64+0xdc/0x260 arch/x86/entry/common.c:89
entry_SYSCALL_64_after_hwframe+0x77/0x7f
RIP: 0033:0x7f936f47e217
Code: b0 ff ff ff f7 d8 64 89 01 48 83 c8 ff c3 0f 1f 44 00 00 31 f6 e9 09 00 00 00 66 0f 1f 84 00 00 00 00 00 b8 a6 00 00 00 0f 05 <48> 3d 00 f0 ff ff 77 01 c3 48 c7 c2 b0 ff ff ff f7 d8 64 89 02 b8
RSP: 002b:00007ffeefbb92f8 EFLAGS: 00000246 ORIG_RAX: 00000000000000a6
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 00007f936f47e217
RDX: 0000000000000000 RSI: 0000000000000009 RDI: 00007ffeefbb93b0
RBP: 00007ffeefbb93b0 R08: 0000000000000000 R09: 0000000000000000
R10: 00000000ffffffff R11: 0000000000000246 R12: 00007ffeefbba470
R13: 00007f936f4c8336 R14: 000000000001951c R15: 0000000000000005
</TASK>
Tested on:
commit: ea5f6ad9 Merge tag 'platform-drivers-x86-v6.10-1' of g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=14b71442980000
kernel config: https://syzkaller.appspot.com/x/.config?x=f1cd4092753f97c5
dashboard link: https://syzkaller.appspot.com/bug?extid=d7c7a495a5e466c031b6
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=160f3b68980000
On Fri, 17 May 2024 04:31:28 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: ea5f6ad9ad96 Merge tag 'platform-drivers-x86-v6.10-1' of g..
> git tree: upstream
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11df3084980000
#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ea5f6ad9ad96
--- x/net/9p/client.c
+++ y/net/9p/client.c
@@ -426,9 +426,6 @@ static void p9_tag_cleanup(struct p9_cli
rcu_read_lock();
idr_for_each_entry(&c->reqs, req, id) {
pr_info("Tag %d still in use\n", id);
- if (p9_req_put(c, req) == 0)
- pr_warn("Packet with tag %d has still references",
- req->tc.tag);
}
rcu_read_unlock();
}
@@ -879,15 +876,19 @@ static void p9_fid_destroy(struct p9_fid
{
struct p9_client *clnt;
unsigned long flags;
+ bool empty;
p9_debug(P9_DEBUG_FID, "fid %d\n", fid->fid);
trace_9p_fid_ref(fid, P9_FID_REF_DESTROY);
clnt = fid->clnt;
spin_lock_irqsave(&clnt->lock, flags);
idr_remove(&clnt->fids, fid->fid);
+ empty = idr_is_empty(&clnt->fids) && clnt->status == Hung + 1;
spin_unlock_irqrestore(&clnt->lock, flags);
kfree(fid->rdir);
kfree(fid);
+ if (empty)
+ kfree(clnt);
}
/* We also need to export tracepoint symbols for tracepoint_enabled() */
@@ -1057,6 +1058,8 @@ EXPORT_SYMBOL(p9_client_create);
void p9_client_destroy(struct p9_client *clnt)
{
struct p9_fid *fid;
+ unsigned long flags;
+ bool empty;
int id;
p9_debug(P9_DEBUG_MUX, "clnt %p\n", clnt);
@@ -1068,13 +1071,17 @@ void p9_client_destroy(struct p9_client
idr_for_each_entry(&clnt->fids, fid, id) {
pr_info("Found fid %d not clunked\n", fid->fid);
- p9_fid_destroy(fid);
}
p9_tag_cleanup(clnt);
kmem_cache_destroy(clnt->fcall_cache);
- kfree(clnt);
+ spin_lock_irqsave(&clnt->lock, flags);
+ clnt->status = Hung + 1;
+ empty = idr_is_empty(&clnt->fids);
+ spin_unlock_irqrestore(&clnt->lock, flags);
+ if (empty)
+ kfree(clnt);
}
EXPORT_SYMBOL(p9_client_destroy);
--
Hello,
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Reported-and-tested-by: [email protected]
Tested on:
commit: ea5f6ad9 Merge tag 'platform-drivers-x86-v6.10-1' of g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=12d37adc980000
kernel config: https://syzkaller.appspot.com/x/.config?x=f1cd4092753f97c5
dashboard link: https://syzkaller.appspot.com/bug?extid=d7c7a495a5e466c031b6
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=149fbcb2980000
Note: testing is done by a robot and is best-effort only.
On Fri, 17 May 2024 04:31:28 -0700
> syzbot has found a reproducer for the following issue on:
>
> HEAD commit: ea5f6ad9ad96 Merge tag 'platform-drivers-x86-v6.10-1' of g..
> git tree: upstream
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=11df3084980000
Test David's patch [1]
[1] https://lore.kernel.org/lkml/[email protected]/
#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git ea5f6ad9ad96
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -353,6 +353,7 @@ void v9fs_evict_inode(struct inode *inod
version = cpu_to_le32(v9inode->qid.version);
netfs_clear_inode_writeback(inode, &version);
+ netfs_wait_for_outstanding_io(inode);
clear_inode(inode);
filemap_fdatawrite(&inode->i_data);
@@ -360,8 +361,10 @@ void v9fs_evict_inode(struct inode *inod
if (v9fs_inode_cookie(v9inode))
fscache_relinquish_cookie(v9fs_inode_cookie(v9inode), false);
#endif
- } else
+ } else {
+ netfs_wait_for_outstanding_io(inode);
clear_inode(inode);
+ }
}
struct inode *
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -652,6 +652,7 @@ void afs_evict_inode(struct inode *inode
afs_set_cache_aux(vnode, &aux);
netfs_clear_inode_writeback(inode, &aux);
+ netfs_wait_for_outstanding_io(inode);
clear_inode(inode);
while (!list_empty(&vnode->wb_keys)) {
--- a/fs/netfs/objects.c
+++ b/fs/netfs/objects.c
@@ -72,6 +72,7 @@ struct netfs_io_request *netfs_alloc_req
}
}
+ atomic_inc(&ctx->io_count);
trace_netfs_rreq_ref(rreq->debug_id, 1, netfs_rreq_trace_new);
netfs_proc_add_rreq(rreq);
netfs_stat(&netfs_n_rh_rreq);
@@ -124,6 +125,7 @@ static void netfs_free_request(struct wo
{
struct netfs_io_request *rreq =
container_of(work, struct netfs_io_request, work);
+ struct netfs_inode *ictx = netfs_inode(rreq->inode);
unsigned int i;
trace_netfs_rreq(rreq, netfs_rreq_trace_free);
@@ -142,6 +144,9 @@ static void netfs_free_request(struct wo
}
kvfree(rreq->direct_bv);
}
+
+ if (atomic_dec_and_test(&ictx->io_count))
+ wake_up_var(&ictx->io_count);
call_rcu(&rreq->rcu, netfs_free_request_rcu);
}
--- a/fs/smb/client/cifsfs.c
+++ b/fs/smb/client/cifsfs.c
@@ -435,6 +435,7 @@ cifs_evict_inode(struct inode *inode)
if (inode->i_state & I_PINNING_NETFS_WB)
cifs_fscache_unuse_inode_cookie(inode, true);
cifs_fscache_release_inode_cookie(inode);
+ netfs_wait_for_outstanding_io(inode);
clear_inode(inode);
}
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -68,6 +68,7 @@ struct netfs_inode {
loff_t remote_i_size; /* Size of the remote file */
loff_t zero_point; /* Size after which we assume there's no data
* on the server */
+ atomic_t io_count; /* Number of outstanding reqs */
unsigned long flags;
#define NETFS_ICTX_ODIRECT 0 /* The file has DIO in progress */
#define NETFS_ICTX_UNBUFFERED 1 /* I/O should not use the pagecache */
@@ -472,6 +473,7 @@ static inline void netfs_inode_init(stru
ctx->remote_i_size = i_size_read(&ctx->inode);
ctx->zero_point = LLONG_MAX;
ctx->flags = 0;
+ atomic_set(&ctx->io_count, 0);
#if IS_ENABLED(CONFIG_FSCACHE)
ctx->cache = NULL;
#endif
@@ -515,4 +517,20 @@ static inline struct fscache_cookie *net
#endif
}
+/**
+ * netfs_wait_for_outstanding_io - Wait for outstanding I/O to complete
+ * @ctx: The netfs inode to wait on
+ *
+ * Wait for outstanding I/O requests of any type to complete. This is intended
+ * to be called from inode eviction routines. This makes sure that any
+ * resources held by those requests are cleaned up before we let the inode get
+ * cleaned up.
+ */
+static inline void netfs_wait_for_outstanding_io(struct inode *inode)
+{
+ struct netfs_inode *ictx = netfs_inode(inode);
+
+ wait_var_event(&ictx->io_count, atomic_read(&ictx->io_count) == 0);
+}
+
#endif /* _LINUX_NETFS_H */
--
Hello,
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Reported-and-tested-by: [email protected]
Tested on:
commit: ea5f6ad9 Merge tag 'platform-drivers-x86-v6.10-1' of g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git
console output: https://syzkaller.appspot.com/x/log.txt?x=107aef84980000
kernel config: https://syzkaller.appspot.com/x/.config?x=f1cd4092753f97c5
dashboard link: https://syzkaller.appspot.com/bug?extid=d7c7a495a5e466c031b6
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=14b12b3f180000
Note: testing is done by a robot and is best-effort only.
#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
netfs, 9p: Fix race between umount and async request completion [v2]
There's a problem in 9p's interaction with netfslib whereby a crash occurs
because the 9p_fid structs get forcibly destroyed during client teardown
(without paying attention to their refcounts) before netfslib has finished
with them. However, it's not a simple case of deferring the clunking that
p9_fid_put() does as that requires the client.
The problem is that netfslib has to unlock pages and clear the IN_PROGRESS
flag before destroying the objects involved - including the pid - and, in
any case, nothing checks to see if writeback completed barring looking at
the page flags.
Fix this by keeping a count of outstanding I/O requests (of any type) and
waiting for it to quiesce during inode eviction.
Reported-by: [email protected]
Link: https://lore.kernel.org/all/[email protected]/
Reported-by: [email protected]
Link: https://lore.kernel.org/all/[email protected]/
Reported-by: [email protected]
Link: https://lore.kernel.org/all/[email protected]/
Signed-off-by: David Howells <[email protected]>
cc: Eric Van Hensbergen <[email protected]>
cc: Latchesar Ionkov <[email protected]>
cc: Dominique Martinet <[email protected]>
cc: Christian Schoenebeck <[email protected]>
cc: Jeff Layton <[email protected]>
cc: Steve French <[email protected]>
cc: Hillf Danton <[email protected]>
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
cc: [email protected]
Notes:
Changes
=======
ver #2)
- Wait for outstanding I/O before clobbering the pagecache.
---
fs/9p/vfs_inode.c | 1 +
fs/afs/inode.c | 1 +
fs/netfs/objects.c | 5 +++++
fs/smb/client/cifsfs.c | 1 +
include/linux/netfs.h | 18 ++++++++++++++++++
5 files changed, 26 insertions(+)
diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
index 8c9a896d691e..effb3aa1f3ed 100644
--- a/fs/9p/vfs_inode.c
+++ b/fs/9p/vfs_inode.c
@@ -349,6 +349,7 @@ void v9fs_evict_inode(struct inode *inode)
__le32 __maybe_unused version;
if (!is_bad_inode(inode)) {
+ netfs_wait_for_outstanding_io(inode);
truncate_inode_pages_final(&inode->i_data);
version = cpu_to_le32(v9inode->qid.version);
diff --git a/fs/afs/inode.c b/fs/afs/inode.c
index 94fc049aff58..15bb7989c387 100644
--- a/fs/afs/inode.c
+++ b/fs/afs/inode.c
@@ -648,6 +648,7 @@ void afs_evict_inode(struct inode *inode)
ASSERTCMP(inode->i_ino, ==, vnode->fid.vnode);
+ netfs_wait_for_outstanding_io(inode);
truncate_inode_pages_final(&inode->i_data);
afs_set_cache_aux(vnode, &aux);
diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
index c90d482b1650..f4a642727479 100644
--- a/fs/netfs/objects.c
+++ b/fs/netfs/objects.c
@@ -72,6 +72,7 @@ struct netfs_io_request *netfs_alloc_request(struct address_space *mapping,
}
}
+ atomic_inc(&ctx->io_count);
trace_netfs_rreq_ref(rreq->debug_id, 1, netfs_rreq_trace_new);
netfs_proc_add_rreq(rreq);
netfs_stat(&netfs_n_rh_rreq);
@@ -124,6 +125,7 @@ static void netfs_free_request(struct work_struct *work)
{
struct netfs_io_request *rreq =
container_of(work, struct netfs_io_request, work);
+ struct netfs_inode *ictx = netfs_inode(rreq->inode);
unsigned int i;
trace_netfs_rreq(rreq, netfs_rreq_trace_free);
@@ -142,6 +144,9 @@ static void netfs_free_request(struct work_struct *work)
}
kvfree(rreq->direct_bv);
}
+
+ if (atomic_dec_and_test(&ictx->io_count))
+ wake_up_var(&ictx->io_count);
call_rcu(&rreq->rcu, netfs_free_request_rcu);
}
diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
index ec5b639f421a..14810ffd15c8 100644
--- a/fs/smb/client/cifsfs.c
+++ b/fs/smb/client/cifsfs.c
@@ -431,6 +431,7 @@ cifs_free_inode(struct inode *inode)
static void
cifs_evict_inode(struct inode *inode)
{
+ netfs_wait_for_outstanding_io(inode);
truncate_inode_pages_final(&inode->i_data);
if (inode->i_state & I_PINNING_NETFS_WB)
cifs_fscache_unuse_inode_cookie(inode, true);
diff --git a/include/linux/netfs.h b/include/linux/netfs.h
index d2d291a9cdad..3ca3906bb8da 100644
--- a/include/linux/netfs.h
+++ b/include/linux/netfs.h
@@ -68,6 +68,7 @@ struct netfs_inode {
loff_t remote_i_size; /* Size of the remote file */
loff_t zero_point; /* Size after which we assume there's no data
* on the server */
+ atomic_t io_count; /* Number of outstanding reqs */
unsigned long flags;
#define NETFS_ICTX_ODIRECT 0 /* The file has DIO in progress */
#define NETFS_ICTX_UNBUFFERED 1 /* I/O should not use the pagecache */
@@ -474,6 +475,7 @@ static inline void netfs_inode_init(struct netfs_inode *ctx,
ctx->remote_i_size = i_size_read(&ctx->inode);
ctx->zero_point = LLONG_MAX;
ctx->flags = 0;
+ atomic_set(&ctx->io_count, 0);
#if IS_ENABLED(CONFIG_FSCACHE)
ctx->cache = NULL;
#endif
@@ -517,4 +519,20 @@ static inline struct fscache_cookie *netfs_i_cookie(struct netfs_inode *ctx)
#endif
}
+/**
+ * netfs_wait_for_outstanding_io - Wait for outstanding I/O to complete
+ * @ctx: The netfs inode to wait on
+ *
+ * Wait for outstanding I/O requests of any type to complete. This is intended
+ * to be called from inode eviction routines. This makes sure that any
+ * resources held by those requests are cleaned up before we let the inode get
+ * cleaned up.
+ */
+static inline void netfs_wait_for_outstanding_io(struct inode *inode)
+{
+ struct netfs_inode *ictx = netfs_inode(inode);
+
+ wait_var_event(&ictx->io_count, atomic_read(&ictx->io_count) == 0);
+}
+
#endif /* _LINUX_NETFS_H */
Hello,
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Reported-and-tested-by: [email protected]
Tested on:
commit: c760b372 Merge tag 'mm-nonmm-stable-2024-05-22-17-30' ..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=1456cc14980000
kernel config: https://syzkaller.appspot.com/x/.config?x=966dbeb548ca6926
dashboard link: https://syzkaller.appspot.com/bug?extid=d7c7a495a5e466c031b6
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=1632668a980000
Note: testing is done by a robot and is best-effort only.
David Howells wrote on Thu, May 23, 2024 at 03:37:38PM +0100:
> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
>
> netfs, 9p: Fix race between umount and async request completion [v2]
>
> There's a problem in 9p's interaction with netfslib whereby a crash occurs
> because the 9p_fid structs get forcibly destroyed during client teardown
> (without paying attention to their refcounts) before netfslib has finished
> with them. However, it's not a simple case of deferring the clunking that
> p9_fid_put() does as that requires the client.
"as that requires the client" doesn't parse
> The problem is that netfslib has to unlock pages and clear the IN_PROGRESS
> flag before destroying the objects involved - including the pid - and, in
s/pid/fid/
> any case, nothing checks to see if writeback completed barring looking at
> the page flags.
>
> Fix this by keeping a count of outstanding I/O requests (of any type) and
> waiting for it to quiesce during inode eviction.
>
> Reported-by: [email protected]
> Link: https://lore.kernel.org/all/[email protected]/
> Reported-by: [email protected]
> Link: https://lore.kernel.org/all/[email protected]/
> Reported-by: [email protected]
> Link: https://lore.kernel.org/all/[email protected]/
> Signed-off-by: David Howells <[email protected]>
> cc: Eric Van Hensbergen <[email protected]>
> cc: Latchesar Ionkov <[email protected]>
> cc: Dominique Martinet <[email protected]>
With these two nitpicks in commit message addressed, looks good to me,
thanks!
Reviewed-by: Dominique Martinet <[email protected]>
> cc: Christian Schoenebeck <[email protected]>
> cc: Jeff Layton <[email protected]>
> cc: Steve French <[email protected]>
> cc: Hillf Danton <[email protected]>
> cc: [email protected]
> cc: [email protected]
> cc: [email protected]
> cc: [email protected]
> cc: [email protected]
>
> Notes:
> Changes
> =======
> ver #2)
> - Wait for outstanding I/O before clobbering the pagecache.
>
> ---
> fs/9p/vfs_inode.c | 1 +
> fs/afs/inode.c | 1 +
> fs/netfs/objects.c | 5 +++++
> fs/smb/client/cifsfs.c | 1 +
> include/linux/netfs.h | 18 ++++++++++++++++++
> 5 files changed, 26 insertions(+)
>
> diff --git a/fs/9p/vfs_inode.c b/fs/9p/vfs_inode.c
> index 8c9a896d691e..effb3aa1f3ed 100644
> --- a/fs/9p/vfs_inode.c
> +++ b/fs/9p/vfs_inode.c
> @@ -349,6 +349,7 @@ void v9fs_evict_inode(struct inode *inode)
> __le32 __maybe_unused version;
>
> if (!is_bad_inode(inode)) {
> + netfs_wait_for_outstanding_io(inode);
> truncate_inode_pages_final(&inode->i_data);
>
> version = cpu_to_le32(v9inode->qid.version);
> diff --git a/fs/afs/inode.c b/fs/afs/inode.c
> index 94fc049aff58..15bb7989c387 100644
> --- a/fs/afs/inode.c
> +++ b/fs/afs/inode.c
> @@ -648,6 +648,7 @@ void afs_evict_inode(struct inode *inode)
>
> ASSERTCMP(inode->i_ino, ==, vnode->fid.vnode);
>
> + netfs_wait_for_outstanding_io(inode);
> truncate_inode_pages_final(&inode->i_data);
>
> afs_set_cache_aux(vnode, &aux);
> diff --git a/fs/netfs/objects.c b/fs/netfs/objects.c
> index c90d482b1650..f4a642727479 100644
> --- a/fs/netfs/objects.c
> +++ b/fs/netfs/objects.c
> @@ -72,6 +72,7 @@ struct netfs_io_request *netfs_alloc_request(struct address_space *mapping,
> }
> }
>
> + atomic_inc(&ctx->io_count);
> trace_netfs_rreq_ref(rreq->debug_id, 1, netfs_rreq_trace_new);
> netfs_proc_add_rreq(rreq);
> netfs_stat(&netfs_n_rh_rreq);
> @@ -124,6 +125,7 @@ static void netfs_free_request(struct work_struct *work)
> {
> struct netfs_io_request *rreq =
> container_of(work, struct netfs_io_request, work);
> + struct netfs_inode *ictx = netfs_inode(rreq->inode);
> unsigned int i;
>
> trace_netfs_rreq(rreq, netfs_rreq_trace_free);
> @@ -142,6 +144,9 @@ static void netfs_free_request(struct work_struct *work)
> }
> kvfree(rreq->direct_bv);
> }
> +
> + if (atomic_dec_and_test(&ictx->io_count))
> + wake_up_var(&ictx->io_count);
> call_rcu(&rreq->rcu, netfs_free_request_rcu);
> }
>
> diff --git a/fs/smb/client/cifsfs.c b/fs/smb/client/cifsfs.c
> index ec5b639f421a..14810ffd15c8 100644
> --- a/fs/smb/client/cifsfs.c
> +++ b/fs/smb/client/cifsfs.c
> @@ -431,6 +431,7 @@ cifs_free_inode(struct inode *inode)
> static void
> cifs_evict_inode(struct inode *inode)
> {
> + netfs_wait_for_outstanding_io(inode);
> truncate_inode_pages_final(&inode->i_data);
> if (inode->i_state & I_PINNING_NETFS_WB)
> cifs_fscache_unuse_inode_cookie(inode, true);
> diff --git a/include/linux/netfs.h b/include/linux/netfs.h
> index d2d291a9cdad..3ca3906bb8da 100644
> --- a/include/linux/netfs.h
> +++ b/include/linux/netfs.h
> @@ -68,6 +68,7 @@ struct netfs_inode {
> loff_t remote_i_size; /* Size of the remote file */
> loff_t zero_point; /* Size after which we assume there's no data
> * on the server */
> + atomic_t io_count; /* Number of outstanding reqs */
> unsigned long flags;
> #define NETFS_ICTX_ODIRECT 0 /* The file has DIO in progress */
> #define NETFS_ICTX_UNBUFFERED 1 /* I/O should not use the pagecache */
> @@ -474,6 +475,7 @@ static inline void netfs_inode_init(struct netfs_inode *ctx,
> ctx->remote_i_size = i_size_read(&ctx->inode);
> ctx->zero_point = LLONG_MAX;
> ctx->flags = 0;
> + atomic_set(&ctx->io_count, 0);
> #if IS_ENABLED(CONFIG_FSCACHE)
> ctx->cache = NULL;
> #endif
> @@ -517,4 +519,20 @@ static inline struct fscache_cookie *netfs_i_cookie(struct netfs_inode *ctx)
> #endif
> }
>
> +/**
> + * netfs_wait_for_outstanding_io - Wait for outstanding I/O to complete
> + * @ctx: The netfs inode to wait on
> + *
> + * Wait for outstanding I/O requests of any type to complete. This is intended
> + * to be called from inode eviction routines. This makes sure that any
> + * resources held by those requests are cleaned up before we let the inode get
> + * cleaned up.
> + */
> +static inline void netfs_wait_for_outstanding_io(struct inode *inode)
> +{
> + struct netfs_inode *ictx = netfs_inode(inode);
> +
> + wait_var_event(&ictx->io_count, atomic_read(&ictx->io_count) == 0);
> +}
> +
> #endif /* _LINUX_NETFS_H */
>
--
Dominique Martinet | Asmadeus
[email protected] wrote:
> > There's a problem in 9p's interaction with netfslib whereby a crash occurs
> > because the 9p_fid structs get forcibly destroyed during client teardown
> > (without paying attention to their refcounts) before netfslib has finished
> > with them. However, it's not a simple case of deferring the clunking that
> > p9_fid_put() does as that requires the client.
>
> "as that requires the client" doesn't parse
"... as that requires the p9_client record to still be present."?
David
David Howells wrote on Thu, May 23, 2024 at 07:07:05PM +0100:
> [email protected] wrote:
>
> > > There's a problem in 9p's interaction with netfslib whereby a crash occurs
> > > because the 9p_fid structs get forcibly destroyed during client teardown
> > > (without paying attention to their refcounts) before netfslib has finished
> > > with them. However, it's not a simple case of deferring the clunking that
> > > p9_fid_put() does as that requires the client.
> >
> > "as that requires the client" doesn't parse
>
> "... as that requires the p9_client record to still be present."?
Ah! yes, that works.
'as that uses/depends on the client' would also work.
Thanks,
--
Dominique Martinet | Asmadeus