Hello,
syzbot found the following issue on:
HEAD commit: 5ad3cb0ed525 Merge tag 'for-v6.8-rc2' of git://git.kernel...
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=16e39306180000
kernel config: https://syzkaller.appspot.com/x/.config?x=fad652894fc96962
dashboard link: https://syzkaller.appspot.com/bug?extid=b91eb2ed18f599dd3c31
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=149defac180000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13877412180000
Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/16eec3abf3df/disk-5ad3cb0e.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/3a72615a6fd4/vmlinux-5ad3cb0e.xz
kernel image: https://storage.googleapis.com/syzbot-assets/6f728d92d6c1/bzImage-5ad3cb0e.xz
The issue was bisected to:
commit 54cbc058d86beca3515c994039b5c0f0a34f53dd
Author: Bart Van Assche <[email protected]>
Date: Thu Feb 15 20:47:39 2024 +0000
fs/aio: Make io_cancel() generate completions again
bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=1748740e180000
final oops: https://syzkaller.appspot.com/x/report.txt?x=14c8740e180000
console output: https://syzkaller.appspot.com/x/log.txt?x=10c8740e180000
IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
Fixes: 54cbc058d86b ("fs/aio: Make io_cancel() generate completions again")
==================================================================
BUG: KASAN: slab-use-after-free in __do_sys_io_cancel fs/aio.c:2207 [inline]
BUG: KASAN: slab-use-after-free in __se_sys_io_cancel+0x2c7/0x2d0 fs/aio.c:2174
Read of size 4 at addr ffff88801f857020 by task syz-executor142/5065
CPU: 0 PID: 5065 Comm: syz-executor142 Not tainted 6.8.0-rc6-syzkaller-00238-g5ad3cb0ed525 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2e0 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x167/0x540 mm/kasan/report.c:488
kasan_report+0x142/0x180 mm/kasan/report.c:601
__do_sys_io_cancel fs/aio.c:2207 [inline]
__se_sys_io_cancel+0x2c7/0x2d0 fs/aio.c:2174
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
RIP: 0033:0x7f1c0c3bd3e9
Code: 48 83 c4 28 c3 e8 37 17 00 00 0f 1f 80 00 00 00 00 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b8 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007ffe5ffcdc98 EFLAGS: 00000246 ORIG_RAX: 00000000000000d2
RAX: ffffffffffffffda RBX: 00007ffe5ffcde68 RCX: 00007f1c0c3bd3e9
RDX: 0000000000000000 RSI: 00000000200001c0 RDI: 00007f1c0c349000
RBP: 00007f1c0c430610 R08: 00007ffe5ffcde68 R09: 00007ffe5ffcde68
R10: 00007ffe5ffcde68 R11: 0000000000000246 R12: 0000000000000001
R13: 00007ffe5ffcde58 R14: 0000000000000001 R15: 0000000000000001
</TASK>
Allocated by task 5065:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
unpoison_slab_object mm/kasan/common.c:312 [inline]
__kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3813 [inline]
slab_alloc_node mm/slub.c:3860 [inline]
kmem_cache_alloc+0x16f/0x340 mm/slub.c:3867
aio_get_req fs/aio.c:1060 [inline]
io_submit_one+0x154/0x18b0 fs/aio.c:2050
__do_sys_io_submit fs/aio.c:2113 [inline]
__se_sys_io_submit+0x17f/0x300 fs/aio.c:2083
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
Freed by task 4812:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:589
poison_slab_object+0xa6/0xe0 mm/kasan/common.c:240
__kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2121 [inline]
slab_free mm/slub.c:4299 [inline]
kmem_cache_free+0x102/0x2a0 mm/slub.c:4363
aio_poll_complete_work+0x467/0x670 fs/aio.c:1767
process_one_work kernel/workqueue.c:2633 [inline]
process_scheduled_works+0x913/0x1420 kernel/workqueue.c:2706
worker_thread+0xa5f/0x1000 kernel/workqueue.c:2787
kthread+0x2ef/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:243
Last potentially related work creation:
kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
__kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:551
insert_work+0x3e/0x330 kernel/workqueue.c:1653
__queue_work+0xbf4/0x1000 kernel/workqueue.c:1802
queue_work_on+0x14f/0x250 kernel/workqueue.c:1837
queue_work include/linux/workqueue.h:548 [inline]
schedule_work include/linux/workqueue.h:609 [inline]
aio_poll_cancel+0xbb/0x130 fs/aio.c:1779
__do_sys_io_cancel fs/aio.c:2196 [inline]
__se_sys_io_cancel+0x126/0x2d0 fs/aio.c:2174
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
The buggy address belongs to the object at ffff88801f857000
which belongs to the cache aio_kiocb of size 216
The buggy address is located 32 bytes inside of
freed 216-byte region [ffff88801f857000, ffff88801f8570d8)
The buggy address belongs to the physical page:
page:ffffea00007e15c0 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1f857
flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000000800 ffff88801a6a9500 dead000000000122 0000000000000000
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x12cc0(GFP_KERNEL|__GFP_NOWARN|__GFP_NORETRY), pid 5065, tgid 5065 (syz-executor142), ts 69523425470, free_ts 63195641417
set_page_owner include/linux/page_owner.h:31 [inline]
post_alloc_hook+0x1ea/0x210 mm/page_alloc.c:1533
prep_new_page mm/page_alloc.c:1540 [inline]
get_page_from_freelist+0x33ea/0x3580 mm/page_alloc.c:3311
__alloc_pages+0x255/0x680 mm/page_alloc.c:4567
__alloc_pages_node include/linux/gfp.h:238 [inline]
alloc_pages_node include/linux/gfp.h:261 [inline]
alloc_slab_page+0x5f/0x160 mm/slub.c:2190
allocate_slab mm/slub.c:2354 [inline]
new_slab+0x84/0x2f0 mm/slub.c:2407
___slab_alloc+0xd17/0x13e0 mm/slub.c:3540
__slab_alloc mm/slub.c:3625 [inline]
__slab_alloc_node mm/slub.c:3678 [inline]
slab_alloc_node mm/slub.c:3850 [inline]
kmem_cache_alloc+0x24d/0x340 mm/slub.c:3867
aio_get_req fs/aio.c:1060 [inline]
io_submit_one+0x154/0x18b0 fs/aio.c:2050
__do_sys_io_submit fs/aio.c:2113 [inline]
__se_sys_io_submit+0x17f/0x300 fs/aio.c:2083
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
page last free pid 4504 tgid 4504 stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1140 [inline]
free_unref_page_prepare+0x968/0xa90 mm/page_alloc.c:2346
free_unref_page+0x37/0x3f0 mm/page_alloc.c:2486
__slab_free+0x349/0x410 mm/slub.c:4211
qlink_free mm/kasan/quarantine.c:163 [inline]
qlist_free_all+0x5e/0xc0 mm/kasan/quarantine.c:179
kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
__kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:322
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3813 [inline]
slab_alloc_node mm/slub.c:3860 [inline]
kmem_cache_alloc+0x16f/0x340 mm/slub.c:3867
getname_flags+0xbc/0x4f0 fs/namei.c:140
do_sys_openat2+0xd2/0x1d0 fs/open.c:1398
do_sys_open fs/open.c:1419 [inline]
__do_sys_openat fs/open.c:1435 [inline]
__se_sys_openat fs/open.c:1430 [inline]
__x64_sys_openat+0x247/0x2a0 fs/open.c:1430
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
Memory state around the buggy address:
ffff88801f856f00: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
ffff88801f856f80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
>ffff88801f857000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88801f857080: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
ffff88801f857100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].
syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title
If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.
If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)
If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report
If you want to undo deduplication, reply with:
#syz undup
please test uaf in sys_io_cancel
#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
diff --git a/fs/aio.c b/fs/aio.c
index 28223f511931..0fed22ed9eb8 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1762,9 +1762,8 @@ static void aio_poll_complete_work(struct work_struct *work)
} /* else, POLLFREE has freed the waitqueue, so we must complete */
list_del_init(&iocb->ki_list);
iocb->ki_res.res = mangle_poll(mask);
- spin_unlock_irq(&ctx->ctx_lock);
-
iocb_put(iocb);
+ spin_unlock_irq(&ctx->ctx_lock);
}
/* assumes we are called with irqs disabled */
@@ -2198,7 +2197,6 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
break;
}
}
- spin_unlock_irq(&ctx->ctx_lock);
/*
* The result argument is no longer used - the io_event is always
@@ -2206,6 +2204,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
*/
if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
aio_complete_rw(&kiocb->rw, -EINTR);
+ spin_unlock_irq(&ctx->ctx_lock);
percpu_ref_put(&ctx->users);
Hello,
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Reported-and-tested-by: [email protected]
Tested on:
commit: 04b8076d Merge tag 'firewire-fixes-6.8-rc7' of git://g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=14602c6a180000
kernel config: https://syzkaller.appspot.com/x/.config?x=fad652894fc96962
dashboard link: https://syzkaller.appspot.com/bug?extid=b91eb2ed18f599dd3c31
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=1575e754180000
Note: testing is done by a robot and is best-effort only.
please test uaf in sys_io_cancel
#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
diff --git a/fs/aio.c b/fs/aio.c
index 28223f511931..38c556514198 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1762,6 +1762,8 @@ static void aio_poll_complete_work(struct work_struct *work)
} /* else, POLLFREE has freed the waitqueue, so we must complete */
list_del_init(&iocb->ki_list);
iocb->ki_res.res = mangle_poll(mask);
+ if (refcount_read(&iocb->ki_refcnt) == 1)
+ iocb->ki_res.obj = -EINVAL;
spin_unlock_irq(&ctx->ctx_lock);
iocb_put(iocb);
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue:
KASAN: slab-use-after-free Read in sys_io_cancel
==================================================================
BUG: KASAN: slab-use-after-free in __do_sys_io_cancel fs/aio.c:2209 [inline]
BUG: KASAN: slab-use-after-free in __se_sys_io_cancel+0x2c7/0x2d0 fs/aio.c:2176
Read of size 4 at addr ffff88802d7b6020 by task syz-executor.0/5486
CPU: 0 PID: 5486 Comm: syz-executor.0 Not tainted 6.8.0-rc6-syzkaller-g04b8076df253-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2e0 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x167/0x540 mm/kasan/report.c:488
kasan_report+0x142/0x180 mm/kasan/report.c:601
__do_sys_io_cancel fs/aio.c:2209 [inline]
__se_sys_io_cancel+0x2c7/0x2d0 fs/aio.c:2176
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
RIP: 0033:0x7f693e07dda9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f693ed4f0c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000d2
RAX: ffffffffffffffda RBX: 00007f693e1abf80 RCX: 00007f693e07dda9
RDX: 0000000000000000 RSI: 0000000020000180 RDI: 00007f693ed2e000
RBP: 00007f693e0ca47a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f693e1abf80 R15: 00007ffc96ae9b58
</TASK>
Allocated by task 5486:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
unpoison_slab_object mm/kasan/common.c:312 [inline]
__kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3813 [inline]
slab_alloc_node mm/slub.c:3860 [inline]
kmem_cache_alloc+0x16f/0x340 mm/slub.c:3867
aio_get_req fs/aio.c:1060 [inline]
io_submit_one+0x154/0x18b0 fs/aio.c:2052
__do_sys_io_submit fs/aio.c:2115 [inline]
__se_sys_io_submit+0x17f/0x300 fs/aio.c:2085
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
Freed by task 4850:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:589
poison_slab_object+0xa6/0xe0 mm/kasan/common.c:240
__kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2121 [inline]
slab_free mm/slub.c:4299 [inline]
kmem_cache_free+0x102/0x2a0 mm/slub.c:4363
aio_poll_complete_work+0x4e7/0x710 fs/aio.c:1769
process_one_work kernel/workqueue.c:2633 [inline]
process_scheduled_works+0x913/0x1420 kernel/workqueue.c:2706
worker_thread+0xa5f/0x1000 kernel/workqueue.c:2787
kthread+0x2ef/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:243
Last potentially related work creation:
kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
__kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:551
insert_work+0x3e/0x330 kernel/workqueue.c:1653
__queue_work+0xbf4/0x1000 kernel/workqueue.c:1802
queue_work_on+0x14f/0x250 kernel/workqueue.c:1837
queue_work include/linux/workqueue.h:548 [inline]
schedule_work include/linux/workqueue.h:609 [inline]
aio_poll_cancel+0xbb/0x130 fs/aio.c:1781
__do_sys_io_cancel fs/aio.c:2198 [inline]
__se_sys_io_cancel+0x126/0x2d0 fs/aio.c:2176
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
The buggy address belongs to the object at ffff88802d7b6000
which belongs to the cache aio_kiocb of size 216
The buggy address is located 32 bytes inside of
freed 216-byte region [ffff88802d7b6000, ffff88802d7b60d8)
The buggy address belongs to the physical page:
page:ffffea0000b5ed80 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x2d7b6
flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000000800 ffff88801676ab40 dead000000000122 0000000000000000
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112cc0(GFP_USER|__GFP_NOWARN|__GFP_NORETRY), pid 5486, tgid 5484 (syz-executor.0), ts 88796805839, free_ts 88787850821
set_page_owner include/linux/page_owner.h:31 [inline]
post_alloc_hook+0x1ea/0x210 mm/page_alloc.c:1533
prep_new_page mm/page_alloc.c:1540 [inline]
get_page_from_freelist+0x33ea/0x3580 mm/page_alloc.c:3311
__alloc_pages+0x255/0x680 mm/page_alloc.c:4567
__alloc_pages_node include/linux/gfp.h:238 [inline]
alloc_pages_node include/linux/gfp.h:261 [inline]
alloc_slab_page+0x5f/0x160 mm/slub.c:2190
allocate_slab mm/slub.c:2354 [inline]
new_slab+0x84/0x2f0 mm/slub.c:2407
___slab_alloc+0xd17/0x13e0 mm/slub.c:3540
__slab_alloc mm/slub.c:3625 [inline]
__slab_alloc_node mm/slub.c:3678 [inline]
slab_alloc_node mm/slub.c:3850 [inline]
kmem_cache_alloc+0x24d/0x340 mm/slub.c:3867
aio_get_req fs/aio.c:1060 [inline]
io_submit_one+0x154/0x18b0 fs/aio.c:2052
__do_sys_io_submit fs/aio.c:2115 [inline]
__se_sys_io_submit+0x17f/0x300 fs/aio.c:2085
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
page last free pid 5485 tgid 5485 stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1140 [inline]
free_unref_page_prepare+0x968/0xa90 mm/page_alloc.c:2346
free_unref_page+0x37/0x3f0 mm/page_alloc.c:2486
discard_slab mm/slub.c:2453 [inline]
__put_partials+0xeb/0x130 mm/slub.c:2922
put_cpu_partial+0x17b/0x250 mm/slub.c:2997
__slab_free+0x302/0x410 mm/slub.c:4166
qlink_free mm/kasan/quarantine.c:163 [inline]
qlist_free_all+0x5e/0xc0 mm/kasan/quarantine.c:179
kasan_quarantine_reduce+0x14f/0x170 mm/kasan/quarantine.c:286
__kasan_slab_alloc+0x23/0x80 mm/kasan/common.c:322
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3813 [inline]
slab_alloc_node mm/slub.c:3860 [inline]
kmem_cache_alloc+0x16f/0x340 mm/slub.c:3867
getname_flags+0xbc/0x4f0 fs/namei.c:140
user_path_at_empty+0x2c/0x60 fs/namei.c:2923
user_path_at include/linux/namei.h:57 [inline]
__do_sys_chdir fs/open.c:556 [inline]
__se_sys_chdir+0xbf/0x220 fs/open.c:550
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
Memory state around the buggy address:
ffff88802d7b5f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff88802d7b5f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88802d7b6000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88802d7b6080: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
ffff88802d7b6100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Tested on:
commit: 04b8076d Merge tag 'firewire-fixes-6.8-rc7' of git://g..
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=11d548e4180000
kernel config: https://syzkaller.appspot.com/x/.config?x=fad652894fc96962
dashboard link: https://syzkaller.appspot.com/bug?extid=b91eb2ed18f599dd3c31
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=1243e082180000
The aio poll work aio_poll_complete_work() need to be synchronized with syscall
io_cancel(). Otherwise, when poll work executes first, syscall may access the
released aio_kiocb object.
Fixes: 54cbc058d86b ("fs/aio: Make io_cancel() generate completions again")
Reported-and-tested-by: [email protected]
Signed-off-by: Edward Adam Davis <[email protected]>
---
fs/aio.c | 5 ++---
1 file changed, 2 insertions(+), 3 deletions(-)
diff --git a/fs/aio.c b/fs/aio.c
index 28223f511931..0fed22ed9eb8 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -1762,9 +1762,8 @@ static void aio_poll_complete_work(struct work_struct *work)
} /* else, POLLFREE has freed the waitqueue, so we must complete */
list_del_init(&iocb->ki_list);
iocb->ki_res.res = mangle_poll(mask);
- spin_unlock_irq(&ctx->ctx_lock);
-
iocb_put(iocb);
+ spin_unlock_irq(&ctx->ctx_lock);
}
/* assumes we are called with irqs disabled */
@@ -2198,7 +2197,6 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
break;
}
}
- spin_unlock_irq(&ctx->ctx_lock);
/*
* The result argument is no longer used - the io_event is always
@@ -2206,6 +2204,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
*/
if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
aio_complete_rw(&kiocb->rw, -EINTR);
+ spin_unlock_irq(&ctx->ctx_lock);
percpu_ref_put(&ctx->users);
--
2.43.0
On Sat, 02 Mar 2024 23:29:23 -0800
> syzbot found the following issue on:
>
> HEAD commit: 5ad3cb0ed525 Merge tag 'for-v6.8-rc2' of git://git.kernel...
> git tree: upstream
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13877412180000
#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
--- x/fs/aio.c
+++ y/fs/aio.c
@@ -2195,6 +2195,11 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t
if (kiocb->ki_res.obj == obj) {
ret = kiocb->ki_cancel(&kiocb->rw);
list_del_init(&kiocb->ki_list);
+ if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW) {
+ struct aio_kiocb *iocb = container_of(&kiocb->rw,
+ struct aio_kiocb, rw);
+ refcount_inc(&iocb->ki_refcnt);
+ }
break;
}
}
--
Hello,
syzbot has tested the proposed patch but the reproducer is still triggering an issue:
KASAN: slab-use-after-free Read in sys_io_cancel
==================================================================
BUG: KASAN: slab-use-after-free in __do_sys_io_cancel fs/aio.c:2212 [inline]
BUG: KASAN: slab-use-after-free in __se_sys_io_cancel+0x3a2/0x3b0 fs/aio.c:2174
Read of size 4 at addr ffff88801fe56020 by task syz-executor.0/5491
CPU: 0 PID: 5491 Comm: syz-executor.0 Not tainted 6.8.0-rc7-syzkaller-g90d35da658da-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 01/25/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x1e7/0x2e0 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x167/0x540 mm/kasan/report.c:488
kasan_report+0x142/0x180 mm/kasan/report.c:601
__do_sys_io_cancel fs/aio.c:2212 [inline]
__se_sys_io_cancel+0x3a2/0x3b0 fs/aio.c:2174
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
RIP: 0033:0x7f963407dda9
Code: 28 00 00 00 75 05 48 83 c4 28 c3 e8 e1 20 00 00 90 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f9634ed70c8 EFLAGS: 00000246 ORIG_RAX: 00000000000000d2
RAX: ffffffffffffffda RBX: 00007f96341abf80 RCX: 00007f963407dda9
RDX: 0000000000000000 RSI: 0000000020000180 RDI: 00007f9634eb6000
RBP: 00007f96340ca47a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f96341abf80 R15: 00007ffc37361bc8
</TASK>
Allocated by task 5491:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
unpoison_slab_object mm/kasan/common.c:312 [inline]
__kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3813 [inline]
slab_alloc_node mm/slub.c:3860 [inline]
kmem_cache_alloc+0x16f/0x340 mm/slub.c:3867
aio_get_req fs/aio.c:1060 [inline]
io_submit_one+0x154/0x18b0 fs/aio.c:2050
__do_sys_io_submit fs/aio.c:2113 [inline]
__se_sys_io_submit+0x17f/0x300 fs/aio.c:2083
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
Freed by task 5082:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:589
poison_slab_object+0xa6/0xe0 mm/kasan/common.c:240
__kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2121 [inline]
slab_free mm/slub.c:4299 [inline]
kmem_cache_free+0x102/0x2a0 mm/slub.c:4363
aio_poll_complete_work+0x467/0x670 fs/aio.c:1767
process_one_work kernel/workqueue.c:2633 [inline]
process_scheduled_works+0x913/0x1420 kernel/workqueue.c:2706
worker_thread+0xa5f/0x1000 kernel/workqueue.c:2787
kthread+0x2ef/0x390 kernel/kthread.c:388
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1b/0x30 arch/x86/entry/entry_64.S:243
Last potentially related work creation:
kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
__kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:551
insert_work+0x3e/0x330 kernel/workqueue.c:1653
__queue_work+0xbf4/0x1000 kernel/workqueue.c:1802
queue_work_on+0x14f/0x250 kernel/workqueue.c:1837
queue_work include/linux/workqueue.h:548 [inline]
schedule_work include/linux/workqueue.h:609 [inline]
aio_poll_cancel+0xbb/0x130 fs/aio.c:1779
__do_sys_io_cancel fs/aio.c:2196 [inline]
__se_sys_io_cancel+0x122/0x3b0 fs/aio.c:2174
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
The buggy address belongs to the object at ffff88801fe56000
which belongs to the cache aio_kiocb of size 216
The buggy address is located 32 bytes inside of
freed 216-byte region [ffff88801fe56000, ffff88801fe560d8)
The buggy address belongs to the physical page:
page:ffffea00007f9580 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x1fe56
flags: 0xfff00000000800(slab|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000000800 ffff88801772adc0 dead000000000122 0000000000000000
raw: 0000000000000000 00000000800c000c 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 0, migratetype Unmovable, gfp_mask 0x112cc0(GFP_USER|__GFP_NOWARN|__GFP_NORETRY), pid 5491, tgid 5490 (syz-executor.0), ts 90500359314, free_ts 90328660331
set_page_owner include/linux/page_owner.h:31 [inline]
post_alloc_hook+0x1ea/0x210 mm/page_alloc.c:1533
prep_new_page mm/page_alloc.c:1540 [inline]
get_page_from_freelist+0x33ea/0x3580 mm/page_alloc.c:3311
__alloc_pages+0x255/0x680 mm/page_alloc.c:4567
__alloc_pages_node include/linux/gfp.h:238 [inline]
alloc_pages_node include/linux/gfp.h:261 [inline]
alloc_slab_page+0x5f/0x160 mm/slub.c:2190
allocate_slab mm/slub.c:2354 [inline]
new_slab+0x84/0x2f0 mm/slub.c:2407
___slab_alloc+0xd17/0x13e0 mm/slub.c:3540
__slab_alloc mm/slub.c:3625 [inline]
__slab_alloc_node mm/slub.c:3678 [inline]
slab_alloc_node mm/slub.c:3850 [inline]
kmem_cache_alloc+0x24d/0x340 mm/slub.c:3867
aio_get_req fs/aio.c:1060 [inline]
io_submit_one+0x154/0x18b0 fs/aio.c:2050
__do_sys_io_submit fs/aio.c:2113 [inline]
__se_sys_io_submit+0x17f/0x300 fs/aio.c:2083
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
page last free pid 5470 tgid 5470 stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1140 [inline]
free_unref_page_prepare+0x968/0xa90 mm/page_alloc.c:2346
free_unref_page_list+0x5a3/0x850 mm/page_alloc.c:2532
release_pages+0x2744/0x2a80 mm/swap.c:1042
tlb_batch_pages_flush mm/mmu_gather.c:98 [inline]
tlb_flush_mmu_free mm/mmu_gather.c:293 [inline]
tlb_flush_mmu+0x34c/0x4e0 mm/mmu_gather.c:300
tlb_finish_mmu+0xd4/0x200 mm/mmu_gather.c:392
exit_mmap+0x4b6/0xd40 mm/mmap.c:3292
__mmput+0x115/0x3c0 kernel/fork.c:1343
exit_mm+0x21f/0x310 kernel/exit.c:569
do_exit+0x9af/0x2740 kernel/exit.c:858
do_group_exit+0x206/0x2c0 kernel/exit.c:1020
__do_sys_exit_group kernel/exit.c:1031 [inline]
__se_sys_exit_group kernel/exit.c:1029 [inline]
__x64_sys_exit_group+0x3f/0x40 kernel/exit.c:1029
do_syscall_64+0xf9/0x240
entry_SYSCALL_64_after_hwframe+0x6f/0x77
Memory state around the buggy address:
ffff88801fe55f00: 00 00 00 00 00 fc fc fc fc fc fc fc fc fc fc fc
ffff88801fe55f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
>ffff88801fe56000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff88801fe56080: fb fb fb fb fb fb fb fb fb fb fb fc fc fc fc fc
ffff88801fe56100: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
==================================================================
Tested on:
commit: 90d35da6 Linux 6.8-rc7
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=1665e3ca180000
kernel config: https://syzkaller.appspot.com/x/.config?x=c11c5c676adb61f0
dashboard link: https://syzkaller.appspot.com/bug?extid=b91eb2ed18f599dd3c31
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=12d5b14c180000
On Sat, 02 Mar 2024 23:29:23 -0800
> syzbot found the following issue on:
>
> HEAD commit: 5ad3cb0ed525 Merge tag 'for-v6.8-rc2' of git://git.kernel...
> git tree: upstream
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13877412180000
#syz test https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
--- x/fs/aio.c
+++ y/fs/aio.c
@@ -2194,6 +2194,8 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t
list_for_each_entry(kiocb, &ctx->active_reqs, ki_list) {
if (kiocb->ki_res.obj == obj) {
ret = kiocb->ki_cancel(&kiocb->rw);
+ if (ret == 0)
+ refcount_inc(&kiocb->ki_refcnt);
list_del_init(&kiocb->ki_list);
break;
}
@@ -2204,8 +2206,11 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t
* The result argument is no longer used - the io_event is always
* delivered via the ring buffer.
*/
- if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
- aio_complete_rw(&kiocb->rw, -EINTR);
+ if (ret == 0)
+ if (kiocb->rw.ki_flags & IOCB_AIO_RW)
+ aio_complete_rw(&kiocb->rw, -EINTR);
+ else
+ iocb_put(kiocb);
percpu_ref_put(&ctx->users);
--
Hello,
syzbot has tested the proposed patch and the reproducer did not trigger any issue:
Reported-and-tested-by: [email protected]
Tested on:
commit: 90d35da6 Linux 6.8-rc7
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git master
console output: https://syzkaller.appspot.com/x/log.txt?x=111f9306180000
kernel config: https://syzkaller.appspot.com/x/.config?x=c11c5c676adb61f0
dashboard link: https://syzkaller.appspot.com/bug?extid=b91eb2ed18f599dd3c31
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=16594d6a180000
Note: testing is done by a robot and is best-effort only.
On 3/3/24 04:21, Edward Adam Davis wrote:
> The aio poll work aio_poll_complete_work() need to be synchronized with syscall
> io_cancel(). Otherwise, when poll work executes first, syscall may access the
> released aio_kiocb object.
>
> Fixes: 54cbc058d86b ("fs/aio: Make io_cancel() generate completions again")
> Reported-and-tested-by: [email protected]
> Signed-off-by: Edward Adam Davis <[email protected]>
> ---
> fs/aio.c | 5 ++---
> 1 file changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/fs/aio.c b/fs/aio.c
> index 28223f511931..0fed22ed9eb8 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -1762,9 +1762,8 @@ static void aio_poll_complete_work(struct work_struct *work)
> } /* else, POLLFREE has freed the waitqueue, so we must complete */
> list_del_init(&iocb->ki_list);
> iocb->ki_res.res = mangle_poll(mask);
> - spin_unlock_irq(&ctx->ctx_lock);
> -
> iocb_put(iocb);
> + spin_unlock_irq(&ctx->ctx_lock);
> }
>
> /* assumes we are called with irqs disabled */
> @@ -2198,7 +2197,6 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
> break;
> }
> }
> - spin_unlock_irq(&ctx->ctx_lock);
>
> /*
> * The result argument is no longer used - the io_event is always
> @@ -2206,6 +2204,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id, struct iocb __user *, iocb,
> */
> if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
> aio_complete_rw(&kiocb->rw, -EINTR);
> + spin_unlock_irq(&ctx->ctx_lock);
>
> percpu_ref_put(&ctx->users);
I'm not enthusiast about the above patch because it increases the amount
of code executed with the ctx_lock held. Wouldn't something like the
untested patch below be a better solution?
Thanks,
Bart.
diff --git a/fs/aio.c b/fs/aio.c
index 28223f511931..c6fb10321e48 100644
--- a/fs/aio.c
+++ b/fs/aio.c
@@ -2177,6 +2177,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id,
struct iocb __user *, iocb,
struct kioctx *ctx;
struct aio_kiocb *kiocb;
int ret = -EINVAL;
+ bool is_cancelled_rw = false;
u32 key;
u64 obj = (u64)(unsigned long)iocb;
@@ -2193,6 +2194,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id,
struct iocb __user *, iocb,
/* TODO: use a hash or array, this sucks. */
list_for_each_entry(kiocb, &ctx->active_reqs, ki_list) {
if (kiocb->ki_res.obj == obj) {
+ is_cancelled_rw = kiocb->rw.ki_flags & IOCB_AIO_RW;
ret = kiocb->ki_cancel(&kiocb->rw);
list_del_init(&kiocb->ki_list);
break;
@@ -2204,7 +2206,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id,
struct iocb __user *, iocb,
* The result argument is no longer used - the io_event is always
* delivered via the ring buffer.
*/
- if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
+ if (ret == 0 && is_cancelled_rw)
aio_complete_rw(&kiocb->rw, -EINTR);
percpu_ref_put(&ctx->users);
On Mon, Mar 04, 2024 at 09:15:47AM -0800, Bart Van Assche wrote:
> On 3/4/24 09:03, Benjamin LaHaise wrote:
> >This is just so wrong there aren't even words to describe it. I
> >recommending reverting all of Bart's patches since they were not reviewed
> >by anyone with a sufficient level of familiarity with fs/aio.c to get it
> >right.
>
> Where were you while my patches were posted for review on the fsdevel
> mailing list and before these were sent to Linus?
That doesn't negate the need for someone with experience in a given
subsystem to sign off on the patches. There are at least 2 other people I
would trust to properly review these patches, and none of their signoffs
are present either.
> A revert request should include a detailed explanation of why the revert
> should happen. Just claiming that something is wrong is not sufficient
> to motivate a revert.
A revert is justified when a series of patches is buggy and had
insufficient review prior to merging. Mainline is not the place to be
testing half baked changes. Getting cancellation semantics and locking
right is *VERY HARD*, and the results of getting it wrong are very subtle
and user exploitable. Using the "a kernel warning hit" approach for work
on cancellation is very much a sign that the patches were half baked.
What testing did you do on your patch series? The commit messages show
nothing of interest regarding testing. Why are you touching the kiocb
after ownership has already been passed on to another entity? This is as
bad as touching memory that has been freed. You clearly do not understand
how that code works.
-ben
--
"Thought is the essence of where you are now."
On Mon, Mar 04, 2024 at 08:15:15AM -0800, Bart Van Assche wrote:
> On 3/3/24 04:21, Edward Adam Davis wrote:
> >The aio poll work aio_poll_complete_work() need to be synchronized with
> >syscall
> >io_cancel(). Otherwise, when poll work executes first, syscall may access
> >the
> >released aio_kiocb object.
> >
> >Fixes: 54cbc058d86b ("fs/aio: Make io_cancel() generate completions again")
> >Reported-and-tested-by:
> >[email protected]
> >Signed-off-by: Edward Adam Davis <[email protected]>
> >---
> > fs/aio.c | 5 ++---
> > 1 file changed, 2 insertions(+), 3 deletions(-)
> >
> >diff --git a/fs/aio.c b/fs/aio.c
> >index 28223f511931..0fed22ed9eb8 100644
> >--- a/fs/aio.c
> >+++ b/fs/aio.c
> >@@ -1762,9 +1762,8 @@ static void aio_poll_complete_work(struct
> >work_struct *work)
> > } /* else, POLLFREE has freed the waitqueue, so we must complete */
> > list_del_init(&iocb->ki_list);
> > iocb->ki_res.res = mangle_poll(mask);
> >- spin_unlock_irq(&ctx->ctx_lock);
> >-
> > iocb_put(iocb);
> >+ spin_unlock_irq(&ctx->ctx_lock);
> > }
> >
> > /* assumes we are called with irqs disabled */
> >@@ -2198,7 +2197,6 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id,
> >struct iocb __user *, iocb,
> > break;
> > }
> > }
> >- spin_unlock_irq(&ctx->ctx_lock);
> >
> > /*
> > * The result argument is no longer used - the io_event is always
> >@@ -2206,6 +2204,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id,
> >struct iocb __user *, iocb,
> > */
> > if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
> > aio_complete_rw(&kiocb->rw, -EINTR);
> >+ spin_unlock_irq(&ctx->ctx_lock);
> >
> > percpu_ref_put(&ctx->users);
This is just so wrong there aren't even words to describe it. I
recommending reverting all of Bart's patches since they were not reviewed
by anyone with a sufficient level of familiarity with fs/aio.c to get it
right.
-ben
> I'm not enthusiast about the above patch because it increases the amount
> of code executed with the ctx_lock held. Wouldn't something like the
> untested patch below be a better solution?
>
> Thanks,
>
> Bart.
>
>
> diff --git a/fs/aio.c b/fs/aio.c
> index 28223f511931..c6fb10321e48 100644
> --- a/fs/aio.c
> +++ b/fs/aio.c
> @@ -2177,6 +2177,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id,
> struct iocb __user *, iocb,
> struct kioctx *ctx;
> struct aio_kiocb *kiocb;
> int ret = -EINVAL;
> + bool is_cancelled_rw = false;
> u32 key;
> u64 obj = (u64)(unsigned long)iocb;
>
> @@ -2193,6 +2194,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id,
> struct iocb __user *, iocb,
> /* TODO: use a hash or array, this sucks. */
> list_for_each_entry(kiocb, &ctx->active_reqs, ki_list) {
> if (kiocb->ki_res.obj == obj) {
> + is_cancelled_rw = kiocb->rw.ki_flags & IOCB_AIO_RW;
> ret = kiocb->ki_cancel(&kiocb->rw);
> list_del_init(&kiocb->ki_list);
> break;
> @@ -2204,7 +2206,7 @@ SYSCALL_DEFINE3(io_cancel, aio_context_t, ctx_id,
> struct iocb __user *, iocb,
> * The result argument is no longer used - the io_event is always
> * delivered via the ring buffer.
> */
> - if (ret == 0 && kiocb->rw.ki_flags & IOCB_AIO_RW)
> + if (ret == 0 && is_cancelled_rw)
> aio_complete_rw(&kiocb->rw, -EINTR);
>
> percpu_ref_put(&ctx->users);
>
>
--
"Thought is the essence of where you are now."
On 3/4/24 09:31, Benjamin LaHaise wrote:
> A revert is justified when a series of patches is buggy and had
> insufficient review prior to merging.
That's not how Linux kernel development works. If a bug can get fixed
easily, a fix is preferred instead of reverting + reapplying a patch.
> Using the "a kernel warning hit" approach for work on cancellation is
> very much a sign that the patches were half baked.
Is there perhaps a misunderstanding? My patches fix a kernel warning and
did not introduce any new WARN*() statements.
> Why are you touching the kiocb after ownership has already been
> passed on to another entity?
Touching the kiocb after ownership has been passed is the result of an
oversight. Whether or not kiocb->ki_cancel() transfers ownership depends
on the I/O type. The use-after-free was not introduced on purpose.
Bart.
On 3/4/24 09:03, Benjamin LaHaise wrote:
> This is just so wrong there aren't even words to describe it. I
> recommending reverting all of Bart's patches since they were not reviewed
> by anyone with a sufficient level of familiarity with fs/aio.c to get it
> right.
Where were you while my patches were posted for review on the fsdevel
mailing list and before these were sent to Linus?
A revert request should include a detailed explanation of why the revert
should happen. Just claiming that something is wrong is not sufficient
to motivate a revert.
Bart.
On Mon, Mar 04, 2024 at 09:40:35AM -0800, Bart Van Assche wrote:
> On 3/4/24 09:31, Benjamin LaHaise wrote:
> >A revert is justified when a series of patches is buggy and had
> >insufficient review prior to merging.
>
> That's not how Linux kernel development works. If a bug can get fixed
> easily, a fix is preferred instead of reverting + reapplying a patch.
Your original "fix" is not right, and it wasn't properly tested. Commit
54cbc058d86beca3515c994039b5c0f0a34f53dd needs to be reverted.
> >Using the "a kernel warning hit" approach for work on cancellation is
> >very much a sign that the patches were half baked.
> Is there perhaps a misunderstanding? My patches fix a kernel warning and
> did not introduce any new WARN*() statements.
The change that introduced that callback by you was incorrect and should
be reverted.
> >Why are you touching the kiocb after ownership has already been
> >passed on to another entity?
> Touching the kiocb after ownership has been passed is the result of an
> oversight. Whether or not kiocb->ki_cancel() transfers ownership depends
> on the I/O type. The use-after-free was not introduced on purpose.
Your fix is still incorrect. You're still touching memory that you don't
own. The event should be generated via the ->ki_cancel method, not in the
io_cancel() syscall.
-ben
> Bart.
>
>
--
"Thought is the essence of where you are now."
On 3/4/24 09:47, Benjamin LaHaise wrote:
> On Mon, Mar 04, 2024 at 09:40:35AM -0800, Bart Van Assche wrote:
>> On 3/4/24 09:31, Benjamin LaHaise wrote:
>>> A revert is justified when a series of patches is buggy and had
>>> insufficient review prior to merging.
>>
>> That's not how Linux kernel development works. If a bug can get fixed
>> easily, a fix is preferred instead of reverting + reapplying a patch.
>
> Your original "fix" is not right, and it wasn't properly tested. Commit
> 54cbc058d86beca3515c994039b5c0f0a34f53dd needs to be reverted.
As I explained before, the above reply is not sufficiently detailed to
motivate a revert.
Bart.
On Mon, Mar 04, 2024 at 09:58:37AM -0800, Bart Van Assche wrote:
> On 3/4/24 09:47, Benjamin LaHaise wrote:
> >On Mon, Mar 04, 2024 at 09:40:35AM -0800, Bart Van Assche wrote:
> >>On 3/4/24 09:31, Benjamin LaHaise wrote:
> >>>A revert is justified when a series of patches is buggy and had
> >>>insufficient review prior to merging.
> >>
> >>That's not how Linux kernel development works. If a bug can get fixed
> >>easily, a fix is preferred instead of reverting + reapplying a patch.
> >
> >Your original "fix" is not right, and it wasn't properly tested. Commit
> >54cbc058d86beca3515c994039b5c0f0a34f53dd needs to be reverted.
>
> As I explained before, the above reply is not sufficiently detailed to
> motivate a revert.
You have introduced a use-after-free. You have not corrected the
underlying cause of that use-after-free.
Once you call ->ki_cancel(), you can't touch the kiocb. The call into
->ki_cancel() can result in a subsequent aio_complete() happening on that
kiocb. Your change is wrong, your "fix" is wrong, and you are refusing to
understand *why* your change was wrong in the first place.
You haven't even given me a test case justifying your change. You need to
justify your change to the maintainer, not the other way around.
Revert 54cbc058d86beca3515c994039b5c0f0a34f53dd and the problem goes away.
-ben
> Bart.
>
--
"Thought is the essence of where you are now."