2020-09-11 22:02:22

by Qian Cai

[permalink] [raw]
Subject: slab-out-of-bounds in iov_iter_revert()

Super easy to reproduce on today's mainline by just fuzzing for a few minutes
on virtiofs (if it ever matters). Any thoughts?

[ 511.089112] BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0xd8/0x3c0
iov_iter_revert at lib/iov_iter.c:1135
(inlined by) iov_iter_revert at lib/iov_iter.c:1080
[ 511.092650] Read of size 8 at addr ffff88869e11dff8 by task trinity-c1/11868
[ 511.096178]
[ 511.096897] CPU: 20 PID: 11868 Comm: trinity-c1 Not tainted 5.9.0-rc4+ #1
[ 511.100257] Hardware name: Red Hat KVM, BIOS 1.14.0-1.module+el8.3.0+7638+07cf13d2 04/01/2014
[ 511.103999] Call Trace:
[ 511.105002] dump_stack+0x7c/0xb0
[ 511.106329] ? iov_iter_revert+0xd8/0x3c0
[ 511.107915] print_address_description.constprop.7+0x1e/0x230
[ 511.110193] ? kmsg_dump_rewind_nolock+0x59/0x59
[ 511.112038] ? _raw_write_lock_irqsave+0xe0/0xe0
[ 511.113890] ? iov_iter_revert+0xd8/0x3c0
[ 511.115469] ? iov_iter_revert+0xd8/0x3c0
[ 511.117082] kasan_report.cold.9+0x37/0x86
[ 511.118711] ? do_readv+0x20/0x1b0
[ 511.120078] ? iov_iter_revert+0xd8/0x3c0
[ 511.122614] iov_iter_revert+0xd8/0x3c0
[ 511.124673] generic_file_read_iter+0x139/0x220
[ 511.127386] fuse_file_read_iter+0x239/0x270 [fuse]
[ 511.130229] ? fuse_direct_IO+0x600/0x600 [fuse]
[ 511.133491] ? rwsem_optimistic_spin+0x3d0/0x3d0
[ 511.137177] ? wake_up_q+0x92/0xd0
[ 511.139702] ? kasan_unpoison_shadow+0x30/0x40
[ 511.142518] do_iter_readv_writev+0x307/0x350
[ 511.144850] ? no_seek_end_llseek_size+0x20/0x20
[ 511.147155] do_iter_read+0x13f/0x2e0
[ 511.148696] vfs_readv+0xcc/0x130
[ 511.150118] ? compat_rw_copy_check_uvector+0x1e0/0x1e0
[ 511.152300] ? enqueue_hrtimer+0x60/0x100
[ 511.154043] ? hrtimer_start_range_ns+0x32f/0x4c0
[ 511.157561] ? hrtimer_run_softirq+0x100/0x100
[ 511.161514] ? _raw_spin_lock_irq+0x7b/0xd0
[ 511.164570] ? _raw_write_unlock_irqrestore+0x20/0x20
[ 511.167568] ? hrtimer_active+0x71/0xa0
[ 511.169331] ? mutex_lock+0x8e/0xe0
[ 511.171694] ? __mutex_lock_slowpath+0x10/0x10
[ 511.174580] ? perf_call_bpf_enter.isra.21+0x110/0x110
[ 511.177926] ? __fget_light+0xa3/0x100
[ 511.179916] do_readv+0xc1/0x1b0
[ 511.181331] ? vfs_readv+0x130/0x130
[ 511.182867] ? ktime_get_coarse_real_ts64+0x4a/0x70
[ 511.185455] do_syscall_64+0x33/0x40
[ 511.188008] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 511.191314] RIP: 0033:0x7f11e9b4578d
[ 511.193639] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d cb 56 2c 00 f7 d8 64 89 08
[ 511.202148] RSP: 002b:00007fff9b5eec58 EFLAGS: 00000246 ORIG_RAX: 0000000000000013
[ 511.205620] RAX: ffffffffffffffda RBX: 0000000000000013 RCX: 00007f11e9b4578d
[ 511.210533] RDX: 0000000000000091 RSI: 0000000002c49450 RDI: 00000000000000e1
[ 511.214992] RBP: 0000000000000013 R08: 000000008d8d8d8d R09: 00000000000002d2
[ 511.218631] R10: 00000020845754a0 R11: 0000000000000246 R12: 0000000000000002
[ 511.221595] R13: 00007f11ea227058 R14: 00007f11ea2356c0 R15: 00007f11ea227000
[ 511.225949]
[ 511.227008] Allocated by task 11748:
[ 511.229204] kasan_save_stack+0x19/0x40
[ 511.231404] __kasan_kmalloc.constprop.8+0xc1/0xd0
[ 511.234647] perf_event_mmap+0x28f/0x5f0
[ 511.237170] mmap_region+0x1cc/0xa50
[ 511.239192] do_mmap+0x3e5/0x6a0
[ 511.241337] vm_mmap_pgoff+0x15f/0x1b0
[ 511.243586] ksys_mmap_pgoff+0x2d3/0x320
[ 511.245903] do_syscall_64+0x33/0x40
[ 511.247914] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 511.250139]
[ 511.250797] Freed by task 11748:
[ 511.252160] kasan_save_stack+0x19/0x40
[ 511.253775] kasan_set_track+0x1c/0x30
[ 511.255348] kasan_set_free_info+0x1b/0x30
[ 511.257072] __kasan_slab_free+0x108/0x150
[ 511.258785] kfree+0x95/0x380
[ 511.260050] perf_event_mmap+0x4aa/0x5f0
[ 511.261694] mmap_region+0x1cc/0xa50
[ 511.263198] do_mmap+0x3e5/0x6a0
[ 511.264564] vm_mmap_pgoff+0x15f/0x1b0
[ 511.266133] ksys_mmap_pgoff+0x2d3/0x320
[ 511.267773] do_syscall_64+0x33/0x40
[ 511.269276] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 511.272756]
[ 511.273583] The buggy address belongs to the object at ffff88869e11c000
[ 511.273583] which belongs to the cache kmalloc-4k of size 4096
[ 511.281456] The buggy address is located 4088 bytes to the right of
[ 511.281456] 4096-byte region [ffff88869e11c000, ffff88869e11d000)
[ 511.288473] The buggy address belongs to the page:
[ 511.291093] page:0000000073d20fbc refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x69e118
[ 511.296681] head:0000000073d20fbc order:3 compound_mapcount:0 compound_pincount:0
[ 511.301118] flags: 0x17ffffc0010200(slab|head)
[ 511.303426] raw: 0017ffffc0010200 0000000000000000 0000000300000001 ffff888107c4ef80
[ 511.307482] raw: 0000000000000000 0000000000040004 00000001ffffffff 0000000000000000
[ 511.310957] page dumped because: kasan: bad access detected
[ 511.313233]
[ 511.313867] Memory state around the buggy address:
[ 511.315849] ffff88869e11de80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 511.318933] ffff88869e11df00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 511.322715] >ffff88869e11df80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 511.325993] ^
[ 511.330020] ffff88869e11e000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 511.334333] ffff88869e11e080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00


2020-09-11 23:56:41

by Al Viro

[permalink] [raw]
Subject: Re: slab-out-of-bounds in iov_iter_revert()

On Fri, Sep 11, 2020 at 05:59:04PM -0400, Qian Cai wrote:
> Super easy to reproduce on today's mainline by just fuzzing for a few minutes
> on virtiofs (if it ever matters). Any thoughts?

Usually happens when ->direct_IO() fucks up and reports the wrong amount
of data written/read. We had several bugs like that in the past - see
e.g. 85128b2be673 (fix nfs O_DIRECT advancing iov_iter too much).

Had there been any recent O_DIRECT-related patches on the filesystems
involved?

2020-09-16 22:06:23

by Qian Cai

[permalink] [raw]
Subject: Re: slab-out-of-bounds in iov_iter_revert()

On Sat, 2020-09-12 at 00:55 +0100, Al Viro wrote:
> On Fri, Sep 11, 2020 at 05:59:04PM -0400, Qian Cai wrote:
> > Super easy to reproduce on today's mainline by just fuzzing for a few
> > minutes
> > on virtiofs (if it ever matters). Any thoughts?
>
> Usually happens when ->direct_IO() fucks up and reports the wrong amount
> of data written/read. We had several bugs like that in the past - see
> e.g. 85128b2be673 (fix nfs O_DIRECT advancing iov_iter too much).
>
> Had there been any recent O_DIRECT-related patches on the filesystems
> involved?

This is only reproducible using FUSE/virtiofs so far, so I will stare at
fuse_direct_IO() until someone can beat me to it.

2020-09-17 02:08:11

by Al Viro

[permalink] [raw]
Subject: Re: slab-out-of-bounds in iov_iter_revert()

On Wed, Sep 16, 2020 at 05:09:49PM -0400, Qian Cai wrote:
> On Sat, 2020-09-12 at 00:55 +0100, Al Viro wrote:
> > On Fri, Sep 11, 2020 at 05:59:04PM -0400, Qian Cai wrote:
> > > Super easy to reproduce on today's mainline by just fuzzing for a few
> > > minutes
> > > on virtiofs (if it ever matters). Any thoughts?
> >
> > Usually happens when ->direct_IO() fucks up and reports the wrong amount
> > of data written/read. We had several bugs like that in the past - see
> > e.g. 85128b2be673 (fix nfs O_DIRECT advancing iov_iter too much).
> >
> > Had there been any recent O_DIRECT-related patches on the filesystems
> > involved?
>
> This is only reproducible using FUSE/virtiofs so far, so I will stare at
> fuse_direct_IO() until someone can beat me to it.

What happens there is that it tries to play with iov_iter_truncate() in
->direct_IO() without a corresponding iov_iter_reexpand(). Could you
check if the following helps?

Signed-off-by: Al Viro <[email protected]>
---
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 6611ef3269a8..d3eec2e11975 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3095,7 +3095,7 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
loff_t pos = 0;
struct inode *inode;
loff_t i_size;
- size_t count = iov_iter_count(iter);
+ size_t count = iov_iter_count(iter), shortened;
loff_t offset = iocb->ki_pos;
struct fuse_io_priv *io;

@@ -3111,7 +3111,8 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
if (offset >= i_size)
return 0;
iov_iter_truncate(iter, fuse_round_up(ff->fc, i_size - offset));
- count = iov_iter_count(iter);
+ shortened = count - iov_iter_count(iter);
+ count -= shortened;
}

io = kmalloc(sizeof(struct fuse_io_priv), GFP_KERNEL);
@@ -3177,6 +3178,8 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
else if (ret < 0 && offset + count > i_size)
fuse_do_truncate(file);
}
+ if (shortened)
+ iov_iter_reexpand(iter, shortened);

return ret;
}

2020-09-17 02:16:29

by Al Viro

[permalink] [raw]
Subject: Re: slab-out-of-bounds in iov_iter_revert()

On Thu, Sep 17, 2020 at 03:04:40AM +0100, Al Viro wrote:
> On Wed, Sep 16, 2020 at 05:09:49PM -0400, Qian Cai wrote:
> > On Sat, 2020-09-12 at 00:55 +0100, Al Viro wrote:
> > > On Fri, Sep 11, 2020 at 05:59:04PM -0400, Qian Cai wrote:
> > > > Super easy to reproduce on today's mainline by just fuzzing for a few
> > > > minutes
> > > > on virtiofs (if it ever matters). Any thoughts?
> > >
> > > Usually happens when ->direct_IO() fucks up and reports the wrong amount
> > > of data written/read. We had several bugs like that in the past - see
> > > e.g. 85128b2be673 (fix nfs O_DIRECT advancing iov_iter too much).
> > >
> > > Had there been any recent O_DIRECT-related patches on the filesystems
> > > involved?
> >
> > This is only reproducible using FUSE/virtiofs so far, so I will stare at
> > fuse_direct_IO() until someone can beat me to it.
>
> What happens there is that it tries to play with iov_iter_truncate() in
> ->direct_IO() without a corresponding iov_iter_reexpand(). Could you
> check if the following helps?

Gyah... Sorry, that should be

Signed-off-by: Al Viro <[email protected]>
---
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 6611ef3269a8..92de6b9b06b0 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3095,7 +3095,7 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
loff_t pos = 0;
struct inode *inode;
loff_t i_size;
- size_t count = iov_iter_count(iter);
+ size_t count = iov_iter_count(iter), shortened;
loff_t offset = iocb->ki_pos;
struct fuse_io_priv *io;

@@ -3111,7 +3111,8 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
if (offset >= i_size)
return 0;
iov_iter_truncate(iter, fuse_round_up(ff->fc, i_size - offset));
- count = iov_iter_count(iter);
+ shortened = count - iov_iter_count(iter);
+ count -= shortened;
}

io = kmalloc(sizeof(struct fuse_io_priv), GFP_KERNEL);
@@ -3177,6 +3178,7 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
else if (ret < 0 && offset + count > i_size)
fuse_do_truncate(file);
}
+ iov_iter_reexpand(iter, iov_iter_count(iter) + shortened);

return ret;
}

2020-09-17 14:17:12

by Qian Cai

[permalink] [raw]
Subject: Re: slab-out-of-bounds in iov_iter_revert()

On Thu, 2020-09-17 at 03:14 +0100, Al Viro wrote:
> On Thu, Sep 17, 2020 at 03:04:40AM +0100, Al Viro wrote:
> > On Wed, Sep 16, 2020 at 05:09:49PM -0400, Qian Cai wrote:
> > > On Sat, 2020-09-12 at 00:55 +0100, Al Viro wrote:
> > > > On Fri, Sep 11, 2020 at 05:59:04PM -0400, Qian Cai wrote:
> > > > > Super easy to reproduce on today's mainline by just fuzzing for a few
> > > > > minutes
> > > > > on virtiofs (if it ever matters). Any thoughts?
> > > >
> > > > Usually happens when ->direct_IO() fucks up and reports the wrong amount
> > > > of data written/read. We had several bugs like that in the past - see
> > > > e.g. 85128b2be673 (fix nfs O_DIRECT advancing iov_iter too much).
> > > >
> > > > Had there been any recent O_DIRECT-related patches on the filesystems
> > > > involved?
> > >
> > > This is only reproducible using FUSE/virtiofs so far, so I will stare at
> > > fuse_direct_IO() until someone can beat me to it.
> >
> > What happens there is that it tries to play with iov_iter_truncate() in
> > ->direct_IO() without a corresponding iov_iter_reexpand(). Could you
> > check if the following helps?
>
> Gyah... Sorry, that should be
>
> Signed-off-by: Al Viro <[email protected]>

This is now triggering:

[ 81.942804] WARNING: CPU: 24 PID: 1545 at lib/iov_iter.c:1084 iov_iter_revert+0x245/0x8c0
[ 81.942805] Modules linked in: af_key mpls_router ip_tunnel hidp cmtp kernelcapi bnep rfcomm bluetooth ecdh_generic ecc can_bcm can_raw can pptp gre l2tp_ppp l2tp_netlink l2tp_core ip6_udp_tunnel udp_tunnel pppoe pppox ppp_generic slhc crypto_user ib_core nfnetlink scsi_transport_iscsi atm sctp rfkill nls_utf8 isofs intel_rapl_msr intel_rapl_common sb_edac kvm_intel kvm bochs_drm drm_vram_helper irqbypass drm_ttm_helper ttm rapl drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops drm i2c_piix4 joydev pcspkr vfat fat ip_tables xfs libcrc32c sr_mod sd_mod cdrom t10_pi sg ata_generic crct10dif_pclmul crc32_pclmul crc32c_intel ata_piix virtiofs libata ghash_clmulni_intel fuse e1000 serio_raw sunrpc dm_mirror dm_region_hash dm_log dm_mod
[ 81.942870] CPU: 24 PID: 1545 Comm: trinity-c0 Not tainted 5.9.0-rc5-iter+ #2
[ 81.942871] Hardware name: Red Hat KVM, BIOS 1.14.0-1.module+el8.3.0+7638+07cf13d2 04/01/2014
[ 81.942879] RIP: 0010:iov_iter_revert+0x245/0x8c0
[ 81.942882] Code: 89 fa 48 c1 ea 03 80 3c 02 00 0f 85 ce 05 00 00 49 29 f5 48 89 6b 18 4c 89 6b 08 48 83 c4 30 5b 5d 41 5c 41 5d 41 5e 41 5f c3 <0f> 0b e9 3e ff ff ff 48 b8 00 00 00 00 00 fc ff df 48 8d 7b 18 48
[ 81.942884] RSP: 0018:ffff8886bf637b60 EFLAGS: 00010286
[ 81.942886] RAX: 000000000000028b RBX: ffff8886bf637dc8 RCX: 1ffff110d7ec6faf
[ 81.942888] RDX: dffffc0000000000 RSI: ffffffffbe4a754d RDI: ffff8886bf637d68
[ 81.942889] RBP: ffff8886bf637d68 R08: 0000000000000004 R09: ffffed10d7ec6ee2
[ 81.942890] R10: 0000000000000003 R11: ffffed10d7ec6ee2 R12: 0000000000000000
[ 81.942891] R13: ffff8886eaa62d80 R14: ffff8886bf637dd0 R15: ffff8886bf637d78
[ 81.942893] FS: 00007fc78f78d740(0000) GS:ffff8887d2000000(0000) knlGS:0000000000000000
[ 81.942897] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 81.942898] CR2: 0000000000000008 CR3: 00000007cadba005 CR4: 0000000000170ee0
[ 81.942899] Call Trace:
[ 81.942909] generic_file_read_iter+0x23b/0x4b0
[ 81.942918] fuse_file_read_iter+0x280/0x4e0 [fuse]
[ 81.942931] ? fuse_direct_IO+0xd30/0xd30 [fuse]
[ 81.942949] ? _raw_spin_lock_irqsave+0x80/0xe0
[ 81.942957] ? timerqueue_add+0x15e/0x280
[ 81.942960] ? _raw_spin_lock_irqsave+0x80/0xe0
[ 81.942966] new_sync_read+0x3b7/0x620
[ 81.942968] ? __ia32_sys_llseek+0x2e0/0x2e0
[ 81.942971] ==================================================================
[ 81.942984] BUG: KASAN: slab-out-of-bounds in iov_iter_revert+0x693/0x8c0
[ 81.942995] ? hrtimer_start_range_ns+0x495/0x900
[ 81.942997] ? hrtimer_try_to_cancel+0x6d/0x330
[ 81.943002] Read of size 8 at addr ffff88867e6a6ff8 by task trinity-c21/1600
[ 81.943005] ? hrtimer_forward+0x1b0/0x1b0

[ 81.943026] ? __fsnotify_update_child_dentry_flags.part.12+0x290/0x290
[ 81.943030] ? _cond_resched+0x15/0x30
[ 81.943041] ? __inode_security_revalidate+0x9d/0xc0
[ 81.943047] CPU: 12 PID: 1600 Comm: trinity-c21 Not tainted 5.9.0-rc5-iter+ #2
[ 81.943050] Hardware name: Red Hat KVM, BIOS 1.14.0-1.module+el8.3.0+7638+07cf13d2 04/01/2014
[ 81.943053] vfs_read+0x22b/0x460
[ 81.943056] ksys_read+0xe5/0x1b0
[ 81.943058] Call Trace:
[ 81.943061] ? vfs_write+0x5e0/0x5e0
[ 81.943069] dump_stack+0x7c/0xb0
[ 81.943074] do_syscall_64+0x33/0x40
[ 81.943077] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 81.943081] RIP: 0033:0x7fc78f09d78d
[ 81.943084] ? iov_iter_revert+0x693/0x8c0
[ 81.943097] print_address_description.constprop.7+0x1e/0x230
[ 81.943100] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d cb 56 2c 00 f7 d8 64 89 01 48
[ 81.943117] ? kmsg_dump_rewind_nolock+0xd9/0xd9
[ 81.943119] RSP: 002b:00007ffe01a9e168 EFLAGS: 00000246
[ 81.943128] ? _raw_write_lock_irqsave+0xe0/0xe0
[ 81.943129] ORIG_RAX: 0000000000000000
[ 81.943133] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fc78f09d78d
[ 81.943136] RDX: 000000000000028b RSI: 00007fc78d282000 RDI: 00000000000000f2
[ 81.943140] ? iov_iter_revert+0x693/0x8c0
[ 81.943142] RBP: 0000000000000000 R08: 19d504e75f9b0c9b R09: 00000000000000ff
[ 81.943146] ? iov_iter_revert+0x693/0x8c0
[ 81.943148] R10: 00000000d9d9d9d9 R11: 0000000000000246 R12: 0000000000000002
[ 81.943153] kasan_report.cold.9+0x37/0x7c
[ 81.943155] R13: 00007fc78f786058 R14: 00007fc78f78d6c0 R15: 00007fc78f786000
[ 81.943158] ? iov_iter_revert+0x693/0x8c0
[ 81.943161] iov_iter_revert+0x693/0x8c0
[ 81.943163] ---[ end trace ee32e92b96589675 ]---
[ 81.943172] ? dentry_needs_remove_privs.part.30+0x40/0x40
[ 81.943181] ? generic_write_checks+0x1d2/0x3d0
[ 81.943185] generic_file_direct_write+0x2ed/0x430
[ 81.943195] fuse_file_write_iter+0x5d8/0xb10 [fuse]
[ 81.943209] ? fuse_perform_write+0xed0/0xed0 [fuse]
[ 81.943215] ? kasan_unpoison_shadow+0x30/0x40
[ 81.943221] do_iter_readv_writev+0x3a6/0x700
[ 81.943224] ? no_seek_end_llseek_size+0x20/0x20
[ 81.943228] do_iter_write+0x12f/0x5f0
[ 81.943233] ? timerqueue_add+0x15e/0x280
[ 81.943236] vfs_writev+0x172/0x2d0
[ 81.943240] ? vfs_iter_write+0xc0/0xc0
[ 81.943245] ? hrtimer_start_range_ns+0x495/0x900
[ 81.943248] ? hrtimer_run_softirq+0x210/0x210
[ 81.943252] ? _raw_spin_lock_irq+0x7b/0xd0
[ 81.943256] ? _raw_write_unlock_irqrestore+0x50/0x50
[ 81.943269] ? perf_syscall_enter+0x136/0x8a0
[ 81.943276] ? perf_call_bpf_enter.isra.21+0x1a0/0x1a0
[ 81.943283] ? alarm_setitimer+0xa0/0x110
[ 81.943287] do_pwritev+0x151/0x200
[ 81.943291] ? __ia32_sys_writev+0xb0/0xb0
[ 81.943296] do_syscall_64+0x33/0x40
[ 81.943301] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 81.943306] RIP: 0033:0x7fc78f09d78d
[ 81.943312] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d cb 56 2c 00 f7 d8 64 89 01 48
[ 81.943314] RSP: 002b:00007ffe01a9e168 EFLAGS: 00000246 ORIG_RAX: 0000000000000148
[ 81.943318] RAX: ffffffffffffffda RBX: 0000000000000148 RCX: 00007fc78f09d78d
[ 81.943320] RDX: 000000000000004b RSI: 0000000001ae7790 RDI: 00000000000000f2
[ 81.943323] RBP: 0000000000000148 R08: 0000000075f74000 R09: 0000000000000001
[ 81.943325] R10: 00830624a1148006 R11: 0000000000000246 R12: 0000000000000002
[ 81.943328] R13: 00007fc78f6f3058 R14: 00007fc78f78d6c0 R15: 00007fc78f6f3000

[ 81.943333] Allocated by task 0:
[ 81.943334] (stack is not available)

[ 81.943339] The buggy address belongs to the object at ffff88867e6a6000
which belongs to the cache kmalloc-2k of size 2048
[ 81.943342] The buggy address is located 2040 bytes to the right of
2048-byte region [ffff88867e6a6000, ffff88867e6a6800)
[ 81.943344] The buggy address belongs to the page:
[ 81.943350] page:000000007cdca305 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x67e6a0
[ 81.943354] head:000000007cdca305 order:3 compound_mapcount:0 compound_pincount:0
[ 81.943358] flags: 0x17ffffc0010200(slab|head)
[ 81.943365] raw: 0017ffffc0010200 dead000000000100 dead000000000122 ffff888107c4f0c0
[ 81.943370] raw: 0000000000000000 0000000080080008 00000001ffffffff 0000000000000000
[ 81.943372] page dumped because: kasan: bad access detected

[ 81.943374] Memory state around the buggy address:
[ 81.943379] ffff88867e6a6e80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 81.943382] ffff88867e6a6f00: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 81.943384] >ffff88867e6a6f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
[ 81.943386] ^
[ 81.943388] ffff88867e6a7000: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 81.943400] ffff88867e6a7080: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
[ 81.943402] ==================================================================
[ 81.943403] Disabling lock debugging due to kernel taint

> ---
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 6611ef3269a8..92de6b9b06b0 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -3095,7 +3095,7 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter
> *iter)
> loff_t pos = 0;
> struct inode *inode;
> loff_t i_size;
> - size_t count = iov_iter_count(iter);
> + size_t count = iov_iter_count(iter), shortened;
> loff_t offset = iocb->ki_pos;
> struct fuse_io_priv *io;
>
> @@ -3111,7 +3111,8 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter
> *iter)
> if (offset >= i_size)
> return 0;
> iov_iter_truncate(iter, fuse_round_up(ff->fc, i_size - offset));
> - count = iov_iter_count(iter);
> + shortened = count - iov_iter_count(iter);
> + count -= shortened;
> }
>
> io = kmalloc(sizeof(struct fuse_io_priv), GFP_KERNEL);
> @@ -3177,6 +3178,7 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter
> *iter)
> else if (ret < 0 && offset + count > i_size)
> fuse_do_truncate(file);
> }
> + iov_iter_reexpand(iter, iov_iter_count(iter) + shortened);
>
> return ret;
> }
>

2020-09-17 16:49:35

by Al Viro

[permalink] [raw]
Subject: Re: slab-out-of-bounds in iov_iter_revert()

On Thu, Sep 17, 2020 at 10:10:27AM -0400, Qian Cai wrote:

> [ 81.942909] generic_file_read_iter+0x23b/0x4b0
> [ 81.942918] fuse_file_read_iter+0x280/0x4e0 [fuse]
> [ 81.942931] ? fuse_direct_IO+0xd30/0xd30 [fuse]
> [ 81.942949] ? _raw_spin_lock_irqsave+0x80/0xe0
> [ 81.942957] ? timerqueue_add+0x15e/0x280
> [ 81.942960] ? _raw_spin_lock_irqsave+0x80/0xe0
> [ 81.942966] new_sync_read+0x3b7/0x620
> [ 81.942968] ? __ia32_sys_llseek+0x2e0/0x2e0

Interesting... Basic logics in there:
->direct_IO() might consume more (on iov_iter_get_pages()
and friends) than it actually reads. We want to revert the
excess. Suppose by the time we call ->direct_IO() we had
N bytes already consumed and C bytes left. We expect that
after ->direct_IO() returns K, we have C' bytes left, N + (C - C')
consumed and N + K out of those actually read. So we revert by
C - K - C'. You end up trying to revert beyond the beginning.

Use of iov_iter_truncate() is problematic here, since it
changes the amount of data left without having consumed anything.
Basically, it changes the position of end, and the logics in the
caller expects that to remain unchanged. iov_iter_reexpand() use
should restore the position of end.

How much IO does it take to trigger that on your reproducer?

2020-09-17 17:50:19

by Qian Cai

[permalink] [raw]
Subject: Re: slab-out-of-bounds in iov_iter_revert()

On Thu, 2020-09-17 at 17:44 +0100, Al Viro wrote:
> On Thu, Sep 17, 2020 at 10:10:27AM -0400, Qian Cai wrote:
>
> > [ 81.942909] generic_file_read_iter+0x23b/0x4b0
> > [ 81.942918] fuse_file_read_iter+0x280/0x4e0 [fuse]
> > [ 81.942931] ? fuse_direct_IO+0xd30/0xd30 [fuse]
> > [ 81.942949] ? _raw_spin_lock_irqsave+0x80/0xe0
> > [ 81.942957] ? timerqueue_add+0x15e/0x280
> > [ 81.942960] ? _raw_spin_lock_irqsave+0x80/0xe0
> > [ 81.942966] new_sync_read+0x3b7/0x620
> > [ 81.942968] ? __ia32_sys_llseek+0x2e0/0x2e0
>
> Interesting... Basic logics in there:
> ->direct_IO() might consume more (on iov_iter_get_pages()
> and friends) than it actually reads. We want to revert the
> excess. Suppose by the time we call ->direct_IO() we had
> N bytes already consumed and C bytes left. We expect that
> after ->direct_IO() returns K, we have C' bytes left, N + (C - C')
> consumed and N + K out of those actually read. So we revert by
> C - K - C'. You end up trying to revert beyond the beginning.
>
> Use of iov_iter_truncate() is problematic here, since it
> changes the amount of data left without having consumed anything.
> Basically, it changes the position of end, and the logics in the
> caller expects that to remain unchanged. iov_iter_reexpand() use
> should restore the position of end.
>
> How much IO does it take to trigger that on your reproducer?

That is something I don't know for sure because it is always reproducible by
running the trinity fuzzer for a few seconds (32 threads). I did another run
below (still with your patch applied) and then tried to capture some logs here:

http://people.redhat.com/qcai/iov_iter_revert/

- virtiofsd.txt: fuse server side log until it triggered.
- trinity-post-mortem.log: AFAICT, it was the final syscall of each child.
- trinity-child9.log: the child actually triggered it.

The last syscall of child9 is:
pwritev2(fd=230, vec=0x293d3f0, vlen=1, pos_l=120, pos_h=0x200000, flags=0x3)

[ 77.125816] BUG: KASAN: stack-out-of-bounds in iov_iter_revert+0x693/0x8c0
[ 77.127079] Read of size 8 at addr ffff8886efd5fda8 by task trinity-c9/1593
[ 77.128292]
[ 77.128581] CPU: 10 PID: 1593 Comm: trinity-c9 Not tainted 5.9.0-rc5-iter+ #2
[ 77.129798] Hardware name: Red Hat KVM, BIOS 1.14.0-1.module+el8.3.0+7638+07cf13d2 04/01/2014
[ 77.131175] Call Trace:
[ 77.131934] dump_stack+0x7c/0xb0
[ 77.132500] ? iov_iter_revert+0x693/0x8c0
[ 77.133254] print_address_description.constprop.7+0x1e/0x230
[ 77.134287] ? kmsg_dump_rewind_nolock+0xd9/0xd9
[ 77.135103] ? _raw_write_lock_irqsave+0xe0/0xe0
[ 77.135933] ? iov_iter_revert+0x693/0x8c0
[ 77.136651] ? iov_iter_revert+0x693/0x8c0
[ 77.137345] kasan_report.cold.9+0x37/0x7c
[ 77.138033] ? iov_iter_revert+0x693/0x8c0
[ 77.138768] iov_iter_revert+0x693/0x8c0
[ 77.139433] ? dentry_needs_remove_privs.part.30+0x40/0x40
[ 77.140684] ? generic_write_checks+0x1d2/0x3d0
[ 77.141477] generic_file_direct_write+0x2ed/0x430
[ 77.142601] fuse_file_write_iter+0x5d8/0xb10 [fuse]
[ 77.143857] ? fuse_perform_write+0xed0/0xed0 [fuse]
[ 77.145062] do_iter_readv_writev+0x3a6/0x700
[ 77.146071] ? no_seek_end_llseek_size+0x20/0x20
[ 77.146568] kexec-bzImage64: File is too short to be a bzImage
[ 77.147277] do_iter_write+0x12f/0x5f0
[ 77.149501] ? timerqueue_add+0x15e/0x280
[ 77.150187] vfs_writev+0x172/0x2d0
[ 77.150837] ? vfs_iter_write+0xc0/0xc0
[ 77.151493] ? hrtimer_start_range_ns+0x495/0x900
[ 77.152316] ? hrtimer_run_softirq+0x210/0x210
[ 77.153087] ? _raw_spin_lock_irq+0x7b/0xd0
[ 77.153842] ? _raw_write_unlock_irqrestore+0x50/0x50
[ 77.154733] ? do_setitimer+0x2e5/0x590
[ 77.155393] ? alarm_setitimer+0xa0/0x110
[ 77.156059] do_pwritev+0x151/0x200
[ 77.156657] ? __ia32_sys_writev+0xb0/0xb0
[ 77.157336] do_syscall_64+0x33/0x40
[ 77.157928] entry_SYSCALL_64_after_hwframe+0x44/0xa9
[ 77.158762] RIP: 0033:0x7fb8beed278d
[ 77.159316] Code: 00 c3 66 2e 0f 1f 84 00 00 00 00 00 90 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 8b 0d cb 56 2c 00 f7 d8 64 89 08
[ 77.162391] RSP: 002b:00007ffcc8dcabc8 EFLAGS: 00000246 ORIG_RAX: 0000000000000148
[ 77.163672] RAX: ffffffffffffffda RBX: 0000000000000148 RCX: 00007fb8beed278d
[ 77.164921] RDX: 0000000000000001 RSI: 000000000293d3f0 RDI: 00000000000000e6
[ 77.166115] RBP: 0000000000000148 R08: 0000000000200000 R09: 0000000000000003
[ 77.167315] R10: 0000000000000078 R11: 0000000000000246 R12: 0000000000000002
[ 77.168494] R13: 00007fb8bf57c058 R14: 00007fb8bf5c26c0 R15: 00007fb8bf57c000
[ 77.169656]
[ 77.169914] The buggy address belongs to the page:
[ 77.170723] page:000000006280f5ba refcount:0 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x6efd5f
[ 77.172271] flags: 0x17ffffc0000000()
[ 77.172906] raw: 0017ffffc0000000 0000000000000000 ffffea001bb87308 0000000000000000
[ 77.174299] raw: 0000000000000000 0000000000000000 00000000ffffffff 0000000000000000
[ 77.175651] page dumped because: kasan: bad access detected
[ 77.176583]
[ 77.176848] addr ffff8886efd5fda8 is located in stack of task trinity-c9/1593 at offset 184 in frame:
[ 77.178352] vfs_writev+0x0/0x2d0
[ 77.178924]
[ 77.179158] this frame has 3 objects:
[ 77.179823] [32, 40) 'iov'
[ 77.179824] [96, 136) 'iter'
[ 77.180269] [192, 320) 'iovstack'
[ 77.180788]
[ 77.181768] Memory state around the buggy address:
[ 77.182927] ffff8886efd5fc80: 00 00 00 00 00 00 00 00 00 00 00 00 00 00 f1 f1
[ 77.184704] ffff8886efd5fd00: f1 f1 00 f2 f2 f2 f2 f2 f2 f2 00 00 00 00 00 f2
[ 77.186457] >ffff8886efd5fd80: f2 f2 f2 f2 f2 f2 00 00 00 00 00 00 00 00 00 00
[ 77.188214] ^
[ 77.189388] ffff8886efd5fe00: 00 00 00 00 00 00 f3 f3 f3 f3 00 00 00 00 00 00
[ 77.191019] ffff8886efd5fe80: 00 00 00 00 00 00 f1 f1 f1 f1 00 f2 f2 f2 00 00
[ 77.192846] ==================================================================
[ 77.194428] Disabling lock debugging due to kernel taint

2020-09-17 18:48:05

by Qian Cai

[permalink] [raw]
Subject: Re: slab-out-of-bounds in iov_iter_revert()

On Thu, 2020-09-17 at 17:44 +0100, Al Viro wrote:
> On Thu, Sep 17, 2020 at 10:10:27AM -0400, Qian Cai wrote:
>
> > [ 81.942909] generic_file_read_iter+0x23b/0x4b0
> > [ 81.942918] fuse_file_read_iter+0x280/0x4e0 [fuse]
> > [ 81.942931] ? fuse_direct_IO+0xd30/0xd30 [fuse]
> > [ 81.942949] ? _raw_spin_lock_irqsave+0x80/0xe0
> > [ 81.942957] ? timerqueue_add+0x15e/0x280
> > [ 81.942960] ? _raw_spin_lock_irqsave+0x80/0xe0
> > [ 81.942966] new_sync_read+0x3b7/0x620
> > [ 81.942968] ? __ia32_sys_llseek+0x2e0/0x2e0
>
> Interesting... Basic logics in there:
> ->direct_IO() might consume more (on iov_iter_get_pages()
> and friends) than it actually reads. We want to revert the
> excess. Suppose by the time we call ->direct_IO() we had
> N bytes already consumed and C bytes left. We expect that
> after ->direct_IO() returns K, we have C' bytes left, N + (C - C')
> consumed and N + K out of those actually read. So we revert by
> C - K - C'. You end up trying to revert beyond the beginning.
>
> Use of iov_iter_truncate() is problematic here, since it
> changes the amount of data left without having consumed anything.
> Basically, it changes the position of end, and the logics in the
> caller expects that to remain unchanged. iov_iter_reexpand() use
> should restore the position of end.
>
> How much IO does it take to trigger that on your reproducer?

I can even reproduce this with a single child of the trinity:

https://people.redhat.com/qcai/iov_iter_revert/single/

[ 77.841021] BUG: KASAN: stack-out-of-bounds in iov_iter_revert+0x693/0x8c0
[ 77.842055] Read of size 8 at addr ffff8886efe47d98 by task trinity-c0/1449

2020-09-17 18:48:10

by Al Viro

[permalink] [raw]
Subject: Re: slab-out-of-bounds in iov_iter_revert()

On Thu, Sep 17, 2020 at 01:42:44PM -0400, Qian Cai wrote:
> >
> > How much IO does it take to trigger that on your reproducer?
>
> That is something I don't know for sure because it is always reproducible by
> running the trinity fuzzer for a few seconds (32 threads). I did another run
> below (still with your patch applied) and then tried to capture some logs here:
>
> http://people.redhat.com/qcai/iov_iter_revert/

FWIW, there were several bugs in that patch:
* 'shortened' possibly left uninitialized
* possible error returns with reexpand not done

Could you try this instead?

Signed-off-by: Al Viro <[email protected]>
---
diff --git a/fs/fuse/file.c b/fs/fuse/file.c
index 6611ef3269a8..43c165e796da 100644
--- a/fs/fuse/file.c
+++ b/fs/fuse/file.c
@@ -3091,11 +3091,10 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
ssize_t ret = 0;
struct file *file = iocb->ki_filp;
struct fuse_file *ff = file->private_data;
- bool async_dio = ff->fc->async_dio;
loff_t pos = 0;
struct inode *inode;
loff_t i_size;
- size_t count = iov_iter_count(iter);
+ size_t count = iov_iter_count(iter), shortened = 0;
loff_t offset = iocb->ki_pos;
struct fuse_io_priv *io;

@@ -3103,17 +3102,9 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
inode = file->f_mapping->host;
i_size = i_size_read(inode);

- if ((iov_iter_rw(iter) == READ) && (offset > i_size))
+ if ((iov_iter_rw(iter) == READ) && (offset >= i_size))
return 0;

- /* optimization for short read */
- if (async_dio && iov_iter_rw(iter) != WRITE && offset + count > i_size) {
- if (offset >= i_size)
- return 0;
- iov_iter_truncate(iter, fuse_round_up(ff->fc, i_size - offset));
- count = iov_iter_count(iter);
- }
-
io = kmalloc(sizeof(struct fuse_io_priv), GFP_KERNEL);
if (!io)
return -ENOMEM;
@@ -3129,15 +3120,22 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
* By default, we want to optimize all I/Os with async request
* submission to the client filesystem if supported.
*/
- io->async = async_dio;
+ io->async = ff->fc->async_dio;
io->iocb = iocb;
io->blocking = is_sync_kiocb(iocb);

+ /* optimization for short read */
+ if (io->async && !io->write && offset + count > i_size) {
+ iov_iter_truncate(iter, fuse_round_up(ff->fc, i_size - offset));
+ shortened = count - iov_iter_count(iter);
+ count -= shortened;
+ }
+
/*
* We cannot asynchronously extend the size of a file.
* In such case the aio will behave exactly like sync io.
*/
- if ((offset + count > i_size) && iov_iter_rw(iter) == WRITE)
+ if ((offset + count > i_size) && io->write)
io->blocking = true;

if (io->async && io->blocking) {
@@ -3155,6 +3153,7 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter *iter)
} else {
ret = __fuse_direct_read(io, iter, &pos);
}
+ iov_iter_reexpand(iter, iov_iter_count(iter) + shortened);

if (io->async) {
bool blocking = io->blocking;

2020-09-17 20:18:40

by Qian Cai

[permalink] [raw]
Subject: Re: slab-out-of-bounds in iov_iter_revert()

On Thu, 2020-09-17 at 19:45 +0100, Al Viro wrote:
> On Thu, Sep 17, 2020 at 01:42:44PM -0400, Qian Cai wrote:
> > > How much IO does it take to trigger that on your reproducer?
> >
> > That is something I don't know for sure because it is always reproducible by
> > running the trinity fuzzer for a few seconds (32 threads). I did another run
> > below (still with your patch applied) and then tried to capture some logs
> > here:
> >
> > http://people.redhat.com/qcai/iov_iter_revert/
>
> FWIW, there were several bugs in that patch:
> * 'shortened' possibly left uninitialized
> * possible error returns with reexpand not done
>
> Could you try this instead?

This works fine. Thanks for taking care of this, Al.

>
> Signed-off-by: Al Viro <[email protected]>
> ---
> diff --git a/fs/fuse/file.c b/fs/fuse/file.c
> index 6611ef3269a8..43c165e796da 100644
> --- a/fs/fuse/file.c
> +++ b/fs/fuse/file.c
> @@ -3091,11 +3091,10 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter
> *iter)
> ssize_t ret = 0;
> struct file *file = iocb->ki_filp;
> struct fuse_file *ff = file->private_data;
> - bool async_dio = ff->fc->async_dio;
> loff_t pos = 0;
> struct inode *inode;
> loff_t i_size;
> - size_t count = iov_iter_count(iter);
> + size_t count = iov_iter_count(iter), shortened = 0;
> loff_t offset = iocb->ki_pos;
> struct fuse_io_priv *io;
>
> @@ -3103,17 +3102,9 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter
> *iter)
> inode = file->f_mapping->host;
> i_size = i_size_read(inode);
>
> - if ((iov_iter_rw(iter) == READ) && (offset > i_size))
> + if ((iov_iter_rw(iter) == READ) && (offset >= i_size))
> return 0;
>
> - /* optimization for short read */
> - if (async_dio && iov_iter_rw(iter) != WRITE && offset + count > i_size)
> {
> - if (offset >= i_size)
> - return 0;
> - iov_iter_truncate(iter, fuse_round_up(ff->fc, i_size - offset));
> - count = iov_iter_count(iter);
> - }
> -
> io = kmalloc(sizeof(struct fuse_io_priv), GFP_KERNEL);
> if (!io)
> return -ENOMEM;
> @@ -3129,15 +3120,22 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter
> *iter)
> * By default, we want to optimize all I/Os with async request
> * submission to the client filesystem if supported.
> */
> - io->async = async_dio;
> + io->async = ff->fc->async_dio;
> io->iocb = iocb;
> io->blocking = is_sync_kiocb(iocb);
>
> + /* optimization for short read */
> + if (io->async && !io->write && offset + count > i_size) {
> + iov_iter_truncate(iter, fuse_round_up(ff->fc, i_size - offset));
> + shortened = count - iov_iter_count(iter);
> + count -= shortened;
> + }
> +
> /*
> * We cannot asynchronously extend the size of a file.
> * In such case the aio will behave exactly like sync io.
> */
> - if ((offset + count > i_size) && iov_iter_rw(iter) == WRITE)
> + if ((offset + count > i_size) && io->write)
> io->blocking = true;
>
> if (io->async && io->blocking) {
> @@ -3155,6 +3153,7 @@ fuse_direct_IO(struct kiocb *iocb, struct iov_iter
> *iter)
> } else {
> ret = __fuse_direct_read(io, iter, &pos);
> }
> + iov_iter_reexpand(iter, iov_iter_count(iter) + shortened);
>
> if (io->async) {
> bool blocking = io->blocking;
>