Hello,
syzbot found the following crash on:
HEAD commit: 2d63ba3e Merge tag 'pm-5.3-rc5' of git://git.kernel.org/pu..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=165d3302600000
kernel config: https://syzkaller.appspot.com/x/.config?x=3ff364e429585cf2
dashboard link: https://syzkaller.appspot.com/bug?extid=82defefbbd8527e1c2cb
compiler: gcc (GCC) 9.0.0 20181231 (experimental)
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16c8ab3c600000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16be0c4c600000
Bisection is inconclusive: the bug happens on the oldest tested release.
bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=11de3622600000
console output: https://syzkaller.appspot.com/x/log.txt?x=15de3622600000
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: [email protected]
------------[ cut here ]------------
refcount_t: increment on 0; use-after-free.
WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156 refcount_inc_checked
lib/refcount.c:156 [inline]
WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156
refcount_inc_checked+0x61/0x70 lib/refcount.c:154
Kernel panic - not syncing: panic_on_warn set ...
CPU: 1 PID: 11828 Comm: syz-executor746 Not tainted 5.3.0-rc4+ #112
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x172/0x1f0 lib/dump_stack.c:113
panic+0x2dc/0x755 kernel/panic.c:219
__warn.cold+0x20/0x4c kernel/panic.c:576
report_bug+0x263/0x2b0 lib/bug.c:186
fixup_bug arch/x86/kernel/traps.c:179 [inline]
fixup_bug arch/x86/kernel/traps.c:174 [inline]
do_error_trap+0x11b/0x200 arch/x86/kernel/traps.c:272
do_invalid_op+0x37/0x50 arch/x86/kernel/traps.c:291
invalid_op+0x23/0x30 arch/x86/entry/entry_64.S:1028
RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline]
RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154
Code: 1d 8e c6 64 06 31 ff 89 de e8 ab 9c 35 fe 84 db 75 dd e8 62 9b 35 fe
48 c7 c7 00 05 c6 87 c6 05 6e c6 64 06 01 e8 67 26 07 fe <0f> 0b eb c1 90
90 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41
RSP: 0018:ffff8880907d78b8 EFLAGS: 00010282
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: ffffffff815c2466 RDI: ffffed10120faf09
RBP: ffff8880907d78c8 R08: ffff8880a771a200 R09: fffffbfff134ae48
R10: fffffbfff134ae47 R11: ffffffff89a5723f R12: ffff88809ea2bb80
R13: 0000000000000000 R14: ffff88809ff6cd40 R15: ffff8880a1c56480
kref_get include/linux/kref.h:45 [inline]
kobject_get+0x66/0xc0 lib/kobject.c:644
cdev_get+0x60/0xb0 fs/char_dev.c:355
chrdev_open+0xb0/0x6b0 fs/char_dev.c:400
do_dentry_open+0x4df/0x1250 fs/open.c:797
vfs_open+0xa0/0xd0 fs/open.c:906
do_last fs/namei.c:3416 [inline]
path_openat+0x10e9/0x4630 fs/namei.c:3533
do_filp_open+0x1a1/0x280 fs/namei.c:3563
do_sys_open+0x3fe/0x5d0 fs/open.c:1089
__do_sys_open fs/open.c:1107 [inline]
__se_sys_open fs/open.c:1102 [inline]
__x64_sys_open+0x7e/0xc0 fs/open.c:1102
do_syscall_64+0xfd/0x6a0 arch/x86/entry/common.c:296
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x406311
Code: 75 14 b8 02 00 00 00 0f 05 48 3d 01 f0 ff ff 0f 83 a4 18 00 00 c3 48
83 ec 08 e8 0a fc ff ff 48 89 04 24 b8 02 00 00 00 0f 05 <48> 8b 3c 24 48
89 c2 e8 53 fc ff ff 48 89 d0 48 83 c4 08 48 3d 01
RSP: 002b:00007f047e1c0960 EFLAGS: 00000293 ORIG_RAX: 0000000000000002
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 0000000000406311
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 00007f047e1c0970
RBP: 6666666666666667 R08: 000000000000000f R09: 00007f047e1c1700
R10: 00007f047e1c19d0 R11: 0000000000000293 R12: 00000000006dbc3c
R13: 0000000000000000 R14: 0000000000000000 R15: 00000000317a7973
Kernel Offset: disabled
Rebooting in 86400 seconds..
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches
Hi all,
[+Hillf, +akpm, +Greg]
On Tue, Aug 20, 2019 at 03:58:06PM -0700, syzbot wrote:
> syzbot found the following crash on:
>
> HEAD commit: 2d63ba3e Merge tag 'pm-5.3-rc5' of git://git.kernel.org/pu..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=165d3302600000
> kernel config: https://syzkaller.appspot.com/x/.config?x=3ff364e429585cf2
> dashboard link: https://syzkaller.appspot.com/bug?extid=82defefbbd8527e1c2cb
> compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16c8ab3c600000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16be0c4c600000
>
> Bisection is inconclusive: the bug happens on the oldest tested release.
>
> bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=11de3622600000
> console output: https://syzkaller.appspot.com/x/log.txt?x=15de3622600000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: [email protected]
>
> ------------[ cut here ]------------
> refcount_t: increment on 0; use-after-free.
> WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156 refcount_inc_checked
> lib/refcount.c:156 [inline]
> WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156
> refcount_inc_checked+0x61/0x70 lib/refcount.c:154
> Kernel panic - not syncing: panic_on_warn set ...
[...]
> RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline]
> RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154
> Code: 1d 8e c6 64 06 31 ff 89 de e8 ab 9c 35 fe 84 db 75 dd e8 62 9b 35 fe
> 48 c7 c7 00 05 c6 87 c6 05 6e c6 64 06 01 e8 67 26 07 fe <0f> 0b eb c1 90 90
> 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41
> RSP: 0018:ffff8880907d78b8 EFLAGS: 00010282
> RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> RDX: 0000000000000000 RSI: ffffffff815c2466 RDI: ffffed10120faf09
> RBP: ffff8880907d78c8 R08: ffff8880a771a200 R09: fffffbfff134ae48
> R10: fffffbfff134ae47 R11: ffffffff89a5723f R12: ffff88809ea2bb80
> R13: 0000000000000000 R14: ffff88809ff6cd40 R15: ffff8880a1c56480
> kref_get include/linux/kref.h:45 [inline]
> kobject_get+0x66/0xc0 lib/kobject.c:644
> cdev_get+0x60/0xb0 fs/char_dev.c:355
> chrdev_open+0xb0/0x6b0 fs/char_dev.c:400
> do_dentry_open+0x4df/0x1250 fs/open.c:797
> vfs_open+0xa0/0xd0 fs/open.c:906
> do_last fs/namei.c:3416 [inline]
> path_openat+0x10e9/0x4630 fs/namei.c:3533
> do_filp_open+0x1a1/0x280 fs/namei.c:3563
> do_sys_open+0x3fe/0x5d0 fs/open.c:1089
FWIW, we've run into this same crash on arm64 so it would be nice to see it
fixed upstream. It looks like Hillf's reply (which included a patch) didn't
make it to the kernel mailing lists for some reason, but it is available
here:
https://groups.google.com/forum/#!original/syzkaller-bugs/PnQNxBrWv_8/X1ygj8d8DgAJ
A simpler fix would just be to use kobject_get_unless_zero() directly in
cdev_get(), but that looks odd in this specific case because chrdev_open()
holds the 'cdev_lock' and you'd hope that finding the kobject in the inode
with that held would mean that it's not being freed at the same time.
Cheers,
Will
On Wed, Dec 04, 2019 at 11:50:56AM +0000, Will Deacon wrote:
> Hi all,
>
> [+Hillf, +akpm, +Greg]
>
> On Tue, Aug 20, 2019 at 03:58:06PM -0700, syzbot wrote:
> > syzbot found the following crash on:
> >
> > HEAD commit: 2d63ba3e Merge tag 'pm-5.3-rc5' of git://git.kernel.org/pu..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=165d3302600000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=3ff364e429585cf2
> > dashboard link: https://syzkaller.appspot.com/bug?extid=82defefbbd8527e1c2cb
> > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16c8ab3c600000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16be0c4c600000
> >
> > Bisection is inconclusive: the bug happens on the oldest tested release.
> >
> > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=11de3622600000
> > console output: https://syzkaller.appspot.com/x/log.txt?x=15de3622600000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: [email protected]
> >
> > ------------[ cut here ]------------
> > refcount_t: increment on 0; use-after-free.
> > WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156 refcount_inc_checked
> > lib/refcount.c:156 [inline]
> > WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156
> > refcount_inc_checked+0x61/0x70 lib/refcount.c:154
> > Kernel panic - not syncing: panic_on_warn set ...
>
> [...]
>
> > RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline]
> > RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154
> > Code: 1d 8e c6 64 06 31 ff 89 de e8 ab 9c 35 fe 84 db 75 dd e8 62 9b 35 fe
> > 48 c7 c7 00 05 c6 87 c6 05 6e c6 64 06 01 e8 67 26 07 fe <0f> 0b eb c1 90 90
> > 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41
> > RSP: 0018:ffff8880907d78b8 EFLAGS: 00010282
> > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> > RDX: 0000000000000000 RSI: ffffffff815c2466 RDI: ffffed10120faf09
> > RBP: ffff8880907d78c8 R08: ffff8880a771a200 R09: fffffbfff134ae48
> > R10: fffffbfff134ae47 R11: ffffffff89a5723f R12: ffff88809ea2bb80
> > R13: 0000000000000000 R14: ffff88809ff6cd40 R15: ffff8880a1c56480
> > kref_get include/linux/kref.h:45 [inline]
> > kobject_get+0x66/0xc0 lib/kobject.c:644
> > cdev_get+0x60/0xb0 fs/char_dev.c:355
> > chrdev_open+0xb0/0x6b0 fs/char_dev.c:400
> > do_dentry_open+0x4df/0x1250 fs/open.c:797
> > vfs_open+0xa0/0xd0 fs/open.c:906
> > do_last fs/namei.c:3416 [inline]
> > path_openat+0x10e9/0x4630 fs/namei.c:3533
> > do_filp_open+0x1a1/0x280 fs/namei.c:3563
> > do_sys_open+0x3fe/0x5d0 fs/open.c:1089
>
> FWIW, we've run into this same crash on arm64 so it would be nice to see it
> fixed upstream. It looks like Hillf's reply (which included a patch) didn't
> make it to the kernel mailing lists for some reason, but it is available
> here:
>
> https://groups.google.com/forum/#!original/syzkaller-bugs/PnQNxBrWv_8/X1ygj8d8DgAJ
No one is going to go and dig a patch out of google groups :(
> A simpler fix would just be to use kobject_get_unless_zero() directly in
> cdev_get(), but that looks odd in this specific case because chrdev_open()
> holds the 'cdev_lock' and you'd hope that finding the kobject in the inode
> with that held would mean that it's not being freed at the same time.
When using kref_get_unless_zero() that implies that a lock is not being
used and you are relying on the kobject only instead.
But I thought we had a lock in play here, so why would changing this
actually fix anything?
This code hasn't changed in 15+ years, what suddenly changed that causes
problems here?
thanks,
greg k-h
Hi Greg,
On Wed, Dec 04, 2019 at 01:31:48PM +0100, Greg KH wrote:
> On Wed, Dec 04, 2019 at 11:50:56AM +0000, Will Deacon wrote:
> > On Tue, Aug 20, 2019 at 03:58:06PM -0700, syzbot wrote:
> > > syzbot found the following crash on:
> > >
> > > HEAD commit: 2d63ba3e Merge tag 'pm-5.3-rc5' of git://git.kernel.org/pu..
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=165d3302600000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=3ff364e429585cf2
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=82defefbbd8527e1c2cb
> > > compiler: gcc (GCC) 9.0.0 20181231 (experimental)
> > > syz repro: https://syzkaller.appspot.com/x/repro.syz?x=16c8ab3c600000
> > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16be0c4c600000
> > >
> > > Bisection is inconclusive: the bug happens on the oldest tested release.
> > >
> > > bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=11de3622600000
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=15de3622600000
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: [email protected]
> > >
> > > ------------[ cut here ]------------
> > > refcount_t: increment on 0; use-after-free.
> > > WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156 refcount_inc_checked
> > > lib/refcount.c:156 [inline]
> > > WARNING: CPU: 1 PID: 11828 at lib/refcount.c:156
> > > refcount_inc_checked+0x61/0x70 lib/refcount.c:154
> > > Kernel panic - not syncing: panic_on_warn set ...
> >
> > [...]
> >
> > > RIP: 0010:refcount_inc_checked lib/refcount.c:156 [inline]
> > > RIP: 0010:refcount_inc_checked+0x61/0x70 lib/refcount.c:154
> > > Code: 1d 8e c6 64 06 31 ff 89 de e8 ab 9c 35 fe 84 db 75 dd e8 62 9b 35 fe
> > > 48 c7 c7 00 05 c6 87 c6 05 6e c6 64 06 01 e8 67 26 07 fe <0f> 0b eb c1 90 90
> > > 90 90 90 90 90 90 90 90 90 55 48 89 e5 41 57 41
> > > RSP: 0018:ffff8880907d78b8 EFLAGS: 00010282
> > > RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
> > > RDX: 0000000000000000 RSI: ffffffff815c2466 RDI: ffffed10120faf09
> > > RBP: ffff8880907d78c8 R08: ffff8880a771a200 R09: fffffbfff134ae48
> > > R10: fffffbfff134ae47 R11: ffffffff89a5723f R12: ffff88809ea2bb80
> > > R13: 0000000000000000 R14: ffff88809ff6cd40 R15: ffff8880a1c56480
> > > kref_get include/linux/kref.h:45 [inline]
> > > kobject_get+0x66/0xc0 lib/kobject.c:644
> > > cdev_get+0x60/0xb0 fs/char_dev.c:355
> > > chrdev_open+0xb0/0x6b0 fs/char_dev.c:400
> > > do_dentry_open+0x4df/0x1250 fs/open.c:797
> > > vfs_open+0xa0/0xd0 fs/open.c:906
> > > do_last fs/namei.c:3416 [inline]
> > > path_openat+0x10e9/0x4630 fs/namei.c:3533
> > > do_filp_open+0x1a1/0x280 fs/namei.c:3563
> > > do_sys_open+0x3fe/0x5d0 fs/open.c:1089
> >
> > FWIW, we've run into this same crash on arm64 so it would be nice to see it
> > fixed upstream. It looks like Hillf's reply (which included a patch) didn't
> > make it to the kernel mailing lists for some reason, but it is available
> > here:
> >
> > https://groups.google.com/forum/#!original/syzkaller-bugs/PnQNxBrWv_8/X1ygj8d8DgAJ
>
> No one is going to go and dig a patch out of google groups :(
Sure, just thought it was worth mentioning after digging up the history.
> > A simpler fix would just be to use kobject_get_unless_zero() directly in
> > cdev_get(), but that looks odd in this specific case because chrdev_open()
> > holds the 'cdev_lock' and you'd hope that finding the kobject in the inode
> > with that held would mean that it's not being freed at the same time.
>
> When using kref_get_unless_zero() that implies that a lock is not being
> used and you are relying on the kobject only instead.
>
> But I thought we had a lock in play here, so why would changing this
> actually fix anything?
I don't think the lock is always used. For example, look at chrdev_open(),
which appears in the backtrace; the locked code is:
spin_lock(&cdev_lock);
p = inode->i_cdev;
if (!p) {
struct kobject *kobj;
int idx;
spin_unlock(&cdev_lock);
kobj = kobj_lookup(cdev_map, inode->i_rdev, &idx);
if (!kobj)
return -ENXIO;
new = container_of(kobj, struct cdev, kobj);
spin_lock(&cdev_lock);
/* Check i_cdev again in case somebody beat us to it while
we dropped the lock. */
p = inode->i_cdev;
if (!p) {
inode->i_cdev = p = new;
list_add(&inode->i_devices, &p->list);
new = NULL;
} else if (!cdev_get(p))
ret = -ENXIO;
} else if (!cdev_get(p))
ret = -ENXIO;
spin_unlock(&cdev_lock);
cdev_put(new);
So the idea is that multiple threads serialise on the 'cdev_lock' and then
check 'inode->i_cdev' to figure out if the device has already been probed,
taking a reference to it if it's available or probing it via kobj_lookup()
otherwise. I think that's backwards with respect to things like cdev_put(),
where the refcount is dropped *before* 'inode->i_cdev' is cleared to NULL.
In which case, if a concurrent call to cdev_put() can drop the refcount
to zero without 'cdev_lock' held, then you could get a use-after-free on
this path thanks to a dangling pointer in 'inode->i_cdev'..
Looking slightly ahead in this same function, there are error paths which
appear to do exactly that:
fops = fops_get(p->ops);
if (!fops)
goto out_cdev_put;
replace_fops(filp, fops);
if (filp->f_op->open) {
ret = filp->f_op->open(inode, filp);
if (ret)
goto out_cdev_put;
}
return 0;
out_cdev_put:
cdev_put(p);
return ret;
In which case the thread which installed 'inode->i_cdev' earlier on can
now drop its refcount to zero without the lock held if, for example, the
filp->f_op->open() call fails.
But note, this is purely based on code inspection -- the C reproducer from
syzkaller doesn't work for me, so I've not been able to test any fixes either.
It's also worth noting that cdev_put() is called from __fput(), but I think the
reference counting on the file means we're ok there.
> This code hasn't changed in 15+ years, what suddenly changed that causes
> problems here?
I suppose one thing to consider is that the refcount code is relatively new,
so it could be that the actual use-after-free is extremely rare, but we're
now seeing that it's at least potentially an issue.
Thoughts?
Will
On Wed, Dec 18, 2019 at 07:20:26PM +0100, Greg KH wrote:
> On Wed, Dec 18, 2019 at 05:08:55PM +0000, Will Deacon wrote:
> > On Tue, Dec 10, 2019 at 11:44:45AM +0000, Will Deacon wrote:
> > > On Wed, Dec 04, 2019 at 01:31:48PM +0100, Greg KH wrote:
> > > > This code hasn't changed in 15+ years, what suddenly changed that causes
> > > > problems here?
> > >
> > > I suppose one thing to consider is that the refcount code is relatively new,
> > > so it could be that the actual use-after-free is extremely rare, but we're
> > > now seeing that it's at least potentially an issue.
> > >
> > > Thoughts?
> >
> > FWIW, I added some mdelay()s to make this race more likely, and I can now
> > trigger it reasonably reliably. See below.
> >
> > --->8
> >
> > [ 89.512353] ------------[ cut here ]------------
> > [ 89.513350] refcount_t: addition on 0; use-after-free.
> > [ 89.513977] WARNING: CPU: 2 PID: 6385 at lib/refcount.c:25 refcount_warn_saturate+0x6d/0xf0
[...]
> No hint as to _where_ you put the mdelay()? :)
I threw it in the release function to maximise the period where the refcount
is 0 but the inode 'i_cdev' pointer is non-NULL. I also hacked chrdev_open()
so that the fops->open() call appears to fail most of the time (I guess
syzkaller uses error injection to do something similar). Nasty hack below.
I'll send a patch, given that I've managed to "reproduce" this.
Will
--->8
diff --git a/fs/char_dev.c b/fs/char_dev.c
index 00dfe17871ac..e2e48fcd0435 100644
--- a/fs/char_dev.c
+++ b/fs/char_dev.c
@@ -375,7 +375,7 @@ static int chrdev_open(struct inode *inode, struct file *filp)
const struct file_operations *fops;
struct cdev *p;
struct cdev *new = NULL;
- int ret = 0;
+ int ret = 0, first = 0;
spin_lock(&cdev_lock);
p = inode->i_cdev;
@@ -395,6 +395,7 @@ static int chrdev_open(struct inode *inode, struct file *filp)
inode->i_cdev = p = new;
list_add(&inode->i_devices, &p->list);
new = NULL;
+ first = 1;
} else if (!cdev_get(p))
ret = -ENXIO;
} else if (!cdev_get(p))
@@ -411,6 +412,10 @@ static int chrdev_open(struct inode *inode, struct file *filp)
replace_fops(filp, fops);
if (filp->f_op->open) {
+ if (first && (get_cycles() & 0x3)) {
+ ret = -EINTR;
+ goto out_cdev_put;
+ }
ret = filp->f_op->open(inode, filp);
if (ret)
goto out_cdev_put;
@@ -594,12 +599,14 @@ void cdev_del(struct cdev *p)
kobject_put(&p->kobj);
}
+#include <linux/delay.h>
static void cdev_default_release(struct kobject *kobj)
{
struct cdev *p = container_of(kobj, struct cdev, kobj);
struct kobject *parent = kobj->parent;
+ mdelay(50);
cdev_purge(p);
kobject_put(parent);
}
Hi Will,
I am facing same issue while syzkaller fault injection code is causing
failure in filp->f_op->open() from chrdev_open().
I believe we need to rely on refcount as cdev_lock() is not sufficient
in this case.
Patch mentioned in
https://groups.google.com/forum/#!original/syzkaller-bugs/PnQNxBrWv_8/X1ygj8d8DgAJ
seems good.
Please share your opinion the same.
Regards
Prateek
On 12/19/2019 5:29 PM, Will Deacon wrote:
> On Wed, Dec 18, 2019 at 07:20:26PM +0100, Greg KH wrote:
>> On Wed, Dec 18, 2019 at 05:08:55PM +0000, Will Deacon wrote:
>>> On Tue, Dec 10, 2019 at 11:44:45AM +0000, Will Deacon wrote:
>>>> On Wed, Dec 04, 2019 at 01:31:48PM +0100, Greg KH wrote:
>>>>> This code hasn't changed in 15+ years, what suddenly changed that causes
>>>>> problems here?
>>>> I suppose one thing to consider is that the refcount code is relatively new,
>>>> so it could be that the actual use-after-free is extremely rare, but we're
>>>> now seeing that it's at least potentially an issue.
>>>>
>>>> Thoughts?
>>> FWIW, I added some mdelay()s to make this race more likely, and I can now
>>> trigger it reasonably reliably. See below.
>>>
>>> --->8
>>>
>>> [ 89.512353] ------------[ cut here ]------------
>>> [ 89.513350] refcount_t: addition on 0; use-after-free.
>>> [ 89.513977] WARNING: CPU: 2 PID: 6385 at lib/refcount.c:25 refcount_warn_saturate+0x6d/0xf0
> [...]
>
>> No hint as to _where_ you put the mdelay()? :)
> I threw it in the release function to maximise the period where the refcount
> is 0 but the inode 'i_cdev' pointer is non-NULL. I also hacked chrdev_open()
> so that the fops->open() call appears to fail most of the time (I guess
> syzkaller uses error injection to do something similar). Nasty hack below.
>
> I'll send a patch, given that I've managed to "reproduce" this.
>
> Will
>
> --->8
>
> diff --git a/fs/char_dev.c b/fs/char_dev.c
> index 00dfe17871ac..e2e48fcd0435 100644
> --- a/fs/char_dev.c
> +++ b/fs/char_dev.c
> @@ -375,7 +375,7 @@ static int chrdev_open(struct inode *inode, struct file *filp)
> const struct file_operations *fops;
> struct cdev *p;
> struct cdev *new = NULL;
> - int ret = 0;
> + int ret = 0, first = 0;
>
> spin_lock(&cdev_lock);
> p = inode->i_cdev;
> @@ -395,6 +395,7 @@ static int chrdev_open(struct inode *inode, struct file *filp)
> inode->i_cdev = p = new;
> list_add(&inode->i_devices, &p->list);
> new = NULL;
> + first = 1;
> } else if (!cdev_get(p))
> ret = -ENXIO;
> } else if (!cdev_get(p))
> @@ -411,6 +412,10 @@ static int chrdev_open(struct inode *inode, struct file *filp)
>
> replace_fops(filp, fops);
> if (filp->f_op->open) {
> + if (first && (get_cycles() & 0x3)) {
> + ret = -EINTR;
> + goto out_cdev_put;
> + }
> ret = filp->f_op->open(inode, filp);
> if (ret)
> goto out_cdev_put;
> @@ -594,12 +599,14 @@ void cdev_del(struct cdev *p)
> kobject_put(&p->kobj);
> }
>
> +#include <linux/delay.h>
>
> static void cdev_default_release(struct kobject *kobj)
> {
> struct cdev *p = container_of(kobj, struct cdev, kobj);
> struct kobject *parent = kobj->parent;
>
> + mdelay(50);
> cdev_purge(p);
> kobject_put(parent);
> }
--
Qualcomm India Private Limited, on behalf of Qualcomm Innovation
Center, Inc., is a member of Code Aurora Forum, a Linux Foundation
Collaborative Project