2022-09-07 20:41:14

by Christian A. Ehrhardt

[permalink] [raw]
Subject: [PATCH] kernfs: fix use-after-free in __kernfs_remove

Concurrent calls to __kernfs_remove can race on the removal
of the root node: The race occurs if the root node(kn) is freed
during kernfs_drain. The child node(pos) is explicitly protected
with an additional ref count. Do the same for the root node.

Found by syzkaller with the following reproducer:

syz_mount_image$ext4(0x0, &(0x7f0000000100)='./file0\x00', 0x100000, 0x0, 0x0, 0x0, 0x0)
r0 = openat(0xffffffffffffff9c, &(0x7f0000000080)='/proc/self/exe\x00', 0x0, 0x0)
close(r0)
pipe2(&(0x7f0000000140)={0xffffffffffffffff, <r1=>0xffffffffffffffff}, 0x800)
mount$9p_fd(0x0, &(0x7f0000000040)='./file0\x00', &(0x7f00000000c0), 0x408, &(0x7f0000000280)={'trans=fd,', {'rfdno', 0x3d, r0}, 0x2c, {'wfdno', 0x3d, r1}, 0x2c, {[{@cache_loose}, {@mmap}, {@loose}, {@loose}, {@mmap}], [{@mask={'mask', 0x3d, '^MAY_EXEC'}}, {@fsmagic={'fsmagic', 0x3d, 0x10001}}, {@dont_hash}]}})

Sample report:

==================================================================
BUG: KASAN: use-after-free in kernfs_type include/linux/kernfs.h:335 [inline]
BUG: KASAN: use-after-free in kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
BUG: KASAN: use-after-free in __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
Read of size 2 at addr ffff8880088807f0 by task syz-executor.2/857

CPU: 0 PID: 857 Comm: syz-executor.2 Not tainted 6.0.0-rc3-00363-g7726d4c3e60b #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x6e/0x91 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:317 [inline]
print_report.cold+0x5e/0x5e5 mm/kasan/report.c:433
kasan_report+0xa3/0x130 mm/kasan/report.c:495
kernfs_type include/linux/kernfs.h:335 [inline]
kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
__kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
__kernfs_remove fs/kernfs/dir.c:1356 [inline]
kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
__kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
create_cache mm/slab_common.c:229 [inline]
kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
p9_client_create+0xd4d/0x1190 net/9p/client.c:993
v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
vfs_get_tree+0x85/0x2e0 fs/super.c:1530
do_new_mount fs/namespace.c:3040 [inline]
path_mount+0x675/0x1d00 fs/namespace.c:3370
do_mount fs/namespace.c:3383 [inline]
__do_sys_mount fs/namespace.c:3591 [inline]
__se_sys_mount fs/namespace.c:3568 [inline]
__x64_sys_mount+0x282/0x300 fs/namespace.c:3568
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f725f983aed
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f725f0f7028 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007f725faa3f80 RCX: 00007f725f983aed
RDX: 00000000200000c0 RSI: 0000000020000040 RDI: 0000000000000000
RBP: 00007f725f9f419c R08: 0000000020000280 R09: 0000000000000000
R10: 0000000000000408 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000006 R14: 00007f725faa3f80 R15: 00007f725f0d7000
</TASK>

Allocated by task 855:
kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
kasan_set_track mm/kasan/common.c:45 [inline]
set_alloc_info mm/kasan/common.c:437 [inline]
__kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:470
kasan_slab_alloc include/linux/kasan.h:224 [inline]
slab_post_alloc_hook mm/slab.h:727 [inline]
slab_alloc_node mm/slub.c:3243 [inline]
slab_alloc mm/slub.c:3251 [inline]
__kmem_cache_alloc_lru mm/slub.c:3258 [inline]
kmem_cache_alloc+0xbf/0x200 mm/slub.c:3268
kmem_cache_zalloc include/linux/slab.h:723 [inline]
__kernfs_new_node+0xd4/0x680 fs/kernfs/dir.c:593
kernfs_new_node fs/kernfs/dir.c:655 [inline]
kernfs_create_dir_ns+0x9c/0x220 fs/kernfs/dir.c:1010
sysfs_create_dir_ns+0x127/0x290 fs/sysfs/dir.c:59
create_dir lib/kobject.c:63 [inline]
kobject_add_internal+0x24a/0x8d0 lib/kobject.c:223
kobject_add_varg lib/kobject.c:358 [inline]
kobject_init_and_add+0x101/0x160 lib/kobject.c:441
sysfs_slab_add+0x156/0x1e0 mm/slub.c:5954
__kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
create_cache mm/slab_common.c:229 [inline]
kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
p9_client_create+0xd4d/0x1190 net/9p/client.c:993
v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
vfs_get_tree+0x85/0x2e0 fs/super.c:1530
do_new_mount fs/namespace.c:3040 [inline]
path_mount+0x675/0x1d00 fs/namespace.c:3370
do_mount fs/namespace.c:3383 [inline]
__do_sys_mount fs/namespace.c:3591 [inline]
__se_sys_mount fs/namespace.c:3568 [inline]
__x64_sys_mount+0x282/0x300 fs/namespace.c:3568
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 857:
kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
kasan_set_track+0x21/0x30 mm/kasan/common.c:45
kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:370
____kasan_slab_free mm/kasan/common.c:367 [inline]
____kasan_slab_free mm/kasan/common.c:329 [inline]
__kasan_slab_free+0x108/0x190 mm/kasan/common.c:375
kasan_slab_free include/linux/kasan.h:200 [inline]
slab_free_hook mm/slub.c:1754 [inline]
slab_free_freelist_hook mm/slub.c:1780 [inline]
slab_free mm/slub.c:3534 [inline]
kmem_cache_free+0x9c/0x340 mm/slub.c:3551
kernfs_put.part.0+0x2b2/0x520 fs/kernfs/dir.c:547
kernfs_put+0x42/0x50 fs/kernfs/dir.c:521
__kernfs_remove.part.0+0x72d/0x960 fs/kernfs/dir.c:1407
__kernfs_remove fs/kernfs/dir.c:1356 [inline]
kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
__kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
create_cache mm/slab_common.c:229 [inline]
kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
p9_client_create+0xd4d/0x1190 net/9p/client.c:993
v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
vfs_get_tree+0x85/0x2e0 fs/super.c:1530
do_new_mount fs/namespace.c:3040 [inline]
path_mount+0x675/0x1d00 fs/namespace.c:3370
do_mount fs/namespace.c:3383 [inline]
__do_sys_mount fs/namespace.c:3591 [inline]
__se_sys_mount fs/namespace.c:3568 [inline]
__x64_sys_mount+0x282/0x300 fs/namespace.c:3568
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888008880780
which belongs to the cache kernfs_node_cache of size 128
The buggy address is located 112 bytes inside of
128-byte region [ffff888008880780, ffff888008880800)

The buggy address belongs to the physical page:
page:00000000732833f8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8880
flags: 0x100000000000200(slab|node=0|zone=1)
raw: 0100000000000200 0000000000000000 dead000000000122 ffff888001147280
raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff888008880680: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
ffff888008880700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff888008880780: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888008880800: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
ffff888008880880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
==================================================================

cc: Greg Kroah-Hartman <[email protected]>
cc: Tejun Heo <[email protected]>
Signed-off-by: Christian A. Ehrhardt <[email protected]>
---
fs/kernfs/dir.c | 2 ++
1 file changed, 2 insertions(+)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 1cc88ba6de90..66ccf8480592 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1365,6 +1365,7 @@ static void __kernfs_remove(struct kernfs_node *kn)
atomic_add(KN_DEACTIVATED_BIAS, &pos->active);

/* deactivate and unlink the subtree node-by-node */
+ kernfs_get(kn);
do {
pos = kernfs_leftmost_descendant(kn);

@@ -1406,6 +1407,7 @@ static void __kernfs_remove(struct kernfs_node *kn)

kernfs_put(pos);
} while (pos != kn);
+ kernfs_put(kn);
}

/**
--
2.34.1


2022-09-08 17:36:39

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] kernfs: fix use-after-free in __kernfs_remove

Hello, Christian.

On Wed, Sep 07, 2022 at 10:08:11PM +0200, Christian A. Ehrhardt wrote:
> Concurrent calls to __kernfs_remove can race on the removal
> of the root node: The race occurs if the root node(kn) is freed
> during kernfs_drain. The child node(pos) is explicitly protected
> with an additional ref count. Do the same for the root node.

I don't think this is right. We don't support parallel invocations of
__kernfs_remove() this way. If @kn can be freed during kernfs_drain(), it
also means that it can be freed before kernfs_rwsem is grabbed in
kernfs_remove(). The caller must be responsible for ensuring that @kn
remains allocated. Otherwise, it can't be made reliable.

Thanks.

--
tejun

2022-09-08 22:31:35

by Christian A. Ehrhardt

[permalink] [raw]
Subject: Re: [PATCH] kernfs: fix use-after-free in __kernfs_remove


Hello Tejun,

On Thu, Sep 08, 2022 at 07:14:43AM -1000, Tejun Heo wrote:
> Hello, Christian.
>
> On Wed, Sep 07, 2022 at 10:08:11PM +0200, Christian A. Ehrhardt wrote:
> > Concurrent calls to __kernfs_remove can race on the removal
> > of the root node: The race occurs if the root node(kn) is freed
> > during kernfs_drain. The child node(pos) is explicitly protected
> > with an additional ref count. Do the same for the root node.
>
> I don't think this is right. We don't support parallel invocations of
> __kernfs_remove() this way. If @kn can be freed during kernfs_drain(), it
> also means that it can be freed before kernfs_rwsem is grabbed in
> kernfs_remove().

Point taken. However, the syzkaller reproducer reliably triggers
the bug without the patch and the bug is gone with the patch.

> The caller must be responsible for ensuring that @kn
> remains allocated. Otherwise, it can't be made reliable.

In this case the caller of __kernfs_remove is not kernfs_remove but
kernfs_remove_by_name_ns and it fails to take a reference for the
node that it looks up and deletes. Thus a second call to
kernfs_remove_by_name_ns can remove the node while kernfs_drain
drops the semaphore.

I'll post an updated patch tomorrow.

regards Christian

2022-09-12 21:38:29

by Christian A. Ehrhardt

[permalink] [raw]
Subject: Re: [PATCH] kernfs: fix use-after-free in __kernfs_remove


Hello, Tejun,

On Fri, Sep 09, 2022 at 12:25:36AM +0200, Christian A. Ehrhardt wrote:
>
> Hello Tejun,
>
> On Thu, Sep 08, 2022 at 07:14:43AM -1000, Tejun Heo wrote:
> > Hello, Christian.
> >
> > On Wed, Sep 07, 2022 at 10:08:11PM +0200, Christian A. Ehrhardt wrote:
> > > Concurrent calls to __kernfs_remove can race on the removal
> > > of the root node: The race occurs if the root node(kn) is freed
> > > during kernfs_drain. The child node(pos) is explicitly protected
> > > with an additional ref count. Do the same for the root node.
> >
> > I don't think this is right. We don't support parallel invocations of
> > __kernfs_remove() this way. If @kn can be freed during kernfs_drain(), it
> > also means that it can be freed before kernfs_rwsem is grabbed in
> > kernfs_remove().
>
> Point taken. However, the syzkaller reproducer reliably triggers
> the bug without the patch and the bug is gone with the patch.
>
> > The caller must be responsible for ensuring that @kn
> > remains allocated. Otherwise, it can't be made reliable.
>
> In this case the caller of __kernfs_remove is not kernfs_remove but
> kernfs_remove_by_name_ns and it fails to take a reference for the
> node that it looks up and deletes. Thus a second call to
> kernfs_remove_by_name_ns can remove the node while kernfs_drain
> drops the semaphore.
>
> I'll post an updated patch tomorrow.

Sorry, no patch (yet). But here's the whole story of the initial
syzkaller report (form top to bottom). I'm not sure where we go wrong
but I think several places could do better here:

The code in net/9p/client.c creates one kmem-cache per client session.
All of these kmem caches use the same name ("9p-fcall-cache").
Is it ok to create several kmem caches with the same name? My feeling is
that this is somewhat unexpected but should probably be allowed.

In the setup in question slab caches are not merged. Thus the slub
code uses the kmem cache name directly to create the sysfs directory for
the slub cache. If we allow multiple kmem caches with the same name the
slub code should somehow make the sysfs names unique (e.g. by adding a
serial numer to the name), right?

Before creating the sysfs directory the slub code uses sysfs_remove_link
(aka kernfs_remove_by_name) with the following comment:
"If we have a leftover link remove it." In fact these "leftover"s
are the sysfs files of an active kmem cache with the same name.

Additionally, sysfs_remove_link() looks like a bit of a misnomer
as it removes whatever exists under the given name. This may be a
symlink but it can be an entire directory hierarchy, too.
Is this intentional? At least it's been like that forever.

If kmem cache creation is done in parallel we can now have
concurrent invocations of kernfs_remove_by_name_ns() for the same
parent and the same name. This is what eventually causes the race.

The race is possible as kernfs_remove_by_name_ns() looks up the
name of the target in its parent but does not acquire a ref count
on the target before calling __kernfs_remove(). __kernfs_remove()
may drop the kernfs_rwsem in kernfs_drain(). Thus a concurrent call
to __kernfs_remove can complete the removal except for the nodes
that the first instance of __kernfs_remove() holds refs for.
As explained above no ref is held on the root of the removed tree.
This causes the use-after-free that KASAN sees and complains about.

For kernfs_remove it is reasonable to expect the caller to hold
some kind of reference to prevent this type of race and from a quick
check, the callers seem to get this correct. The only exception that
I could find is kernfs_remove_by_name_ns. This is easy to fix if
kernfs_remove_by_name_ns() hold a reference on the root node across
the call to __kernfs_remove().
IMHO this should be done. Should I sent a patch?

Thanks for bearing with me. Oppinons?

regards Christian

2022-09-12 22:42:25

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] kernfs: fix use-after-free in __kernfs_remove

Hello,

On Mon, Sep 12, 2022 at 11:24:52PM +0200, Christian A. Ehrhardt wrote:
> Sorry, no patch (yet). But here's the whole story of the initial
> syzkaller report (form top to bottom). I'm not sure where we go wrong
> but I think several places could do better here:
>
> The code in net/9p/client.c creates one kmem-cache per client session.
> All of these kmem caches use the same name ("9p-fcall-cache").
> Is it ok to create several kmem caches with the same name? My feeling is
> that this is somewhat unexpected but should probably be allowed.

I don't think that's supported. kmem_caches are exposed in sysfs and having
the same name is gonna cause issues.

> In the setup in question slab caches are not merged. Thus the slub
> code uses the kmem cache name directly to create the sysfs directory for
> the slub cache. If we allow multiple kmem caches with the same name the
> slub code should somehow make the sysfs names unique (e.g. by adding a
> serial numer to the name), right?

I think we're just in uncharted terriotory. Maybe some things will work
while others don't. Nobody really thought about or tested this usage.

> Before creating the sysfs directory the slub code uses sysfs_remove_link
> (aka kernfs_remove_by_name) with the following comment:
> "If we have a leftover link remove it." In fact these "leftover"s
> are the sysfs files of an active kmem cache with the same name.

Hahahaha

> Additionally, sysfs_remove_link() looks like a bit of a misnomer
> as it removes whatever exists under the given name. This may be a
> symlink but it can be an entire directory hierarchy, too.
> Is this intentional? At least it's been like that forever.

It's a historical artifact. The backend implementation has changed while the
wrapping sysfs interface remained the same.

> If kmem cache creation is done in parallel we can now have
> concurrent invocations of kernfs_remove_by_name_ns() for the same
> parent and the same name. This is what eventually causes the race.
>
> The race is possible as kernfs_remove_by_name_ns() looks up the
> name of the target in its parent but does not acquire a ref count
> on the target before calling __kernfs_remove(). __kernfs_remove()
> may drop the kernfs_rwsem in kernfs_drain(). Thus a concurrent call
> to __kernfs_remove can complete the removal except for the nodes
> that the first instance of __kernfs_remove() holds refs for.
> As explained above no ref is held on the root of the removed tree.
> This causes the use-after-free that KASAN sees and complains about.
>
> For kernfs_remove it is reasonable to expect the caller to hold
> some kind of reference to prevent this type of race and from a quick
> check, the callers seem to get this correct. The only exception that
> I could find is kernfs_remove_by_name_ns. This is easy to fix if
> kernfs_remove_by_name_ns() hold a reference on the root node across
> the call to __kernfs_remove().
> IMHO this should be done. Should I sent a patch?

So, no matter what, I think it'd be a good idea to make remove_by_name hold
onto the kn it's removing, so please send a patch to do so. For the rest of
the situation, I think the right thing to do would be updating 9p so that it
doesn't create multiple kmem_caches with the same name.

Thanks.

--
tejun

2022-09-13 12:55:24

by Christian A. Ehrhardt

[permalink] [raw]
Subject: [PATCH v2] kernfs: fix use-after-free in __kernfs_remove

Syzkaller managed to trigger concurrent calls to
kernfs_remove_by_name_ns() for the same file resulting in
a KASAN detected use-after-free. The race occurs when the root
node is freed during kernfs_drain().

To prevent this acquire an additional reference for the root
of the tree that is removed before calling __kernfs_remove().

Found by syzkaller with the following reproducer (slab_nomerge is
required):

syz_mount_image$ext4(0x0, &(0x7f0000000100)='./file0\x00', 0x100000, 0x0, 0x0, 0x0, 0x0)
r0 = openat(0xffffffffffffff9c, &(0x7f0000000080)='/proc/self/exe\x00', 0x0, 0x0)
close(r0)
pipe2(&(0x7f0000000140)={0xffffffffffffffff, <r1=>0xffffffffffffffff}, 0x800)
mount$9p_fd(0x0, &(0x7f0000000040)='./file0\x00', &(0x7f00000000c0), 0x408, &(0x7f0000000280)={'trans=fd,', {'rfdno', 0x3d, r0}, 0x2c, {'wfdno', 0x3d, r1}, 0x2c, {[{@cache_loose}, {@mmap}, {@loose}, {@loose}, {@mmap}], [{@mask={'mask', 0x3d, '^MAY_EXEC'}}, {@fsmagic={'fsmagic', 0x3d, 0x10001}}, {@dont_hash}]}})

Sample report:

==================================================================
BUG: KASAN: use-after-free in kernfs_type include/linux/kernfs.h:335 [inline]
BUG: KASAN: use-after-free in kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
BUG: KASAN: use-after-free in __kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
Read of size 2 at addr ffff8880088807f0 by task syz-executor.2/857

CPU: 0 PID: 857 Comm: syz-executor.2 Not tainted 6.0.0-rc3-00363-g7726d4c3e60b #5
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.15.0-1 04/01/2014
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x6e/0x91 lib/dump_stack.c:106
print_address_description mm/kasan/report.c:317 [inline]
print_report.cold+0x5e/0x5e5 mm/kasan/report.c:433
kasan_report+0xa3/0x130 mm/kasan/report.c:495
kernfs_type include/linux/kernfs.h:335 [inline]
kernfs_leftmost_descendant fs/kernfs/dir.c:1261 [inline]
__kernfs_remove.part.0+0x843/0x960 fs/kernfs/dir.c:1369
__kernfs_remove fs/kernfs/dir.c:1356 [inline]
kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
__kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
create_cache mm/slab_common.c:229 [inline]
kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
p9_client_create+0xd4d/0x1190 net/9p/client.c:993
v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
vfs_get_tree+0x85/0x2e0 fs/super.c:1530
do_new_mount fs/namespace.c:3040 [inline]
path_mount+0x675/0x1d00 fs/namespace.c:3370
do_mount fs/namespace.c:3383 [inline]
__do_sys_mount fs/namespace.c:3591 [inline]
__se_sys_mount fs/namespace.c:3568 [inline]
__x64_sys_mount+0x282/0x300 fs/namespace.c:3568
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7f725f983aed
Code: 02 b8 ff ff ff ff c3 66 0f 1f 44 00 00 f3 0f 1e fa 48 89 f8 48 89 f7 48 89 d6 48 89 ca 4d 89 c2 4d 89 c8 4c 8b 4c 24 08 0f 05 <48> 3d 01 f0 ff ff 73 01 c3 48 c7 c1 b0 ff ff ff f7 d8 64 89 01 48
RSP: 002b:00007f725f0f7028 EFLAGS: 00000246 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 00007f725faa3f80 RCX: 00007f725f983aed
RDX: 00000000200000c0 RSI: 0000000020000040 RDI: 0000000000000000
RBP: 00007f725f9f419c R08: 0000000020000280 R09: 0000000000000000
R10: 0000000000000408 R11: 0000000000000246 R12: 0000000000000000
R13: 0000000000000006 R14: 00007f725faa3f80 R15: 00007f725f0d7000
</TASK>

Allocated by task 855:
kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
kasan_set_track mm/kasan/common.c:45 [inline]
set_alloc_info mm/kasan/common.c:437 [inline]
__kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:470
kasan_slab_alloc include/linux/kasan.h:224 [inline]
slab_post_alloc_hook mm/slab.h:727 [inline]
slab_alloc_node mm/slub.c:3243 [inline]
slab_alloc mm/slub.c:3251 [inline]
__kmem_cache_alloc_lru mm/slub.c:3258 [inline]
kmem_cache_alloc+0xbf/0x200 mm/slub.c:3268
kmem_cache_zalloc include/linux/slab.h:723 [inline]
__kernfs_new_node+0xd4/0x680 fs/kernfs/dir.c:593
kernfs_new_node fs/kernfs/dir.c:655 [inline]
kernfs_create_dir_ns+0x9c/0x220 fs/kernfs/dir.c:1010
sysfs_create_dir_ns+0x127/0x290 fs/sysfs/dir.c:59
create_dir lib/kobject.c:63 [inline]
kobject_add_internal+0x24a/0x8d0 lib/kobject.c:223
kobject_add_varg lib/kobject.c:358 [inline]
kobject_init_and_add+0x101/0x160 lib/kobject.c:441
sysfs_slab_add+0x156/0x1e0 mm/slub.c:5954
__kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
create_cache mm/slab_common.c:229 [inline]
kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
p9_client_create+0xd4d/0x1190 net/9p/client.c:993
v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
vfs_get_tree+0x85/0x2e0 fs/super.c:1530
do_new_mount fs/namespace.c:3040 [inline]
path_mount+0x675/0x1d00 fs/namespace.c:3370
do_mount fs/namespace.c:3383 [inline]
__do_sys_mount fs/namespace.c:3591 [inline]
__se_sys_mount fs/namespace.c:3568 [inline]
__x64_sys_mount+0x282/0x300 fs/namespace.c:3568
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 857:
kasan_save_stack+0x1e/0x40 mm/kasan/common.c:38
kasan_set_track+0x21/0x30 mm/kasan/common.c:45
kasan_set_free_info+0x20/0x40 mm/kasan/generic.c:370
____kasan_slab_free mm/kasan/common.c:367 [inline]
____kasan_slab_free mm/kasan/common.c:329 [inline]
__kasan_slab_free+0x108/0x190 mm/kasan/common.c:375
kasan_slab_free include/linux/kasan.h:200 [inline]
slab_free_hook mm/slub.c:1754 [inline]
slab_free_freelist_hook mm/slub.c:1780 [inline]
slab_free mm/slub.c:3534 [inline]
kmem_cache_free+0x9c/0x340 mm/slub.c:3551
kernfs_put.part.0+0x2b2/0x520 fs/kernfs/dir.c:547
kernfs_put+0x42/0x50 fs/kernfs/dir.c:521
__kernfs_remove.part.0+0x72d/0x960 fs/kernfs/dir.c:1407
__kernfs_remove fs/kernfs/dir.c:1356 [inline]
kernfs_remove_by_name_ns+0x108/0x190 fs/kernfs/dir.c:1589
sysfs_slab_add+0x133/0x1e0 mm/slub.c:5943
__kmem_cache_create+0x3e0/0x550 mm/slub.c:4899
create_cache mm/slab_common.c:229 [inline]
kmem_cache_create_usercopy+0x167/0x2a0 mm/slab_common.c:335
p9_client_create+0xd4d/0x1190 net/9p/client.c:993
v9fs_session_init+0x1e6/0x13c0 fs/9p/v9fs.c:408
v9fs_mount+0xb9/0xbd0 fs/9p/vfs_super.c:126
legacy_get_tree+0xf1/0x200 fs/fs_context.c:610
vfs_get_tree+0x85/0x2e0 fs/super.c:1530
do_new_mount fs/namespace.c:3040 [inline]
path_mount+0x675/0x1d00 fs/namespace.c:3370
do_mount fs/namespace.c:3383 [inline]
__do_sys_mount fs/namespace.c:3591 [inline]
__se_sys_mount fs/namespace.c:3568 [inline]
__x64_sys_mount+0x282/0x300 fs/namespace.c:3568
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0x90 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

The buggy address belongs to the object at ffff888008880780
which belongs to the cache kernfs_node_cache of size 128
The buggy address is located 112 bytes inside of
128-byte region [ffff888008880780, ffff888008880800)

The buggy address belongs to the physical page:
page:00000000732833f8 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x8880
flags: 0x100000000000200(slab|node=0|zone=1)
raw: 0100000000000200 0000000000000000 dead000000000122 ffff888001147280
raw: 0000000000000000 0000000000150015 00000001ffffffff 0000000000000000
page dumped because: kasan: bad access detected

Memory state around the buggy address:
ffff888008880680: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
ffff888008880700: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
>ffff888008880780: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888008880800: fc fc fc fc fc fc fc fc fa fb fb fb fb fb fb fb
ffff888008880880: fb fb fb fb fb fb fb fb fc fc fc fc fc fc fc fc
==================================================================

cc: Greg Kroah-Hartman <[email protected]>
cc: Tejun Heo <[email protected]>
Signed-off-by: Christian A. Ehrhardt <[email protected]>
---
fs/kernfs/dir.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/kernfs/dir.c b/fs/kernfs/dir.c
index 1cc88ba6de90..2e9313988871 100644
--- a/fs/kernfs/dir.c
+++ b/fs/kernfs/dir.c
@@ -1585,8 +1585,11 @@ int kernfs_remove_by_name_ns(struct kernfs_node *parent, const char *name,
down_write(&root->kernfs_rwsem);

kn = kernfs_find_ns(parent, name, ns);
- if (kn)
+ if (kn) {
+ kernfs_get(kn);
__kernfs_remove(kn);
+ kernfs_put(kn);
+ }

up_write(&root->kernfs_rwsem);

--
2.34.1

2022-09-19 17:51:48

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH v2] kernfs: fix use-after-free in __kernfs_remove

On Tue, Sep 13, 2022 at 02:17:23PM +0200, Christian A. Ehrhardt wrote:
> Syzkaller managed to trigger concurrent calls to
> kernfs_remove_by_name_ns() for the same file resulting in
> a KASAN detected use-after-free. The race occurs when the root
> node is freed during kernfs_drain().
>
> To prevent this acquire an additional reference for the root
> of the tree that is removed before calling __kernfs_remove().
...
> cc: Greg Kroah-Hartman <[email protected]>
> cc: Tejun Heo <[email protected]>
> Signed-off-by: Christian A. Ehrhardt <[email protected]>

Acked-by: Tejun Heo <[email protected]>

Thanks.

--
tejun