2023-05-12 07:34:34

by syzbot

[permalink] [raw]
Subject: [syzbot] [rdma?] KASAN: slab-use-after-free Read in siw_query_port

Hello,

syzbot found the following issue on:

HEAD commit: 16a8829130ca nfs: fix another case of NULL/IS_ERR confusio..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=162c0566280000
kernel config: https://syzkaller.appspot.com/x/.config?x=8bc832f563d8bf38
dashboard link: https://syzkaller.appspot.com/bug?extid=79f283f1f4ccc6e8b624
compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/f8c18a31ba47/disk-16a88291.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/03a18f29b7e7/vmlinux-16a88291.xz
kernel image: https://storage.googleapis.com/syzbot-assets/1db2407ade1e/bzImage-16a88291.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

xfrm0 speed is unknown, defaulting to 1000
==================================================================
BUG: KASAN: slab-use-after-free in siw_query_port+0x37b/0x3e0 drivers/infiniband/sw/siw/siw_verbs.c:177
Read of size 4 at addr ffff888034efa0e8 by task kworker/1:4/24211

CPU: 1 PID: 24211 Comm: kworker/1:4 Not tainted 6.4.0-rc1-syzkaller-00012-g16a8829130ca #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
Workqueue: infiniband ib_cache_event_task
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:351
print_report mm/kasan/report.c:462 [inline]
kasan_report+0x11c/0x130 mm/kasan/report.c:572
siw_query_port+0x37b/0x3e0 drivers/infiniband/sw/siw/siw_verbs.c:177
iw_query_port drivers/infiniband/core/device.c:2049 [inline]
ib_query_port drivers/infiniband/core/device.c:2090 [inline]
ib_query_port+0x3c4/0x8f0 drivers/infiniband/core/device.c:2082
ib_cache_update.part.0+0xcf/0x920 drivers/infiniband/core/cache.c:1487
ib_cache_update drivers/infiniband/core/cache.c:1561 [inline]
ib_cache_event_task+0x1b1/0x270 drivers/infiniband/core/cache.c:1561
process_one_work+0x99a/0x15e0 kernel/workqueue.c:2405
worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
kthread+0x344/0x440 kernel/kthread.c:379
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
</TASK>

Allocated by task 14304:
kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
____kasan_kmalloc mm/kasan/common.c:374 [inline]
____kasan_kmalloc mm/kasan/common.c:333 [inline]
__kasan_kmalloc+0xa2/0xb0 mm/kasan/common.c:383
kasan_kmalloc include/linux/kasan.h:196 [inline]
__do_kmalloc_node mm/slab_common.c:966 [inline]
__kmalloc_node+0x61/0x1a0 mm/slab_common.c:973
kmalloc_node include/linux/slab.h:579 [inline]
kvmalloc_node+0xa2/0x1a0 mm/util.c:604
kvmalloc include/linux/slab.h:697 [inline]
kvzalloc include/linux/slab.h:705 [inline]
alloc_netdev_mqs+0x9c/0x1250 net/core/dev.c:10626
rtnl_create_link+0xbeb/0xee0 net/core/rtnetlink.c:3315
rtnl_newlink_create net/core/rtnetlink.c:3433 [inline]
__rtnl_newlink+0xfd4/0x1840 net/core/rtnetlink.c:3660
rtnl_newlink+0x68/0xa0 net/core/rtnetlink.c:3673
rtnetlink_rcv_msg+0x43d/0xd50 net/core/rtnetlink.c:6395
netlink_rcv_skb+0x165/0x440 net/netlink/af_netlink.c:2546
netlink_unicast_kernel net/netlink/af_netlink.c:1339 [inline]
netlink_unicast+0x547/0x7f0 net/netlink/af_netlink.c:1365
netlink_sendmsg+0x925/0xe30 net/netlink/af_netlink.c:1913
sock_sendmsg_nosec net/socket.c:724 [inline]
sock_sendmsg+0xde/0x190 net/socket.c:747
__sys_sendto+0x23a/0x340 net/socket.c:2144
__do_sys_sendto net/socket.c:2156 [inline]
__se_sys_sendto net/socket.c:2152 [inline]
__x64_sys_sendto+0xe1/0x1b0 net/socket.c:2152
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Freed by task 5268:
kasan_save_stack+0x22/0x40 mm/kasan/common.c:45
kasan_set_track+0x25/0x30 mm/kasan/common.c:52
kasan_save_free_info+0x2e/0x40 mm/kasan/generic.c:521
____kasan_slab_free mm/kasan/common.c:236 [inline]
____kasan_slab_free+0x160/0x1c0 mm/kasan/common.c:200
kasan_slab_free include/linux/kasan.h:162 [inline]
slab_free_hook mm/slub.c:1781 [inline]
slab_free_freelist_hook+0x8b/0x1c0 mm/slub.c:1807
slab_free mm/slub.c:3786 [inline]
__kmem_cache_free+0xaf/0x2d0 mm/slub.c:3799
kvfree+0x46/0x50 mm/util.c:650
device_release+0xa3/0x240 drivers/base/core.c:2484
kobject_cleanup lib/kobject.c:683 [inline]
kobject_release lib/kobject.c:714 [inline]
kref_put include/linux/kref.h:65 [inline]
kobject_put+0x1c2/0x4d0 lib/kobject.c:731
netdev_run_todo+0x762/0x1100 net/core/dev.c:10400
xfrmi_exit_batch_net+0x2c7/0x3e0 net/xfrm/xfrm_interface_core.c:984
ops_exit_list+0x125/0x170 net/core/net_namespace.c:175
cleanup_net+0x4ee/0xb10 net/core/net_namespace.c:614
process_one_work+0x99a/0x15e0 kernel/workqueue.c:2405
worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
kthread+0x344/0x440 kernel/kthread.c:379
ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308

The buggy address belongs to the object at ffff888034efa000
which belongs to the cache kmalloc-cg-4k of size 4096
The buggy address is located 232 bytes inside of
freed 4096-byte region [ffff888034efa000, ffff888034efb000)

The buggy address belongs to the physical page:
page:ffffea0000d3be00 refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x34ef8
head:ffffea0000d3be00 order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
memcg:ffff88807d3a3741
flags: 0xfff00000010200(slab|head|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000010200 ffff88801244f500 ffffea00005adc00 dead000000000002
raw: 0000000000000000 0000000000040004 00000001ffffffff ffff88807d3a3741
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0x1d20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC|__GFP_HARDWALL), pid 4668, tgid 4668 (dhcpcd), ts 914718336210, free_ts 914216762552
set_page_owner include/linux/page_owner.h:31 [inline]
post_alloc_hook+0x2db/0x350 mm/page_alloc.c:1731
prep_new_page mm/page_alloc.c:1738 [inline]
get_page_from_freelist+0xf41/0x2c00 mm/page_alloc.c:3502
__alloc_pages+0x1cb/0x4a0 mm/page_alloc.c:4768
alloc_pages+0x1aa/0x270 mm/mempolicy.c:2279
alloc_slab_page mm/slub.c:1851 [inline]
allocate_slab+0x25f/0x390 mm/slub.c:1998
new_slab mm/slub.c:2051 [inline]
___slab_alloc+0xa91/0x1400 mm/slub.c:3192
__slab_alloc.constprop.0+0x56/0xa0 mm/slub.c:3291
__slab_alloc_node mm/slub.c:3344 [inline]
slab_alloc_node mm/slub.c:3441 [inline]
__kmem_cache_alloc_node+0x136/0x320 mm/slub.c:3490
__do_kmalloc_node mm/slab_common.c:965 [inline]
__kmalloc_node+0x51/0x1a0 mm/slab_common.c:973
kmalloc_node include/linux/slab.h:579 [inline]
kvmalloc_node+0xa2/0x1a0 mm/util.c:604
kvmalloc include/linux/slab.h:697 [inline]
seq_buf_alloc fs/seq_file.c:38 [inline]
seq_read_iter+0x7fb/0x12d0 fs/seq_file.c:210
kernfs_fop_read_iter+0x4ce/0x690 fs/kernfs/file.c:279
call_read_iter include/linux/fs.h:1862 [inline]
new_sync_read fs/read_write.c:389 [inline]
vfs_read+0x4b1/0x8a0 fs/read_write.c:470
ksys_read+0x12b/0x250 fs/read_write.c:613
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
page last free stack trace:
reset_page_owner include/linux/page_owner.h:24 [inline]
free_pages_prepare mm/page_alloc.c:1302 [inline]
free_unref_page_prepare+0x62e/0xcb0 mm/page_alloc.c:2564
free_unref_page+0x33/0x370 mm/page_alloc.c:2659
__unfreeze_partials+0x17c/0x1a0 mm/slub.c:2636
qlink_free mm/kasan/quarantine.c:166 [inline]
qlist_free_all+0x6a/0x170 mm/kasan/quarantine.c:185
kasan_quarantine_reduce+0x195/0x220 mm/kasan/quarantine.c:292
__kasan_slab_alloc+0x63/0x90 mm/kasan/common.c:305
kasan_slab_alloc include/linux/kasan.h:186 [inline]
slab_post_alloc_hook mm/slab.h:711 [inline]
slab_alloc_node mm/slub.c:3451 [inline]
slab_alloc mm/slub.c:3459 [inline]
__kmem_cache_alloc_lru mm/slub.c:3466 [inline]
kmem_cache_alloc+0x17c/0x3b0 mm/slub.c:3475
getname_flags.part.0+0x50/0x4f0 fs/namei.c:140
getname_flags+0x9e/0xe0 include/linux/audit.h:321
vfs_fstatat+0x77/0xb0 fs/stat.c:275
__do_sys_newfstatat+0x8a/0x110 fs/stat.c:446
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x39/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd

Memory state around the buggy address:
ffff888034ef9f80: fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc fc
ffff888034efa000: fa fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888034efa080: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888034efa100: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888034efa180: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to change bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup


2023-05-12 10:24:00

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [rdma?] KASAN: slab-use-after-free Read in siw_query_port

>
>
> On 5/12/23 15:10, syzbot wrote:
>> Hello,
>>
>> syzbot found the following issue on:
>>
>> HEAD commit: 16a8829130ca nfs: fix another case of NULL/IS_ERR confusio..
>> git tree: upstream
>> console output: https://syzkaller.appspot.com/x/log.txt?x=162c0566280000
>> kernel config: https://syzkaller.appspot.com/x/.config?x=8bc832f563d8bf38
>> dashboard link: https://syzkaller.appspot.com/bug?extid=79f283f1f4ccc6e8b624
>> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>>
>> Unfortunately, I don't have any reproducer for this issue yet.
>>
>> Downloadable assets:
>> disk image: https://storage.googleapis.com/syzbot-assets/f8c18a31ba47/disk-16a88291.raw.xz
>> vmlinux: https://storage.googleapis.com/syzbot-assets/03a18f29b7e7/vmlinux-16a88291.xz
>> kernel image: https://storage.googleapis.com/syzbot-assets/1db2407ade1e/bzImage-16a88291.xz
>>
>> IMPORTANT: if you fix the issue, please add the following tag to the commit:
>> Reported-by: [email protected]
>>
>> xfrm0 speed is unknown, defaulting to 1000
>> ==================================================================
>> BUG: KASAN: slab-use-after-free in siw_query_port+0x37b/0x3e0 drivers/infiniband/sw/siw/siw_verbs.c:177
>> Read of size 4 at addr ffff888034efa0e8 by task kworker/1:4/24211
>>
>> CPU: 1 PID: 24211 Comm: kworker/1:4 Not tainted 6.4.0-rc1-syzkaller-00012-g16a8829130ca #0
>> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
>> Workqueue: infiniband ib_cache_event_task
>> Call Trace:
>> <TASK>
>> __dump_stack lib/dump_stack.c:88 [inline]
>> dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
>> print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:351
>> print_report mm/kasan/report.c:462 [inline]
>> kasan_report+0x11c/0x130 mm/kasan/report.c:572
>> siw_query_port+0x37b/0x3e0 drivers/infiniband/sw/siw/siw_verbs.c:177
>> iw_query_port drivers/infiniband/core/device.c:2049 [inline]
>> ib_query_port drivers/infiniband/core/device.c:2090 [inline]
>> ib_query_port+0x3c4/0x8f0 drivers/infiniband/core/device.c:2082
>> ib_cache_update.part.0+0xcf/0x920 drivers/infiniband/core/cache.c:1487
>> ib_cache_update drivers/infiniband/core/cache.c:1561 [inline]
>> ib_cache_event_task+0x1b1/0x270 drivers/infiniband/core/cache.c:1561
>> process_one_work+0x99a/0x15e0 kernel/workqueue.c:2405
>> worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
>> kthread+0x344/0x440 kernel/kthread.c:379
>> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
>> </TASK>
>
> This might be similar as 390d3fdcae2d,  let me play with syzbot a bit ????
>
> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git

This crash does not have a reproducer. I cannot test it.

> for-rc
>
> diff --git a/drivers/infiniband/core/device.c
> b/drivers/infiniband/core/device.c
> index a666847bd714..9dd59f8d5f05 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -2016,6 +2016,7 @@static int iw_query_port(struct ib_device *device,
> {
>        struct in_device *inetdev;
>        struct net_device *netdev;
> +int ret;
>
>        memset(port_attr, 0, sizeof(*port_attr));
>
> @@ -2045,8 +2046,9 @@static int iw_query_port(struct ib_device *device,
>                rcu_read_unlock();
>        }
>
> +ret = device->ops.query_port(device, port_num, port_attr);
>        dev_put(netdev);
> -       return device->ops.query_port(device, port_num, port_attr);
> +return ret;
> }
>
> static int __ib_query_port(struct ib_device *device,

2023-05-12 10:25:56

by Guoqing Jiang

[permalink] [raw]
Subject: Re: [syzbot] [rdma?] KASAN: slab-use-after-free Read in siw_query_port



On 5/12/23 15:10, syzbot wrote:
> Hello,
>
> syzbot found the following issue on:
>
> HEAD commit: 16a8829130ca nfs: fix another case of NULL/IS_ERR confusio..
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=162c0566280000
> kernel config: https://syzkaller.appspot.com/x/.config?x=8bc832f563d8bf38
> dashboard link: https://syzkaller.appspot.com/bug?extid=79f283f1f4ccc6e8b624
> compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
>
> Unfortunately, I don't have any reproducer for this issue yet.
>
> Downloadable assets:
> disk image: https://storage.googleapis.com/syzbot-assets/f8c18a31ba47/disk-16a88291.raw.xz
> vmlinux: https://storage.googleapis.com/syzbot-assets/03a18f29b7e7/vmlinux-16a88291.xz
> kernel image: https://storage.googleapis.com/syzbot-assets/1db2407ade1e/bzImage-16a88291.xz
>
> IMPORTANT: if you fix the issue, please add the following tag to the commit:
> Reported-by: [email protected]
>
> xfrm0 speed is unknown, defaulting to 1000
> ==================================================================
> BUG: KASAN: slab-use-after-free in siw_query_port+0x37b/0x3e0 drivers/infiniband/sw/siw/siw_verbs.c:177
> Read of size 4 at addr ffff888034efa0e8 by task kworker/1:4/24211
>
> CPU: 1 PID: 24211 Comm: kworker/1:4 Not tainted 6.4.0-rc1-syzkaller-00012-g16a8829130ca #0
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
> Workqueue: infiniband ib_cache_event_task
> Call Trace:
> <TASK>
> __dump_stack lib/dump_stack.c:88 [inline]
> dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
> print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:351
> print_report mm/kasan/report.c:462 [inline]
> kasan_report+0x11c/0x130 mm/kasan/report.c:572
> siw_query_port+0x37b/0x3e0 drivers/infiniband/sw/siw/siw_verbs.c:177
> iw_query_port drivers/infiniband/core/device.c:2049 [inline]
> ib_query_port drivers/infiniband/core/device.c:2090 [inline]
> ib_query_port+0x3c4/0x8f0 drivers/infiniband/core/device.c:2082
> ib_cache_update.part.0+0xcf/0x920 drivers/infiniband/core/cache.c:1487
> ib_cache_update drivers/infiniband/core/cache.c:1561 [inline]
> ib_cache_event_task+0x1b1/0x270 drivers/infiniband/core/cache.c:1561
> process_one_work+0x99a/0x15e0 kernel/workqueue.c:2405
> worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
> kthread+0x344/0x440 kernel/kthread.c:379
> ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
> </TASK>

This might be similar as 390d3fdcae2d,  let me play with syzbot a bit ????

#syz test: git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
for-rc

diff --git a/drivers/infiniband/core/device.c
b/drivers/infiniband/core/device.c
index a666847bd714..9dd59f8d5f05 100644
--- a/drivers/infiniband/core/device.c
+++ b/drivers/infiniband/core/device.c
@@ -2016,6 +2016,7 @@static int iw_query_port(struct ib_device *device,
{
       struct in_device *inetdev;
       struct net_device *netdev;
+int ret;

       memset(port_attr, 0, sizeof(*port_attr));

@@ -2045,8 +2046,9 @@static int iw_query_port(struct ib_device *device,
               rcu_read_unlock();
       }

+ret = device->ops.query_port(device, port_num, port_attr);
       dev_put(netdev);
-       return device->ops.query_port(device, port_num, port_attr);
+return ret;
}

static int __ib_query_port(struct ib_device *device,

2023-05-18 20:49:47

by Fedor Pchelkin

[permalink] [raw]
Subject: Re: [syzbot] [rdma?] KASAN: slab-use-after-free Read in siw_query_port

On Fri, May 12, 2023 at 06:16:26PM +0800, Guoqing Jiang wrote:
>
>
> On 5/12/23 15:10, syzbot wrote:
> > Hello,
> >
> > syzbot found the following issue on:
> >
> > HEAD commit: 16a8829130ca nfs: fix another case of NULL/IS_ERR confusio..
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=162c0566280000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=8bc832f563d8bf38
> > dashboard link: https://syzkaller.appspot.com/bug?extid=79f283f1f4ccc6e8b624
> > compiler: gcc (Debian 10.2.1-6) 10.2.1 20210110, GNU ld (GNU Binutils for Debian) 2.35.2
> >
> > Unfortunately, I don't have any reproducer for this issue yet.
> >
> > Downloadable assets:
> > disk image: https://storage.googleapis.com/syzbot-assets/f8c18a31ba47/disk-16a88291.raw.xz
> > vmlinux: https://storage.googleapis.com/syzbot-assets/03a18f29b7e7/vmlinux-16a88291.xz
> > kernel image: https://storage.googleapis.com/syzbot-assets/1db2407ade1e/bzImage-16a88291.xz
> >
> > IMPORTANT: if you fix the issue, please add the following tag to the commit:
> > Reported-by: [email protected]
> >
> > xfrm0 speed is unknown, defaulting to 1000
> > ==================================================================
> > BUG: KASAN: slab-use-after-free in siw_query_port+0x37b/0x3e0 drivers/infiniband/sw/siw/siw_verbs.c:177
> > Read of size 4 at addr ffff888034efa0e8 by task kworker/1:4/24211
> >
> > CPU: 1 PID: 24211 Comm: kworker/1:4 Not tainted 6.4.0-rc1-syzkaller-00012-g16a8829130ca #0
> > Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 04/14/2023
> > Workqueue: infiniband ib_cache_event_task
> > Call Trace:
> > <TASK>
> > __dump_stack lib/dump_stack.c:88 [inline]
> > dump_stack_lvl+0xd9/0x150 lib/dump_stack.c:106
> > print_address_description.constprop.0+0x2c/0x3c0 mm/kasan/report.c:351
> > print_report mm/kasan/report.c:462 [inline]
> > kasan_report+0x11c/0x130 mm/kasan/report.c:572
> > siw_query_port+0x37b/0x3e0 drivers/infiniband/sw/siw/siw_verbs.c:177
> > iw_query_port drivers/infiniband/core/device.c:2049 [inline]
> > ib_query_port drivers/infiniband/core/device.c:2090 [inline]
> > ib_query_port+0x3c4/0x8f0 drivers/infiniband/core/device.c:2082
> > ib_cache_update.part.0+0xcf/0x920 drivers/infiniband/core/cache.c:1487
> > ib_cache_update drivers/infiniband/core/cache.c:1561 [inline]
> > ib_cache_event_task+0x1b1/0x270 drivers/infiniband/core/cache.c:1561
> > process_one_work+0x99a/0x15e0 kernel/workqueue.c:2405
> > worker_thread+0x67d/0x10c0 kernel/workqueue.c:2552
> > kthread+0x344/0x440 kernel/kthread.c:379
> > ret_from_fork+0x1f/0x30 arch/x86/entry/entry_64.S:308
> > </TASK>
>
> This might be similar as 390d3fdcae2d, let me play with syzbot a bit ????
>
> #syz test: git://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git
> for-rc
>
> diff --git a/drivers/infiniband/core/device.c
> b/drivers/infiniband/core/device.c
> index a666847bd714..9dd59f8d5f05 100644
> --- a/drivers/infiniband/core/device.c
> +++ b/drivers/infiniband/core/device.c
> @@ -2016,6 +2016,7 @@static int iw_query_port(struct ib_device *device,
> {
> struct in_device *inetdev;
> struct net_device *netdev;
> +int ret;
>
> memset(port_attr, 0, sizeof(*port_attr));
>
> @@ -2045,8 +2046,9 @@static int iw_query_port(struct ib_device *device,
> rcu_read_unlock();
> }
>
> +ret = device->ops.query_port(device, port_num, port_attr);
> dev_put(netdev);
> - return device->ops.query_port(device, port_num, port_attr);
> +return ret;
> }
>
> static int __ib_query_port(struct ib_device *device,

The issue can be reproduced with an attached C repro [1]. For much greater
possibility it is better to include a delay after dev_put() and before
calling query_port().

On our local Syzkaller instance the bug started to be caught after
266e9b3475ba ("RDMA/siw: Remove namespace check from siw_netdev_event()")
so CC'ing Tetsuo Handa if maybe he would be also interested in the bug.
This fix seems to be good and perhaps it just made a bigger opportunity
for the UAF bug to happen. Actually, the C repro was taken from there [2].

With your suggested solution the UAF is not reproduced. I don't know the
exact reasons why dev_put() was placed before calling query_port() but the
context implies that netdev can be freed in that period. And some of
->query_port() realizations may touch netdev. So it seems reasonable to
move ref count put after performing query_port().

[2]: https://syzkaller.appspot.com/bug?extid=5e70d01ee8985ae62a3b
[1]: https://syzkaller.appspot.com/text?tag=ReproC&x=1193ae64f00000

// https://syzkaller.appspot.com/bug?id=f35e12774aa3a888f874e944cda3d6e5c9e95b48
// autogenerated by syzkaller (https://github.com/google/syzkaller)

#define _GNU_SOURCE

#include <arpa/inet.h>
#include <dirent.h>
#include <endian.h>
#include <errno.h>
#include <fcntl.h>
#include <net/if.h>
#include <net/if_arp.h>
#include <netinet/in.h>
#include <sched.h>
#include <signal.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/ioctl.h>
#include <sys/mount.h>
#include <sys/prctl.h>
#include <sys/resource.h>
#include <sys/socket.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/time.h>
#include <sys/types.h>
#include <sys/uio.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>

#include <linux/capability.h>
#include <linux/genetlink.h>
#include <linux/if_addr.h>
#include <linux/if_ether.h>
#include <linux/if_link.h>
#include <linux/if_tun.h>
#include <linux/in6.h>
#include <linux/ip.h>
#include <linux/neighbour.h>
#include <linux/net.h>
#include <linux/netlink.h>
#include <linux/rtnetlink.h>
#include <linux/tcp.h>
#include <linux/veth.h>

static unsigned long long procid;

static void sleep_ms(uint64_t ms)
{
usleep(ms * 1000);
}

static uint64_t current_time_ms(void)
{
struct timespec ts;
if (clock_gettime(CLOCK_MONOTONIC, &ts))
exit(1);
return (uint64_t)ts.tv_sec * 1000 + (uint64_t)ts.tv_nsec / 1000000;
}

static bool write_file(const char* file, const char* what, ...)
{
char buf[1024];
va_list args;
va_start(args, what);
vsnprintf(buf, sizeof(buf), what, args);
va_end(args);
buf[sizeof(buf) - 1] = 0;
int len = strlen(buf);
int fd = open(file, O_WRONLY | O_CLOEXEC);
if (fd == -1)
return false;
if (write(fd, buf, len) != len) {
int err = errno;
close(fd);
errno = err;
return false;
}
close(fd);
return true;
}

struct nlmsg {
char* pos;
int nesting;
struct nlattr* nested[8];
char buf[4096];
};

static void netlink_init(struct nlmsg* nlmsg, int typ, int flags,
const void* data, int size)
{
memset(nlmsg, 0, sizeof(*nlmsg));
struct nlmsghdr* hdr = (struct nlmsghdr*)nlmsg->buf;
hdr->nlmsg_type = typ;
hdr->nlmsg_flags = NLM_F_REQUEST | NLM_F_ACK | flags;
memcpy(hdr + 1, data, size);
nlmsg->pos = (char*)(hdr + 1) + NLMSG_ALIGN(size);
}

static void netlink_attr(struct nlmsg* nlmsg, int typ, const void* data,
int size)
{
struct nlattr* attr = (struct nlattr*)nlmsg->pos;
attr->nla_len = sizeof(*attr) + size;
attr->nla_type = typ;
if (size > 0)
memcpy(attr + 1, data, size);
nlmsg->pos += NLMSG_ALIGN(attr->nla_len);
}

static void netlink_nest(struct nlmsg* nlmsg, int typ)
{
struct nlattr* attr = (struct nlattr*)nlmsg->pos;
attr->nla_type = typ;
nlmsg->pos += sizeof(*attr);
nlmsg->nested[nlmsg->nesting++] = attr;
}

static void netlink_done(struct nlmsg* nlmsg)
{
struct nlattr* attr = nlmsg->nested[--nlmsg->nesting];
attr->nla_len = nlmsg->pos - (char*)attr;
}

static int netlink_send_ext(struct nlmsg* nlmsg, int sock, uint16_t reply_type,
int* reply_len, bool dofail)
{
if (nlmsg->pos > nlmsg->buf + sizeof(nlmsg->buf) || nlmsg->nesting)
exit(1);
struct nlmsghdr* hdr = (struct nlmsghdr*)nlmsg->buf;
hdr->nlmsg_len = nlmsg->pos - nlmsg->buf;
struct sockaddr_nl addr;
memset(&addr, 0, sizeof(addr));
addr.nl_family = AF_NETLINK;
ssize_t n = sendto(sock, nlmsg->buf, hdr->nlmsg_len, 0,
(struct sockaddr*)&addr, sizeof(addr));
if (n != (ssize_t)hdr->nlmsg_len) {
if (dofail)
exit(1);
return -1;
}
n = recv(sock, nlmsg->buf, sizeof(nlmsg->buf), 0);
if (reply_len)
*reply_len = 0;
if (n < 0) {
if (dofail)
exit(1);
return -1;
}
if (n < (ssize_t)sizeof(struct nlmsghdr)) {
errno = EINVAL;
if (dofail)
exit(1);
return -1;
}
if (hdr->nlmsg_type == NLMSG_DONE)
return 0;
if (reply_len && hdr->nlmsg_type == reply_type) {
*reply_len = n;
return 0;
}
if (n < (ssize_t)(sizeof(struct nlmsghdr) + sizeof(struct nlmsgerr))) {
errno = EINVAL;
if (dofail)
exit(1);
return -1;
}
if (hdr->nlmsg_type != NLMSG_ERROR) {
errno = EINVAL;
if (dofail)
exit(1);
return -1;
}
errno = -((struct nlmsgerr*)(hdr + 1))->error;
return -errno;
}

static int netlink_send(struct nlmsg* nlmsg, int sock)
{
return netlink_send_ext(nlmsg, sock, 0, NULL, true);
}

static int netlink_query_family_id(struct nlmsg* nlmsg, int sock,
const char* family_name, bool dofail)
{
struct genlmsghdr genlhdr;
memset(&genlhdr, 0, sizeof(genlhdr));
genlhdr.cmd = CTRL_CMD_GETFAMILY;
netlink_init(nlmsg, GENL_ID_CTRL, 0, &genlhdr, sizeof(genlhdr));
netlink_attr(nlmsg, CTRL_ATTR_FAMILY_NAME, family_name,
strnlen(family_name, GENL_NAMSIZ - 1) + 1);
int n = 0;
int err = netlink_send_ext(nlmsg, sock, GENL_ID_CTRL, &n, dofail);
if (err < 0) {
return -1;
}
uint16_t id = 0;
struct nlattr* attr = (struct nlattr*)(nlmsg->buf + NLMSG_HDRLEN +
NLMSG_ALIGN(sizeof(genlhdr)));
for (; (char*)attr < nlmsg->buf + n;
attr = (struct nlattr*)((char*)attr + NLMSG_ALIGN(attr->nla_len))) {
if (attr->nla_type == CTRL_ATTR_FAMILY_ID) {
id = *(uint16_t*)(attr + 1);
break;
}
}
if (!id) {
errno = EINVAL;
return -1;
}
recv(sock, nlmsg->buf, sizeof(nlmsg->buf), 0);
return id;
}

static int netlink_next_msg(struct nlmsg* nlmsg, unsigned int offset,
unsigned int total_len)
{
struct nlmsghdr* hdr = (struct nlmsghdr*)(nlmsg->buf + offset);
if (offset == total_len || offset + hdr->nlmsg_len > total_len)
return -1;
return hdr->nlmsg_len;
}

static void netlink_add_device_impl(struct nlmsg* nlmsg, const char* type,
const char* name)
{
struct ifinfomsg hdr;
memset(&hdr, 0, sizeof(hdr));
netlink_init(nlmsg, RTM_NEWLINK, NLM_F_EXCL | NLM_F_CREATE, &hdr,
sizeof(hdr));
if (name)
netlink_attr(nlmsg, IFLA_IFNAME, name, strlen(name));
netlink_nest(nlmsg, IFLA_LINKINFO);
netlink_attr(nlmsg, IFLA_INFO_KIND, type, strlen(type));
}

static void netlink_add_device(struct nlmsg* nlmsg, int sock, const char* type,
const char* name)
{
netlink_add_device_impl(nlmsg, type, name);
netlink_done(nlmsg);
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}

static void netlink_add_veth(struct nlmsg* nlmsg, int sock, const char* name,
const char* peer)
{
netlink_add_device_impl(nlmsg, "veth", name);
netlink_nest(nlmsg, IFLA_INFO_DATA);
netlink_nest(nlmsg, VETH_INFO_PEER);
nlmsg->pos += sizeof(struct ifinfomsg);
netlink_attr(nlmsg, IFLA_IFNAME, peer, strlen(peer));
netlink_done(nlmsg);
netlink_done(nlmsg);
netlink_done(nlmsg);
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}

static void netlink_add_hsr(struct nlmsg* nlmsg, int sock, const char* name,
const char* slave1, const char* slave2)
{
netlink_add_device_impl(nlmsg, "hsr", name);
netlink_nest(nlmsg, IFLA_INFO_DATA);
int ifindex1 = if_nametoindex(slave1);
netlink_attr(nlmsg, IFLA_HSR_SLAVE1, &ifindex1, sizeof(ifindex1));
int ifindex2 = if_nametoindex(slave2);
netlink_attr(nlmsg, IFLA_HSR_SLAVE2, &ifindex2, sizeof(ifindex2));
netlink_done(nlmsg);
netlink_done(nlmsg);
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}

static void netlink_add_linked(struct nlmsg* nlmsg, int sock, const char* type,
const char* name, const char* link)
{
netlink_add_device_impl(nlmsg, type, name);
netlink_done(nlmsg);
int ifindex = if_nametoindex(link);
netlink_attr(nlmsg, IFLA_LINK, &ifindex, sizeof(ifindex));
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}

static void netlink_add_vlan(struct nlmsg* nlmsg, int sock, const char* name,
const char* link, uint16_t id, uint16_t proto)
{
netlink_add_device_impl(nlmsg, "vlan", name);
netlink_nest(nlmsg, IFLA_INFO_DATA);
netlink_attr(nlmsg, IFLA_VLAN_ID, &id, sizeof(id));
netlink_attr(nlmsg, IFLA_VLAN_PROTOCOL, &proto, sizeof(proto));
netlink_done(nlmsg);
netlink_done(nlmsg);
int ifindex = if_nametoindex(link);
netlink_attr(nlmsg, IFLA_LINK, &ifindex, sizeof(ifindex));
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}

static void netlink_add_macvlan(struct nlmsg* nlmsg, int sock, const char* name,
const char* link)
{
netlink_add_device_impl(nlmsg, "macvlan", name);
netlink_nest(nlmsg, IFLA_INFO_DATA);
uint32_t mode = MACVLAN_MODE_BRIDGE;
netlink_attr(nlmsg, IFLA_MACVLAN_MODE, &mode, sizeof(mode));
netlink_done(nlmsg);
netlink_done(nlmsg);
int ifindex = if_nametoindex(link);
netlink_attr(nlmsg, IFLA_LINK, &ifindex, sizeof(ifindex));
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}

static void netlink_add_geneve(struct nlmsg* nlmsg, int sock, const char* name,
uint32_t vni, struct in_addr* addr4,
struct in6_addr* addr6)
{
netlink_add_device_impl(nlmsg, "geneve", name);
netlink_nest(nlmsg, IFLA_INFO_DATA);
netlink_attr(nlmsg, IFLA_GENEVE_ID, &vni, sizeof(vni));
if (addr4)
netlink_attr(nlmsg, IFLA_GENEVE_REMOTE, addr4, sizeof(*addr4));
if (addr6)
netlink_attr(nlmsg, IFLA_GENEVE_REMOTE6, addr6, sizeof(*addr6));
netlink_done(nlmsg);
netlink_done(nlmsg);
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}

#define IFLA_IPVLAN_FLAGS 2
#define IPVLAN_MODE_L3S 2
#undef IPVLAN_F_VEPA
#define IPVLAN_F_VEPA 2

static void netlink_add_ipvlan(struct nlmsg* nlmsg, int sock, const char* name,
const char* link, uint16_t mode, uint16_t flags)
{
netlink_add_device_impl(nlmsg, "ipvlan", name);
netlink_nest(nlmsg, IFLA_INFO_DATA);
netlink_attr(nlmsg, IFLA_IPVLAN_MODE, &mode, sizeof(mode));
netlink_attr(nlmsg, IFLA_IPVLAN_FLAGS, &flags, sizeof(flags));
netlink_done(nlmsg);
netlink_done(nlmsg);
int ifindex = if_nametoindex(link);
netlink_attr(nlmsg, IFLA_LINK, &ifindex, sizeof(ifindex));
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}

static void netlink_device_change(struct nlmsg* nlmsg, int sock,
const char* name, bool up, const char* master,
const void* mac, int macsize,
const char* new_name)
{
struct ifinfomsg hdr;
memset(&hdr, 0, sizeof(hdr));
if (up)
hdr.ifi_flags = hdr.ifi_change = IFF_UP;
hdr.ifi_index = if_nametoindex(name);
netlink_init(nlmsg, RTM_NEWLINK, 0, &hdr, sizeof(hdr));
if (new_name)
netlink_attr(nlmsg, IFLA_IFNAME, new_name, strlen(new_name));
if (master) {
int ifindex = if_nametoindex(master);
netlink_attr(nlmsg, IFLA_MASTER, &ifindex, sizeof(ifindex));
}
if (macsize)
netlink_attr(nlmsg, IFLA_ADDRESS, mac, macsize);
int err = netlink_send(nlmsg, sock);
if (err < 0) {
}
}

static int netlink_add_addr(struct nlmsg* nlmsg, int sock, const char* dev,
const void* addr, int addrsize)
{
struct ifaddrmsg hdr;
memset(&hdr, 0, sizeof(hdr));
hdr.ifa_family = addrsize == 4 ? AF_INET : AF_INET6;
hdr.ifa_prefixlen = addrsize == 4 ? 24 : 120;
hdr.ifa_scope = RT_SCOPE_UNIVERSE;
hdr.ifa_index = if_nametoindex(dev);
netlink_init(nlmsg, RTM_NEWADDR, NLM_F_CREATE | NLM_F_REPLACE, &hdr,
sizeof(hdr));
netlink_attr(nlmsg, IFA_LOCAL, addr, addrsize);
netlink_attr(nlmsg, IFA_ADDRESS, addr, addrsize);
return netlink_send(nlmsg, sock);
}

static void netlink_add_addr4(struct nlmsg* nlmsg, int sock, const char* dev,
const char* addr)
{
struct in_addr in_addr;
inet_pton(AF_INET, addr, &in_addr);
int err = netlink_add_addr(nlmsg, sock, dev, &in_addr, sizeof(in_addr));
if (err < 0) {
}
}

static void netlink_add_addr6(struct nlmsg* nlmsg, int sock, const char* dev,
const char* addr)
{
struct in6_addr in6_addr;
inet_pton(AF_INET6, addr, &in6_addr);
int err = netlink_add_addr(nlmsg, sock, dev, &in6_addr, sizeof(in6_addr));
if (err < 0) {
}
}

static struct nlmsg nlmsg;

#define DEVLINK_FAMILY_NAME "devlink"

#define DEVLINK_CMD_PORT_GET 5
#define DEVLINK_ATTR_BUS_NAME 1
#define DEVLINK_ATTR_DEV_NAME 2
#define DEVLINK_ATTR_NETDEV_NAME 7

static struct nlmsg nlmsg2;

static void initialize_devlink_ports(const char* bus_name, const char* dev_name,
const char* netdev_prefix)
{
struct genlmsghdr genlhdr;
int len, total_len, id, err, offset;
uint16_t netdev_index;
int sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
if (sock == -1)
exit(1);
int rtsock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
if (rtsock == -1)
exit(1);
id = netlink_query_family_id(&nlmsg, sock, DEVLINK_FAMILY_NAME, true);
if (id == -1)
goto error;
memset(&genlhdr, 0, sizeof(genlhdr));
genlhdr.cmd = DEVLINK_CMD_PORT_GET;
netlink_init(&nlmsg, id, NLM_F_DUMP, &genlhdr, sizeof(genlhdr));
netlink_attr(&nlmsg, DEVLINK_ATTR_BUS_NAME, bus_name, strlen(bus_name) + 1);
netlink_attr(&nlmsg, DEVLINK_ATTR_DEV_NAME, dev_name, strlen(dev_name) + 1);
err = netlink_send_ext(&nlmsg, sock, id, &total_len, true);
if (err < 0) {
goto error;
}
offset = 0;
netdev_index = 0;
while ((len = netlink_next_msg(&nlmsg, offset, total_len)) != -1) {
struct nlattr* attr = (struct nlattr*)(nlmsg.buf + offset + NLMSG_HDRLEN +
NLMSG_ALIGN(sizeof(genlhdr)));
for (; (char*)attr < nlmsg.buf + offset + len;
attr = (struct nlattr*)((char*)attr + NLMSG_ALIGN(attr->nla_len))) {
if (attr->nla_type == DEVLINK_ATTR_NETDEV_NAME) {
char* port_name;
char netdev_name[IFNAMSIZ];
port_name = (char*)(attr + 1);
snprintf(netdev_name, sizeof(netdev_name), "%s%d", netdev_prefix,
netdev_index);
netlink_device_change(&nlmsg2, rtsock, port_name, true, 0, 0, 0,
netdev_name);
break;
}
}
offset += len;
netdev_index++;
}
error:
close(rtsock);
close(sock);
}

#define DEV_IPV4 "172.20.20.%d"
#define DEV_IPV6 "fe80::%02x"
#define DEV_MAC 0x00aaaaaaaaaa

static void netdevsim_add(unsigned int addr, unsigned int port_count)
{
char buf[16];
sprintf(buf, "%u %u", addr, port_count);
if (write_file("/sys/bus/netdevsim/new_device", buf)) {
snprintf(buf, sizeof(buf), "netdevsim%d", addr);
initialize_devlink_ports("netdevsim", buf, "netdevsim");
}
}

#define WG_GENL_NAME "wireguard"
enum wg_cmd {
WG_CMD_GET_DEVICE,
WG_CMD_SET_DEVICE,
};
enum wgdevice_attribute {
WGDEVICE_A_UNSPEC,
WGDEVICE_A_IFINDEX,
WGDEVICE_A_IFNAME,
WGDEVICE_A_PRIVATE_KEY,
WGDEVICE_A_PUBLIC_KEY,
WGDEVICE_A_FLAGS,
WGDEVICE_A_LISTEN_PORT,
WGDEVICE_A_FWMARK,
WGDEVICE_A_PEERS,
};
enum wgpeer_attribute {
WGPEER_A_UNSPEC,
WGPEER_A_PUBLIC_KEY,
WGPEER_A_PRESHARED_KEY,
WGPEER_A_FLAGS,
WGPEER_A_ENDPOINT,
WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
WGPEER_A_LAST_HANDSHAKE_TIME,
WGPEER_A_RX_BYTES,
WGPEER_A_TX_BYTES,
WGPEER_A_ALLOWEDIPS,
WGPEER_A_PROTOCOL_VERSION,
};
enum wgallowedip_attribute {
WGALLOWEDIP_A_UNSPEC,
WGALLOWEDIP_A_FAMILY,
WGALLOWEDIP_A_IPADDR,
WGALLOWEDIP_A_CIDR_MASK,
};

static void netlink_wireguard_setup(void)
{
const char ifname_a[] = "wg0";
const char ifname_b[] = "wg1";
const char ifname_c[] = "wg2";
const char private_a[] =
"\xa0\x5c\xa8\x4f\x6c\x9c\x8e\x38\x53\xe2\xfd\x7a\x70\xae\x0f\xb2\x0f\xa1"
"\x52\x60\x0c\xb0\x08\x45\x17\x4f\x08\x07\x6f\x8d\x78\x43";
const char private_b[] =
"\xb0\x80\x73\xe8\xd4\x4e\x91\xe3\xda\x92\x2c\x22\x43\x82\x44\xbb\x88\x5c"
"\x69\xe2\x69\xc8\xe9\xd8\x35\xb1\x14\x29\x3a\x4d\xdc\x6e";
const char private_c[] =
"\xa0\xcb\x87\x9a\x47\xf5\xbc\x64\x4c\x0e\x69\x3f\xa6\xd0\x31\xc7\x4a\x15"
"\x53\xb6\xe9\x01\xb9\xff\x2f\x51\x8c\x78\x04\x2f\xb5\x42";
const char public_a[] =
"\x97\x5c\x9d\x81\xc9\x83\xc8\x20\x9e\xe7\x81\x25\x4b\x89\x9f\x8e\xd9\x25"
"\xae\x9f\x09\x23\xc2\x3c\x62\xf5\x3c\x57\xcd\xbf\x69\x1c";
const char public_b[] =
"\xd1\x73\x28\x99\xf6\x11\xcd\x89\x94\x03\x4d\x7f\x41\x3d\xc9\x57\x63\x0e"
"\x54\x93\xc2\x85\xac\xa4\x00\x65\xcb\x63\x11\xbe\x69\x6b";
const char public_c[] =
"\xf4\x4d\xa3\x67\xa8\x8e\xe6\x56\x4f\x02\x02\x11\x45\x67\x27\x08\x2f\x5c"
"\xeb\xee\x8b\x1b\xf5\xeb\x73\x37\x34\x1b\x45\x9b\x39\x22";
const uint16_t listen_a = 20001;
const uint16_t listen_b = 20002;
const uint16_t listen_c = 20003;
const uint16_t af_inet = AF_INET;
const uint16_t af_inet6 = AF_INET6;
const struct sockaddr_in endpoint_b_v4 = {
.sin_family = AF_INET,
.sin_port = htons(listen_b),
.sin_addr = {htonl(INADDR_LOOPBACK)}};
const struct sockaddr_in endpoint_c_v4 = {
.sin_family = AF_INET,
.sin_port = htons(listen_c),
.sin_addr = {htonl(INADDR_LOOPBACK)}};
struct sockaddr_in6 endpoint_a_v6 = {.sin6_family = AF_INET6,
.sin6_port = htons(listen_a)};
endpoint_a_v6.sin6_addr = in6addr_loopback;
struct sockaddr_in6 endpoint_c_v6 = {.sin6_family = AF_INET6,
.sin6_port = htons(listen_c)};
endpoint_c_v6.sin6_addr = in6addr_loopback;
const struct in_addr first_half_v4 = {0};
const struct in_addr second_half_v4 = {(uint32_t)htonl(128 << 24)};
const struct in6_addr first_half_v6 = {{{0}}};
const struct in6_addr second_half_v6 = {{{0x80}}};
const uint8_t half_cidr = 1;
const uint16_t persistent_keepalives[] = {1, 3, 7, 9, 14, 19};
struct genlmsghdr genlhdr = {.cmd = WG_CMD_SET_DEVICE, .version = 1};
int sock;
int id, err;
sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_GENERIC);
if (sock == -1) {
return;
}
id = netlink_query_family_id(&nlmsg, sock, WG_GENL_NAME, true);
if (id == -1)
goto error;
netlink_init(&nlmsg, id, 0, &genlhdr, sizeof(genlhdr));
netlink_attr(&nlmsg, WGDEVICE_A_IFNAME, ifname_a, strlen(ifname_a) + 1);
netlink_attr(&nlmsg, WGDEVICE_A_PRIVATE_KEY, private_a, 32);
netlink_attr(&nlmsg, WGDEVICE_A_LISTEN_PORT, &listen_a, 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGDEVICE_A_PEERS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGPEER_A_PUBLIC_KEY, public_b, 32);
netlink_attr(&nlmsg, WGPEER_A_ENDPOINT, &endpoint_b_v4,
sizeof(endpoint_b_v4));
netlink_attr(&nlmsg, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
&persistent_keepalives[0], 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGPEER_A_ALLOWEDIPS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &first_half_v4,
sizeof(first_half_v4));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet6, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &first_half_v6,
sizeof(first_half_v6));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGPEER_A_PUBLIC_KEY, public_c, 32);
netlink_attr(&nlmsg, WGPEER_A_ENDPOINT, &endpoint_c_v6,
sizeof(endpoint_c_v6));
netlink_attr(&nlmsg, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
&persistent_keepalives[1], 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGPEER_A_ALLOWEDIPS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &second_half_v4,
sizeof(second_half_v4));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet6, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &second_half_v6,
sizeof(second_half_v6));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
err = netlink_send(&nlmsg, sock);
if (err < 0) {
}
netlink_init(&nlmsg, id, 0, &genlhdr, sizeof(genlhdr));
netlink_attr(&nlmsg, WGDEVICE_A_IFNAME, ifname_b, strlen(ifname_b) + 1);
netlink_attr(&nlmsg, WGDEVICE_A_PRIVATE_KEY, private_b, 32);
netlink_attr(&nlmsg, WGDEVICE_A_LISTEN_PORT, &listen_b, 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGDEVICE_A_PEERS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGPEER_A_PUBLIC_KEY, public_a, 32);
netlink_attr(&nlmsg, WGPEER_A_ENDPOINT, &endpoint_a_v6,
sizeof(endpoint_a_v6));
netlink_attr(&nlmsg, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
&persistent_keepalives[2], 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGPEER_A_ALLOWEDIPS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &first_half_v4,
sizeof(first_half_v4));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet6, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &first_half_v6,
sizeof(first_half_v6));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGPEER_A_PUBLIC_KEY, public_c, 32);
netlink_attr(&nlmsg, WGPEER_A_ENDPOINT, &endpoint_c_v4,
sizeof(endpoint_c_v4));
netlink_attr(&nlmsg, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
&persistent_keepalives[3], 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGPEER_A_ALLOWEDIPS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &second_half_v4,
sizeof(second_half_v4));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet6, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &second_half_v6,
sizeof(second_half_v6));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
err = netlink_send(&nlmsg, sock);
if (err < 0) {
}
netlink_init(&nlmsg, id, 0, &genlhdr, sizeof(genlhdr));
netlink_attr(&nlmsg, WGDEVICE_A_IFNAME, ifname_c, strlen(ifname_c) + 1);
netlink_attr(&nlmsg, WGDEVICE_A_PRIVATE_KEY, private_c, 32);
netlink_attr(&nlmsg, WGDEVICE_A_LISTEN_PORT, &listen_c, 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGDEVICE_A_PEERS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGPEER_A_PUBLIC_KEY, public_a, 32);
netlink_attr(&nlmsg, WGPEER_A_ENDPOINT, &endpoint_a_v6,
sizeof(endpoint_a_v6));
netlink_attr(&nlmsg, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
&persistent_keepalives[4], 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGPEER_A_ALLOWEDIPS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &first_half_v4,
sizeof(first_half_v4));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet6, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &first_half_v6,
sizeof(first_half_v6));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGPEER_A_PUBLIC_KEY, public_b, 32);
netlink_attr(&nlmsg, WGPEER_A_ENDPOINT, &endpoint_b_v4,
sizeof(endpoint_b_v4));
netlink_attr(&nlmsg, WGPEER_A_PERSISTENT_KEEPALIVE_INTERVAL,
&persistent_keepalives[5], 2);
netlink_nest(&nlmsg, NLA_F_NESTED | WGPEER_A_ALLOWEDIPS);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &second_half_v4,
sizeof(second_half_v4));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_nest(&nlmsg, NLA_F_NESTED | 0);
netlink_attr(&nlmsg, WGALLOWEDIP_A_FAMILY, &af_inet6, 2);
netlink_attr(&nlmsg, WGALLOWEDIP_A_IPADDR, &second_half_v6,
sizeof(second_half_v6));
netlink_attr(&nlmsg, WGALLOWEDIP_A_CIDR_MASK, &half_cidr, 1);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
netlink_done(&nlmsg);
err = netlink_send(&nlmsg, sock);
if (err < 0) {
}

error:
close(sock);
}
static void initialize_netdevices(void)
{
char netdevsim[16];
sprintf(netdevsim, "netdevsim%d", (int)procid);
struct {
const char* type;
const char* dev;
} devtypes[] = {
{"ip6gretap", "ip6gretap0"}, {"bridge", "bridge0"},
{"vcan", "vcan0"}, {"bond", "bond0"},
{"team", "team0"}, {"dummy", "dummy0"},
{"nlmon", "nlmon0"}, {"caif", "caif0"},
{"batadv", "batadv0"}, {"vxcan", "vxcan1"},
{"netdevsim", netdevsim}, {"veth", 0},
{"xfrm", "xfrm0"}, {"wireguard", "wg0"},
{"wireguard", "wg1"}, {"wireguard", "wg2"},
};
const char* devmasters[] = {"bridge", "bond", "team", "batadv"};
struct {
const char* name;
int macsize;
bool noipv6;
} devices[] = {
{"lo", ETH_ALEN},
{"sit0", 0},
{"bridge0", ETH_ALEN},
{"vcan0", 0, true},
{"tunl0", 0},
{"gre0", 0},
{"gretap0", ETH_ALEN},
{"ip_vti0", 0},
{"ip6_vti0", 0},
{"ip6tnl0", 0},
{"ip6gre0", 0},
{"ip6gretap0", ETH_ALEN},
{"erspan0", ETH_ALEN},
{"bond0", ETH_ALEN},
{"veth0", ETH_ALEN},
{"veth1", ETH_ALEN},
{"team0", ETH_ALEN},
{"veth0_to_bridge", ETH_ALEN},
{"veth1_to_bridge", ETH_ALEN},
{"veth0_to_bond", ETH_ALEN},
{"veth1_to_bond", ETH_ALEN},
{"veth0_to_team", ETH_ALEN},
{"veth1_to_team", ETH_ALEN},
{"veth0_to_hsr", ETH_ALEN},
{"veth1_to_hsr", ETH_ALEN},
{"hsr0", 0},
{"dummy0", ETH_ALEN},
{"nlmon0", 0},
{"vxcan0", 0, true},
{"vxcan1", 0, true},
{"caif0", ETH_ALEN},
{"batadv0", ETH_ALEN},
{netdevsim, ETH_ALEN},
{"xfrm0", ETH_ALEN},
{"veth0_virt_wifi", ETH_ALEN},
{"veth1_virt_wifi", ETH_ALEN},
{"virt_wifi0", ETH_ALEN},
{"veth0_vlan", ETH_ALEN},
{"veth1_vlan", ETH_ALEN},
{"vlan0", ETH_ALEN},
{"vlan1", ETH_ALEN},
{"macvlan0", ETH_ALEN},
{"macvlan1", ETH_ALEN},
{"ipvlan0", ETH_ALEN},
{"ipvlan1", ETH_ALEN},
{"veth0_macvtap", ETH_ALEN},
{"veth1_macvtap", ETH_ALEN},
{"macvtap0", ETH_ALEN},
{"macsec0", ETH_ALEN},
{"veth0_to_batadv", ETH_ALEN},
{"veth1_to_batadv", ETH_ALEN},
{"batadv_slave_0", ETH_ALEN},
{"batadv_slave_1", ETH_ALEN},
{"geneve0", ETH_ALEN},
{"geneve1", ETH_ALEN},
{"wg0", 0},
{"wg1", 0},
{"wg2", 0},
};
int sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
if (sock == -1)
exit(1);
unsigned i;
for (i = 0; i < sizeof(devtypes) / sizeof(devtypes[0]); i++)
netlink_add_device(&nlmsg, sock, devtypes[i].type, devtypes[i].dev);
for (i = 0; i < sizeof(devmasters) / (sizeof(devmasters[0])); i++) {
char master[32], slave0[32], veth0[32], slave1[32], veth1[32];
sprintf(slave0, "%s_slave_0", devmasters[i]);
sprintf(veth0, "veth0_to_%s", devmasters[i]);
netlink_add_veth(&nlmsg, sock, slave0, veth0);
sprintf(slave1, "%s_slave_1", devmasters[i]);
sprintf(veth1, "veth1_to_%s", devmasters[i]);
netlink_add_veth(&nlmsg, sock, slave1, veth1);
sprintf(master, "%s0", devmasters[i]);
netlink_device_change(&nlmsg, sock, slave0, false, master, 0, 0, NULL);
netlink_device_change(&nlmsg, sock, slave1, false, master, 0, 0, NULL);
}
netlink_device_change(&nlmsg, sock, "bridge_slave_0", true, 0, 0, 0, NULL);
netlink_device_change(&nlmsg, sock, "bridge_slave_1", true, 0, 0, 0, NULL);
netlink_add_veth(&nlmsg, sock, "hsr_slave_0", "veth0_to_hsr");
netlink_add_veth(&nlmsg, sock, "hsr_slave_1", "veth1_to_hsr");
netlink_add_hsr(&nlmsg, sock, "hsr0", "hsr_slave_0", "hsr_slave_1");
netlink_device_change(&nlmsg, sock, "hsr_slave_0", true, 0, 0, 0, NULL);
netlink_device_change(&nlmsg, sock, "hsr_slave_1", true, 0, 0, 0, NULL);
netlink_add_veth(&nlmsg, sock, "veth0_virt_wifi", "veth1_virt_wifi");
netlink_add_linked(&nlmsg, sock, "virt_wifi", "virt_wifi0",
"veth1_virt_wifi");
netlink_add_veth(&nlmsg, sock, "veth0_vlan", "veth1_vlan");
netlink_add_vlan(&nlmsg, sock, "vlan0", "veth0_vlan", 0, htons(ETH_P_8021Q));
netlink_add_vlan(&nlmsg, sock, "vlan1", "veth0_vlan", 1, htons(ETH_P_8021AD));
netlink_add_macvlan(&nlmsg, sock, "macvlan0", "veth1_vlan");
netlink_add_macvlan(&nlmsg, sock, "macvlan1", "veth1_vlan");
netlink_add_ipvlan(&nlmsg, sock, "ipvlan0", "veth0_vlan", IPVLAN_MODE_L2, 0);
netlink_add_ipvlan(&nlmsg, sock, "ipvlan1", "veth0_vlan", IPVLAN_MODE_L3S,
IPVLAN_F_VEPA);
netlink_add_veth(&nlmsg, sock, "veth0_macvtap", "veth1_macvtap");
netlink_add_linked(&nlmsg, sock, "macvtap", "macvtap0", "veth0_macvtap");
netlink_add_linked(&nlmsg, sock, "macsec", "macsec0", "veth1_macvtap");
char addr[32];
sprintf(addr, DEV_IPV4, 14 + 10);
struct in_addr geneve_addr4;
if (inet_pton(AF_INET, addr, &geneve_addr4) <= 0)
exit(1);
struct in6_addr geneve_addr6;
if (inet_pton(AF_INET6, "fc00::01", &geneve_addr6) <= 0)
exit(1);
netlink_add_geneve(&nlmsg, sock, "geneve0", 0, &geneve_addr4, 0);
netlink_add_geneve(&nlmsg, sock, "geneve1", 1, 0, &geneve_addr6);
netdevsim_add((int)procid, 4);
netlink_wireguard_setup();
for (i = 0; i < sizeof(devices) / (sizeof(devices[0])); i++) {
char addr[32];
sprintf(addr, DEV_IPV4, i + 10);
netlink_add_addr4(&nlmsg, sock, devices[i].name, addr);
if (!devices[i].noipv6) {
sprintf(addr, DEV_IPV6, i + 10);
netlink_add_addr6(&nlmsg, sock, devices[i].name, addr);
}
uint64_t macaddr = DEV_MAC + ((i + 10ull) << 40);
netlink_device_change(&nlmsg, sock, devices[i].name, true, 0, &macaddr,
devices[i].macsize, NULL);
}
close(sock);
}
static void initialize_netdevices_init(void)
{
int sock = socket(AF_NETLINK, SOCK_RAW, NETLINK_ROUTE);
if (sock == -1)
exit(1);
struct {
const char* type;
int macsize;
bool noipv6;
bool noup;
} devtypes[] = {
{"nr", 7, true},
{"rose", 5, true, true},
};
unsigned i;
for (i = 0; i < sizeof(devtypes) / sizeof(devtypes[0]); i++) {
char dev[32], addr[32];
sprintf(dev, "%s%d", devtypes[i].type, (int)procid);
sprintf(addr, "172.30.%d.%d", i, (int)procid + 1);
netlink_add_addr4(&nlmsg, sock, dev, addr);
if (!devtypes[i].noipv6) {
sprintf(addr, "fe88::%02x:%02x", i, (int)procid + 1);
netlink_add_addr6(&nlmsg, sock, dev, addr);
}
int macsize = devtypes[i].macsize;
uint64_t macaddr = 0xbbbbbb +
((unsigned long long)i << (8 * (macsize - 2))) +
(procid << (8 * (macsize - 1)));
netlink_device_change(&nlmsg, sock, dev, !devtypes[i].noup, 0, &macaddr,
macsize, NULL);
}
close(sock);
}

#define MAX_FDS 30

static void setup_common()
{
if (mount(0, "/sys/fs/fuse/connections", "fusectl", 0, 0)) {
}
}

static void setup_binderfs()
{
if (mkdir("/dev/binderfs", 0777)) {
}
if (mount("binder", "/dev/binderfs", "binder", 0, NULL)) {
}
if (symlink("/dev/binderfs", "./binderfs")) {
}
}

static void loop();

static void sandbox_common()
{
prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
setsid();
struct rlimit rlim;
rlim.rlim_cur = rlim.rlim_max = (200 << 20);
setrlimit(RLIMIT_AS, &rlim);
rlim.rlim_cur = rlim.rlim_max = 32 << 20;
setrlimit(RLIMIT_MEMLOCK, &rlim);
rlim.rlim_cur = rlim.rlim_max = 136 << 20;
setrlimit(RLIMIT_FSIZE, &rlim);
rlim.rlim_cur = rlim.rlim_max = 1 << 20;
setrlimit(RLIMIT_STACK, &rlim);
rlim.rlim_cur = rlim.rlim_max = 0;
setrlimit(RLIMIT_CORE, &rlim);
rlim.rlim_cur = rlim.rlim_max = 256;
setrlimit(RLIMIT_NOFILE, &rlim);
if (unshare(CLONE_NEWNS)) {
}
if (mount(NULL, "/", NULL, MS_REC | MS_PRIVATE, NULL)) {
}
if (unshare(CLONE_NEWIPC)) {
}
if (unshare(0x02000000)) {
}
if (unshare(CLONE_NEWUTS)) {
}
if (unshare(CLONE_SYSVSEM)) {
}
typedef struct {
const char* name;
const char* value;
} sysctl_t;
static const sysctl_t sysctls[] = {
{"/proc/sys/kernel/shmmax", "16777216"},
{"/proc/sys/kernel/shmall", "536870912"},
{"/proc/sys/kernel/shmmni", "1024"},
{"/proc/sys/kernel/msgmax", "8192"},
{"/proc/sys/kernel/msgmni", "1024"},
{"/proc/sys/kernel/msgmnb", "1024"},
{"/proc/sys/kernel/sem", "1024 1048576 500 1024"},
};
unsigned i;
for (i = 0; i < sizeof(sysctls) / sizeof(sysctls[0]); i++)
write_file(sysctls[i].name, sysctls[i].value);
}

static int wait_for_loop(int pid)
{
if (pid < 0)
exit(1);
int status = 0;
while (waitpid(-1, &status, __WALL) != pid) {
}
return WEXITSTATUS(status);
}

static void drop_caps(void)
{
struct __user_cap_header_struct cap_hdr = {};
struct __user_cap_data_struct cap_data[2] = {};
cap_hdr.version = _LINUX_CAPABILITY_VERSION_3;
cap_hdr.pid = getpid();
if (syscall(SYS_capget, &cap_hdr, &cap_data))
exit(1);
const int drop = (1 << CAP_SYS_PTRACE) | (1 << CAP_SYS_NICE);
cap_data[0].effective &= ~drop;
cap_data[0].permitted &= ~drop;
cap_data[0].inheritable &= ~drop;
if (syscall(SYS_capset, &cap_hdr, &cap_data))
exit(1);
}

static int do_sandbox_none(void)
{
if (unshare(CLONE_NEWPID)) {
}
int pid = fork();
if (pid != 0)
return wait_for_loop(pid);
setup_common();
sandbox_common();
drop_caps();
initialize_netdevices_init();
if (unshare(CLONE_NEWNET)) {
}
initialize_netdevices();
setup_binderfs();
loop();
exit(1);
}

static void kill_and_wait(int pid, int* status)
{
kill(-pid, SIGKILL);
kill(pid, SIGKILL);
for (int i = 0; i < 100; i++) {
if (waitpid(-1, status, WNOHANG | __WALL) == pid)
return;
usleep(1000);
}
DIR* dir = opendir("/sys/fs/fuse/connections");
if (dir) {
for (;;) {
struct dirent* ent = readdir(dir);
if (!ent)
break;
if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0)
continue;
char abort[300];
snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort",
ent->d_name);
int fd = open(abort, O_WRONLY);
if (fd == -1) {
continue;
}
if (write(fd, abort, 1) < 0) {
}
close(fd);
}
closedir(dir);
} else {
}
while (waitpid(-1, status, __WALL) != pid) {
}
}

static void setup_test()
{
prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
setpgrp();
write_file("/proc/self/oom_score_adj", "1000");
}

static void close_fds()
{
for (int fd = 3; fd < MAX_FDS; fd++)
close(fd);
}

static void execute_one(void);

#define WAIT_FLAGS __WALL

static void loop(void)
{
int iter = 0;
for (;; iter++) {
int pid = fork();
if (pid < 0)
exit(1);
if (pid == 0) {
setup_test();
execute_one();
close_fds();
exit(0);
}
int status = 0;
uint64_t start = current_time_ms();
for (;;) {
if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid)
break;
sleep_ms(1);
if (current_time_ms() - start < 5000)
continue;
kill_and_wait(pid, &status);
break;
}
}
}

uint64_t r[2] = {0xffffffffffffffff, 0xffffffffffffffff};

void execute_one(void)
{
intptr_t res = 0;
res = syscall(__NR_socket, 0x10ul, 3ul, 0x14);
if (res != -1)
r[0] = res;
*(uint64_t*)0x20000200 = 0;
*(uint32_t*)0x20000208 = 0;
*(uint64_t*)0x20000210 = 0x200001c0;
*(uint64_t*)0x200001c0 = 0x20000180;
memcpy((void*)0x20000180,
"\x38\x00\x00\x00\x03\x14\x01\x00\x00\x00\x00\x00\x00\x00\x00\x00\x09"
"\x00\x02\x00\x73\x79\x6a\x31\x00\x00\x00\x00\x08\x00\x41\x00\x73\x69"
"\x77\x00\x14\x00\x33\x00\x76\x6c\x61\x6e\x30",
45);
*(uint64_t*)0x200001c8 = 0x38;
*(uint64_t*)0x20000218 = 1;
*(uint64_t*)0x20000220 = 0;
*(uint64_t*)0x20000228 = 0;
*(uint32_t*)0x20000230 = 0;
syscall(__NR_sendmsg, r[0], 0x20000200ul, 0ul);
res = syscall(__NR_socket, 0x11ul, 3ul, 0x300);
if (res != -1)
r[1] = res;
syscall(__NR_bind, r[1], 0ul, 0ul);
*(uint32_t*)0x20000040 = 1;
memcpy((void*)0x20000044,
"vlan0\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000\000"
"\000\000\000",
24);
*(uint32_t*)0x2000005c = 0;
*(uint16_t*)0x20000074 = 6;
syscall(__NR_ioctl, r[1], 0x8982, 0x20000040ul);
}
int main(void)
{
syscall(__NR_mmap, 0x1ffff000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x20000000ul, 0x1000000ul, 7ul, 0x32ul, -1, 0ul);
syscall(__NR_mmap, 0x21000000ul, 0x1000ul, 0ul, 0x32ul, -1, 0ul);
do_sandbox_none();
return 0;
}

2023-05-18 22:24:20

by Tetsuo Handa

[permalink] [raw]
Subject: Re: [syzbot] [rdma?] KASAN: slab-use-after-free Read in siw_query_port

On 2023/05/19 5:21, Fedor Pchelkin wrote:
> On our local Syzkaller instance the bug started to be caught after
> 266e9b3475ba ("RDMA/siw: Remove namespace check from siw_netdev_event()")
> so CC'ing Tetsuo Handa if maybe he would be also interested in the bug.

UAF could not be observed until that commit because hung task was observed
until that commit because syzkaller is testing non init_net namespace.

> This fix seems to be good and perhaps it just made a bigger opportunity
> for the UAF bug to happen. Actually, the C repro was taken from there [2].
>
> With your suggested solution the UAF is not reproduced. I don't know the
> exact reasons why dev_put() was placed before calling query_port() but the
> context implies that netdev can be freed in that period. And some of
> ->query_port() realizations may touch netdev. So it seems reasonable to
> move ref count put after performing query_port().

Since ib_device_get_netdev() calls dev_hold() on success, I think that
we need to call dev_put() after query_port(). Please send as a formal patch.