LinuxLists.cc - [syzbot] [kernel?] KASAN: slab-use-after-free Read in kill_orphaned

2024-05-08 18:16:59

Subject: [syzbot] [kernel?] KASAN: slab-use-after-free Read in kill_orphaned_pgrp (2)

Hello,

syzbot found the following issue on:

HEAD commit: dccb07f2914c Merge tag 'for-6.9-rc7-tag' of git://git.kern..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=12928970980000
kernel config: https://syzkaller.appspot.com/x/.config?x=6d14c12b661fb43
dashboard link: https://syzkaller.appspot.com/bug?extid=68619f9e9e69accd8e0a
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40

Unfortunately, I don't have any reproducer for this issue yet.

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/bc129693f2cc/disk-dccb07f2.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/cf12611cfdc7/vmlinux-dccb07f2.xz
kernel image: https://storage.googleapis.com/syzbot-assets/311fbc1afd69/bzImage-dccb07f2.xz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

==================================================================
BUG: KASAN: slab-use-after-free in task_pgrp include/linux/sched/signal.h:689 [inline]
BUG: KASAN: slab-use-after-free in kill_orphaned_pgrp+0x41/0x560 kernel/exit.c:379
Read of size 8 at addr ffff888064362708 by task vhost-9668/9669

CPU: 0 PID: 9669 Comm: vhost-9668 Not tainted 6.9.0-rc7-syzkaller-00012-gdccb07f2914c #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 03/27/2024
Call Trace:
<TASK>
__dump_stack lib/dump_stack.c:88 [inline]
dump_stack_lvl+0x241/0x360 lib/dump_stack.c:114
print_address_description mm/kasan/report.c:377 [inline]
print_report+0x169/0x550 mm/kasan/report.c:488
kasan_report+0x143/0x180 mm/kasan/report.c:601
task_pgrp include/linux/sched/signal.h:689 [inline]
kill_orphaned_pgrp+0x41/0x560 kernel/exit.c:379
exit_notify kernel/exit.c:739 [inline]
do_exit+0x1673/0x27e0 kernel/exit.c:898
vhost_task_fn+0x2ff/0x320 kernel/vhost_task.c:61
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244
</TASK>

Allocated by task 8534:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
unpoison_slab_object mm/kasan/common.c:312 [inline]
__kasan_slab_alloc+0x66/0x80 mm/kasan/common.c:338
kasan_slab_alloc include/linux/kasan.h:201 [inline]
slab_post_alloc_hook mm/slub.c:3804 [inline]
slab_alloc_node mm/slub.c:3851 [inline]
kmem_cache_alloc_node+0x194/0x390 mm/slub.c:3894
alloc_task_struct_node kernel/fork.c:176 [inline]
dup_task_struct+0x57/0x7d0 kernel/fork.c:1107
copy_process+0x5d1/0x3df0 kernel/fork.c:2220
kernel_clone+0x226/0x8f0 kernel/fork.c:2797
__do_sys_clone kernel/fork.c:2940 [inline]
__se_sys_clone kernel/fork.c:2924 [inline]
__x64_sys_clone+0x258/0x2a0 kernel/fork.c:2924
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

Freed by task 5326:
kasan_save_stack mm/kasan/common.c:47 [inline]
kasan_save_track+0x3f/0x80 mm/kasan/common.c:68
kasan_save_free_info+0x40/0x50 mm/kasan/generic.c:579
poison_slab_object+0xa6/0xe0 mm/kasan/common.c:240
__kasan_slab_free+0x37/0x60 mm/kasan/common.c:256
kasan_slab_free include/linux/kasan.h:184 [inline]
slab_free_hook mm/slub.c:2111 [inline]
slab_free mm/slub.c:4286 [inline]
kmem_cache_free+0x10b/0x2d0 mm/slub.c:4350
put_task_struct include/linux/sched/task.h:138 [inline]
delayed_put_task_struct+0x125/0x2f0 kernel/exit.c:229
rcu_do_batch kernel/rcu/tree.c:2196 [inline]
rcu_core+0xafd/0x1830 kernel/rcu/tree.c:2471
handle_softirqs+0x2d6/0x990 kernel/softirq.c:554
__do_softirq kernel/softirq.c:588 [inline]
invoke_softirq kernel/softirq.c:428 [inline]
__irq_exit_rcu+0xf4/0x1c0 kernel/softirq.c:637
irq_exit_rcu+0x9/0x30 kernel/softirq.c:649
instr_sysvec_apic_timer_interrupt arch/x86/kernel/apic/apic.c:1043 [inline]
sysvec_apic_timer_interrupt+0xa6/0xc0 arch/x86/kernel/apic/apic.c:1043
asm_sysvec_apic_timer_interrupt+0x1a/0x20 arch/x86/include/asm/idtentry.h:702

Last potentially related work creation:
kasan_save_stack+0x3f/0x60 mm/kasan/common.c:47
__kasan_record_aux_stack+0xac/0xc0 mm/kasan/generic.c:541
__call_rcu_common kernel/rcu/tree.c:2734 [inline]
call_rcu+0x167/0xa70 kernel/rcu/tree.c:2838
release_task+0x16d2/0x1810
de_thread fs/exec.c:1172 [inline]
begin_new_exec+0xfde/0x1ce0 fs/exec.c:1278
load_elf_binary+0xb12/0x2e40 fs/binfmt_elf.c:996
search_binary_handler fs/exec.c:1778 [inline]
exec_binprm fs/exec.c:1820 [inline]
bprm_execve+0xaf8/0x17c0 fs/exec.c:1872
do_execveat_common+0x553/0x700 fs/exec.c:1979
do_execve fs/exec.c:2053 [inline]
__do_sys_execve fs/exec.c:2129 [inline]
__se_sys_execve fs/exec.c:2124 [inline]
__x64_sys_execve+0x92/0xb0 fs/exec.c:2124
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f

The buggy address belongs to the object at ffff888064361e00
which belongs to the cache task_struct of size 7424
The buggy address is located 2312 bytes inside of
freed 7424-byte region [ffff888064361e00, ffff888064363b00)

The buggy address belongs to the physical page:
page: refcount:1 mapcount:0 mapping:0000000000000000 index:0x0 pfn:0x64360
head: order:3 entire_mapcount:0 nr_pages_mapped:0 pincount:0
memcg:ffff88801ed27d81
flags: 0xfff00000000840(slab|head|node=0|zone=1|lastcpupid=0x7ff)
page_type: 0xffffffff()
raw: 00fff00000000840 ffff888015eea500 ffffea00008c9e00 dead000000000002
raw: 0000000000000000 0000000000040004 00000001ffffffff ffff88801ed27d81
head: 00fff00000000840 ffff888015eea500 ffffea00008c9e00 dead000000000002
head: 0000000000000000 0000000000040004 00000001ffffffff ffff88801ed27d81
head: 00fff00000000003 ffffea000190d801 dead000000000122 00000000ffffffff
head: 0000000800000000 0000000000000000 00000000ffffffff 0000000000000000
page dumped because: kasan: bad access detected
page_owner tracks the page as allocated
page last allocated via order 3, migratetype Unmovable, gfp_mask 0xd20c0(__GFP_IO|__GFP_FS|__GFP_NOWARN|__GFP_NORETRY|__GFP_COMP|__GFP_NOMEMALLOC), pid 4538, tgid -361552985 (udevd), ts 4538, free_ts 24654065279
set_page_owner include/linux/page_owner.h:32 [inline]
post_alloc_hook+0x1ea/0x210 mm/page_alloc.c:1534
prep_new_page mm/page_alloc.c:1541 [inline]
get_page_from_freelist+0x3410/0x35b0 mm/page_alloc.c:3317
__alloc_pages+0x256/0x6c0 mm/page_alloc.c:4575
__alloc_pages_node include/linux/gfp.h:238 [inline]
alloc_pages_node include/linux/gfp.h:261 [inline]
alloc_slab_page+0x5f/0x160 mm/slub.c:2180
allocate_slab mm/slub.c:2343 [inline]
new_slab+0x84/0x2f0 mm/slub.c:2396
___slab_alloc+0xc73/0x1260 mm/slub.c:3530
__slab_alloc mm/slub.c:3615 [inline]
__slab_alloc_node mm/slub.c:3668 [inline]
slab_alloc_node mm/slub.c:3841 [inline]
kmem_cache_alloc_node+0x24a/0x390 mm/slub.c:3894
alloc_task_struct_node kernel/fork.c:176 [inline]
dup_task_struct+0x57/0x7d0 kernel/fork.c:1107
copy_process+0x5d1/0x3df0 kernel/fork.c:2220
kernel_clone+0x226/0x8f0 kernel/fork.c:2797
__do_sys_clone kernel/fork.c:2940 [inline]
__se_sys_clone kernel/fork.c:2924 [inline]
__x64_sys_clone+0x258/0x2a0 kernel/fork.c:2924
do_syscall_x64 arch/x86/entry/common.c:52 [inline]
do_syscall_64+0xf5/0x240 arch/x86/entry/common.c:83
entry_SYSCALL_64_after_hwframe+0x77/0x7f
page last free pid 1 tgid 1 stack trace:
reset_page_owner include/linux/page_owner.h:25 [inline]
free_pages_prepare mm/page_alloc.c:1141 [inline]
free_unref_page_prepare+0x986/0xab0 mm/page_alloc.c:2347
free_unref_page+0x37/0x3f0 mm/page_alloc.c:2487
free_contig_range+0x9e/0x160 mm/page_alloc.c:6572
destroy_args+0x8a/0x890 mm/debug_vm_pgtable.c:1036
debug_vm_pgtable+0x4be/0x550 mm/debug_vm_pgtable.c:1416
do_one_initcall+0x248/0x880 init/main.c:1245
do_initcall_level+0x157/0x210 init/main.c:1307
do_initcalls+0x3f/0x80 init/main.c:1323
kernel_init_freeable+0x435/0x5d0 init/main.c:1555
kernel_init+0x1d/0x2b0 init/main.c:1444
ret_from_fork+0x4b/0x80 arch/x86/kernel/process.c:147
ret_from_fork_asm+0x1a/0x30 arch/x86/entry/entry_64.S:244

Memory state around the buggy address:
ffff888064362600: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888064362680: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
>ffff888064362700: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
^
ffff888064362780: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
ffff888064362800: fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb fb
==================================================================

---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup

2024-05-11 14:45:28

by lee bruce

[permalink] [raw]

Subject: Re: [syzbot] [kernel?] KASAN: slab-use-after-free Read in kill_orphaned_pgrp (2)

Hello, I found a reproducer for this bug.

If you fix this issue, please add the following tag to the commit:
Reported-by: xingwei lee <[email protected]>
Reported-by: lingfei cheng <[email protected]>

I use the same kernel as syzbot instance
Kernel Commit: upstream dccb07f2914cdab2ac3a5b6c98406f765acab803
Kernel Config: https://syzkaller.appspot.com/text?tag=KernelConfig&x=6d14c12b661fb43
with KASAN enabled

Since the same title bug is triggered in
https://syzkaller.appspot.com/bug?id=70492b96ff47ff70cfc433be100586119310670b.
I make a simple RCA.
In the old-syzbot instance the bug still trigger the title "KASAN:
slab-use-after-free Read in kill_orphaned_pgrp” and in the lastest
syzbot the bug report as

TITLE: WARNING in signal_wake_up_state
------------[ cut here ]------------
WARNING: CPU: 3 PID: 8591 at kernel/signal.c:762
signal_wake_up_state+0xf8/0x130 kernel/signal.c:762
Modules linked in:
CPU: 3 PID: 8591 Comm: file0 Not tainted 6.9.0-rc7-00012-gdccb07f2914c #6
Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS
1.16.0-2.fc37 04/01/2014
RIP: 0010:signal_wake_up_state+0xf8/0x130 kernel/signal.c:762
Code: 31 c0 31 c9 31 f6 e9 b2 1f 73 0a e8 42 6f 3a 00 48 89 df 5b 41
5e 41 5f 5d 31 c0 31 c9 31 f6 e9 ce 27 0a 00 ec
RSP: 0000:ffffc900154af918 EFLAGS: 00010046
RAX: 0000000000000000 RBX: ffff8880239a0000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: ffff8880239a0000
R13: ffff88802bd80908 R14: 0000000000000108 R15: dffffc0000000000
FS: 0000000000000000(0000) GS:ffff88823bc80000(0000) knlGS:0000000000000000
CS: 0010 DS: 002b ES: 002b CR0: 0000000080050033
CR2: 0000000000000000 CR3: 00000000259fa000 CR4: 0000000000750ef0
PKRU: 55555554
Call Trace:
<TASK>
signal_wake_up include/linux/sched/signal.h:448 [inline]
zap_process fs/coredump.c:373 [inline]
zap_threads fs/coredump.c:392 [inline]
coredump_wait fs/coredump.c:410 [inline]
do_coredump+0x8ff/0x2b60 fs/coredump.c:571
get_signal+0x13fa/0x1740 kernel/signal.c:2896
arch_do_signal_or_restart+0x96/0x870 arch/x86/kernel/signal.c:310
exit_to_user_mode_loop kernel/entry/common.c:111 [inline]
exit_to_user_mode_prepare include/linux/entry-common.h:328 [inline]
irqentry_exit_to_user_mode+0x79/0x280 kernel/entry/common.c:231
exc_page_fault+0x577/0x8b0 arch/x86/mm/fault.c:1535
asm_exc_page_fault+0x26/0x30 arch/x86/include/asm/idtentry.h:623
RIP: 0033:0x0
Code: Unable to access opcode bytes at 0xffffffffffffffd6.
RSP: 002b:00000000ff9bfe10 EFLAGS: 00010202
RAX: 0000000000000000 RBX: 0000000000000000 RCX: 0000000000000000
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000000000 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000000
R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
</TASK>

This is also the reason why there are so many people in my cc list.

I debug this bug for a while and found that when our PoC process call vhost:
vhost_dev_set_owner
vhost_worker_create
copy_process

It will call copy_process to create a “vhost-xxxx” thread in workqueue
to run vhost_task_fn.

Also, PoC create a invalid elf “file0" and execute it afterwards.
It is conceivable that execute syscall will fail and kill itself in
put_task_struct prograss.
However, when the workQueue thread “vhost-xxxx” call vhost_task_fn

static int vhost_task_fn(void *data)
{
struct vhost_task *vtsk = data;
bool dead = false;

for (;;) {
bool did_work;

if (!dead && signal_pending(current)) {
struct ksignal ksig;
/*
* Calling get_signal will block in SIGSTOP,
* or clear fatal_signal_pending, but remember
* what was set.
*
* This thread won't actually exit until all
* of the file descriptors are closed, and
* the release function is called.
*/
dead = get_signal(&ksig);
if (dead)
clear_thread_flag(TIF_SIGPENDING);
}

/* mb paired w/ vhost_task_stop */
set_current_state(TASK_INTERRUPTIBLE);

if (test_bit(VHOST_TASK_FLAGS_STOP, &vtsk->flags)) {
__set_current_state(TASK_RUNNING);
break;
}

did_work = vtsk->fn(vtsk->data);
if (!did_work)
schedule();
}

complete(&vtsk->exited);
do_exit(0);
}

I found, which seems a bit strange but I’m not sure:

gef> bt
#0 kill_orphaned_pgrp (tsk=0xffff888107833780, parent=parent@entry=0x0
<fixed_percpu_data>) at ./include/linux/sched/signal.h:694
#1 0xffffffff811ff54d in exit_notify (group_dead=0x1,
tsk=0xffff888245e0d340) at kernel/exit.c:737
#2 do_exit (code=code@entry=0x0) at kernel/exit.c:894
#3 0xffffffff812331ea in vhost_task_fn (data=0xffff88824570c700) at
kernel/vhost_task.c:61
#4 0xffffffff810d7f7c in ret_from_fork (prev=<optimized out>,
regs=0xffffc90007eaff58, fn=0xffffffff81233110 <vhost_task_fn>,
fn_arg=0xffff88824570c700)
at arch/x86/kernel/process.c:147
#5 0xffffffff81002431 in ret_from_fork_asm () at arch/x86/entry/entry_64.S:304
#6 0x00007fff1f50a958 in ?? ()
#7 0x00007fff1f50a870 in ?? ()
#8 0x0000000000000011 in fixed_percpu_data ()
#9 0xffffffffffffffb0 in ?? ()
#10 0x00007fe268ae3220 in ?? ()
#11 0x00007fe268ae36c0 in ?? ()
#12 0x0000000000000203 in ?? ()
#13 0x0000000000000000 in ?? ()
gef> p tsk->comm
$64 = "PoC", '\\000' <repeats 12 times>

the vhost_task_fn trying to call kill_orphaned_pgrp to and cause a UAF
because the PoC task is feed by idle in the KASAN report.

I don't know if my analysis is correct, please feel free to correct me.

=* repro.c =*
#define _GNU_SOURCE

#include <dirent.h>
#include <endian.h>
#include <errno.h>
#include <fcntl.h>
#include <linux/futex.h>
#include <pthread.h>
#include <signal.h>
#include <stdarg.h>
#include <stdbool.h>
#include <stdint.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <sys/prctl.h>
#include <sys/stat.h>
#include <sys/syscall.h>
#include <sys/types.h>
#include <sys/wait.h>
#include <time.h>
#include <unistd.h>

static void sleep_ms(uint64_t ms) { usleep(ms * 1000); }

static uint64_t current_time_ms(void) {
struct timespec ts;
if (clock_gettime(CLOCK_MONOTONIC, &ts)) exit(1);
return (uint64_t)ts.tv_sec * 1000 + (uint64_t)ts.tv_nsec / 1000000;
}

static void thread_start(void* (*fn)(void*), void* arg) {
pthread_t th;
pthread_attr_t attr;
pthread_attr_init(&attr);
pthread_attr_setstacksize(&attr, 128 << 10);
int i = 0;
for (; i < 100; i++) {
if (pthread_create(&th, &attr, fn, arg) == 0) {
pthread_attr_destroy(&attr);
return;
}
if (errno == EAGAIN) {
usleep(50);
continue;
}
break;
}
exit(1);
}

typedef struct {
int state;
} event_t;

static void event_init(event_t* ev) { ev->state = 0; }

static void event_reset(event_t* ev) { ev->state = 0; }

static void event_set(event_t* ev) {
if (ev->state) exit(1);
__atomic_store_n(&ev->state, 1, __ATOMIC_RELEASE);
syscall(SYS_futex, &ev->state, FUTEX_WAKE | FUTEX_PRIVATE_FLAG, 1000000);
}

static void event_wait(event_t* ev) {
while (!__atomic_load_n(&ev->state, __ATOMIC_ACQUIRE))
syscall(SYS_futex, &ev->state, FUTEX_WAIT | FUTEX_PRIVATE_FLAG, 0, 0);
}

static int event_isset(event_t* ev) {
return __atomic_load_n(&ev->state, __ATOMIC_ACQUIRE);
}

static int event_timedwait(event_t* ev, uint64_t timeout) {
uint64_t start = current_time_ms();
uint64_t now = start;
for (;;) {
uint64_t remain = timeout - (now - start);
struct timespec ts;
ts.tv_sec = remain / 1000;
ts.tv_nsec = (remain % 1000) * 1000 * 1000;
syscall(SYS_futex, &ev->state, FUTEX_WAIT | FUTEX_PRIVATE_FLAG, 0, &ts);
if (__atomic_load_n(&ev->state, __ATOMIC_ACQUIRE)) return 1;
now = current_time_ms();
if (now - start > timeout) return 0;
}
}

static bool write_file(const char* file, const char* what, ...) {
char buf[1024];
va_list args;
va_start(args, what);
vsnprintf(buf, sizeof(buf), what, args);
va_end(args);
buf[sizeof(buf) - 1] = 0;
int len = strlen(buf);
int fd = open(file, O_WRONLY | O_CLOEXEC);
if (fd == -1) return false;
if (write(fd, buf, len) != len) {
int err = errno;
close(fd);
errno = err;
return false;
}
close(fd);
return true;
}

static void kill_and_wait(int pid, int* status) {
kill(-pid, SIGKILL);
kill(pid, SIGKILL);
for (int i = 0; i < 100; i++) {
if (waitpid(-1, status, WNOHANG | __WALL) == pid) return;
usleep(1000);
}
DIR* dir = opendir("/sys/fs/fuse/connections");
if (dir) {
for (;;) {
struct dirent* ent = readdir(dir);
if (!ent) break;
if (strcmp(ent->d_name, ".") == 0 || strcmp(ent->d_name, "..") == 0)
continue;
char abort[300];
snprintf(abort, sizeof(abort), "/sys/fs/fuse/connections/%s/abort",
ent->d_name);
int fd = open(abort, O_WRONLY);
if (fd == -1) {
continue;
}
if (write(fd, abort, 1) < 0) {
}
close(fd);
}
closedir(dir);
} else {
}
while (waitpid(-1, status, __WALL) != pid) {
}
}

static void setup_test() {
prctl(PR_SET_PDEATHSIG, SIGKILL, 0, 0, 0);
setpgrp();
write_file("/proc/self/oom_score_adj", "1000");
}

struct thread_t {
int created, call;
event_t ready, done;
};

static struct thread_t threads[16];
static void execute_call(int call);
static int running;

static void* thr(void* arg) {
struct thread_t* th = (struct thread_t*)arg;
for (;;) {
event_wait(&th->ready);
event_reset(&th->ready);
execute_call(th->call);
__atomic_fetch_sub(&running, 1, __ATOMIC_RELAXED);
event_set(&th->done);
}
return 0;
}

static void execute_one(void) {
int i, call, thread;
for (call = 0; call < 6; call++) {
for (thread = 0; thread < (int)(sizeof(threads) / sizeof(threads[0]));
thread++) {
struct thread_t* th = &threads[thread];
if (!th->created) {
th->created = 1;
event_init(&th->ready);
event_init(&th->done);
event_set(&th->done);
thread_start(thr, th);
}
if (!event_isset(&th->done)) continue;
event_reset(&th->done);
th->call = call;
__atomic_fetch_add(&running, 1, __ATOMIC_RELAXED);
event_set(&th->ready);
event_timedwait(&th->done, 50);
break;
}
}
for (i = 0; i < 100 && __atomic_load_n(&running, __ATOMIC_RELAXED); i++)
sleep_ms(1);
}

static void execute_one(void);

#define WAIT_FLAGS __WALL

static void loop(void) {
int iter = 0;
for (;; iter++) {
int pid = fork();
if (pid < 0) exit(1);
if (pid == 0) {
setup_test();
execute_one();
exit(0);
}
int status = 0;
uint64_t start = current_time_ms();
for (;;) {
if (waitpid(-1, &status, WNOHANG | WAIT_FLAGS) == pid) break;
sleep_ms(1);
if (current_time_ms() - start < 5000) continue;
kill_and_wait(pid, &status);
break;
}
}
}

uint64_t r[2] = {0xffffffffffffffff, 0xffffffffffffffff};

void execute_call(int call) {
intptr_t res = 0;
switch (call) {
case 0:
memcpy((void*)0x20000280, "./file0\000", 8);
res = syscall(__NR_creat, /*file=*/0x20000280ul,
/*mode=*/0xecf86c37d53049ccul);
if (res != -1) r[0] = res;
break;
case 1:
*(uint8_t*)0x20000440 = 0x7f;
*(uint8_t*)0x20000441 = 0x45;
*(uint8_t*)0x20000442 = 0x4c;
*(uint8_t*)0x20000443 = 0x46;
*(uint8_t*)0x20000444 = 0;
*(uint8_t*)0x20000445 = 0;
*(uint8_t*)0x20000446 = 0;
*(uint8_t*)0x20000447 = 0;
*(uint64_t*)0x20000448 = 0;
*(uint16_t*)0x20000450 = 2;
*(uint16_t*)0x20000452 = 0x3e;
*(uint32_t*)0x20000454 = 0;
*(uint32_t*)0x20000458 = 0;
*(uint32_t*)0x2000045c = 0x38;
*(uint32_t*)0x20000460 = 0;
*(uint32_t*)0x20000464 = 0;
*(uint16_t*)0x20000468 = 0xeb0;
*(uint16_t*)0x2000046a = 0x20;
*(uint16_t*)0x2000046c = 2;
*(uint16_t*)0x2000046e = 0;
*(uint16_t*)0x20000470 = 0;
*(uint16_t*)0x20000472 = 0;
*(uint32_t*)0x20000478 = 0;
*(uint32_t*)0x2000047c = 0;
*(uint32_t*)0x20000480 = 0;
*(uint32_t*)0x20000484 = 0;
*(uint32_t*)0x20000488 = 0;
*(uint32_t*)0x2000048c = 0;
*(uint32_t*)0x20000490 = 0;
*(uint32_t*)0x20000494 = 0;
memset((void*)0x20000498, 0, 256);
memset((void*)0x20000598, 0, 256);
memset((void*)0x20000698, 0, 256);
memset((void*)0x20000798, 0, 256);
memset((void*)0x20000898, 0, 256);
memset((void*)0x20000998, 0, 256);
memset((void*)0x20000a98, 0, 256);
syscall(__NR_write, /*fd=*/r[0], /*data=*/0x20000440ul, /*len=*/0x758ul);
break;
case 2:
syscall(__NR_close, /*fd=*/r[0]);
break;
case 3:
memcpy((void*)0x20000000, "/dev/vhost-net\000", 15);
res = syscall(__NR_openat, /*fd=*/0xffffffffffffff9cul,
/*file=*/0x20000000ul, /*flags=*/2ul, /*mode=*/0ul);
if (res != -1) r[1] = res;
break;
case 4:
syscall(__NR_ioctl, /*fd=*/r[1], /*cmd=*/0xaf01, /*v=*/0ul);
break;
case 5:
memcpy((void*)0x20000400, "./file0\000", 8);
syscall(__NR_execve, /*file=*/0x20000400ul, /*argv=*/0ul, /*envp=*/0ul);
break;
}
}
int main(void) {
syscall(__NR_mmap, /*addr=*/0x1ffff000ul, /*len=*/0x1000ul, /*prot=*/0ul,
/*flags=*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
syscall(__NR_mmap, /*addr=*/0x20000000ul, /*len=*/0x1000000ul, /*prot=*/7ul,
/*flags=*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
syscall(__NR_mmap, /*addr=*/0x21000000ul, /*len=*/0x1000ul, /*prot=*/0ul,
/*flags=*/0x32ul, /*fd=*/-1, /*offset=*/0ul);
loop();
return 0;
}

=* repro.txt =*
r0 = creat(&(0x7f0000000280)='./file0\x00', 0xecf86c37d53049cc)
write$binfmt_elf32(r0, &(0x7f0000000440)={{0x7f, 0x45, 0x4c, 0x46,
0x0, 0x0, 0x0, 0x0, 0x0, 0x2, 0x3e, 0x0, 0x0, 0x38, 0x0, 0x0, 0xeb0,
0x20, 0x2}, [{}], "", ['\x00', '\x00', '\x00', '\x00', '\x00', '\x00',
'\x00']}, 0x758)
close(r0)
r1 = openat$vnet(0xffffffffffffff9c, &(0x7f0000000000), 0x2, 0x0)
ioctl$int_in(r1, 0x40000000af01, 0x0)
execve(&(0x7f0000000400)='./file0\x00', 0x0, 0x0)

and see also in
https://gist.github.com/xrivendell7/c1540d905cdbcc43fc134a76dd364f6d

I hope it helps.
Best regards

2024-05-11 23:34:38

by Hillf Danton

[permalink] [raw]

Subject: Re: [syzbot] [kernel?] KASAN: slab-use-after-free Read in kill_orphaned_pgrp (2)

On Sat, 11 May 2024 22:45:06 +0800 lee bruce <[email protected]>
> Hello, I found a reproducer for this bug.
>
Thanks for your report.

> If you fix this issue, please add the following tag to the commit:
> Reported-by: xingwei lee <[email protected]>
> Reported-by: lingfei cheng <[email protected]>
>
> I use the same kernel as syzbot instance
> Kernel Commit: upstream dccb07f2914cdab2ac3a5b6c98406f765acab803
> Kernel Config: https://syzkaller.appspot.com/text?tag=3DKernelConfig&x=3D6d14c12b661fb43
> with KASAN enabled
>
> Since the same title bug is triggered in
> https://syzkaller.appspot.com/bug?id=3D70492b96ff47ff70cfc433be100586119310670b.
> I make a simple RCA.
> In the old-syzbot instance the bug still trigger the title "KASAN:
> slab-use-after-free Read in kill_orphaned_pgrp=E2=80=9D and in the lastest
> syzbot the bug report as
>
> TITLE: WARNING in signal_wake_up_state
> ------------[ cut here ]------------
> WARNING: CPU: 3 PID: 8591 at kernel/signal.c:762
> signal_wake_up_state+0xf8/0x130 kernel/signal.c:762
> Modules linked in:
> CPU: 3 PID: 8591 Comm: file0 Not tainted 6.9.0-rc7-00012-gdccb07f2914c #6

Could you reproduce it in the next tree, because of d558664602d3 ("vhost_task:
Handle SIGKILL by flushing work and exiting") adding reaction to signal?

2024-05-12 01:40:49

by lee bruce

[permalink] [raw]

Subject: Re: [syzbot] [kernel?] KASAN: slab-use-after-free Read in kill_orphaned_pgrp (2)

Hi.

Hillf Danton <[email protected]> 于2024年5月12日周日 07:34写道：
>
> On Sat, 11 May 2024 22:45:06 +0800 lee bruce <[email protected]>
> > Hello, I found a reproducer for this bug.
> >
> Thanks for your report.
>
> > If you fix this issue, please add the following tag to the commit:
> > Reported-by: xingwei lee <[email protected]>
> > Reported-by: lingfei cheng <[email protected]>
> >
> > I use the same kernel as syzbot instance
> > Kernel Commit: upstream dccb07f2914cdab2ac3a5b6c98406f765acab803
> > Kernel Config: https://syzkaller.appspot.com/text?tag=3DKernelConfig&x=3D6d14c12b661fb43
> > with KASAN enabled
> >
> > Since the same title bug is triggered in
> > https://syzkaller.appspot.com/bug?id=3D70492b96ff47ff70cfc433be100586119310670b.
> > I make a simple RCA.
> > In the old-syzbot instance the bug still trigger the title "KASAN:
> > slab-use-after-free Read in kill_orphaned_pgrp=E2=80=9D and in the lastest
> > syzbot the bug report as
> >
> > TITLE: WARNING in signal_wake_up_state
> > ------------[ cut here ]------------
> > WARNING: CPU: 3 PID: 8591 at kernel/signal.c:762
> > signal_wake_up_state+0xf8/0x130 kernel/signal.c:762
> > Modules linked in:
> > CPU: 3 PID: 8591 Comm: file0 Not tainted 6.9.0-rc7-00012-gdccb07f2914c #6
>
> Could you reproduce it in the next tree, because of d558664602d3 ("vhost_task:
> Handle SIGKILL by flushing work and exiting") adding reaction to signal?
Ok, I'll try.

Best Regards,
xingwei lee

2024-05-12 13:37:19

by lee bruce

[permalink] [raw]

Subject: Re: [syzbot] [kernel?] KASAN: slab-use-after-free Read in kill_orphaned_pgrp (2)

Hi.

2024年5月12日 09:40，lee bruce <[email protected]> 写道：

Hi.

Hillf Danton <[email protected]> 于2024年5月12日周日 07:34写道：

On Sat, 11 May 2024 22:45:06 +0800 lee bruce <[email protected]>

Hello, I found a reproducer for this bug.

Thanks for your report.

If you fix this issue, please add the following tag to the commit:
Reported-by: xingwei lee <[email protected]>
Reported-by: lingfei cheng <[email protected]>

I use the same kernel as syzbot instance
Kernel Commit: upstream dccb07f2914cdab2ac3a5b6c98406f765acab803
Kernel Config: https://syzkaller.appspot.com/text?tag=3DKernelConfig&x=3D6d14c12b661fb43
with KASAN enabled

Since the same title bug is triggered in
https://syzkaller.appspot.com/bug?id=3D70492b96ff47ff70cfc433be100586119310670b.
I make a simple RCA.
In the old-syzbot instance the bug still trigger the title "KASAN:
slab-use-after-free Read in kill_orphaned_pgrp=E2=80=9D and in the lastest
syzbot the bug report as

TITLE: WARNING in signal_wake_up_state
------------[ cut here ]------------
WARNING: CPU: 3 PID: 8591 at kernel/signal.c:762
signal_wake_up_state+0xf8/0x130 kernel/signal.c:762
Modules linked in:
CPU: 3 PID: 8591 Comm: file0 Not tainted 6.9.0-rc7-00012-gdccb07f2914c #6

Could you reproduce it in the next tree, because of d558664602d3 ("vhost_task:
Handle SIGKILL by flushing work and exiting") adding reaction to signal?

I test the kernel linux-next: d558664602d3906866e604a618dcf67f66d79967
and comfired reproducer and didn’t trigger any crashes.
I notice maybe this bug is duplicated with
https://syzkaller.appspot.com/bug?extid=98edc2df894917b3431f and
https://syzkaller.appspot.com/bug?extid=c6d438f2d77f96cae7c2.
Should we need to close or tag duplicated or fix to them?

Ok, I'll try.

Best Regards,
xingwei lee

2024-05-12 22:23:54

by Hillf Danton

[permalink] [raw]

Subject: Re: [syzbot] [kernel?] KASAN: slab-use-after-free Read in kill_orphaned_pgrp (2)

On Sun, 12 May 2024 21:36:56 +0800 lee bruce <[email protected]>
>
> I test the kernel linux-next: d558664602d3906866e604a618dcf67f66d79967
> and comfired reproducer and didn’t trigger any crashes.

Thank you for testing on top of the next tree.

> I notice maybe this bug is duplicated with
> https://syzkaller.appspot.com/bug?extid=98edc2df894917b3431f and
> https://syzkaller.appspot.com/bug?extid=c6d438f2d77f96cae7c2.
> Should we need to close or tag duplicated or fix to them?

Feel free to do so.

BTW, git send-email is one of the tools for posting to lore.