2023-11-01 05:37:21

by syzbot

[permalink] [raw]
Subject: [syzbot] [ext4?] general protection fault in hrtimer_nanosleep

Hello,

syzbot found the following issue on:

HEAD commit: 888cf78c29e2 Merge tag 'iommu-fix-v6.6-rc7' of git://git.k..
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=10339673680000
kernel config: https://syzkaller.appspot.com/x/.config?x=7d1f30869bb78ec6
dashboard link: https://syzkaller.appspot.com/bug?extid=b408cd9b40ec25380ee1
compiler: gcc (Debian 12.2.0-14) 12.2.0, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=165bbce3680000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/2e776d64243c/disk-888cf78c.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/9ce776a2bcfc/vmlinux-888cf78c.xz
kernel image: https://storage.googleapis.com/syzbot-assets/86a6c193c013/bzImage-888cf78c.xz
mounted in repro: https://storage.googleapis.com/syzbot-assets/8021bba287f0/mount_0.gz

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]

general protection fault, probably for non-canonical address 0xdffffc003ffff113: 0000 [#1] PREEMPT SMP KASAN
KASAN: probably user-memory-access in range [0x00000001ffff8898-0x00000001ffff889f]
CPU: 1 PID: 5308 Comm: syz-executor.4 Not tainted 6.6.0-rc7-syzkaller-00142-g888cf78c29e2 #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
RIP: 0010:lookup_object lib/debugobjects.c:195 [inline]
RIP: 0010:lookup_object_or_alloc lib/debugobjects.c:564 [inline]
RIP: 0010:__debug_object_init+0xf3/0x2b0 lib/debugobjects.c:634
Code: d8 48 c1 e8 03 42 80 3c 20 00 0f 85 85 01 00 00 48 8b 1b 48 85 db 0f 84 9f 00 00 00 48 8d 7b 18 83 c5 01 48 89 f8 48 c1 e8 03 <42> 80 3c 20 00 0f 85 4c 01 00 00 4c 3b 73 18 75 c3 48 8d 7b 10 48
RSP: 0018:ffffc900050e7d08 EFLAGS: 00010012
RAX: 000000003ffff113 RBX: 00000001ffff8880 RCX: ffffffff8169123e
RDX: 1ffffffff249b149 RSI: 0000000000000004 RDI: 00000001ffff8898
RBP: 0000000000000003 R08: 0000000000000001 R09: 0000000000000216
R10: 0000000000000003 R11: 0000000000000000 R12: dffffc0000000000
R13: ffffffff924d8a48 R14: ffffc900050e7d90 R15: ffffffff924d8a50
FS: 0000555556eec480(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa23ab065ee CR3: 000000007e5c1000 CR4: 0000000000350ee0
Call Trace:
<TASK>
hrtimer_init_sleeper_on_stack kernel/time/hrtimer.c:447 [inline]
hrtimer_nanosleep+0x122/0x440 kernel/time/hrtimer.c:2098
common_nsleep+0xa1/0xc0 kernel/time/posix-timers.c:1350
__do_sys_clock_nanosleep kernel/time/posix-timers.c:1396 [inline]
__se_sys_clock_nanosleep kernel/time/posix-timers.c:1373 [inline]
__x64_sys_clock_nanosleep+0x344/0x490 kernel/time/posix-timers.c:1373
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x38/0xb0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7ff1a56a7ef5
Code: 24 0c 89 3c 24 48 89 4c 24 18 e8 f6 b9 ff ff 4c 8b 54 24 18 48 8b 54 24 10 41 89 c0 8b 74 24 0c 8b 3c 24 b8 e6 00 00 00 0f 05 <44> 89 c7 48 89 04 24 e8 4f ba ff ff 48 8b 04 24 48 83 c4 28 f7 d8
RSP: 002b:00007ffe80c6ee30 EFLAGS: 00000293 ORIG_RAX: 00000000000000e6
RAX: ffffffffffffffda RBX: 00007ff1a579bf80 RCX: 00007ff1a56a7ef5
RDX: 00007ffe80c6ee70 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 00007ff1a579d980 R08: 0000000000000000 R09: 0000000000000001
R10: 0000000000000000 R11: 0000000000000293 R12: 000000000000fef3
R13: ffffffffffffffff R14: 00007ff1a5200000 R15: 000000000000fbb2
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:lookup_object lib/debugobjects.c:195 [inline]
RIP: 0010:lookup_object_or_alloc lib/debugobjects.c:564 [inline]
RIP: 0010:__debug_object_init+0xf3/0x2b0 lib/debugobjects.c:634
Code: d8 48 c1 e8 03 42 80 3c 20 00 0f 85 85 01 00 00 48 8b 1b 48 85 db 0f 84 9f 00 00 00 48 8d 7b 18 83 c5 01 48 89 f8 48 c1 e8 03 <42> 80 3c 20 00 0f 85 4c 01 00 00 4c 3b 73 18 75 c3 48 8d 7b 10 48
RSP: 0018:ffffc900050e7d08 EFLAGS: 00010012

RAX: 000000003ffff113 RBX: 00000001ffff8880 RCX: ffffffff8169123e
RDX: 1ffffffff249b149 RSI: 0000000000000004 RDI: 00000001ffff8898
RBP: 0000000000000003 R08: 0000000000000001 R09: 0000000000000216
R10: 0000000000000003 R11: 0000000000000000 R12: dffffc0000000000
R13: ffffffff924d8a48 R14: ffffc900050e7d90 R15: ffffffff924d8a50
FS: 0000555556eec480(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fa23ab065ee CR3: 000000007e5c1000 CR4: 0000000000350ee0
----------------
Code disassembly (best guess):
0: d8 48 c1 fmuls -0x3f(%rax)
3: e8 03 42 80 3c call 0x3c80420b
8: 20 00 and %al,(%rax)
a: 0f 85 85 01 00 00 jne 0x195
10: 48 8b 1b mov (%rbx),%rbx
13: 48 85 db test %rbx,%rbx
16: 0f 84 9f 00 00 00 je 0xbb
1c: 48 8d 7b 18 lea 0x18(%rbx),%rdi
20: 83 c5 01 add $0x1,%ebp
23: 48 89 f8 mov %rdi,%rax
26: 48 c1 e8 03 shr $0x3,%rax
* 2a: 42 80 3c 20 00 cmpb $0x0,(%rax,%r12,1) <-- trapping instruction
2f: 0f 85 4c 01 00 00 jne 0x181
35: 4c 3b 73 18 cmp 0x18(%rbx),%r14
39: 75 c3 jne 0xfffffffe
3b: 48 8d 7b 10 lea 0x10(%rbx),%rdi
3f: 48 rex.W


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.

If the report is already addressed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite report's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the report is a duplicate of another one, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup


2023-11-02 15:57:16

by Thomas Gleixner

[permalink] [raw]
Subject: Re: [syzbot] [ext4?] general protection fault in hrtimer_nanosleep

On Thu, Nov 02 2023 at 13:08, Aleksandr Nogikh wrote:
> On Wed, Nov 1, 2023 at 1:58 PM Thomas Gleixner <[email protected]> wrote:
>> Unfortunately repro.syz does not hold up to its name and refuses to
>> reproduce.
>
> For me, on a locally built kernel (gcc 13.2.0) it didn't work either.
>
> But, interestingly, it does reproduce using the syzbot-built kernel
> shared via the "Downloadable assets" [1] in the original report. The
> repro crashed the kernel in ~1 minute.
>
> [1] https://github.com/google/syzkaller/blob/master/docs/syzbot_assets.md
>
> [ 125.919060][ C0] BUG: KASAN: stack-out-of-bounds in rb_next+0x10a/0x130
> [ 125.921169][ C0] Read of size 8 at addr ffffc900048e7c60 by task
> kworker/0:1/9
> [ 125.923235][ C0]
> [ 125.923243][ C0] CPU: 0 PID: 9 Comm: kworker/0:1 Not tainted
> 6.6.0-rc7-syzkaller-00142-g888cf78c29e2 #0
> [ 125.924546][ C0] Hardware name: QEMU Standard PC (Q35 + ICH9,
> 2009), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
> [ 125.926915][ C0] Workqueue: events nsim_dev_trap_report_work
> [ 125.929333][ C0]
> [ 125.929341][ C0] Call Trace:
> [ 125.929350][ C0] <IRQ>
> [ 125.929356][ C0] dump_stack_lvl+0xd9/0x1b0
> [ 125.931302][ C0] print_report+0xc4/0x620
> [ 125.932115][ C0] ? __virt_addr_valid+0x5e/0x2d0
> [ 125.933194][ C0] kasan_report+0xda/0x110
> [ 125.934814][ C0] ? rb_next+0x10a/0x130
> [ 125.936521][ C0] ? rb_next+0x10a/0x130
> [ 125.936544][ C0] rb_next+0x10a/0x130
> [ 125.936565][ C0] timerqueue_del+0xd4/0x140
> [ 125.936590][ C0] __remove_hrtimer+0x99/0x290
> [ 125.936613][ C0] __hrtimer_run_queues+0x55b/0xc10
> [ 125.936638][ C0] ? enqueue_hrtimer+0x310/0x310
> [ 125.936659][ C0] ? ktime_get_update_offsets_now+0x3bc/0x610
> [ 125.936688][ C0] hrtimer_interrupt+0x31b/0x800
> [ 125.936715][ C0] __sysvec_apic_timer_interrupt+0x105/0x3f0
> [ 125.936737][ C0] sysvec_apic_timer_interrupt+0x8e/0xc0
> [ 125.936755][ C0] </IRQ>
> [ 125.936759][ C0] <TASK>

Which is a completely different failure mode.

It explodes in the hrtimer interrupt when dequeuing an hrtimer for
expiry. That means the corresponding embedded rb_node is corrupted,
which points to random data corruption.

As you can reproduce (it still fails here with the provided assets),
does the failure change when you run it several times?

Thanks,

tglx