2023-10-29 17:09:38

by syzbot

[permalink] [raw]
Subject: [syzbot] [mm?] general protection fault in __hugetlb_zap_begin

Hello,

syzbot found the following issue on:

HEAD commit: 3a568e3a961b Merge tag 'soc-fixes-6.7-3' of git://git.kern..
git tree: upstream
console+strace: https://syzkaller.appspot.com/x/log.txt?x=159d9cdd680000
kernel config: https://syzkaller.appspot.com/x/.config?x=174a257c5ae6b4fd
dashboard link: https://syzkaller.appspot.com/bug?extid=ec9435c038e451be48ff
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=14e433eb680000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=16808d35680000

Downloadable assets:
disk image: https://storage.googleapis.com/syzbot-assets/70fcba190275/disk-3a568e3a.raw.xz
vmlinux: https://storage.googleapis.com/syzbot-assets/65a6a954dd61/vmlinux-3a568e3a.xz
kernel image: https://storage.googleapis.com/syzbot-assets/e4d7963284af/bzImage-3a568e3a.xz

The issue was bisected to:

commit bf4916922c60f43efaa329744b3eef539aa6a2b2
Author: Rik van Riel <[email protected]>
Date: Fri Oct 6 03:59:07 2023 +0000

hugetlbfs: extend hugetlb_vma_lock to private VMAs

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=17d92743680000
final oops: https://syzkaller.appspot.com/x/report.txt?x=14392743680000
console output: https://syzkaller.appspot.com/x/log.txt?x=10392743680000

IMPORTANT: if you fix the issue, please add the following tag to the commit:
Reported-by: [email protected]
Fixes: bf4916922c60 ("hugetlbfs: extend hugetlb_vma_lock to private VMAs")

general protection fault, probably for non-canonical address 0xdffffc000000001d: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x00000000000000e8-0x00000000000000ef]
CPU: 1 PID: 5048 Comm: syz-executor261 Not tainted 6.6.0-rc7-syzkaller-00123-g3a568e3a961b #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
RIP: 0010:__lock_acquire+0x10d/0x7f70 kernel/locking/lockdep.c:5004
Code: 85 75 18 00 00 83 3d fd 93 2c 0d 00 48 89 9c 24 10 01 00 00 0f 84 f8 0f 00 00 83 3d 5c 8c b2 0b 00 74 34 48 89 d0 48 c1 e8 03 <42> 80 3c 00 00 74 1a 48 89 d7 e8 24 c0 7b 00 48 8b 94 24 80 00 00
RSP: 0018:ffffc90003abf440 EFLAGS: 00010006
RAX: 000000000000001d RBX: 1ffff92000757eac RCX: 0000000000000000
RDX: 00000000000000e8 RSI: 0000000000000000 RDI: 00000000000000e8
RBP: ffffc90003abf708 R08: dffffc0000000000 R09: 0000000000000000
R10: dffffc0000000000 R11: fffffbfff1d32d6e R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000000000 R15: ffff88807b34bb80
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fbcd77db0d0 CR3: 0000000029b05000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
lock_acquire+0x1e3/0x520 kernel/locking/lockdep.c:5753
down_write+0x3a/0x50 kernel/locking/rwsem.c:1573
__hugetlb_zap_begin+0x2e0/0x380 mm/hugetlb.c:5447
hugetlb_zap_begin include/linux/hugetlb.h:258 [inline]
unmap_vmas+0x364/0x5c0 mm/memory.c:1733
exit_mmap+0x297/0xc50 mm/mmap.c:3230
__mmput+0x115/0x3c0 kernel/fork.c:1349
exit_mm+0x21f/0x300 kernel/exit.c:567
do_exit+0x9af/0x2650 kernel/exit.c:861
__do_sys_exit kernel/exit.c:991 [inline]
__se_sys_exit kernel/exit.c:989 [inline]
__x64_sys_exit+0x40/0x40 kernel/exit.c:989
do_syscall_x64 arch/x86/entry/common.c:50 [inline]
do_syscall_64+0x41/0xc0 arch/x86/entry/common.c:80
entry_SYSCALL_64_after_hwframe+0x63/0xcd
RIP: 0033:0x7fbcd7764af9
Code: Unable to access opcode bytes at 0x7fbcd7764acf.
RSP: 002b:00007ffe3ee5cb58 EFLAGS: 00000246
ORIG_RAX: 000000000000003c
RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007fbcd7764af9
RDX: 00007fbcd779e433 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 0000000000011f97 R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 00007ffe3ee5cc7c
R13: 431bde82d7b634db R14: 0000000000000001 R15: 0000000000000001
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__lock_acquire+0x10d/0x7f70 kernel/locking/lockdep.c:5004
Code: 85 75 18 00 00 83 3d fd 93 2c 0d 00 48 89 9c 24 10 01 00 00 0f 84 f8 0f 00 00 83 3d 5c 8c b2 0b 00 74 34 48 89 d0 48 c1 e8 03 <42> 80 3c 00 00 74 1a 48 89 d7 e8 24 c0 7b 00 48 8b 94 24 80 00 00
RSP: 0018:ffffc90003abf440 EFLAGS: 00010006

RAX: 000000000000001d RBX: 1ffff92000757eac RCX: 0000000000000000
RDX: 00000000000000e8 RSI: 0000000000000000 RDI: 00000000000000e8
RBP: ffffc90003abf708 R08: dffffc0000000000 R09: 0000000000000000
R10: dffffc0000000000 R11: fffffbfff1d32d6e R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000000000 R15: ffff88807b34bb80
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007fbcd77db0d0 CR3: 0000000029b05000 CR4: 00000000003506e0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: 85 75 18 test %esi,0x18(%rbp)
3: 00 00 add %al,(%rax)
5: 83 3d fd 93 2c 0d 00 cmpl $0x0,0xd2c93fd(%rip) # 0xd2c9409
c: 48 89 9c 24 10 01 00 mov %rbx,0x110(%rsp)
13: 00
14: 0f 84 f8 0f 00 00 je 0x1012
1a: 83 3d 5c 8c b2 0b 00 cmpl $0x0,0xbb28c5c(%rip) # 0xbb28c7d
21: 74 34 je 0x57
23: 48 89 d0 mov %rdx,%rax
26: 48 c1 e8 03 shr $0x3,%rax
* 2a: 42 80 3c 00 00 cmpb $0x0,(%rax,%r8,1) <-- trapping instruction
2f: 74 1a je 0x4b
31: 48 89 d7 mov %rdx,%rdi
34: e8 24 c0 7b 00 call 0x7bc05d
39: 48 rex.W
3a: 8b .byte 0x8b
3b: 94 xchg %eax,%esp
3c: 24 80 and $0x80,%al


---
This report is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].

syzbot will keep track of this issue. See:
https://goo.gl/tpsmEJ#status for how to communicate with syzbot.
For information about bisection process see: https://goo.gl/tpsmEJ#bisection

If the bug is already fixed, let syzbot know by replying with:
#syz fix: exact-commit-title

If you want syzbot to run the reproducer, reply with:
#syz test: git://repo/address.git branch-or-commit-hash
If you attach or paste a git patch, syzbot will apply it before testing.

If you want to overwrite bug's subsystems, reply with:
#syz set subsystems: new-subsystem
(See the list of subsystem names on the web dashboard)

If the bug is a duplicate of another bug, reply with:
#syz dup: exact-subject-of-another-report

If you want to undo deduplication, reply with:
#syz undup


2023-11-03 11:36:44

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [PATCH] Test for 2030579113a1

For archival purposes, forwarding an incoming command email to
[email protected].

***

Subject: [PATCH] Test for 2030579113a1
Author: [email protected]

please test BUG: corrupted list in ptp_open

#syz test https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git 2dac75696c6d

diff --git a/drivers/ptp/ptp_chardev.c b/drivers/ptp/ptp_chardev.c
index 282cd7d24077..6e9762a54b14 100644
--- a/drivers/ptp/ptp_chardev.c
+++ b/drivers/ptp/ptp_chardev.c
@@ -119,8 +119,13 @@ int ptp_open(struct posix_clock_context *pccontext, fmode_t fmode)
}
bitmap_set(queue->mask, 0, PTP_MAX_CHANNELS);
spin_lock_init(&queue->lock);
+ if (mutex_lock_interruptible(&ptp->tsevq_mux)) {
+ kfree(queue);
+ return -ERESTARTSYS;
+ }
list_add_tail(&queue->qlist, &ptp->tsevqs);
pccontext->private_clkdata = queue;
+ mutex_unlock(&ptp->tsevq_mux);

/* Debugfs contents */
sprintf(debugfsname, "0x%p", queue);
@@ -138,14 +143,19 @@ int ptp_open(struct posix_clock_context *pccontext, fmode_t fmode)
int ptp_release(struct posix_clock_context *pccontext)
{
struct timestamp_event_queue *queue = pccontext->private_clkdata;
+ struct ptp_clock *ptp =
+ container_of(pccontext->clk, struct ptp_clock, clock);
unsigned long flags;

if (queue) {
+ if (mutex_lock_interruptible(&ptp->tsevq_mux))
+ return -ERESTARTSYS;
debugfs_remove(queue->debugfs_instance);
pccontext->private_clkdata = NULL;
spin_lock_irqsave(&queue->lock, flags);
list_del(&queue->qlist);
spin_unlock_irqrestore(&queue->lock, flags);
+ mutex_unlock(&ptp->tsevq_mux);
bitmap_free(queue->mask);
kfree(queue);
}
@@ -585,7 +595,5 @@ ssize_t ptp_read(struct posix_clock_context *pccontext, uint rdflags,
free_event:
kfree(event);
exit:
- if (result < 0)
- ptp_release(pccontext);
return result;
}
diff --git a/drivers/ptp/ptp_clock.c b/drivers/ptp/ptp_clock.c
index 3d1b0a97301c..7930db6ec18d 100644
--- a/drivers/ptp/ptp_clock.c
+++ b/drivers/ptp/ptp_clock.c
@@ -176,6 +176,7 @@ static void ptp_clock_release(struct device *dev)

ptp_cleanup_pin_groups(ptp);
kfree(ptp->vclock_index);
+ mutex_destroy(&ptp->tsevq_mux);
mutex_destroy(&ptp->pincfg_mux);
mutex_destroy(&ptp->n_vclocks_mux);
/* Delete first entry */
@@ -247,6 +248,7 @@ struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info,
if (!queue)
goto no_memory_queue;
list_add_tail(&queue->qlist, &ptp->tsevqs);
+ mutex_init(&ptp->tsevq_mux);
queue->mask = bitmap_alloc(PTP_MAX_CHANNELS, GFP_KERNEL);
if (!queue->mask)
goto no_memory_bitmap;
@@ -356,6 +358,7 @@ struct ptp_clock *ptp_clock_register(struct ptp_clock_info *info,
if (ptp->kworker)
kthread_destroy_worker(ptp->kworker);
kworker_err:
+ mutex_destroy(&ptp->tsevq_mux);
mutex_destroy(&ptp->pincfg_mux);
mutex_destroy(&ptp->n_vclocks_mux);
bitmap_free(queue->mask);
diff --git a/drivers/ptp/ptp_private.h b/drivers/ptp/ptp_private.h
index 52f87e394aa6..1525bd2059ba 100644
--- a/drivers/ptp/ptp_private.h
+++ b/drivers/ptp/ptp_private.h
@@ -44,6 +44,7 @@ struct ptp_clock {
struct pps_device *pps_source;
long dialed_frequency; /* remembers the frequency adjustment */
struct list_head tsevqs; /* timestamp fifo list */
+ struct mutex tsevq_mux; /* one process at a time reading the fifo */
struct mutex pincfg_mux; /* protect concurrent info->pin_config access */
wait_queue_head_t tsev_wq;
int defunct; /* tells readers to go away when clock is being removed */
--
2.25.1

2023-11-03 11:52:14

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [mm?] general protection fault in __hugetlb_zap_begin

Hello,

syzbot has tested the proposed patch but the reproducer is still triggering an issue:
general protection fault in __hugetlb_zap_begin

general protection fault, probably for non-canonical address 0xdffffc000000001d: 0000 [#1] PREEMPT SMP KASAN
KASAN: null-ptr-deref in range [0x00000000000000e8-0x00000000000000ef]
CPU: 1 PID: 5726 Comm: syz-executor.3 Not tainted 6.6.0-rc6-next-20231018-syzkaller-dirty #0
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS Google 10/09/2023
RIP: 0010:__lock_acquire+0x10d/0x7f70 kernel/locking/lockdep.c:5004
Code: 85 75 18 00 00 83 3d fd 68 4d 0d 00 48 89 9c 24 10 01 00 00 0f 84 f8 0f 00 00 83 3d 0c 61 d2 0b 00 74 34 48 89 d0 48 c1 e8 03 <42> 80 3c 00 00 74 1a 48 89 d7 e8 b4 76 7d 00 48 8b 94 24 80 00 00
RSP: 0018:ffffc90005a37440 EFLAGS: 00010006
RAX: 000000000000001d RBX: 1ffff92000b46eac RCX: 0000000000000000
RDX: 00000000000000e8 RSI: 0000000000000000 RDI: 00000000000000e8
RBP: ffffc90005a37708 R08: dffffc0000000000 R09: 0000000000000000
R10: dffffc0000000000 R11: fffffbfff1d74f1e R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000000000 R15: ffff888028368000
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe062f6d48 CR3: 00000000282bd000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
Call Trace:
<TASK>
lock_acquire+0x1e3/0x520 kernel/locking/lockdep.c:5753
down_write+0x3a/0x50 kernel/locking/rwsem.c:1579
__hugetlb_zap_begin+0x2e0/0x380 mm/hugetlb.c:5707
hugetlb_zap_begin include/linux/hugetlb.h:258 [inline]
unmap_vmas+0x364/0x5c0 mm/memory.c:1742
exit_mmap+0x297/0xc50 mm/mmap.c:3308
__mmput+0x115/0x3c0 kernel/fork.c:1349
exit_mm+0x21f/0x300 kernel/exit.c:567
do_exit+0x9b7/0x2750 kernel/exit.c:858
__do_sys_exit kernel/exit.c:988 [inline]
__se_sys_exit kernel/exit.c:986 [inline]
__x64_sys_exit+0x40/0x40 kernel/exit.c:986
do_syscall_x64 arch/x86/entry/common.c:51 [inline]
do_syscall_64+0x44/0x110 arch/x86/entry/common.c:82
entry_SYSCALL_64_after_hwframe+0x63/0x6b
RIP: 0033:0x7f67fe27cae9
Code: Unable to access opcode bytes at 0x7f67fe27cabf.
RSP: 002b:00007f67fef73078 EFLAGS: 00000246
ORIG_RAX: 000000000000003c
RAX: ffffffffffffffda RBX: 00007f67fe39bf80 RCX: 00007f67fe27cae9
RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000
RBP: 00007f67fe2c847a R08: 0000000000000000 R09: 0000000000000000
R10: 0000000000000000 R11: 0000000000000246 R12: 0000000000000000
R13: 000000000000000b R14: 00007f67fe39bf80 R15: 00007ffdf414ed58
</TASK>
Modules linked in:
---[ end trace 0000000000000000 ]---
RIP: 0010:__lock_acquire+0x10d/0x7f70 kernel/locking/lockdep.c:5004
Code: 85 75 18 00 00 83 3d fd 68 4d 0d 00 48 89 9c 24 10 01 00 00 0f 84 f8 0f 00 00 83 3d 0c 61 d2 0b 00 74 34 48 89 d0 48 c1 e8 03 <42> 80 3c 00 00 74 1a 48 89 d7 e8 b4 76 7d 00 48 8b 94 24 80 00 00
RSP: 0018:ffffc90005a37440 EFLAGS: 00010006

RAX: 000000000000001d RBX: 1ffff92000b46eac RCX: 0000000000000000
RDX: 00000000000000e8 RSI: 0000000000000000 RDI: 00000000000000e8
RBP: ffffc90005a37708 R08: dffffc0000000000 R09: 0000000000000000
R10: dffffc0000000000 R11: fffffbfff1d74f1e R12: 0000000000000000
R13: 0000000000000001 R14: 0000000000000000 R15: ffff888028368000
FS: 0000000000000000(0000) GS:ffff8880b9900000(0000) knlGS:0000000000000000
CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
CR2: 00007ffe062f6d48 CR3: 00000000282bd000 CR4: 00000000003506f0
DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
----------------
Code disassembly (best guess):
0: 85 75 18 test %esi,0x18(%rbp)
3: 00 00 add %al,(%rax)
5: 83 3d fd 68 4d 0d 00 cmpl $0x0,0xd4d68fd(%rip) # 0xd4d6909
c: 48 89 9c 24 10 01 00 mov %rbx,0x110(%rsp)
13: 00
14: 0f 84 f8 0f 00 00 je 0x1012
1a: 83 3d 0c 61 d2 0b 00 cmpl $0x0,0xbd2610c(%rip) # 0xbd2612d
21: 74 34 je 0x57
23: 48 89 d0 mov %rdx,%rax
26: 48 c1 e8 03 shr $0x3,%rax
* 2a: 42 80 3c 00 00 cmpb $0x0,(%rax,%r8,1) <-- trapping instruction
2f: 74 1a je 0x4b
31: 48 89 d7 mov %rdx,%rdi
34: e8 b4 76 7d 00 call 0x7d76ed
39: 48 rex.W
3a: 8b .byte 0x8b
3b: 94 xchg %eax,%esp
3c: 24 80 and $0x80,%al


Tested on:

commit: 2dac7569 Add linux-next specific files for 20231018
git tree: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
console output: https://syzkaller.appspot.com/x/log.txt?x=14bc60d7680000
kernel config: https://syzkaller.appspot.com/x/.config?x=29e8e23689e6210c
dashboard link: https://syzkaller.appspot.com/bug?extid=ec9435c038e451be48ff
compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
patch: https://syzkaller.appspot.com/x/patch.diff?x=1088a55f680000

2023-11-03 18:14:17

by Mike Kravetz

[permalink] [raw]
Subject: Re: [syzbot] [mm?] general protection fault in __hugetlb_zap_begin

On 11/03/23 04:52, syzbot wrote:
> Hello,
>
> syzbot has tested the proposed patch but the reproducer is still triggering an issue:
> general protection fault in __hugetlb_zap_begin
>
> general protection fault, probably for non-canonical address 0xdffffc000000001d: 0000 [#1] PREEMPT SMP KASAN
> KASAN: null-ptr-deref in range [0x00000000000000e8-0x00000000000000ef]

<snip>

> Tested on:
>
> commit: 2dac7569 Add linux-next specific files for 20231018
> git tree: https://git.kernel.org/pub/scm/linux/kernel/git/next/linux-next.git
> console output: https://syzkaller.appspot.com/x/log.txt?x=14bc60d7680000
> kernel config: https://syzkaller.appspot.com/x/.config?x=29e8e23689e6210c
> dashboard link: https://syzkaller.appspot.com/bug?extid=ec9435c038e451be48ff
> compiler: Debian clang version 15.0.6, GNU ld (GNU Binutils for Debian) 2.40
> patch: https://syzkaller.appspot.com/x/patch.diff?x=1088a55f680000

Unless I am missing something, I do not believe the tested patch was
proposed for the general protection fault in __hugetlb_zap_begin issue.
--
Mike Kravetz

2024-01-09 18:17:18

by syzbot

[permalink] [raw]
Subject: Re: [syzbot] [mm?] general protection fault in __hugetlb_zap_begin

syzbot suspects this issue was fixed by commit:

commit 187da0f8250aa94bd96266096aef6f694e0b4cd2
Author: Mike Kravetz <[email protected]>
Date: Tue Nov 14 01:20:33 2023 +0000

hugetlb: fix null-ptr-deref in hugetlb_vma_lock_write

bisection log: https://syzkaller.appspot.com/x/bisect.txt?x=17f5054de80000
start commit: 9b6de136b5f0 Merge tag 'loongarch-fixes-6.7-1' of git://gi..
git tree: upstream
kernel config: https://syzkaller.appspot.com/x/.config?x=52c9552def2a0fdd
dashboard link: https://syzkaller.appspot.com/bug?extid=ec9435c038e451be48ff
syz repro: https://syzkaller.appspot.com/x/repro.syz?x=150a257ce80000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=13481ff0e80000

If the result looks correct, please mark the issue as fixed by replying with:

#syz fix: hugetlb: fix null-ptr-deref in hugetlb_vma_lock_write

For information about bisection process see: https://goo.gl/tpsmEJ#bisection