Hello,
syzbot found the following crash on:
HEAD commit: 203ec2fed17a Merge tag 'armsoc-fixes' of git://git.kernel...
git tree: upstream
console output: https://syzkaller.appspot.com/x/log.txt?x=11c1ad77800000
kernel config: https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
dashboard link: https://syzkaller.appspot.com/bug?extid=568245b88fbaedcb1959
compiler: gcc (GCC) 8.0.1 20180413 (experimental)
syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=122c7427800000
C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10387057800000
IMPORTANT: if you fix the bug, please add the following tag to the commit:
Reported-by: [email protected]
(ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 ................
(ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
00 ................
XFS (loop0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x2
len 1 error 117
XFS (loop0): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -117,
agno 0
XFS (loop0): failed to read root inode
INFO: task syz-executor060:4501 blocked for more than 120 seconds.
Not tainted 4.17.0-rc5+ #60
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
syz-executor060 D17168 4501 4499 0x00000000
Call Trace:
context_switch kernel/sched/core.c:2859 [inline]
__schedule+0x801/0x1e30 kernel/sched/core.c:3501
schedule+0xef/0x430 kernel/sched/core.c:3545
xlog_grant_head_wait+0x260/0xf80 fs/xfs/xfs_log.c:275
xlog_grant_head_check+0x4d6/0x550 fs/xfs/xfs_log.c:337
xfs_log_reserve+0x398/0xd20 fs/xfs/xfs_log.c:466
xfs_log_unmount_write+0x2c4/0xfb0 fs/xfs/xfs_log.c:885
xfs_log_quiesce+0xf9/0x130 fs/xfs/xfs_log.c:1011
xfs_log_unmount+0x22/0xb0 fs/xfs/xfs_log.c:1025
xfs_log_mount_cancel+0x44/0x60 fs/xfs/xfs_log.c:828
xfs_mountfs+0x17d9/0x2b80 fs/xfs/xfs_mount.c:1041
xfs_fs_fill_super+0xdef/0x1560 fs/xfs/xfs_super.c:1734
mount_bdev+0x30c/0x3e0 fs/super.c:1164
xfs_fs_mount+0x34/0x40 fs/xfs/xfs_super.c:1801
mount_fs+0xae/0x328 fs/super.c:1267
vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037
vfs_kern_mount fs/namespace.c:1027 [inline]
do_new_mount fs/namespace.c:2518 [inline]
do_mount+0x564/0x3070 fs/namespace.c:2848
ksys_mount+0x12d/0x140 fs/namespace.c:3064
__do_sys_mount fs/namespace.c:3078 [inline]
__se_sys_mount fs/namespace.c:3075 [inline]
__x64_sys_mount+0xbe/0x150 fs/namespace.c:3075
do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
entry_SYSCALL_64_after_hwframe+0x49/0xbe
RIP: 0033:0x442d9a
RSP: 002b:00007ffc7a9dde88 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000442d9a
RDX: 0000000020000040 RSI: 0000000020000100 RDI: 00007ffc7a9dde90
RBP: 0000000000000004 R08: 00000000200001c0 R09: 000000000000000a
R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000401c00
R13: 0000000000401c90 R14: 0000000000000000 R15: 0000000000000000
Showing all locks held in the system:
2 locks held by khungtaskd/893:
#0: (ptrval) (rcu_read_lock){....}, at:
check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
#0: (ptrval) (rcu_read_lock){....}, at: watchdog+0x1ff/0xf60
kernel/hung_task.c:249
#1: (ptrval) (tasklist_lock){.+.+}, at:
debug_show_all_locks+0xde/0x34a kernel/locking/lockdep.c:4470
1 lock held by rsyslogd/4384:
#0: (ptrval) (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x1a9/0x1e0
fs/file.c:766
2 locks held by getty/4474:
#0: (ptrval) (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
2 locks held by getty/4475:
#0: (ptrval) (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
2 locks held by getty/4476:
#0: (ptrval) (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
2 locks held by getty/4477:
#0: (ptrval) (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
2 locks held by getty/4478:
#0: (ptrval) (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
2 locks held by getty/4479:
#0: (ptrval) (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
2 locks held by getty/4480:
#0: (ptrval) (&tty->ldisc_sem){++++}, at:
ldsem_down_read+0x37/0x40 drivers/tty/tty_ldsem.c:365
#1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
1 lock held by syz-executor060/4501:
#0: (ptrval) (&type->s_umount_key#36/1){+.+.}, at: alloc_super
fs/super.c:213 [inline]
#0: (ptrval) (&type->s_umount_key#36/1){+.+.}, at:
sget_userns+0x2dd/0xf00 fs/super.c:506
=============================================
NMI backtrace for cpu 0
CPU: 0 PID: 893 Comm: khungtaskd Not tainted 4.17.0-rc5+ #60
Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
Google 01/01/2011
Call Trace:
__dump_stack lib/dump_stack.c:77 [inline]
dump_stack+0x1b9/0x294 lib/dump_stack.c:113
nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline]
check_hung_task kernel/hung_task.c:132 [inline]
check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline]
watchdog+0xc10/0xf60 kernel/hung_task.c:249
kthread+0x345/0x410 kernel/kthread.c:240
ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
Sending NMI from CPU 0 to CPUs 1:
NMI backtrace for cpu 1 skipped: idling at native_safe_halt+0x6/0x10
arch/x86/include/asm/irqflags.h:54
---
This bug is generated by a bot. It may contain errors.
See https://goo.gl/tpsmEJ for more information about syzbot.
syzbot engineers can be reached at [email protected].
syzbot will keep track of this bug report. See:
https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
syzbot.
syzbot can test patches for this bug, for details see:
https://goo.gl/tpsmEJ#testing-patches
On Mon, May 21, 2018 at 10:55:02AM -0700, syzbot wrote:
> Hello,
>
> syzbot found the following crash on:
>
> HEAD commit: 203ec2fed17a Merge tag 'armsoc-fixes' of git://git.kernel...
> git tree: upstream
> console output: https://syzkaller.appspot.com/x/log.txt?x=11c1ad77800000
> kernel config: https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
> dashboard link: https://syzkaller.appspot.com/bug?extid=568245b88fbaedcb1959
> compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=122c7427800000
> C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10387057800000
>
> IMPORTANT: if you fix the bug, please add the following tag to the commit:
> Reported-by: [email protected]
>
> (ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ................
> (ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> ................
> XFS (loop0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x2 len
> 1 error 117
> XFS (loop0): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -117,
> agno 0
> XFS (loop0): failed to read root inode
FWIW, the initial console output is actually:
[ 448.028253] XFS (loop0): Mounting V4 Filesystem
[ 448.033540] XFS (loop0): Log size 9371840 blocks too large, maximum size is 1048576 blocks
[ 448.042287] XFS (loop0): Log size out of supported range.
[ 448.047841] XFS (loop0): Continuing onwards, but if log hangs are experienced then please report this message in the bug report.
[ 448.060712] XFS (loop0): totally zeroed log
... which warns about an oversized log and resulting log hangs. Not
having dug into the details of why this occurs so quickly in this mount
failure path, it does look like we'd never have got past this point on a
v5 fs (i.e., the above warning would become an error and we'd not enter
the xfs_log_mount_cancel() path).
Brian
> INFO: task syz-executor060:4501 blocked for more than 120 seconds.
> Not tainted 4.17.0-rc5+ #60
> "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> syz-executor060 D17168 4501 4499 0x00000000
> Call Trace:
> context_switch kernel/sched/core.c:2859 [inline]
> __schedule+0x801/0x1e30 kernel/sched/core.c:3501
> schedule+0xef/0x430 kernel/sched/core.c:3545
> xlog_grant_head_wait+0x260/0xf80 fs/xfs/xfs_log.c:275
> xlog_grant_head_check+0x4d6/0x550 fs/xfs/xfs_log.c:337
> xfs_log_reserve+0x398/0xd20 fs/xfs/xfs_log.c:466
> xfs_log_unmount_write+0x2c4/0xfb0 fs/xfs/xfs_log.c:885
> xfs_log_quiesce+0xf9/0x130 fs/xfs/xfs_log.c:1011
> xfs_log_unmount+0x22/0xb0 fs/xfs/xfs_log.c:1025
> xfs_log_mount_cancel+0x44/0x60 fs/xfs/xfs_log.c:828
> xfs_mountfs+0x17d9/0x2b80 fs/xfs/xfs_mount.c:1041
> xfs_fs_fill_super+0xdef/0x1560 fs/xfs/xfs_super.c:1734
> mount_bdev+0x30c/0x3e0 fs/super.c:1164
> xfs_fs_mount+0x34/0x40 fs/xfs/xfs_super.c:1801
> mount_fs+0xae/0x328 fs/super.c:1267
> vfs_kern_mount.part.34+0xd4/0x4d0 fs/namespace.c:1037
> vfs_kern_mount fs/namespace.c:1027 [inline]
> do_new_mount fs/namespace.c:2518 [inline]
> do_mount+0x564/0x3070 fs/namespace.c:2848
> ksys_mount+0x12d/0x140 fs/namespace.c:3064
> __do_sys_mount fs/namespace.c:3078 [inline]
> __se_sys_mount fs/namespace.c:3075 [inline]
> __x64_sys_mount+0xbe/0x150 fs/namespace.c:3075
> do_syscall_64+0x1b1/0x800 arch/x86/entry/common.c:287
> entry_SYSCALL_64_after_hwframe+0x49/0xbe
> RIP: 0033:0x442d9a
> RSP: 002b:00007ffc7a9dde88 EFLAGS: 00000202 ORIG_RAX: 00000000000000a5
> RAX: ffffffffffffffda RBX: 0000000000000003 RCX: 0000000000442d9a
> RDX: 0000000020000040 RSI: 0000000020000100 RDI: 00007ffc7a9dde90
> RBP: 0000000000000004 R08: 00000000200001c0 R09: 000000000000000a
> R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000401c00
> R13: 0000000000401c90 R14: 0000000000000000 R15: 0000000000000000
>
> Showing all locks held in the system:
> 2 locks held by khungtaskd/893:
> #0: (ptrval) (rcu_read_lock){....}, at:
> check_hung_uninterruptible_tasks kernel/hung_task.c:175 [inline]
> #0: (ptrval) (rcu_read_lock){....}, at: watchdog+0x1ff/0xf60
> kernel/hung_task.c:249
> #1: (ptrval) (tasklist_lock){.+.+}, at:
> debug_show_all_locks+0xde/0x34a kernel/locking/lockdep.c:4470
> 1 lock held by rsyslogd/4384:
> #0: (ptrval) (&f->f_pos_lock){+.+.}, at: __fdget_pos+0x1a9/0x1e0
> fs/file.c:766
> 2 locks held by getty/4474:
> #0: (ptrval) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
> #1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 2 locks held by getty/4475:
> #0: (ptrval) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
> #1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 2 locks held by getty/4476:
> #0: (ptrval) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
> #1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 2 locks held by getty/4477:
> #0: (ptrval) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
> #1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 2 locks held by getty/4478:
> #0: (ptrval) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
> #1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 2 locks held by getty/4479:
> #0: (ptrval) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
> #1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 2 locks held by getty/4480:
> #0: (ptrval) (&tty->ldisc_sem){++++}, at: ldsem_down_read+0x37/0x40
> drivers/tty/tty_ldsem.c:365
> #1: (ptrval) (&ldata->atomic_read_lock){+.+.}, at:
> n_tty_read+0x321/0x1cc0 drivers/tty/n_tty.c:2131
> 1 lock held by syz-executor060/4501:
> #0: (ptrval) (&type->s_umount_key#36/1){+.+.}, at: alloc_super
> fs/super.c:213 [inline]
> #0: (ptrval) (&type->s_umount_key#36/1){+.+.}, at:
> sget_userns+0x2dd/0xf00 fs/super.c:506
>
> =============================================
>
> NMI backtrace for cpu 0
> CPU: 0 PID: 893 Comm: khungtaskd Not tainted 4.17.0-rc5+ #60
> Hardware name: Google Google Compute Engine/Google Compute Engine, BIOS
> Google 01/01/2011
> Call Trace:
> __dump_stack lib/dump_stack.c:77 [inline]
> dump_stack+0x1b9/0x294 lib/dump_stack.c:113
> nmi_cpu_backtrace.cold.4+0x19/0xce lib/nmi_backtrace.c:103
> nmi_trigger_cpumask_backtrace+0x151/0x192 lib/nmi_backtrace.c:62
> arch_trigger_cpumask_backtrace+0x14/0x20 arch/x86/kernel/apic/hw_nmi.c:38
> trigger_all_cpu_backtrace include/linux/nmi.h:138 [inline]
> check_hung_task kernel/hung_task.c:132 [inline]
> check_hung_uninterruptible_tasks kernel/hung_task.c:190 [inline]
> watchdog+0xc10/0xf60 kernel/hung_task.c:249
> kthread+0x345/0x410 kernel/kthread.c:240
> ret_from_fork+0x3a/0x50 arch/x86/entry/entry_64.S:412
> Sending NMI from CPU 0 to CPUs 1:
> NMI backtrace for cpu 1 skipped: idling at native_safe_halt+0x6/0x10
> arch/x86/include/asm/irqflags.h:54
>
>
> ---
> This bug is generated by a bot. It may contain errors.
> See https://goo.gl/tpsmEJ for more information about syzbot.
> syzbot engineers can be reached at [email protected].
>
> syzbot will keep track of this bug report. See:
> https://goo.gl/tpsmEJ#bug-status-tracking for how to communicate with
> syzbot.
> syzbot can test patches for this bug, for details see:
> https://goo.gl/tpsmEJ#testing-patches
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, May 22, 2018 at 08:31:08AM -0400, Brian Foster wrote:
> On Mon, May 21, 2018 at 10:55:02AM -0700, syzbot wrote:
> > Hello,
> >
> > syzbot found the following crash on:
> >
> > HEAD commit: 203ec2fed17a Merge tag 'armsoc-fixes' of git://git.kernel...
> > git tree: upstream
> > console output: https://syzkaller.appspot.com/x/log.txt?x=11c1ad77800000
> > kernel config: https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
> > dashboard link: https://syzkaller.appspot.com/bug?extid=568245b88fbaedcb1959
> > compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> > syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=122c7427800000
> > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10387057800000
> >
> > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > Reported-by: [email protected]
> >
> > (ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > ................
> > (ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > ................
> > XFS (loop0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x2 len
> > 1 error 117
> > XFS (loop0): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -117,
> > agno 0
> > XFS (loop0): failed to read root inode
>
> FWIW, the initial console output is actually:
>
> [ 448.028253] XFS (loop0): Mounting V4 Filesystem
> [ 448.033540] XFS (loop0): Log size 9371840 blocks too large, maximum size is 1048576 blocks
> [ 448.042287] XFS (loop0): Log size out of supported range.
> [ 448.047841] XFS (loop0): Continuing onwards, but if log hangs are experienced then please report this message in the bug report.
> [ 448.060712] XFS (loop0): totally zeroed log
>
> ... which warns about an oversized log and resulting log hangs. Not
> having dug into the details of why this occurs so quickly in this mount
> failure path,
I suspect that it is a head and/or log tail pointer overflow, so when it
tries to do the first trans reserve of the mount - to write the
unmount record - it says "no log space available, please wait".
> it does look like we'd never have got past this point on a
> v5 fs (i.e., the above warning would become an error and we'd not enter
> the xfs_log_mount_cancel() path).
And this comes back to my repeated comments about fuzzers needing
to fuzz properly made V5 filesystems as we catch and error out on
things like this. Fuzzing random collections of v4 filesystem
fragments will continue to trip over problems we've avoided with v5
filesystems, and this is further evidence to point to that.
I'd suggest that at this point, syzbot XFS reports should be
redirected to /dev/null. It's not worth our time to triage
unreviewed bot generated bug reports until the syzbot developers
start listening and acting on what we have been telling them
about fuzzing filesystems and reproducing bugs that are meaningful
and useful to us.
Cheers,
Dave.
--
Dave Chinner
[email protected]
On Wed, May 23, 2018 at 08:26:20AM +1000, Dave Chinner wrote:
> On Tue, May 22, 2018 at 08:31:08AM -0400, Brian Foster wrote:
> > On Mon, May 21, 2018 at 10:55:02AM -0700, syzbot wrote:
> > > Hello,
> > >
> > > syzbot found the following crash on:
> > >
> > > HEAD commit: 203ec2fed17a Merge tag 'armsoc-fixes' of git://git.kernel...
> > > git tree: upstream
> > > console output: https://syzkaller.appspot.com/x/log.txt?x=11c1ad77800000
> > > kernel config: https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
> > > dashboard link: https://syzkaller.appspot.com/bug?extid=568245b88fbaedcb1959
> > > compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> > > syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=122c7427800000
> > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10387057800000
> > >
> > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > Reported-by: [email protected]
> > >
> > > (ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > ................
> > > (ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > ................
> > > XFS (loop0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x2 len
> > > 1 error 117
> > > XFS (loop0): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -117,
> > > agno 0
> > > XFS (loop0): failed to read root inode
> >
> > FWIW, the initial console output is actually:
> >
> > [ 448.028253] XFS (loop0): Mounting V4 Filesystem
> > [ 448.033540] XFS (loop0): Log size 9371840 blocks too large, maximum size is 1048576 blocks
> > [ 448.042287] XFS (loop0): Log size out of supported range.
> > [ 448.047841] XFS (loop0): Continuing onwards, but if log hangs are experienced then please report this message in the bug report.
> > [ 448.060712] XFS (loop0): totally zeroed log
> >
> > ... which warns about an oversized log and resulting log hangs. Not
> > having dug into the details of why this occurs so quickly in this mount
> > failure path,
>
> I suspect that it is a head and/or log tail pointer overflow, so when it
> tries to do the first trans reserve of the mount - to write the
> unmount record - it says "no log space available, please wait".
>
> > it does look like we'd never have got past this point on a
> > v5 fs (i.e., the above warning would become an error and we'd not enter
> > the xfs_log_mount_cancel() path).
>
> And this comes back to my repeated comments about fuzzers needing
> to fuzz properly made V5 filesystems as we catch and error out on
> things like this. Fuzzing random collections of v4 filesystem
> fragments will continue to trip over problems we've avoided with v5
> filesystems, and this is further evidence to point to that.
>
>
> I'd suggest that at this point, syzbot XFS reports should be
> redirected to /dev/null. It's not worth our time to triage
> unreviewed bot generated bug reports until the syzbot developers
> start listening and acting on what we have been telling them
> about fuzzing filesystems and reproducing bugs that are meaningful
> and useful to us.
The whole point of fuzzing is to provide improper inputs. A kernel bug is a
kernel bug, even if it's in deprecated/unmaintained code, or involves userspace
doing something unexpected. If you have known buggy code in XFS that you refuse
to fix, then please provide a kernel config option so that users can disable the
unmaintained XFS formats/features, leaving the maintained ones. As-is, you seem
to be forcing everyone who enables CONFIG_XFS_FS to build known
buggy/unmaintained code into their kernel.
- Eric
On Tue, May 22, 2018 at 03:52:08PM -0700, Eric Biggers wrote:
> On Wed, May 23, 2018 at 08:26:20AM +1000, Dave Chinner wrote:
> > On Tue, May 22, 2018 at 08:31:08AM -0400, Brian Foster wrote:
> > > On Mon, May 21, 2018 at 10:55:02AM -0700, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit: 203ec2fed17a Merge tag 'armsoc-fixes' of git://git.kernel...
> > > > git tree: upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=11c1ad77800000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=568245b88fbaedcb1959
> > > > compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> > > > syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=122c7427800000
> > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10387057800000
> > > >
> > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > Reported-by: [email protected]
> > > >
> > > > (ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > ................
> > > > (ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > ................
> > > > XFS (loop0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x2 len
> > > > 1 error 117
> > > > XFS (loop0): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -117,
> > > > agno 0
> > > > XFS (loop0): failed to read root inode
> > >
> > > FWIW, the initial console output is actually:
> > >
> > > [ 448.028253] XFS (loop0): Mounting V4 Filesystem
> > > [ 448.033540] XFS (loop0): Log size 9371840 blocks too large, maximum size is 1048576 blocks
> > > [ 448.042287] XFS (loop0): Log size out of supported range.
> > > [ 448.047841] XFS (loop0): Continuing onwards, but if log hangs are experienced then please report this message in the bug report.
> > > [ 448.060712] XFS (loop0): totally zeroed log
> > >
> > > ... which warns about an oversized log and resulting log hangs. Not
> > > having dug into the details of why this occurs so quickly in this mount
> > > failure path,
> >
> > I suspect that it is a head and/or log tail pointer overflow, so when it
> > tries to do the first trans reserve of the mount - to write the
> > unmount record - it says "no log space available, please wait".
> >
> > > it does look like we'd never have got past this point on a
> > > v5 fs (i.e., the above warning would become an error and we'd not enter
> > > the xfs_log_mount_cancel() path).
> >
> > And this comes back to my repeated comments about fuzzers needing
> > to fuzz properly made V5 filesystems as we catch and error out on
> > things like this. Fuzzing random collections of v4 filesystem
> > fragments will continue to trip over problems we've avoided with v5
> > filesystems, and this is further evidence to point to that.
> >
> >
> > I'd suggest that at this point, syzbot XFS reports should be
> > redirected to /dev/null. It's not worth our time to triage
> > unreviewed bot generated bug reports until the syzbot developers
> > start listening and acting on what we have been telling them
> > about fuzzing filesystems and reproducing bugs that are meaningful
> > and useful to us.
>
> The whole point of fuzzing is to provide improper inputs.
Eric, we know what fuzzing is.
If people listened to us rather just throwing stuff over the wall at
us, they'd already know that our own fuzzing code works on v5
filesystems and that it has found a lot more problems in recent
times than syzbot has.
And they'd also know that our own fuzzing stuff provides us with
easily debuggable, reproducable test cases instead of opaque,
difficult to analyse reports of things we already know about and
can't fix in a legacy on-disk format. i.e. it already does all the
things we're asking from the syzbot fuzzing.
> A kernel bug is a
> kernel bug, even if it's in deprecated/unmaintained code, or involves userspace
> doing something unexpected.
Yup, but then the severity and impact of the problem the bug exposes
has to weighed against the risk it poses to the userbase, and the
impact the fix will have on the userbase. We went through this
process several years ago for this specific problem, like we do for
all on-disk format bugs.
Keep in mind that filesystems are persistent structures that have
lifetimes of tens of years. We have to support users with old
formats, regardless of the unfixable problems they may have. We do
what we can to mitigate those issues for them and encourage users to
upgrade their kernels and on-disk formats, but we can't just shut
off access to the old formats in new kernels because a new fuzzer
found an old problem we've known about for years.
[ FYI, this report is for an on-disk v4 format bug that was
introduced into mkfs about 15 years ago. It has since been fixed
for both v4 and v5 filesystems (~4 years ago, IIRC), but still
leaves us with about a really, really long tail of production v4
format filesystems with the on-disk format bug present in them.
IOWs, when we came across this problem, we had the choice of two
things when initially validating the log size:
- only warning users of v4 filesystems and leaving them exposed to a
bug that could caused runtime hangs in very rare corner cases
(very low risk); or
- preventing them from accessing their filesystems and data on
kernel upgrade, thereby directly affecting millions of existing
filesystems in production around the world.
In this case, the *choice of least harm* was to warn about the
problem for v4 filesystems and continue onwards, but to reject
mounts that failed log size validation on the new v5 filesystem
because the on-disk format bug is fixed in v5 filesystems. ]
> If you have known buggy code in XFS that you refuse to fix, then
> please provide a kernel config option so that users can disable
> the unmaintained XFS formats/features, leaving the maintained
> ones.
What's the point of doing that when the attacker can just move to
some other exploitable filesystem e.g. ext2/3/4, vfat, btrfs, hfs,
etc? They have known vulnerable and exploitable on disk formats,
too.
i.e. this isn't an "XFS problem", this is a "all current block
device based kernel filesystems are built on top of a trust model
that breaks down when 3rd party storage access is allowed" problem.
This is not solvable by just saying "don't use filesystem X"....
> As-is, you seem to be forcing everyone who enables
> CONFIG_XFS_FS to build known buggy/unmaintained code into their
> kernel.
That's just hyperbole - software /by definition/ is known to be
buggy. And all Linux filesystems have unfixable, known bugs when it
comes to 3rd party manipulation of their on-disk format and that's the
reality we live in right now. How we chose to deal with that is
not a black/white decision (as I outlined above) - every distro has
users of the legacy XFS format, so even if we made it a config
option it will never be turned off in shipping distros kernels.
IOWs, mitigation decisions are difficult, but we have to draw a line
somewhere. In the case of XFS, we decided that the legacy format
needs to remain accepting of some bad input so users of those
formats don't get nasty surprises, but all new formats will strictly
validate all inputs and reject anything invalid. i.e. we get better
as time goes on, and that's why we want syzbot and other fuzzers to
focus on finding flaws in the new formats rather than the old.
Cheers,
-Dave.
--
Dave Chinner
[email protected]
On Tue, May 22, 2018 at 03:52:08PM -0700, Eric Biggers wrote:
> On Wed, May 23, 2018 at 08:26:20AM +1000, Dave Chinner wrote:
> > On Tue, May 22, 2018 at 08:31:08AM -0400, Brian Foster wrote:
> > > On Mon, May 21, 2018 at 10:55:02AM -0700, syzbot wrote:
> > > > Hello,
> > > >
> > > > syzbot found the following crash on:
> > > >
> > > > HEAD commit: 203ec2fed17a Merge tag 'armsoc-fixes' of git://git.kernel...
> > > > git tree: upstream
> > > > console output: https://syzkaller.appspot.com/x/log.txt?x=11c1ad77800000
> > > > kernel config: https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
> > > > dashboard link: https://syzkaller.appspot.com/bug?extid=568245b88fbaedcb1959
> > > > compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> > > > syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=122c7427800000
> > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10387057800000
> > > >
> > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > Reported-by: [email protected]
> > > >
> > > > (ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > ................
> > > > (ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > ................
> > > > XFS (loop0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x2 len
> > > > 1 error 117
> > > > XFS (loop0): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -117,
> > > > agno 0
> > > > XFS (loop0): failed to read root inode
> > >
> > > FWIW, the initial console output is actually:
> > >
> > > [ 448.028253] XFS (loop0): Mounting V4 Filesystem
> > > [ 448.033540] XFS (loop0): Log size 9371840 blocks too large, maximum size is 1048576 blocks
> > > [ 448.042287] XFS (loop0): Log size out of supported range.
> > > [ 448.047841] XFS (loop0): Continuing onwards, but if log hangs are experienced then please report this message in the bug report.
> > > [ 448.060712] XFS (loop0): totally zeroed log
> > >
> > > ... which warns about an oversized log and resulting log hangs. Not
> > > having dug into the details of why this occurs so quickly in this mount
> > > failure path,
> >
> > I suspect that it is a head and/or log tail pointer overflow, so when it
> > tries to do the first trans reserve of the mount - to write the
> > unmount record - it says "no log space available, please wait".
> >
> > > it does look like we'd never have got past this point on a
> > > v5 fs (i.e., the above warning would become an error and we'd not enter
> > > the xfs_log_mount_cancel() path).
> >
> > And this comes back to my repeated comments about fuzzers needing
> > to fuzz properly made V5 filesystems as we catch and error out on
> > things like this. Fuzzing random collections of v4 filesystem
> > fragments will continue to trip over problems we've avoided with v5
> > filesystems, and this is further evidence to point to that.
> >
> >
> > I'd suggest that at this point, syzbot XFS reports should be
> > redirected to /dev/null. It's not worth our time to triage
> > unreviewed bot generated bug reports until the syzbot developers
> > start listening and acting on what we have been telling them
> > about fuzzing filesystems and reproducing bugs that are meaningful
> > and useful to us.
>
> The whole point of fuzzing is to provide improper inputs. A kernel
> bug is a kernel bug, even if it's in deprecated/unmaintained code, or
> involves userspace doing something unexpected. If you have known
> buggy code in XFS that you refuse to fix,
Ok, that's it.
I disagree with Google's syzbot strategy, and I dissent most vehemently!
The whole point of constructing free software in public is that we
people communally build things that anyone can use for any purpose and
that anyone can modify. That privilege comes with a societal
expectation that the people using this commons will contribute to the
upkeep of that commons or it rots. For end users that means helping us
to find the gaps, but for software developers at large multinational
companies that means (to a first approximation) pitching in to write the
code, write the documentation, and to fix the problems.
Yes, there are many places where fs metadata validation is insufficient
to avoid misbehavior. Google's strategy of dumping vulnerability
disclosures on public mailing lists every week, demanding that other
people regularly reallocate their time to fix these problems, and not
helping to fix anything breaks our free software societal norms. Again,
the whole point of free software is to share the responsibility, share
the work, and share the gains. That is how collaboration works.
Help us to improve the software so that we all will be better off.
Figure out how to strengthen the validation, figure out how to balance
the risk of exposure against the risk of nonfunctionality, and figure
out how to discuss with this community. That is how the game works.
Google has enough money and smart people that you have (collectively)
learned how to spoof humans, so you can well afford to spend a small
fraction of that hiring some developers and writers and putting them to
work with us.
If you refuse to do this, you already /have/ a config option to turn off
the 'known buggy/unmaintained code in [your] kernel'; use it. I will
not repeat this message again[1].
--D
[1] https://marc.info/?l=linux-xfs&m=152303106427867&w=2
> then please provide a kernel config option so that users can disable
> the unmaintained XFS formats/features, leaving the maintained ones.
> As-is, you seem to be forcing everyone who enables CONFIG_XFS_FS to
> build known buggy/unmaintained code into their kernel.
>
> - Eric
> --
> To unsubscribe from this list: send the line "unsubscribe linux-xfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Darrick,
On Wed, May 23, 2018 at 12:44:25AM -0700, Darrick J. Wong wrote:
> On Tue, May 22, 2018 at 03:52:08PM -0700, Eric Biggers wrote:
> > On Wed, May 23, 2018 at 08:26:20AM +1000, Dave Chinner wrote:
> > > On Tue, May 22, 2018 at 08:31:08AM -0400, Brian Foster wrote:
> > > > On Mon, May 21, 2018 at 10:55:02AM -0700, syzbot wrote:
> > > > > Hello,
> > > > >
> > > > > syzbot found the following crash on:
> > > > >
> > > > > HEAD commit: 203ec2fed17a Merge tag 'armsoc-fixes' of git://git.kernel...
> > > > > git tree: upstream
> > > > > console output: https://syzkaller.appspot.com/x/log.txt?x=11c1ad77800000
> > > > > kernel config: https://syzkaller.appspot.com/x/.config?x=f3b4e30da84ec1ed
> > > > > dashboard link: https://syzkaller.appspot.com/bug?extid=568245b88fbaedcb1959
> > > > > compiler: gcc (GCC) 8.0.1 20180413 (experimental)
> > > > > syzkaller repro:https://syzkaller.appspot.com/x/repro.syz?x=122c7427800000
> > > > > C reproducer: https://syzkaller.appspot.com/x/repro.c?x=10387057800000
> > > > >
> > > > > IMPORTANT: if you fix the bug, please add the following tag to the commit:
> > > > > Reported-by: [email protected]
> > > > >
> > > > > (ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > > ................
> > > > > (ptrval): 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00 00
> > > > > ................
> > > > > XFS (loop0): metadata I/O error in "xfs_trans_read_buf_map" at daddr 0x2 len
> > > > > 1 error 117
> > > > > XFS (loop0): xfs_imap_lookup: xfs_ialloc_read_agi() returned error -117,
> > > > > agno 0
> > > > > XFS (loop0): failed to read root inode
> > > >
> > > > FWIW, the initial console output is actually:
> > > >
> > > > [ 448.028253] XFS (loop0): Mounting V4 Filesystem
> > > > [ 448.033540] XFS (loop0): Log size 9371840 blocks too large, maximum size is 1048576 blocks
> > > > [ 448.042287] XFS (loop0): Log size out of supported range.
> > > > [ 448.047841] XFS (loop0): Continuing onwards, but if log hangs are experienced then please report this message in the bug report.
> > > > [ 448.060712] XFS (loop0): totally zeroed log
> > > >
> > > > ... which warns about an oversized log and resulting log hangs. Not
> > > > having dug into the details of why this occurs so quickly in this mount
> > > > failure path,
> > >
> > > I suspect that it is a head and/or log tail pointer overflow, so when it
> > > tries to do the first trans reserve of the mount - to write the
> > > unmount record - it says "no log space available, please wait".
> > >
> > > > it does look like we'd never have got past this point on a
> > > > v5 fs (i.e., the above warning would become an error and we'd not enter
> > > > the xfs_log_mount_cancel() path).
> > >
> > > And this comes back to my repeated comments about fuzzers needing
> > > to fuzz properly made V5 filesystems as we catch and error out on
> > > things like this. Fuzzing random collections of v4 filesystem
> > > fragments will continue to trip over problems we've avoided with v5
> > > filesystems, and this is further evidence to point to that.
> > >
> > >
> > > I'd suggest that at this point, syzbot XFS reports should be
> > > redirected to /dev/null. It's not worth our time to triage
> > > unreviewed bot generated bug reports until the syzbot developers
> > > start listening and acting on what we have been telling them
> > > about fuzzing filesystems and reproducing bugs that are meaningful
> > > and useful to us.
> >
> > The whole point of fuzzing is to provide improper inputs. A kernel
> > bug is a kernel bug, even if it's in deprecated/unmaintained code, or
> > involves userspace doing something unexpected. If you have known
> > buggy code in XFS that you refuse to fix,
>
> Ok, that's it.
>
> I disagree with Google's syzbot strategy, and I dissent most vehemently!
>
> The whole point of constructing free software in public is that we
> people communally build things that anyone can use for any purpose and
> that anyone can modify. That privilege comes with a societal
> expectation that the people using this commons will contribute to the
> upkeep of that commons or it rots. For end users that means helping us
> to find the gaps, but for software developers at large multinational
> companies that means (to a first approximation) pitching in to write the
> code, write the documentation, and to fix the problems.
>
> Yes, there are many places where fs metadata validation is insufficient
> to avoid misbehavior. Google's strategy of dumping vulnerability
> disclosures on public mailing lists every week, demanding that other
> people regularly reallocate their time to fix these problems, and not
> helping to fix anything breaks our free software societal norms. Again,
> the whole point of free software is to share the responsibility, share
> the work, and share the gains. That is how collaboration works.
>
> Help us to improve the software so that we all will be better off.
>
> Figure out how to strengthen the validation, figure out how to balance
> the risk of exposure against the risk of nonfunctionality, and figure
> out how to discuss with this community. That is how the game works.
>
> Google has enough money and smart people that you have (collectively)
> learned how to spoof humans, so you can well afford to spend a small
> fraction of that hiring some developers and writers and putting them to
> work with us.
>
> If you refuse to do this, you already /have/ a config option to turn off
> the 'known buggy/unmaintained code in [your] kernel'; use it. I will
> not repeat this message again[1].
>
I actually agree that Google should doing more, but I think you're also shooting
the messenger. The fact is that these bugs exist, a not-insignificant number of
which are exploitable security vulnerabilities, and *everyone* needs to be doing
more to address them. I'm not sure you're aware, but Google employees have
already fixed around 200 syzkaller/syzbot reported bugs in the last 6 months;
that's hardly "not helping to fix anything". Unfortunately it's still not
enough to even keep up with the rate of new bugs being reported, so we need to
be doing more still. (And for context, as at other companies it's unfortunately
difficult to get organizational support for kernel-wide work; I'm not actually
on the syzbot team and have so far ended up working quite a bit on this in my
"free time" on this, fixing 35+ bugs, because I care, and I know that other
people care too, about the security and reliability of the Linux kernel.)
But even with more resources, the kernel is a huge project and there are always
going to be some subsystems without in-house experts. For example, AFAIK Google
doesn't use XFS for anything, and CONFIG_XFS_FS isn't even enabled in Google's
production, Android, or Chrome OS. Given that you are actually apparently paid
to work on XFS, I'd really hope you'd have a better attitude towards receiving
XFS bug reports with reproducers. Other maintainers have responded differently
and it's unclear why XFS is so different. For example the F2FS maintainer
immediately fixed all the syzbot bugs that were reported to them, and
maintainers of some networking subsystems have also been very responsive and
positive about receiving bug reports. Though, unfortunately there seems to be a
lot of de-facto unmaintained code in the kernel too (or alternatively: Andrew
and Linus are the "maintainers"), so I will at least give you more credit than
that :-)
Now, if you *really* don't want syzbot to report XFS bugs as you believe XFS
contains known unfixable bugs or for other reasons, you can formally ask Dmitry
to remove CONFIG_XFS_FS from the syzbot config. But of course that doesn't make
the bugs go away, it just makes the bug reports go away; you'll have to fix them
eventually anyway, one way or another. I do think you're drastically
underestimating how useful the syzbot bug reports can be too -- note e.g. that
the bug Dave fixed by "fs: don't scan the inode cache before SB_BORN is set"
took only 3 days to be reported by syzbot after it gained support for mounting
XFS filesystems. AFAICS that bug was in XFS for 7 years and was causing
production systems to mysteriously crash (very rarely), yet it took syzbot only
3 days to send you a C reproducer.
This is only at the early stage too --- syzkaller doesn't know how to fuzz the
XFS-specific ioctls yet, for example, but it could be taught. It's already been
finding ext4 bugs that allowed anyone with access to an ext4 directory to
corrupt the filesystem and crash the kernel. And note that syzkaller is
coverage-guided, using CONFIG_KCOV, so it *will* find bugs that you never
thought to test for in manually written (fuzz) tests. Non-coverage-guided
fuzzers are no longer state of the art. I've been really amazed at the bugs
syzkaller been able to find in other kernel subsystems, e.g. obscure races that
no one would have ever thought to test for.
Anyone is welcome to contribute to syzkaller and syzbot too; all the code is on
Github and Apache licensed. AFAIK the only thing missing from the git repo is a
few configuration files that control how the specific syzbot installation that
Dmitry is running is set up (like how many VMs to use, the credentials, etc.)
Anyway, I'm going to keep helping with bugs either way. Getting into arguments
like this is just a waste of time, and distracts from the real work getting
done.
Thanks,
- Eric
On 5/23/18 11:20 AM, Eric Biggers wrote:
> Hi Darrick,
...
> Now, if you *really* don't want syzbot to report XFS bugs as you believe XFS
> contains known unfixable bugs or for other reasons, you can formally ask Dmitry
> to remove CONFIG_XFS_FS from the syzbot config. But of course that doesn't make
> the bugs go away, it just makes the bug reports go away; you'll have to fix them
> eventually anyway, one way or another.
I'd revise that to "have to fix /some/ of them anyway."
What I'm personally hung up on are the bugs where the "exploit" involves merely
mounting a crafted filesystem that in reality would never (until the heat death
of the universe) corrupt itself into that state on its own; it's the "malicious
image" case, which is quite different than exposing fundamental bugs like the
SB_BORN race or or the user-exploitable ext4 flaw you mentioned in your reply.
Those are more insidious and/or things which can be hit by real users in real life.
I don't know if I can win the "malicious images aren't a critical security
threat" battle, but I do think they are at least a different class of flaws,
because as Dave said, mount is supposed to be a privileged operation.
In a perfect world we'd fix them anyway, but I don't know that our resource
pool can keep up with your google-scale bot and still make progress in other
critical areas.
Anyway, the upshot is that we're probably just not going to care much about V4
filesystem oops-or-hang-on-mount bugs. Those problems are solved (largely) with
V5 filesystem format. Maybe I /will/ propose a system-wide tunable to disallow
V4 for those who are worried about such things.
To Darrick's points about more collaboration, I still wish that our requests
for more traditional fs fuzzer reporting (i.e. a filesystem image) weren't met
with such resistance.Tailoring your bug reports to the needs of the developer
community you're interacting with seems like a pretty reasonable thing to do.
As an aside, I wonder how much coverage of the V5 format code syzkaller /has/
achieved; that would be another useful datapoint google could provide - if
syzkaller is in fact traversing the V5 codepaths and isn't turning anything
up, that'd be pretty useful to know.
Thanks,
-Eric
On Wed, May 23, 2018 at 09:20:15AM -0700, Eric Biggers wrote:
> Now, if you *really* don't want syzbot to report XFS bugs as you believe XFS
> contains known unfixable bugs or for other reasons, you can formally ask Dmitry
> to remove CONFIG_XFS_FS from the syzbot config.
We haven't said "we don't want syzbot to run on XFS" - we've been
saying "we want syzbot to run on the new XFS format". i.e. you've
got completely the wrong end of the stick.
> But of course that doesn't make
> the bugs go away, it just makes the bug reports go away; you'll have to fix them
> eventually anyway, one way or another. I do think you're drastically
> underestimating how useful the syzbot bug reports can be too -- note e.g. that
> the bug Dave fixed by "fs: don't scan the inode cache before SB_BORN is set"
> took only 3 days to be reported by syzbot after it gained support for mounting
> XFS filesystems. AFAICS that bug was in XFS for 7 years and was causing
> production systems to mysteriously crash (very rarely), yet it took syzbot only
> 3 days to send you a C reproducer.
We got the first ever usable user bug report for this in late
February on a v5 filesystem. Just because a bug has been there for
a long time, it doesn't mean that users or test programs are
tripping over it. e.g. trinity has been fuzzing filesystems (as
have many other tools), but they never hit this because of the
unlikely combination of events needed to trigger the failure.
The first proposed fix was mid-march:
https://www.spinics.net/lists/linux-xfs/msg16601.html
IOWs, trying to associate this bug with the on-disk format issues we
want fixed, or even attributing the finding and fixing this bug to
syzbot is stretching the truth somewhat. Yes, syzbot tripped over it
fairly quickly and that is great, but let's no try to rewrite
history....
However, I think this silly desire to get everything syzbot reports
attributed to syzbot regardless of reality has clouded the important
observation that should have been made here. Everyone seems to have
missed the fact that syzbot uncovered a general class of filesystem
implementation error. i.e. Several filesystem implementations have
failed to handle ->fill_super errors correctly and syzbot tripped
over many of them - XFS is just one example.
The common mistake being made is failing to clear sb->s_fs_info when
it was freed on ->fill_super failure, and hence had subsequent
problems when ->kill_super was called and the code assumed
->s_fs_info was still valid. There have been other problems due to
sb->s_fs_info needing to be being assigned before the filesystem is
fully set up, and some of them were fixed by the SB_BORN change the
above patch morphed into after initial review.
IOWs, there is a general class of implementation bug here, and maybe
there's something we can learn from that - is our documentation
lacking, the API too convoluted, etc? Understanding why this
happened and making sure we don't do it again is far more important
than fixing any individual bug. And what other general
programming/API error patterns has syzbot tripped over that nobody
has noticed because there's no-one actually paying attention to the
general scope of bugs that syzbot is discovering?
> This is only at the early stage too --- syzkaller doesn't know how to fuzz the
> XFS-specific ioctls yet, for example, but it could be taught. It's already been
> finding ext4 bugs that allowed anyone with access to an ext4 directory to
> corrupt the filesystem and crash the kernel. And note that syzkaller is
> coverage-guided, using CONFIG_KCOV, so it *will* find bugs that you never
> thought to test for in manually written (fuzz) tests. Non-coverage-guided
> fuzzers are no longer state of the art. I've been really amazed at the bugs
> syzkaller been able to find in other kernel subsystems, e.g. obscure races that
> no one would have ever thought to test for.
OTOH, knowing how many bugs lurk in our code base, I'm still amazed
that tools like syzbot find so few of them.
Cheers,
Dave.
--
Dave Chinner
[email protected]
On Wed, May 23, 2018 at 01:01:59PM -0500, Eric Sandeen wrote:
>
> What I'm personally hung up on are the bugs where the "exploit" involves merely
> mounting a crafted filesystem that in reality would never (until the heat death
> of the universe) corrupt itself into that state on its own; it's the "malicious
> image" case, which is quite different than exposing fundamental bugs like the
> SB_BORN race or or the user-exploitable ext4 flaw you mentioned in your reply.
> Those are more insidious and/or things which can be hit by real users in real life.
Well, it *can* be hit in real life. If you have a system which auto
mounts USB sticks, then an attacker might be able to weaponize that
bug by creating a USB stick where mounted and the user opens a
particular file, the buffer overrun causes code to be executed that
grabs the user's credentials (e.g., ssh-agent keys, OATH creds, etc.)
and exfiltrates them to a collection server.
Fedora and Chrome OS might be two such platforms where someone could
very easily create a weaponized exploit tool where you could insert a
file system buffer overrun bug, and "hey presto!" it becomes a serious
zero day vulnerability.
(I recently suggested to a security researcher who was concerned that
file system developers weren't taking these sorts of things seriously
enough could do a service to the community by creating a demonstration
about how these sorts of bugs can be weaponized. And I suspect it
could be about just as easily on Chrome OS as Fedora, and that can be
one way that an argument could be made to management that more
resources should be applied to this problem. :-)
Of course, not all bugs triggered by a maliciously crafted file system
are equally weaponizable. An errors=panic or a NULL derefrence are
probably not easily exploitable at all. A buffer overrun (and I fixed
two in ext4 in the last two days while being stuck in a T13 standards
meeting, so I do feel your pain) might be a very different story.
Solutions
---------
One of the things I've wanted to get help from the syzbot folks is if
there was some kind of machine learning or expert system evaluation
that could be done so malicious image bugs could be binned into
different categories, based on how easily they can be weaponized.
That way, when there is a resource shortage situation, humans can be
more easily guided into detremining which bugs should be prioritized
and given attention, and which we can defer to when we have more time.
Or maybe it would be useful if there was a way where maintainers could
be able to annotate bugs with priority and severity levels, and maybe
make comments that can be viewed from the Syzbot dashboard UI.
The other thing that perhaps could be done is to set up a system where
the USB stick is automounted in a guest VM (using libvirt in Fedora,
and perhaps Crostini for Chrome OS), and the contents of the file
system would then get exported from the guest OS to the host OS using
either NFS or 9P. (9P2000.u is the solution that was used in
gVisor[1].)
[1] https://github.com/google/gvisor
It could be that putting this kind of security layer in front to
automounted USB sticks is less work than playing whack-a-mole fixing a
lot of security bugs with maliciously crafted file systems.
- Ted
On Wed, May 23, 2018 at 07:41:15PM -0400, Theodore Y. Ts'o wrote:
> On Wed, May 23, 2018 at 01:01:59PM -0500, Eric Sandeen wrote:
> >
> > What I'm personally hung up on are the bugs where the "exploit" involves merely
> > mounting a crafted filesystem that in reality would never (until the heat death
> > of the universe) corrupt itself into that state on its own; it's the "malicious
> > image" case, which is quite different than exposing fundamental bugs like the
> > SB_BORN race or or the user-exploitable ext4 flaw you mentioned in your reply.
> > Those are more insidious and/or things which can be hit by real users in real life.
>
> Well, it *can* be hit in real life. If you have a system which auto
> mounts USB sticks, then an attacker might be able to weaponize that
> bug by creating a USB stick where mounted and the user opens a
> particular file, the buffer overrun causes code to be executed that
> grabs the user's credentials (e.g., ssh-agent keys, OATH creds, etc.)
> and exfiltrates them to a collection server.
We've learnt this lesson the hard way over and over again: don't
parse untrusted input in privileged contexts. How many times do we
have to make the same mistakes before people start to learn from
them?
User automounting of removable storage should be done via a
privilege separation mechanism and hence avoid this whole class of
security problems. We can get this separation by using FUSE in these
situations, right?
> Fedora and Chrome OS might be two such platforms where someone could
> very easily create a weaponized exploit tool where you could insert a
> file system buffer overrun bug, and "hey presto!" it becomes a serious
> zero day vulnerability.
There's little we can do to prevent people from exploiting
flaws in the filesystem's on-disk format. No filesystem has robust,
exhaustive verification of all it's metadata, nor is that
something we can really check at runtime due to the complexity
and overhead of runtime checking.
And then when you consider all the avenues to data exposure and
unprivileged runtime manipulation of on-disk metadata (e.g.
intentionally cross linking critical metadata blocks into user
data files), it's pretty obvious that untrusted filesystem images
are not something that should *ever* be parsed in a privileged
context.
> (I recently suggested to a security researcher who was concerned that
> file system developers weren't taking these sorts of things seriously
> enough could do a service to the community by creating a demonstration
> about how these sorts of bugs can be weaponized. And I suspect it
> could be about just as easily on Chrome OS as Fedora, and that can be
> one way that an argument could be made to management that more
> resources should be applied to this problem. :-)
There's "taking it seriously" and then there's "understanding that
we can't stop new exploits from being developed because they
exploit a flaw in the trust model".
i.e. kernel filesystems are built on a directly connected trust
model where the storage "guarantees" it will return exactly what the
filesystem has stored in it. Hence our filesystems are not built
around tamper-evident/tamper-proof structures and algorithms that
are needed to robustly detect 3rd-party manipulation because their
trust model says they don't need to defend against such attacks.
As such, the most robust way I can see of defending *the kernel*
against malicious/untrusted filesystem images is to move parsing of
those images out of the kernel privilege context altogether. The
parser can then be sandboxed appropriately as you suggested and
we've avoided the problem of kernel level exploits from malicious
filesystem images....
> Of course, not all bugs triggered by a maliciously crafted file system
> are equally weaponizable. An errors=panic or a NULL derefrence are
> probably not easily exploitable at all.
Bugs don't have to be exploitable to be a "security issue". Detected
filesystem corruptions on a errors=panic mount, or undetected
problems that cause a x/NULL deref are still a user-triggerable
kernel crash (i.e. a DOS) and therefore considered a security
problem.
> A buffer overrun (and I fixed
> two in ext4 in the last two days while being stuck in a T13 standards
> meeting, so I do feel your pain) might be a very different story.
The fact you are currently finding and fixing buffer overuns in ext4
solidly demonstrates my point about existing filesystems being
largely untrustable and unfixable. :/
Cheers,
Dave.
--
Dave Chinner
[email protected]
On Thu, May 24, 2018 at 10:49:31AM +1000, Dave Chinner wrote:
>
> We've learnt this lesson the hard way over and over again: don't
> parse untrusted input in privileged contexts. How many times do we
> have to make the same mistakes before people start to learn from
> them?
Good question. For how many years (or is it decades, now) has Fedora
auto-mounted USB sticks? :-) Let me know when you successfully get
Fedora to turn of a feature which appears to have great user appeal.
And I'll note that Eric Beiderman just posted a patch series allowing
unprivileged processes to mount file systems in containers.
And remember the mantra which the containner people keep chanting.
Containers are just as secure as VM's. Hahahaha.....
> User automounting of removable storage should be done via a
> privilege separation mechanism and hence avoid this whole class of
> security problems. We can get this separation by using FUSE in these
> situations, right?
FUSE is a pretty terrible security boundary. And not all file systems
have FUSE support. As I had suggested earlier, probably better to use
9P, and mount the file system in a VM.
> Bugs don't have to be exploitable to be a "security issue". Detected
> filesystem corruptions on a errors=panic mount, or undetected
> problems that cause a x/NULL deref are still a user-triggerable
> kernel crash (i.e. a DOS) and therefore considered a security
> problem.
I disagree here. I think it's worth it to disambiguate the two. If
you have physical access to the machine, you can also apply AC mains
voltage to the USB port, which will likely cause the system to crash.
And at least for Chrome OS, it reboots really quickly. :-)
If someone can gain control of the system so they can exfiltrate data,
or be able to modify files owned as root, that's a much bigger deal
that crashing the machcine in my view.
- Ted
On Wed, May 23, 2018 at 08:59:06PM -0400, Theodore Y. Ts'o wrote:
> On Thu, May 24, 2018 at 10:49:31AM +1000, Dave Chinner wrote:
> >
> > We've learnt this lesson the hard way over and over again: don't
> > parse untrusted input in privileged contexts. How many times do we
> > have to make the same mistakes before people start to learn from
> > them?
>
> Good question. For how many years (or is it decades, now) has Fedora
> auto-mounted USB sticks? :-) Let me know when you successfully get
> Fedora to turn of a feature which appears to have great user appeal.
They'll do that when we provide them with a safe, easy to use
solution to the problem. This is our problem to solve, not
blame-shift it away.
> And I'll note that Eric Beiderman just posted a patch series allowing
> unprivileged processes to mount file systems in containers.
Yup, that's to make it easy for virtual kernel filesystems to be
mounted inside containers, and to solve some of FUSEs security
issues caused by needing root permissions to mount FUSE filesystems.
Enabling unprivileged mounts requires an opt-in flag in the
filesystem fs-type definition, and we most certainly won't be
setting that flag on XFS. I also doubt it will ever get set on any
other existing block device based filesystem because of the trust
model problems it exposes.
> And remember the mantra which the containner people keep chanting.
> Containers are just as secure as VM's. Hahahaha.....
So your solution is to have VM guests and container users spin up
sandboxed VMs to access filesystem images safely? That's not really
a practical solution. :/
> > User automounting of removable storage should be done via a
> > privilege separation mechanism and hence avoid this whole class of
> > security problems. We can get this separation by using FUSE in these
> > situations, right?
>
> FUSE is a pretty terrible security boundary.
That may be true, but it's so much better than using the kernel to
parse untrusted filesystem metadata.
> And not all file systems
> have FUSE support.
Except there is now fusefs-lkl, so all kernel filesystem are fully
accessible through FUSE.
> > Bugs don't have to be exploitable to be a "security issue". Detected
> > filesystem corruptions on a errors=panic mount, or undetected
> > problems that cause a x/NULL deref are still a user-triggerable
> > kernel crash (i.e. a DOS) and therefore considered a security
> > problem.
>
> I disagree here. I think it's worth it to disambiguate the two.
Been trying to get security people to understand this for years.
I've given up because there's always some new security person who
follows The Process and simply does not understand that there is a
difference.
Cheers,
Dave.
--
Dave Chinner
[email protected]
On 5/23/18 7:59 PM, Theodore Y. Ts'o wrote:
> On Thu, May 24, 2018 at 10:49:31AM +1000, Dave Chinner wrote:
>> We've learnt this lesson the hard way over and over again: don't
>> parse untrusted input in privileged contexts. How many times do we
>> have to make the same mistakes before people start to learn from
>> them?
> Good question. For how many years (or is it decades, now) has Fedora
> auto-mounted USB sticks?:-) Let me know when you successfully get
> Fedora to turn of a feature which appears to have great user appeal.
So we have decades of filesystem design based on one threat model, and
some desktop environments decided to blow it all up 'cause it's more
convenient that way.
Super. Maybe the email client can start auto-running attachments, too,
For The Convenience.
What's the phrase... poor planning on your part doesn't constitute an
emergency on my part? :/ (not actually referring to /you/, Ted) ;)
Anyway, if desktops auto-mounting USB sticks is the primary threat,
maybe time would be better spent adding restrictions there - allow only a
subset of common USB formats which are simple and have been fuzzed to hell
and back, rather than mounting whatever you happened to find lying in the
parking lot at work and hoping that somebody, somewhere, has discovered and
fixed every attack vector now that we've blown up the trust model
For The Convenience.
(Digging through dconf-editor, there's just on/off, no gui method at
least, to include or exclude automountable fs types. It's all or
nothing. TBH I have no idea how many mechanisms are out there to do
this automounting - hal/udev/systemd/ghome/dbus/...?)
Anyway, fuzzers aside, it sure seems like if we can't un-ring the
automount bell, it'd be prudent to limit it to FAT by default and focus
efforts on making that as safe as possible.
>> Bugs don't have to be exploitable to be a "security issue". Detected
>> filesystem corruptions on a errors=panic mount, or undetected
>> problems that cause a x/NULL deref are still a user-triggerable
>> kernel crash (i.e. a DOS) and therefore considered a security
>> problem.
>
> I disagree here. I think it's worth it to disambiguate the two. If
> you have physical access to the machine, you can also apply AC mains
> voltage to the USB port, which will likely cause the system to crash.
> And at least for Chrome OS, it reboots really quickly.
Even after you apply AC mains to the USB port? Cool, Chrome's pretty
resilient. ;)
I think Dave may have been just stating a reality there rather than agreeing
with it, not sure.
> If someone can gain control of the system so they can exfiltrate data,
> or be able to modify files owned as root, that's a much bigger deal
> that crashing the machcine in my view.
For sure. I guess some subset of the crashes could be more carefully
crafted to be more dangerous, but fuzzers really don't tell us that today,
in fact the more insidious flaws that don't turn up as a crash or hang likely
go unnoticed.
-Eric
On Thu, May 24, 2018 at 1:41 AM, Theodore Y. Ts'o <[email protected]> wrote:
> On Wed, May 23, 2018 at 01:01:59PM -0500, Eric Sandeen wrote:
>>
>> What I'm personally hung up on are the bugs where the "exploit" involves merely
>> mounting a crafted filesystem that in reality would never (until the heat death
>> of the universe) corrupt itself into that state on its own; it's the "malicious
>> image" case, which is quite different than exposing fundamental bugs like the
>> SB_BORN race or or the user-exploitable ext4 flaw you mentioned in your reply.
>> Those are more insidious and/or things which can be hit by real users in real life.
>
> Well, it *can* be hit in real life. If you have a system which auto
> mounts USB sticks, then an attacker might be able to weaponize that
> bug by creating a USB stick where mounted and the user opens a
> particular file, the buffer overrun causes code to be executed that
> grabs the user's credentials (e.g., ssh-agent keys, OATH creds, etc.)
> and exfiltrates them to a collection server.
>
> Fedora and Chrome OS might be two such platforms where someone could
> very easily create a weaponized exploit tool where you could insert a
> file system buffer overrun bug, and "hey presto!" it becomes a serious
> zero day vulnerability.
>
> (I recently suggested to a security researcher who was concerned that
> file system developers weren't taking these sorts of things seriously
> enough could do a service to the community by creating a demonstration
> about how these sorts of bugs can be weaponized. And I suspect it
> could be about just as easily on Chrome OS as Fedora, and that can be
> one way that an argument could be made to management that more
> resources should be applied to this problem. :-)
>
> Of course, not all bugs triggered by a maliciously crafted file system
> are equally weaponizable. An errors=panic or a NULL derefrence are
> probably not easily exploitable at all. A buffer overrun (and I fixed
> two in ext4 in the last two days while being stuck in a T13 standards
> meeting, so I do feel your pain) might be a very different story.
>
> Solutions
> ---------
>
> One of the things I've wanted to get help from the syzbot folks is if
> there was some kind of machine learning or expert system evaluation
> that could be done so malicious image bugs could be binned into
> different categories, based on how easily they can be weaponized.
> That way, when there is a resource shortage situation, humans can be
> more easily guided into detremining which bugs should be prioritized
> and given attention, and which we can defer to when we have more time.
Hi Ted,
I don't see that "some kind of machine learning or expert system
evaluation" is feasible. At least not in short/mid-term. There are
innocently-looking bugs that actually turn out to be very bad, and
there are badly looking at first glance bugs that actually not that
bad for some complex reasons. Full security assessment is a complex
task and I think stays "human expert area" for now. One can get some
coarse estimation by searching for "use-after-free" and
"out-of-bounds" on the dashboard.
Also note that even the most innocent bugs can block ability to
discover deeper and worse bugs during any runtime testing. So
ultimately all need to be fixed if we want correct, stable and secure
kernel. To significant degree it's like compiler warnings: you either
fix them all, or turn them off, there is no middle ground of having
thousands of unfixed warnings and still getting benefit from them.
> Or maybe it would be useful if there was a way where maintainers could
> be able to annotate bugs with priority and severity levels, and maybe
> make comments that can be viewed from the Syzbot dashboard UI.
This looks more realistic. +Tetsuo proposed something similar:
https://github.com/google/syzkaller/issues/608
I think to make it useful we need to settle on some small set of
well-defined tags for bugs that we can show on the dashboard.
Arbitrary detailed free-form comments can be left on the mailing list
threads that are always referenced from the dashboard.
What tags would you use today for existing bugs? One would be
"security-critical", right?
> The other thing that perhaps could be done is to set up a system where
> the USB stick is automounted in a guest VM (using libvirt in Fedora,
> and perhaps Crostini for Chrome OS), and the contents of the file
> system would then get exported from the guest OS to the host OS using
> either NFS or 9P. (9P2000.u is the solution that was used in
> gVisor[1].)
>
> [1] https://github.com/google/gvisor
>
> It could be that putting this kind of security layer in front to
> automounted USB sticks is less work than playing whack-a-mole fixing a
> lot of security bugs with maliciously crafted file systems.
I don't think that auto mounting or "requires root" is significantly
relevant in this context. If one needs to use a USB stick, or DVD or
just any filesystem that they did not create themselves, there is
pretty much no choice than to mount it, issuing sudo if necessary. If
you did not create it yourself with a trusted program, there is no way
you can be sure in the contents of the thing and there is no way you
can verify every byte of it before mounting. That's exactly the work
for software. Responsibility shifting like "you said sudo so now it's
all on you" is not useful for users. It's like web sites that give you
a hundred page license agreement before you can use it, but now you
clicked Agree so it's all on you, you read and understood every word
of it and if there would be any concern you would not click Agree,
right?
Fixing large legacy code bases is hard. But there is no other way than
persistent testing and fixing one bug at a time. We know that it's
doable because browsers did it over the past 10 years for much larger
set of input formats.
> For sure. I guess some subset of the crashes could be more carefully
> crafted to be more dangerous, but fuzzers really don't tell us that today,
> in fact the more insidious flaws that don't turn up as a crash or hang likely
> go unnoticed.
Well, we have KASAN, almost have KMSAN and will have KTSAN in future.
They can detect detect significant portion of bugs that go unnoticed
otherwise. At least this prevents "bad guys" from also using tooling
to cheaply harvest exploits. Systematic use of these tools on browsers
raised exploit costs to $1M+ for a reason.
On Sat, May 26, 2018 at 07:12:49PM +0200, Dmitry Vyukov wrote:
>
> I don't see that "some kind of machine learning or expert system
> evaluation" is feasible. At least not in short/mid-term. There are
> innocently-looking bugs that actually turn out to be very bad, and
> there are badly looking at first glance bugs that actually not that
> bad for some complex reasons. Full security assessment is a complex
> task and I think stays "human expert area" for now. One can get some
> coarse estimation by searching for "use-after-free" and
> "out-of-bounds" on the dashboard.
If the kernel intentionally triggers a BUG_ON or a panic (as in file
systems configured with 'tune2fs -e panic') it's pretty obvious that
those errors can't be weaponized to execute code chosen by the
attacker. Would you agree with that?
The same should be true for "blocked for more than 120 seconds";
again, I claim that those sorts of errors are by definition less
serious than buffer overruns.
So there is at least some kind of automated evaluation that can be
done, even if the general case problem is really hard.
> > Or maybe it would be useful if there was a way where maintainers could
> > be able to annotate bugs with priority and severity levels, and maybe
> > make comments that can be viewed from the Syzbot dashboard UI.
>
> This looks more realistic. +Tetsuo proposed something similar:
> https://github.com/google/syzkaller/issues/608
>
> I think to make it useful we need to settle on some small set of
> well-defined tags for bugs that we can show on the dashboard.
> Arbitrary detailed free-form comments can be left on the mailing list
> threads that are always referenced from the dashboard.
>
> What tags would you use today for existing bugs? One would be
> "security-critical", right?
For me, it's not about tags. Things missing from the
https://syzkaller.appspot.com/ front page are:
* Whether or not a repro is available
* Which subsystems the bug has been tentatively assigned
* A maintainer assigned priority and severity level
I generally don't use the syzkaller.apptspot.com front page because
it's too hard to find the sorts of thing that I'm looking for ---
namely the most important syzkaller bug that I as an ext4 expert can
work on.
If someone else sends me a well-formed bug report on
bugzilla.kernel.org, with free-standing image file, and a simple .c
reproducer, I can make forward progress much more quickly. So if I'm
time bound, guess which bug I'm going to pay attention to first?
Especially when Syzkaller makes it hard for me to find the bug again
once it ages out of my inbox?
> Well, we have KASAN, almost have KMSAN and will have KTSAN in future.
> They can detect detect significant portion of bugs that go unnoticed
> otherwise. At least this prevents "bad guys" from also using tooling
> to cheaply harvest exploits. Systematic use of these tools on browsers
> raised exploit costs to $1M+ for a reason.
I'll note that browsers also use processes and capsicum to provide
security boundaries. This is why David and I have been suggesting
solutions like FUSE or running the mount in a guest VM, which can act
as an isolation layer. Even if there is a bug in the guest kernel,
the blast radius of the bug can be isolated, hopefully to the point
where it can be completely contained. It's not an either-or, but a
both-and.
But the much more important thing is that mangement has to be willing
to **fund** bug remediation. That was true for Chrome; it doesn't
seem to be as true for the Linux Kernel, for whatever reason.
People trying to fix Syzkaller and other fuzzer-found bugs on 20%
time, or on the weekends, or as a background activity during
low-bandwidth meetings, or as an unfunded mandate that doesn't show up
on anyone's quarterly objectives upon which they are graded, is just
not going to scale.
And if that's the reality, it may very well be that if you want
Syzkaller to make more a difference, anything you can do to reduce the
human toil needed to investigate a bug is going to be hugely
important.
And if bug remediation is only going to be funded to a very limited
extent, then it's important that the bugs we work on are the highest
priority ones, since the lower priority ones *will* have to let slide.
Regards,
- Ted
"Theodore Y. Ts'o" <[email protected]> writes:
> On Thu, May 24, 2018 at 10:49:31AM +1000, Dave Chinner wrote:
>> User automounting of removable storage should be done via a
>> privilege separation mechanism and hence avoid this whole class of
>> security problems. We can get this separation by using FUSE in these
>> situations, right?
>
> FUSE is a pretty terrible security boundary. And not all file systems
> have FUSE support. As I had suggested earlier, probably better to use
> 9P, and mount the file system in a VM.
I just have to ask. Why do you find FUSE to be a pretty terrible
security boundary?
My experience with kernel's 9P implemenation is that it is scarier to
deal with, and that 9P is starting to suffer the maladies of an
unmaintained filesystem (which it is).
FUSE was always written with the assumption that it would be attacked by
malicious users and generally appears robust against that kind of thing.
The whole internet accessibleness of 9P while making it usable in VM's
generally looks like down-side as it adds a the whole issue of
malicious packets from a 3rd party that is neither client nor server to
deal with.
Eric
On Wed, May 30, 2018 at 1:42 PM Dave Chinner <[email protected]> wrote:
> We've learnt this lesson the hard way over and over again: don't
> parse untrusted input in privileged contexts. How many times do we
> have to make the same mistakes before people start to learn from
> them?
You're not wrong, but we haven't considered root to be fundamentally
trustworthy for years - there are multiple kernel features that can be
configured such that root is no longer able to do certain things (the
one-way trap for requiring module signatures is the most obvious, but
IMA in appraisal mode will also restrict root), and as a result it's
not reasonable to be worried only about users - it's also necessary to
prevent root form being able to deliberately mount a filesystem that
results in arbitrary code execution in the kernel.
On Sat, May 26, 2018 at 10:24 PM, Theodore Y. Ts'o <[email protected]> wrote:
> On Sat, May 26, 2018 at 07:12:49PM +0200, Dmitry Vyukov wrote:
>>
>> I don't see that "some kind of machine learning or expert system
>> evaluation" is feasible. At least not in short/mid-term. There are
>> innocently-looking bugs that actually turn out to be very bad, and
>> there are badly looking at first glance bugs that actually not that
>> bad for some complex reasons. Full security assessment is a complex
>> task and I think stays "human expert area" for now. One can get some
>> coarse estimation by searching for "use-after-free" and
>> "out-of-bounds" on the dashboard.
>
> If the kernel intentionally triggers a BUG_ON or a panic (as in file
> systems configured with 'tune2fs -e panic') it's pretty obvious that
> those errors can't be weaponized to execute code chosen by the
> attacker. Would you agree with that?
>
> The same should be true for "blocked for more than 120 seconds";
> again, I claim that those sorts of errors are by definition less
> serious than buffer overruns.
>
> So there is at least some kind of automated evaluation that can be
> done, even if the general case problem is really hard.
These can't be weaponized to execute code, but if a BUG_ON is
triggerable over a network, or from VM guest, then it's likely more
critical than a local code execution. That's why I am saying that
automated evaluation is infeasible.
Anyway, bug type (UAF, BUG, task hung) is available in the bug title
on dashboard and on mailing lists, so you can just search/sort bugs on
the dashboard. What other interface you want on top of this?
>> > Or maybe it would be useful if there was a way where maintainers could
>> > be able to annotate bugs with priority and severity levels, and maybe
>> > make comments that can be viewed from the Syzbot dashboard UI.
>>
>> This looks more realistic. +Tetsuo proposed something similar:
>> https://github.com/google/syzkaller/issues/608
>>
>> I think to make it useful we need to settle on some small set of
>> well-defined tags for bugs that we can show on the dashboard.
>> Arbitrary detailed free-form comments can be left on the mailing list
>> threads that are always referenced from the dashboard.
>>
>> What tags would you use today for existing bugs? One would be
>> "security-critical", right?
>
> For me, it's not about tags. Things missing from the
> https://syzkaller.appspot.com/ front page are:
>
> * Whether or not a repro is available
This was always available in the Repro column.
> * Which subsystems the bug has been tentatively assigned
> * A maintainer assigned priority and severity level
Let's call this tags collectively (unless you have a better name). P0
or subsystem:ext4 can also be tags.
So you mean: (1) priority levels (P0, P1, P2), (2) severity levels
(S0, S1, S2) and subsystem, right?
On a related note, perhaps kernel community needs to finally start
using bugzilla for real, like with priorities, assignees, up-to-date
statuses, no stale bugs, etc. All of this is available in bug tracking
systems for decades...
On Wed, May 30, 2018 at 10:51 PM, 'Matthew Garrett' via syzkaller-bugs
<[email protected]> wrote:
> On Wed, May 30, 2018 at 1:42 PM Dave Chinner <[email protected]> wrote:
>> We've learnt this lesson the hard way over and over again: don't
>> parse untrusted input in privileged contexts. How many times do we
>> have to make the same mistakes before people start to learn from
>> them?
>
> You're not wrong, but we haven't considered root to be fundamentally
> trustworthy for years - there are multiple kernel features that can be
> configured such that root is no longer able to do certain things (the
> one-way trap for requiring module signatures is the most obvious, but
> IMA in appraisal mode will also restrict root), and as a result it's
> not reasonable to be worried only about users - it's also necessary to
> prevent root form being able to deliberately mount a filesystem that
> results in arbitrary code execution in the kernel.
FWIW, Android also does not consider root as trusted entity. It's
limited by SELinux and maybe something else. Kernel becomes the main
attack target on Android. Even if attackers get root, they still go
for kernel execution or kernel data corruption to do anything harmful.
And kernel is exploited with use-after-frees, out-of-bounds,
double-frees, etc.
On Wed, May 23, 2018 at 8:01 PM, Eric Sandeen <[email protected]> wrote:
> On 5/23/18 11:20 AM, Eric Biggers wrote:
>
> ...
>
>
> I'd revise that to "have to fix /some/ of them anyway."
>
> What I'm personally hung up on are the bugs where the "exploit" involves
> merely
> mounting a crafted filesystem that in reality would never (until the heat
> death
> of the universe) corrupt itself into that state on its own; it's the
> "malicious
> image" case, which is quite different than exposing fundamental bugs like
> the
> SB_BORN race or or the user-exploitable ext4 flaw you mentioned in your
> reply.
> Those are more insidious and/or things which can be hit by real users in
> real life.
>
> I don't know if I can win the "malicious images aren't a critical security
> threat" battle, but I do think they are at least a different class of flaws,
> because as Dave said, mount is supposed to be a privileged operation.
> In a perfect world we'd fix them anyway, but I don't know that our resource
> pool can keep up with your google-scale bot and still make progress in other
> critical areas.
>
> Anyway, the upshot is that we're probably just not going to care much about
> V4
> filesystem oops-or-hang-on-mount bugs. Those problems are solved (largely)
> with
> V5 filesystem format. Maybe I /will/ propose a system-wide tunable to
> disallow
> V4 for those who are worried about such things.
>
> To Darrick's points about more collaboration, I still wish that our requests
> for more traditional fs fuzzer reporting (i.e. a filesystem image) weren't
> met
> with such resistance.Tailoring your bug reports to the needs of the
> developer
> community you're interacting with seems like a pretty reasonable thing to
> do.
>
> As an aside, I wonder how much coverage of the V5 format code syzkaller
> /has/
> achieved; that would be another useful datapoint google could provide - if
> syzkaller is in fact traversing the V5 codepaths and isn't turning anything
> up, that'd be pretty useful to know.
Hi Eric,
The current syzbot kernel code coverage is available here:
https://storage.googleapis.com/syzkaller/cover/upstream.html#9c73bb525fc1def86e67f5039ab97d8f48062621
On Mon, Jun 11, 2018 at 03:07:24PM +0200, Dmitry Vyukov wrote:
>
> These can't be weaponized to execute code, but if a BUG_ON is
> triggerable over a network, or from VM guest, then it's likely more
> critical than a local code execution. That's why I am saying that
> automated evaluation is infeasible.
I can't imagine situations where a BUG_ON would be more critical than
local code execution. You can leverage local code execution to ah
remote privilege escalation attack; and local code execution can (with
less effort) be translated to a system crash. Hence, local code
execution is always more critical than a BUG_ON.
> Anyway, bug type (UAF, BUG, task hung) is available in the bug title
> on dashboard and on mailing lists, so you can just search/sort bugs on
> the dashboard. What other interface you want on top of this?
I also want to be able to search and filter based on subsystem, and
whether or not there is a reproducer. Sometimes you can't even figure
out the subsytem from the limited string shown on the dashboard,
because the original string didn't include the subsystem to begin
with, or the the subsytem name was truncated and not included on the
dashboard.
> On a related note, perhaps kernel community needs to finally start
> using bugzilla for real, like with priorities, assignees, up-to-date
> statuses, no stale bugs, etc. All of this is available in bug tracking
> systems for decades...
I do use bugzilla and in fact if syzbot would automatically file a
bugzilla.kernel.org report for things that are in the ext4 subsystem,
that would be really helpful.
As far as no stale bugs, etc., many companies (including Google)
aren't capable of doing that with their own internal bug tracking
systems, because management doesn't give them enough time to track and
fix all stale bugs. You seem to be assuming/demanding things of the
kernel community that are at least partially constrained by resource
availability --- and since you've used constrained resources as a
reason why Syzbot can't be extended as we've requested to reduce
developer toil and leverage our available resources, it would perhaps
be respectful if you also accepted that resource constraints also
exist in other areas, such as how much we can keep a fully groomed bug
tracking system.
Regards,
- Ted
On 6/11/18 8:20 AM, Dmitry Vyukov wrote:
> On Wed, May 23, 2018 at 8:01 PM, Eric Sandeen <[email protected]> wrote:
...
>> As an aside, I wonder how much coverage of the V5 format code syzkaller
>> /has/
>> achieved; that would be another useful datapoint google could provide - if
>> syzkaller is in fact traversing the V5 codepaths and isn't turning anything
>> up, that'd be pretty useful to know.
>
> Hi Eric,
>
> The current syzbot kernel code coverage is available here:
> https://storage.googleapis.com/syzkaller/cover/upstream.html#9c73bb525fc1def86e67f5039ab97d8f48062621
Here is an example of an informative, useful, and efficient presentation of
code coverage:
http://ltp.sourceforge.net/coverage/lcov/output/index.html
Thanks,
-Eric
On Mon, Jun 11, 2018 at 3:33 PM, Theodore Y. Ts'o <[email protected]> wrote:
> On Mon, Jun 11, 2018 at 03:07:24PM +0200, Dmitry Vyukov wrote:
>>
>> These can't be weaponized to execute code, but if a BUG_ON is
>> triggerable over a network, or from VM guest, then it's likely more
>> critical than a local code execution. That's why I am saying that
>> automated evaluation is infeasible.
>
> I can't imagine situations where a BUG_ON would be more critical than
> local code execution. You can leverage local code execution to ah
> remote privilege escalation attack; and local code execution can (with
> less effort) be translated to a system crash. Hence, local code
> execution is always more critical than a BUG_ON.
Well, if one could bring all of Google servers remotely, lots of
people would consider this as more critical as _anything_ local.
>> Anyway, bug type (UAF, BUG, task hung) is available in the bug title
>> on dashboard and on mailing lists, so you can just search/sort bugs on
>> the dashboard. What other interface you want on top of this?
>
> I also want to be able to search and filter based on subsystem, and
> whether or not there is a reproducer. Sometimes you can't even figure
> out the subsytem from the limited string shown on the dashboard,
> because the original string didn't include the subsystem to begin
> with, or the the subsytem name was truncated and not included on the
> dashboard.
How is this problem solved in kernel development for all other bug reports?
>> On a related note, perhaps kernel community needs to finally start
>> using bugzilla for real, like with priorities, assignees, up-to-date
>> statuses, no stale bugs, etc. All of this is available in bug tracking
>> systems for decades...
>
> I do use bugzilla and in fact if syzbot would automatically file a
> bugzilla.kernel.org report for things that are in the ext4 subsystem,
> that would be really helpful.
>
> As far as no stale bugs, etc., many companies (including Google)
> aren't capable of doing that with their own internal bug tracking
> systems, because management doesn't give them enough time to track and
> fix all stale bugs. You seem to be assuming/demanding things of the
> kernel community that are at least partially constrained by resource
> availability --- and since you've used constrained resources as a
> reason why Syzbot can't be extended as we've requested to reduce
> developer toil and leverage our available resources, it would perhaps
> be respectful if you also accepted that resource constraints also
> exist in other areas, such as how much we can keep a fully groomed bug
> tracking system.
I mentioned this only because you asked for this.
Whatever tracking system and style we go with, bug states need to
maintained and bugs need to be nursed. If we extend syzbot dashboard
with more traditional bug tracking system capabilities, but then
nobody cares to maintain order, it also won't be useful and nobody
will be able to easily select the current tasks to work on.
So that's a prerequisite for what you are asking for.
Well, you use bugzilla, but somebody else uses something else. This
fragmentation is kernel development practices does not allow to build
further automation on top. We can't do a personal solution for each
developer. For now the greatest common divisor seems to be freeform
emails on mailing lists...
A good example is "I submitted 7 kernel bugs to bugzilla, but nobody
answered me" email thread from today:
https://groups.google.com/forum/#!topic/syzkaller/OnbMQbbE4gQ