2014-08-03 13:52:31

by Toralf Förster

[permalink] [raw]
Subject: fuzz testing an ext4fs file system under a 32 bit Linux user mode linux guest let task jbd2/ubda hang

Hello,

fuzzying a 32 bit stable Gentoo x86 linux with trinity (and without excluding the munmap syscall but it might be independed from this) gives within a 32 bit user mode linux guest :


Aug 3 15:31:19 trinity su[1475]: Successful su for root by root
Aug 3 15:31:19 trinity su[1475]: + ??? root:root
Aug 3 15:31:19 trinity su[1475]: pam_unix(su:session): session opened for user root by (uid=0)
Aug 3 15:31:19 trinity su[1475]: pam_unix(su:session): session closed for user root
Aug 3 15:31:23 trinity kernel: VFS: Warning: trinity-c1 using old stat() call. Recompile your binary.
Aug 3 15:31:23 trinity kernel: VFS: Warning: trinity-c1 using old stat() call. Recompile your binary.
Aug 3 15:31:23 trinity kernel: VFS: Warning: trinity-c1 using old stat() call. Recompile your binary.
Aug 3 15:31:23 trinity kernel: trinity-c1 (1687) uses deprecated remap_file_pages() syscall. See Documentation/vm/remap_file_pages.txt.
Aug 3 15:31:23 trinity kernel: VFS: Warning: trinity-c0 using old stat() call. Recompile your binary.
Aug 3 15:31:23 trinity kernel: warning: process `trinity-c0' used the deprecated sysctl system call with
Aug 3 15:31:23 trinity kernel: VFS: Warning: trinity-c1 using old stat() call. Recompile your binary.
Aug 3 15:37:50 trinity kernel: INFO: task jbd2/ubda-8:397 blocked for more than 120 seconds.
Aug 3 15:37:50 trinity kernel: Not tainted 3.16.0-rc7-00111-g3f9c08f #92
Aug 3 15:37:50 trinity kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Aug 3 15:37:50 trinity kernel: jbd2/ubda-8 D 400011d2 0 397 2 0x00000000
Aug 3 15:37:50 trinity kernel: Stack:
Aug 3 15:37:50 trinity kernel: 086c8b7c 00000001 00000000 8486fd88 08060864 851e9f3c 086c8b7c 851e9a00
Aug 3 15:37:50 trinity kernel: 851e9a00 8486fdb0 084e7d14 851e9a00 086c8640 00000001 00000010 00001000
Aug 3 15:37:50 trinity kernel: 8486fe28 8486fe20 ffffffff 8486fdc4 084e7e05 080729be 00000000 8486fde0
Aug 3 15:37:50 trinity kernel: Call Trace:
Aug 3 15:37:50 trinity kernel: [<08060864>] __switch_to+0x44/0x70
Aug 3 15:37:50 trinity kernel: [<084e7d14>] __schedule+0x2c4/0x360
Aug 3 15:37:50 trinity kernel: [<084e7e05>] schedule+0x55/0x60
Aug 3 15:37:50 trinity kernel: [<080729be>] ? set_signals+0x1e/0x40
Aug 3 15:37:50 trinity kernel: [<084e8106>] io_schedule+0x46/0x60
Aug 3 15:37:50 trinity kernel: [<0812f628>] sleep_on_buffer+0x8/0x10
Aug 3 15:37:50 trinity kernel: [<084e81cc>] __wait_on_bit+0x3c/0x70
Aug 3 15:37:50 trinity kernel: [<084e82f9>] out_of_line_wait_on_bit+0x69/0x80
Aug 3 15:37:50 trinity kernel: [<0812f620>] ? sleep_on_buffer+0x0/0x10
Aug 3 15:37:50 trinity kernel: [<080a4b60>] ? wake_bit_function+0x0/0x50
Aug 3 15:37:50 trinity kernel: [<08130290>] __wait_on_buffer+0x30/0x40
Aug 3 15:37:50 trinity kernel: [<0812f620>] ? sleep_on_buffer+0x0/0x10
Aug 3 15:37:50 trinity kernel: [<081c841a>] jbd2_journal_commit_transaction+0xe1a/0x1390
Aug 3 15:37:50 trinity kernel: [<080729be>] ? set_signals+0x1e/0x40
Aug 3 15:37:50 trinity kernel: [<081cbc8f>] kjournald2+0xaf/0x1f0
Aug 3 15:37:50 trinity kernel: [<081cbc8f>] ? kjournald2+0xaf/0x1f0
Aug 3 15:37:50 trinity kernel: [<080729be>] ? set_signals+0x1e/0x40
Aug 3 15:37:50 trinity kernel: [<080a4b10>] ? autoremove_wake_function+0x0/0x50
Aug 3 15:37:50 trinity kernel: [<081cbbe0>] ? kjournald2+0x0/0x1f0
Aug 3 15:37:50 trinity kernel: [<08096806>] kthread+0xd6/0xe0
Aug 3 15:37:50 trinity kernel: [<0809dd7d>] ? finish_task_switch.isra.56+0x1d/0x70
Aug 3 15:37:50 trinity kernel: [<0806064b>] new_thread_handler+0x6b/0x90
Aug 3 15:37:50 trinity kernel:
Aug 3 15:39:50 trinity kernel: INFO: task jbd2/ubda-8:397 blocked for more than 120 seconds.



The trinity fuzzer now seems to be in an endless loop, the corresponding process at the host side gives always :


Thread 1 (process 21625):
#0 0xb7726aec in __kernel_vsyscall ()
#1 0x08496f6f in __nanosleep_nocancel () at ../sysdeps/unix/syscall-template.S:81
#2 0x08073124 in idle_sleep (nsecs=606859328233668608) at arch/um/os-Linux/time.c:183
#3 0x08060b3f in arch_cpu_idle () at arch/um/kernel/process.c:208
#4 0x080a5405 in cpuidle_idle_call () at kernel/sched/idle.c:120
#5 cpu_idle_loop () at kernel/sched/idle.c:224
#6 cpu_startup_entry (state=CPUHP_ONLINE) at kernel/sched/idle.c:272
#7 0x084e1692 in rest_init () at init/main.c:419
#8 0x0804892e in start_kernel () at init/main.c:679
#9 0x08049fc9 in start_kernel_proc (unused=0x0) at arch/um/kernel/skas/process.c:46
#10 0x0806064b in new_thread_handler () at arch/um/kernel/process.c:129
#11 0x00000000 in ?? ()


It might be that [1] has few more info/data, or ?
The diff to [1] is just that I'm still able to login into the UML guest.



[1] http://sourceforge.net/p/user-mode-linux/mailman/message/32673925/

--
Toralf



2014-08-03 18:53:10

by Theodore Ts'o

[permalink] [raw]
Subject: Re: fuzz testing an ext4fs file system under a 32 bit Linux user mode linux guest let task jbd2/ubda hang

On Sun, Aug 03, 2014 at 03:52:18PM +0200, Toralf F?rster wrote:
> Hello,
>
> fuzzying a 32 bit stable Gentoo x86 linux with trinity (and without excluding the munmap syscall but it might be independed from this) gives within a 32 bit user mode linux guest :

The problem with these sorts of trinity bug reports is that we have no
idea which syscall or set of syscalls might have corrupted kernel
state to the point where the kernel started malfunctioning.

Sometimes, a trinity induced bug is obvious, when it causes a system
call to immediately access an illegal memory location. But if it
causes some more subtle corruption, possibly in a completely unrelated
subsystem, figuring out what actually happened can be close to
impossible.

So there's not much I can do with this sort of bug report. If you can
easily repeat it, and you can dump out the system call stream, we
might be able to make a smaller reproduction case, at which point
trying to debug this sort of failure would be tractable.

Cheers,

- Ted

2014-08-09 18:45:48

by Toralf Förster

[permalink] [raw]
Subject: Re: fuzz testing an ext4fs file system under a 32 bit Linux user mode linux guest let task jbd2/ubda hang

On 08/03/2014 08:42 PM, Theodore Ts'o wrote:
> On Sun, Aug 03, 2014 at 03:52:18PM +0200, Toralf Förster wrote:
>> Hello,
>>
>> fuzzying a 32 bit stable Gentoo x86 linux with trinity (and without excluding the munmap syscall but it might be independed from this) gives within a 32 bit user mode linux guest :
>
> The problem with these sorts of trinity bug reports is that we have no
> idea which syscall or set of syscalls might have corrupted kernel
> state to the point where the kernel started malfunctioning.
>
> Sometimes, a trinity induced bug is obvious, when it causes a system
> call to immediately access an illegal memory location. But if it
> causes some more subtle corruption, possibly in a completely unrelated
> subsystem, figuring out what actually happened can be close to
> impossible.
>
> So there's not much I can do with this sort of bug report. If you can
> easily repeat it, and you can dump out the system call stream, we
> might be able to make a smaller reproduction case, at which point
> trying to debug this sort of failure would be tractable.
>
> Cheers,
>
> - Ted
>
ok, fair enough.


BTW may I just ask, if the following (same 32 bit system, host and UML guest are both at v3.16, 4 trinity childs hammering victims files in a ramdisk within the uML guest) matches the above too ? Or does the BUG output here more helpful data ? :


Kernel panic - not syncing: BUG!
CPU: 0 PID: 105 Comm: kworker/u2:2 Not tainted 3.16.0 #2
Workqueue: writeback bdi_writeback_workfn (flush-98:0)
Stack:
085be12e 085be12e 00000003 086e8547 0000000c 000002b3 85177cc4 85177c74
085015c3 00000000 85177c48 85177c9c 084fd6dd 085c9dc0 08727180 085bae4a
85177ca8 00000000 0000000c 000002b3 85177cc4 85177d10 0818ec0f 085bae4a
Call Trace:
[<085015c3>] dump_stack+0x26/0x28
[<084fd6dd>] panic+0x8f/0x1a9
[<0818ec0f>] mpage_release_unused_pages+0x11f/0x1b0
[<0819395a>] ext4_writepages+0x42a/0x640
[<080d87e5>] do_writepages+0x25/0x50
[<0812fd2c>] __writeback_single_inode+0x3c/0xf0
[<08130064>] writeback_sb_inodes+0x1d4/0x320
[<08130214>] __writeback_inodes_wb+0x64/0xa0
[<08130501>] wb_writeback+0xf1/0x190
[<081308a2>] bdi_writeback_workfn+0xa2/0x2d0
[<08504c56>] ? _raw_spin_unlock_irq+0x16/0x20
[<0809e30c>] ? finish_task_switch.constprop.52+0x3c/0x90
[<08091474>] process_one_work+0x184/0x2f0
[<080918da>] worker_thread+0x2fa/0x540
[<08504c3c>] ? _raw_spin_unlock_irqrestore+0x1c/0x20
[<08502d65>] ? schedule+0x55/0x60
[<080915e0>] ? worker_thread+0x0/0x540
[<080971b6>] kthread+0xd6/0xe0
[<0805f68b>] new_thread_handler+0x6b/0x90



--
Toralf

2014-08-09 20:09:52

by Theodore Ts'o

[permalink] [raw]
Subject: Re: fuzz testing an ext4fs file system under a 32 bit Linux user mode linux guest let task jbd2/ubda hang

On Sat, Aug 09, 2014 at 08:45:43PM +0200, Toralf F?rster wrote:
>
> BTW may I just ask, if the following (same 32 bit system, host and
> UML guest are both at v3.16, 4 trinity childs hammering victims
> files in a ramdisk within the uML guest) matches the above too ? Or
> does the BUG output here more helpful data ? :

Unfortunately, there's nothing useful here. There is more than one
BUG_ON() in this mpage_release_unused_pages function, and what's more,
we have no idea which syscall or combination of syscalls might have
left a page in a particular state.

- Ted