2009-10-18 15:47:23

by Fredrik Andersson

[permalink] [raw]
Subject: Fwd: Ext4 bug with f

Hi, I'd like to report what I'm fairly certain is an ext4 bug. I hope
this is the right place to do so.

My program creates a big file (around 30 GB) with posix_fallocate (to
utilize extents), fills it with data and uses ftruncate to crop it to
its final size (usually somewhere between 20 and 25 GB).
The problem is that in around 5% of the cases, the program locks up
completely in a syscall. The process can thus not be killed even with
kill -9, and a reboot is all that will do.

Here is the contents of my /proc/PID/syscall:
76 0xee4a80 0x486d6aaf8 0x36390113f8 0x7fffc63cd350 0xecc050
0x7fffc63cd3f0 0x7fffc63cd5c8 0x36380e0cc7

Syscall 76 is as far as I can tell getrlimit, which I do not call. It
must have been called somewhere from ftruncate.

The file is on a software raid 0 mount with two disks, handled by
mdadm. I have reported the problem to the md people but they insist
it's an ext4 problem.

I have also tried closing the file and opening it again prior to the
ftruncate. No change.

There are no other strange phenomena whatsoever with ext4. This
problem only arises in this particular situation.

I'm running Fedora on an x86_64 system.
I have tried this on several kernel versions, the last one 2.6.31.1.
It has persisted all the way since the kernel that originally shipped
with Fedora 11.

Is this a bug/known problem? Thankful for any insights!

/Fredrik


2009-10-18 15:57:33

by Eric Sandeen

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

Fredrik Andersson wrote:
> Hi, I'd like to report what I'm fairly certain is an ext4 bug. I hope
> this is the right place to do so.
>
> My program creates a big file (around 30 GB) with posix_fallocate (to
> utilize extents), fills it with data and uses ftruncate to crop it to
> its final size (usually somewhere between 20 and 25 GB).
> The problem is that in around 5% of the cases, the program locks up
> completely in a syscall. The process can thus not be killed even with
> kill -9, and a reboot is all that will do.

does echo w > /proc/sysrq-trigger (this does sleeping processes; or use
echo t for all processes) show you where the stuck threads are?

-Eric


2009-10-19 09:49:25

by Fredrik Andersson

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

Hi, here is the data for this process:

5958816.744013] drdbmake ???? D ffff88021e4c7800???? 0 27019? 13796
[5958816.744013]? ffff8801d1bcda88 0000000000000082 ffff8801f4ce9bf0
ffff8801678b1380
[5958816.744013]? 0000000000010e80 000000000000c748 ffff8800404963c0
ffffffff81526360
[5958816.744013]? ffff880040496730 00000000f4ce9bf0 000000025819cebe
0000000000000282
[5958816.744013] Call Trace:
[5958816.744013]? [<ffffffff813a9639>] schedule+0x9/0x20
[5958816.744013]? [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
[5958816.744013]? [<ffffffff8105b900>] ? autoremove_wake_function+0x0/
0x40
[5958816.744013]? [<ffffffff811781ce>] jbd2_journal_restart+0xbe/0x150
[5958816.744013]? [<ffffffff8116243d>] ext4_ext_truncate+0x6dd/0xa20
[5958816.744013]? [<ffffffff81095b3b>] ? find_get_pages+0x3b/0xf0
[5958816.744013]? [<ffffffff81150a78>] ext4_truncate+0x198/0x680
[5958816.744013]? [<ffffffff810ac984>] ? unmap_mapping_range+0x74/0x280
[5958816.744013]? [<ffffffff811772c0>] ? jbd2_journal_stop+0x1e0/0x360
[5958816.744013]? [<ffffffff810acd25>] vmtruncate+0xa5/0x110
[5958816.744013]? [<ffffffff810dda10>] inode_setattr+0x30/0x180
[5958816.744013]? [<ffffffff8114d073>] ext4_setattr+0x173/0x310
[5958816.744013]? [<ffffffff810ddc79>] notify_change+0x119/0x330
[5958816.744013]? [<ffffffff810c6df3>] do_truncate+0x63/0x90
[5958816.744013]? [<ffffffff810d0cc3>] ? get_write_access+0x23/0x60
[5958816.744013]? [<ffffffff810c70cb>] sys_truncate+0x17b/0x180
[5958816.744013]? [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b

Don't know if this has anything to do with it, but? I also noticed
that another process of mine,
which is working just fine, is executing a suspicious looking function
called raid0_unplug.
It operates on the same raid0/ext4 filesystem as the hung process. I
include the calltrace for it here too:

[5958816.744013] nodeserv????? D ffff880167bd7ca8???? 0 17900? 13796
[5958816.744013]? ffff880167bd7bf8 0000000000000082 ffff88002800a588
ffff88021e5b56e0
[5958816.744013]? 0000000000010e80 000000000000c748 ffff880100664020
ffffffff81526360
[5958816.744013]? ffff880100664390 000000008119bd17 000000026327bfa9
0000000000000002
[5958816.744013] Call Trace:
[5958816.744013]? [<ffffffffa0039291>] ? raid0_unplug+0x51/0x70 [raid0]
[5958816.744013]? [<ffffffff813a9639>] schedule+0x9/0x20
[5958816.744013]? [<ffffffff813a9687>] io_schedule+0x37/0x50
[5958816.744013]? [<ffffffff81095e35>] sync_page+0x35/0x60
[5958816.744013]? [<ffffffff81095e69>] sync_page_killable+0x9/0x50
[5958816.744013]? [<ffffffff813a99d2>] __wait_on_bit_lock+0x52/0xb0
[5958816.744013]? [<ffffffff81095e60>] ? sync_page_killable+0x0/0x50
[5958816.744013]? [<ffffffff81095d74>] __lock_page_killable+0x64/0x70
[5958816.744013]? [<ffffffff8105b940>] ? wake_bit_function+0x0/0x40
[5958816.744013]? [<ffffffff81095c0b>] ? find_get_page+0x1b/0xb0
[5958816.744013]? [<ffffffff81097908>] generic_file_aio_read+0x3b8/0x6b0
[5958816.744013]? [<ffffffff810c7dc1>] do_sync_read+0xf1/0x140
[5958816.744013]? [<ffffffff8106a5e8>] ? do_futex+0xb8/0xb20
[5958816.744013]? [<ffffffff813ab78f>] ? _spin_unlock_irqrestore+0x2f/0x40
[5958816.744013]? [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[5958816.744013]? [<ffffffff8105bc73>] ? add_wait_queue+0x43/0x60
[5958816.744013]? [<ffffffff81062a6c>] ? getnstimeofday+0x5c/0xf0
[5958816.744013]? [<ffffffff810c85b8>] vfs_read+0xc8/0x170
[5958816.744013]? [<ffffffff810c86fa>] sys_pread64+0x9a/0xa0
[5958816.744013]? [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b

Hope this makes sense to anyone, and please let me know if there is
more info I can provide.

/Fredrik

On Sun, Oct 18, 2009 at 5:57 PM, Eric Sandeen <[email protected]> wrote:
>
> Fredrik Andersson wrote:
>>
>> Hi, I'd like to report what I'm fairly certain is an ext4 bug. I hope
>> this is the right place to do so.
>>
>> My program creates a big file (around 30 GB) with posix_fallocate (to
>> utilize extents), fills it with data and uses ftruncate to crop it to
>> its final size (usually somewhere between 20 and 25 GB).
>> The problem is that in around 5% of the cases, the program locks up
>> completely in a syscall. The process can thus not be killed even with
>> kill -9, and a reboot is all that will do.
>
> does echo w > /proc/sysrq-trigger (this does sleeping processes; or use echo t for all processes) show you where the stuck threads are?
>
> -Eric
>

2009-10-20 16:49:26

by Fredrik Andersson

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

I found the following post to the ext4 list. This seems to fit my
experienced problems pretty exactly.

http://osdir.com/ml/linux-ext4/2009-08/msg00184.html

Is it the same problem?

/Fredrik

On Mon, Oct 19, 2009 at 11:49 AM, Fredrik Andersson <[email protected]> wrote:
> Hi, here is the data for this process:
>
> 5958816.744013] drdbmake ???? D ffff88021e4c7800???? 0 27019? 13796
> [5958816.744013]? ffff8801d1bcda88 0000000000000082 ffff8801f4ce9bf0
> ffff8801678b1380
> [5958816.744013]? 0000000000010e80 000000000000c748 ffff8800404963c0
> ffffffff81526360
> [5958816.744013]? ffff880040496730 00000000f4ce9bf0 000000025819cebe
> 0000000000000282
> [5958816.744013] Call Trace:
> [5958816.744013]? [<ffffffff813a9639>] schedule+0x9/0x20
> [5958816.744013]? [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
> [5958816.744013]? [<ffffffff8105b900>] ? autoremove_wake_function+0x0/
> 0x40
> [5958816.744013]? [<ffffffff811781ce>] jbd2_journal_restart+0xbe/0x150
> [5958816.744013]? [<ffffffff8116243d>] ext4_ext_truncate+0x6dd/0xa20
> [5958816.744013]? [<ffffffff81095b3b>] ? find_get_pages+0x3b/0xf0
> [5958816.744013]? [<ffffffff81150a78>] ext4_truncate+0x198/0x680
> [5958816.744013]? [<ffffffff810ac984>] ? unmap_mapping_range+0x74/0x280
> [5958816.744013]? [<ffffffff811772c0>] ? jbd2_journal_stop+0x1e0/0x360
> [5958816.744013]? [<ffffffff810acd25>] vmtruncate+0xa5/0x110
> [5958816.744013]? [<ffffffff810dda10>] inode_setattr+0x30/0x180
> [5958816.744013]? [<ffffffff8114d073>] ext4_setattr+0x173/0x310
> [5958816.744013]? [<ffffffff810ddc79>] notify_change+0x119/0x330
> [5958816.744013]? [<ffffffff810c6df3>] do_truncate+0x63/0x90
> [5958816.744013]? [<ffffffff810d0cc3>] ? get_write_access+0x23/0x60
> [5958816.744013]? [<ffffffff810c70cb>] sys_truncate+0x17b/0x180
> [5958816.744013]? [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
>
> Don't know if this has anything to do with it, but? I also noticed
> that another process of mine,
> which is working just fine, is executing a suspicious looking function
> called raid0_unplug.
> It operates on the same raid0/ext4 filesystem as the hung process. I
> include the calltrace for it here too:
>
> [5958816.744013] nodeserv????? D ffff880167bd7ca8???? 0 17900? 13796
> [5958816.744013]? ffff880167bd7bf8 0000000000000082 ffff88002800a588
> ffff88021e5b56e0
> [5958816.744013]? 0000000000010e80 000000000000c748 ffff880100664020
> ffffffff81526360
> [5958816.744013]? ffff880100664390 000000008119bd17 000000026327bfa9
> 0000000000000002
> [5958816.744013] Call Trace:
> [5958816.744013]? [<ffffffffa0039291>] ? raid0_unplug+0x51/0x70 [raid0]
> [5958816.744013]? [<ffffffff813a9639>] schedule+0x9/0x20
> [5958816.744013]? [<ffffffff813a9687>] io_schedule+0x37/0x50
> [5958816.744013]? [<ffffffff81095e35>] sync_page+0x35/0x60
> [5958816.744013]? [<ffffffff81095e69>] sync_page_killable+0x9/0x50
> [5958816.744013]? [<ffffffff813a99d2>] __wait_on_bit_lock+0x52/0xb0
> [5958816.744013]? [<ffffffff81095e60>] ? sync_page_killable+0x0/0x50
> [5958816.744013]? [<ffffffff81095d74>] __lock_page_killable+0x64/0x70
> [5958816.744013]? [<ffffffff8105b940>] ? wake_bit_function+0x0/0x40
> [5958816.744013]? [<ffffffff81095c0b>] ? find_get_page+0x1b/0xb0
> [5958816.744013]? [<ffffffff81097908>] generic_file_aio_read+0x3b8/0x6b0
> [5958816.744013]? [<ffffffff810c7dc1>] do_sync_read+0xf1/0x140
> [5958816.744013]? [<ffffffff8106a5e8>] ? do_futex+0xb8/0xb20
> [5958816.744013]? [<ffffffff813ab78f>] ? _spin_unlock_irqrestore+0x2f/0x40
> [5958816.744013]? [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
> [5958816.744013]? [<ffffffff8105bc73>] ? add_wait_queue+0x43/0x60
> [5958816.744013]? [<ffffffff81062a6c>] ? getnstimeofday+0x5c/0xf0
> [5958816.744013]? [<ffffffff810c85b8>] vfs_read+0xc8/0x170
> [5958816.744013]? [<ffffffff810c86fa>] sys_pread64+0x9a/0xa0
> [5958816.744013]? [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
>
> Hope this makes sense to anyone, and please let me know if there is
> more info I can provide.
>
> /Fredrik
>
> On Sun, Oct 18, 2009 at 5:57 PM, Eric Sandeen <[email protected]> wrote:
>>
>> Fredrik Andersson wrote:
>>>
>>> Hi, I'd like to report what I'm fairly certain is an ext4 bug. I hope
>>> this is the right place to do so.
>>>
>>> My program creates a big file (around 30 GB) with posix_fallocate (to
>>> utilize extents), fills it with data and uses ftruncate to crop it to
>>> its final size (usually somewhere between 20 and 25 GB).
>>> The problem is that in around 5% of the cases, the program locks up
>>> completely in a syscall. The process can thus not be killed even with
>>> kill -9, and a reboot is all that will do.
>>
>> does echo w > /proc/sysrq-trigger (this does sleeping processes; or use echo t for all processes) show you where the stuck threads are?
>>
>> -Eric
>>
>

2009-10-21 02:20:21

by Eric Sandeen

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

Fredrik Andersson wrote:
> Hi, here is the data for this process:

Including all of the processes in D state (everything reported by
sysrq-w) would probably be most helpful.

Feel free to file an ext4 bug on bugzilla.kernel.org w/ this
information, too, so it doesn't get lost in busy schedules ...

Thanks,
-Eric

> 5958816.744013] drdbmake D ffff88021e4c7800 0 27019 13796
> [5958816.744013] ffff8801d1bcda88 0000000000000082 ffff8801f4ce9bf0
> ffff8801678b1380
> [5958816.744013] 0000000000010e80 000000000000c748 ffff8800404963c0
> ffffffff81526360
> [5958816.744013] ffff880040496730 00000000f4ce9bf0 000000025819cebe
> 0000000000000282
> [5958816.744013] Call Trace:
> [5958816.744013] [<ffffffff813a9639>] schedule+0x9/0x20
> [5958816.744013] [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
> [5958816.744013] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/
> 0x40
> [5958816.744013] [<ffffffff811781ce>] jbd2_journal_restart+0xbe/0x150
> [5958816.744013] [<ffffffff8116243d>] ext4_ext_truncate+0x6dd/0xa20
> [5958816.744013] [<ffffffff81095b3b>] ? find_get_pages+0x3b/0xf0
> [5958816.744013] [<ffffffff81150a78>] ext4_truncate+0x198/0x680
> [5958816.744013] [<ffffffff810ac984>] ? unmap_mapping_range+0x74/0x280
> [5958816.744013] [<ffffffff811772c0>] ? jbd2_journal_stop+0x1e0/0x360
> [5958816.744013] [<ffffffff810acd25>] vmtruncate+0xa5/0x110
> [5958816.744013] [<ffffffff810dda10>] inode_setattr+0x30/0x180
> [5958816.744013] [<ffffffff8114d073>] ext4_setattr+0x173/0x310
> [5958816.744013] [<ffffffff810ddc79>] notify_change+0x119/0x330
> [5958816.744013] [<ffffffff810c6df3>] do_truncate+0x63/0x90
> [5958816.744013] [<ffffffff810d0cc3>] ? get_write_access+0x23/0x60
> [5958816.744013] [<ffffffff810c70cb>] sys_truncate+0x17b/0x180
> [5958816.744013] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
>
> Don't know if this has anything to do with it, but I also noticed
> that another process of mine,
> which is working just fine, is executing a suspicious looking function
> called raid0_unplug.
> It operates on the same raid0/ext4 filesystem as the hung process. I
> include the calltrace for it here too:
>
> [5958816.744013] nodeserv D ffff880167bd7ca8 0 17900 13796
> [5958816.744013] ffff880167bd7bf8 0000000000000082 ffff88002800a588
> ffff88021e5b56e0
> [5958816.744013] 0000000000010e80 000000000000c748 ffff880100664020
> ffffffff81526360
> [5958816.744013] ffff880100664390 000000008119bd17 000000026327bfa9
> 0000000000000002
> [5958816.744013] Call Trace:
> [5958816.744013] [<ffffffffa0039291>] ? raid0_unplug+0x51/0x70 [raid0]
> [5958816.744013] [<ffffffff813a9639>] schedule+0x9/0x20
> [5958816.744013] [<ffffffff813a9687>] io_schedule+0x37/0x50
> [5958816.744013] [<ffffffff81095e35>] sync_page+0x35/0x60
> [5958816.744013] [<ffffffff81095e69>] sync_page_killable+0x9/0x50
> [5958816.744013] [<ffffffff813a99d2>] __wait_on_bit_lock+0x52/0xb0
> [5958816.744013] [<ffffffff81095e60>] ? sync_page_killable+0x0/0x50
> [5958816.744013] [<ffffffff81095d74>] __lock_page_killable+0x64/0x70
> [5958816.744013] [<ffffffff8105b940>] ? wake_bit_function+0x0/0x40
> [5958816.744013] [<ffffffff81095c0b>] ? find_get_page+0x1b/0xb0
> [5958816.744013] [<ffffffff81097908>] generic_file_aio_read+0x3b8/0x6b0
> [5958816.744013] [<ffffffff810c7dc1>] do_sync_read+0xf1/0x140
> [5958816.744013] [<ffffffff8106a5e8>] ? do_futex+0xb8/0xb20
> [5958816.744013] [<ffffffff813ab78f>] ? _spin_unlock_irqrestore+0x2f/0x40
> [5958816.744013] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
> [5958816.744013] [<ffffffff8105bc73>] ? add_wait_queue+0x43/0x60
> [5958816.744013] [<ffffffff81062a6c>] ? getnstimeofday+0x5c/0xf0
> [5958816.744013] [<ffffffff810c85b8>] vfs_read+0xc8/0x170
> [5958816.744013] [<ffffffff810c86fa>] sys_pread64+0x9a/0xa0
> [5958816.744013] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
>
> Hope this makes sense to anyone, and please let me know if there is
> more info I can provide.
>
> /Fredrik
>
> On Sun, Oct 18, 2009 at 5:57 PM, Eric Sandeen <[email protected]> wrote:
>> Fredrik Andersson wrote:
>>> Hi, I'd like to report what I'm fairly certain is an ext4 bug. I hope
>>> this is the right place to do so.
>>>
>>> My program creates a big file (around 30 GB) with posix_fallocate (to
>>> utilize extents), fills it with data and uses ftruncate to crop it to
>>> its final size (usually somewhere between 20 and 25 GB).
>>> The problem is that in around 5% of the cases, the program locks up
>>> completely in a syscall. The process can thus not be killed even with
>>> kill -9, and a reboot is all that will do.
>> does echo w > /proc/sysrq-trigger (this does sleeping processes; or use echo t for all processes) show you where the stuck threads are?
>>
>> -Eric
>>


2009-10-21 09:08:11

by Fredrik Andersson

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

Alright, here is the complete sysrq w data. I will file a bug report too.

My programs are drdbmake, nodeserv, receptd.

[6130704.521614] SysRq : Show Blocked State
[6130704.521696] task PC stack pid father
[6130704.521702] pdflush D ffff8801f4ce9bdc 0 269 2
[6130704.521707] ffff88021e551960 0000000000000046 ffff88021e5b56e0
ffff8800d3506a80
[6130704.521712] 0000000000010e80 000000000000c748 ffff88021fa24740
ffff88021f843900
[6130704.521717] ffff88021fa24ab0 0000000100000008 000000025819ca1c
0080001000800013
[6130704.521721] Call Trace:
[6130704.521731] [<ffffffff813a9639>] schedule+0x9/0x20
[6130704.521735] [<ffffffff813ab2d5>] __down_read+0x85/0xb7
[6130704.521738] [<ffffffff813aa565>] down_read+0x25/0x30
[6130704.521743] [<ffffffff8114e6f2>] ext4_get_blocks_wrap+0x52/0x290
[6130704.521747] [<ffffffff8114f601>] mpage_da_map_blocks+0xd1/0x7a0
[6130704.521752] [<ffffffff8109d12c>] ? write_cache_pages+0x2fc/0x470
[6130704.521756] [<ffffffff81150330>] ? __mpage_da_writepage+0x0/0x180
[6130704.521760] [<ffffffff8114ffdb>] ext4_da_writepages+0x30b/0x590
[6130704.521764] [<ffffffff8103d91b>] ? finish_task_switch+0x5b/0xd0
[6130704.521768] [<ffffffff8109d2f8>] do_writepages+0x28/0x50
[6130704.521773] [<ffffffff810e60fb>] __writeback_single_inode+0x9b/0x3e0
[6130704.521777] [<ffffffff8104f830>] ? process_timeout+0x0/0x10
[6130704.521781] [<ffffffff813a8827>] ? io_schedule_timeout+0x37/0x50
[6130704.521785] [<ffffffff810e68ec>] generic_sync_sb_inodes+0x36c/0x440
[6130704.521789] [<ffffffff810e6b8e>] writeback_inodes+0x5e/0x100
[6130704.521792] [<ffffffff8109d4ac>] wb_kupdate+0xbc/0x140
[6130704.521796] [<ffffffff8109e6a8>] pdflush+0x118/0x220
[6130704.521799] [<ffffffff8109d3f0>] ? wb_kupdate+0x0/0x140
[6130704.521802] [<ffffffff8109e590>] ? pdflush+0x0/0x220
[6130704.521806] [<ffffffff8105b4c6>] kthread+0x56/0x90
[6130704.521811] [<ffffffff8100cf6a>] child_rip+0xa/0x20
[6130704.521814] [<ffffffff8100c969>] ? restore_args+0x0/0x30
[6130704.521818] [<ffffffff8105b470>] ? kthread+0x0/0x90
[6130704.521821] [<ffffffff8100cf60>] ? child_rip+0x0/0x20
[6130704.521826] kjournald2 D ffff88021f921e28 0 1334 2
[6130704.521830] ffff88021f921d10 0000000000000046 ffff88021f48c890
0000000000000000
[6130704.521835] 0000000000010e80 000000000000c748 ffff88021e5d8000
ffff88021f844740
[6130704.521839] ffff88021e5d8370 0000000100000282 0000000000000000
0000000000000282
[6130704.521844] Call Trace:
[6130704.521847] [<ffffffff813a9639>] schedule+0x9/0x20
[6130704.521853] [<ffffffff81178ebe>]
jbd2_journal_commit_transaction+0x1be/0x1810
[6130704.521856] [<ffffffff81040cd6>] ? dequeue_task_fair+0x256/0x260
[6130704.521860] [<ffffffff8103d91b>] ? finish_task_switch+0x5b/0xd0
[6130704.521864] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6130704.521868] [<ffffffff8117ea8a>] kjournald2+0x11a/0x370
[6130704.521872] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6130704.521875] [<ffffffff8117e970>] ? kjournald2+0x0/0x370
[6130704.521878] [<ffffffff8117e970>] ? kjournald2+0x0/0x370
[6130704.521881] [<ffffffff8105b4c6>] kthread+0x56/0x90
[6130704.521885] [<ffffffff8100cf6a>] child_rip+0xa/0x20
[6130704.521888] [<ffffffff8100c969>] ? restore_args+0x0/0x30
[6130704.521892] [<ffffffff8105b470>] ? kthread+0x0/0x90
[6130704.521895] [<ffffffff8100cf60>] ? child_rip+0x0/0x20
[6130704.521901] receptd D 000000000000002a 0 28746 2063
[6130704.521905] ffff8801d1b39c78 0000000000000082 ffff8801d1b39be8
ffffffff81152dd3
[6130704.521910] 0000000000010e80 000000000000c748 ffff88010014ce60
ffffffff81526360
[6130704.521914] ffff88010014d1d0 00000000810db183 000000025821c45c
0000000000000286
[6130704.521919] Call Trace:
[6130704.521922] [<ffffffff81152dd3>] ? dx_release+0x23/0x50
[6130704.521926] [<ffffffff813a9639>] schedule+0x9/0x20
[6130704.521930] [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
[6130704.521934] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6130704.521937] [<ffffffff81178306>] jbd2_journal_start+0xa6/0xf0
[6130704.521942] [<ffffffff811593c5>] ext4_journal_start_sb+0x55/0x90
[6130704.521945] [<ffffffff81155441>] ext4_create+0x81/0x130
[6130704.521949] [<ffffffff810d23aa>] vfs_create+0x9a/0xb0
[6130704.521952] [<ffffffff810d522e>] do_filp_open+0x85e/0x990
[6130704.521956] [<ffffffff810d1d16>] ? getname+0x36/0x200
[6130704.521960] [<ffffffff810deada>] ? alloc_fd+0x4a/0x140
[6130704.521964] [<ffffffff810c5dfb>] do_sys_open+0x7b/0x110
[6130704.521967] [<ffffffff810c5ebb>] sys_open+0x1b/0x20
[6130704.521971] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6130704.521973] receptd D 000000000000002a 0 10643 2063
[6130704.521977] ffff8800155e9c78 0000000000000082 ffff8800155e9be8
ffffffff81152dd3
[6130704.521982] 0000000000010e80 000000000000c748 ffff880100149560
ffffffff81526360
[6130704.521986] ffff8801001498d0 00000000810db183 000000025adbe421
0000000000000286
[6130704.521991] Call Trace:
[6130704.521994] [<ffffffff81152dd3>] ? dx_release+0x23/0x50
[6130704.521998] [<ffffffff813a9639>] schedule+0x9/0x20
[6130704.522002] [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
[6130704.522005] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6130704.522009] [<ffffffff81178306>] jbd2_journal_start+0xa6/0xf0
[6130704.522013] [<ffffffff811593c5>] ext4_journal_start_sb+0x55/0x90
[6130704.522016] [<ffffffff81155441>] ext4_create+0x81/0x130
[6130704.522020] [<ffffffff810d23aa>] vfs_create+0x9a/0xb0
[6130704.522023] [<ffffffff810d522e>] do_filp_open+0x85e/0x990
[6130704.522027] [<ffffffff810d1d16>] ? getname+0x36/0x200
[6130704.522030] [<ffffffff810deada>] ? alloc_fd+0x4a/0x140
[6130704.522033] [<ffffffff810c5dfb>] do_sys_open+0x7b/0x110
[6130704.522037] [<ffffffff810c5ebb>] sys_open+0x1b/0x20
[6130704.522040] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6130704.522043] receptd D 000000000000002a 0 3658 2063
[6130704.522047] ffff8801792adc78 0000000000000082 ffff8801792adbe8
ffffffff81152dd3
[6130704.522051] 0000000000010e80 000000000000c748 ffff88010014b900
ffffffff81526360
[6130704.522056] ffff88010014bc70 00000000810db183 000000025ddc053f
0000000000000286
[6130704.522060] Call Trace:
[6130704.522063] [<ffffffff81152dd3>] ? dx_release+0x23/0x50
[6130704.522067] [<ffffffff813a9639>] schedule+0x9/0x20
[6130704.522071] [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
[6130704.522075] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6130704.522079] [<ffffffff81178306>] jbd2_journal_start+0xa6/0xf0
[6130704.522083] [<ffffffff811593c5>] ext4_journal_start_sb+0x55/0x90
[6130704.522086] [<ffffffff81155441>] ext4_create+0x81/0x130
[6130704.522089] [<ffffffff810d23aa>] vfs_create+0x9a/0xb0
[6130704.522093] [<ffffffff810d522e>] do_filp_open+0x85e/0x990
[6130704.522096] [<ffffffff810d1d16>] ? getname+0x36/0x200
[6130704.522099] [<ffffffff810deada>] ? alloc_fd+0x4a/0x140
[6130704.522103] [<ffffffff810c5dfb>] do_sys_open+0x7b/0x110
[6130704.522106] [<ffffffff810c5ebb>] sys_open+0x1b/0x20
[6130704.522109] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6130704.522112] receptd D ffff8801f4d04ba4 0 4716 2063
[6130704.522116] ffff880001f43e58 0000000000000082 ffffffff81095e00
ffff880001f43de8
[6130704.522120] 0000000000010e80 000000000000c748 ffff880100149c80
ffffffff81526360
[6130704.522125] ffff880100149ff0 0000000000060d51 000000025fd791a8
0000000000060d51
[6130704.522128] Call Trace:
[6130704.522128] [<ffffffff81095e00>] ? sync_page+0x0/0x60
[6130704.522128] [<ffffffff810b9c48>] ? delete_from_swap_cache+0x48/0x60
[6130704.522128] [<ffffffff810ba70b>] ? swap_info_get+0x6b/0xf0
[6130704.522128] [<ffffffff813a9d7b>] __mutex_lock_killable_slowpath+0xeb/0x1c0
[6130704.522128] [<ffffffff810d6c90>] ? filldir+0x0/0xe0
[6130704.522128] [<ffffffff813a9ea8>] mutex_lock_killable+0x58/0x70
[6130704.522128] [<ffffffff810d6ed2>] vfs_readdir+0x72/0xc0
[6130704.522128] [<ffffffff810d707b>] sys_getdents+0x8b/0xf0
[6130704.522128] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6130704.522128] receptd D ffff8801f4d04ba4 0 18596 2063
[6130704.522128] ffff88015a003e58 0000000000000082 ffff88021c9cb540
000000012905b025
[6130704.522128] 0000000000010e80 000000000000c748 ffff8801fb166ae0
ffff88021f843900
[6130704.522128] ffff8801fb166e50 0000000101c0ba80 000000026291bdc0
ffffe20006bcac40
[6130704.522128] Call Trace:
[6130704.522128] [<ffffffff813a9d7b>] __mutex_lock_killable_slowpath+0xeb/0x1c0
[6130704.522128] [<ffffffff810d6c90>] ? filldir+0x0/0xe0
[6130704.522128] [<ffffffff813a9ea8>] mutex_lock_killable+0x58/0x70
[6130704.522128] [<ffffffff810d6ed2>] vfs_readdir+0x72/0xc0
[6130704.522128] [<ffffffff810d707b>] sys_getdents+0x8b/0xf0
[6130704.522128] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6130704.522128] receptd D ffff8801f4d04ba4 0 569 2063
[6130704.522128] ffff8801c5705e58 0000000000000082 0000000000000000
ffff88021e7a1150
[6130704.522128] 0000000000010e80 000000000000c748 ffff8801fb161560
ffffffff81526360
[6130704.522128] ffff8801fb1618d0 0000000081580a80 00000002654bd5e6
0000000000000001
[6130704.522128] Call Trace:
[6130704.522128] [<ffffffff810b68b4>] ? anon_vma_prepare+0x34/0xf0
[6130704.522128] [<ffffffff813a9d7b>] __mutex_lock_killable_slowpath+0xeb/0x1c0
[6130704.522128] [<ffffffff810d6c90>] ? filldir+0x0/0xe0
[6130704.522128] [<ffffffff813a9ea8>] mutex_lock_killable+0x58/0x70
[6130704.522128] [<ffffffff810d6ed2>] vfs_readdir+0x72/0xc0
[6130704.522128] [<ffffffff810d707b>] sys_getdents+0x8b/0xf0
[6130704.522128] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6130704.522128] receptd D ffff8801f4d04ba4 0 15225 2063
[6130704.522128] ffff880013545e58 0000000000000082 0000000000000000
ffff88021e7a1150
[6130704.522128] 0000000000010e80 000000000000c748 ffff8801fb165ca0
ffffffff81526360
[6130704.522128] ffff8801fb166010 0000000081580a80 000000026805edc6
0000000000000001
[6130704.522128] Call Trace:
[6130704.522128] [<ffffffff810b68b4>] ? anon_vma_prepare+0x34/0xf0
[6130704.522128] [<ffffffff813a9d7b>] __mutex_lock_killable_slowpath+0xeb/0x1c0
[6130704.522128] [<ffffffff810d6c90>] ? filldir+0x0/0xe0
[6130704.522128] [<ffffffff813a9ea8>] mutex_lock_killable+0x58/0x70
[6130704.522128] [<ffffffff810d6ed2>] vfs_readdir+0x72/0xc0
[6130704.522128] [<ffffffff810d707b>] sys_getdents+0x8b/0xf0
[6130704.522128] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6130704.522128] receptd D ffff8801f4d04ba4 0 29613 2063
[6130704.522128] ffff88004071de58 0000000000000082 0000000000000000
ffff88021e7a1150
[6130704.522128] 0000000000010e80 000000000000c748 ffff8801fb1663c0
ffffffff81526360
[6130704.522128] ffff8801fb166730 0000000081580a80 000000026ac00606
0000000000000001
[6130704.522128] Call Trace:
[6130704.522128] [<ffffffff810b68b4>] ? anon_vma_prepare+0x34/0xf0
[6130704.522128] [<ffffffff813a9d7b>] __mutex_lock_killable_slowpath+0xeb/0x1c0
[6130704.522128] [<ffffffff810d6c90>] ? filldir+0x0/0xe0
[6130704.522128] [<ffffffff813a9ea8>] mutex_lock_killable+0x58/0x70
[6130704.522128] [<ffffffff810d6ed2>] vfs_readdir+0x72/0xc0
[6130704.522128] [<ffffffff810d707b>] sys_getdents+0x8b/0xf0
[6130704.522128] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6130704.522128] receptd D ffff8801f4d04ba4 0 4444 2063
[6130704.522128] ffff8800404e5e58 0000000000000082 0000000000000000
ffff88021e7a1150
[6130704.522128] 0000000000010e80 000000000000c748 ffff880040494020
ffffffff81526360
[6130704.522128] ffff880040494390 0000000081580a80 000000026b2c0f52
0000000000000001
[6130704.522128] Call Trace:
[6130704.522128] [<ffffffff810b68b4>] ? anon_vma_prepare+0x34/0xf0
[6130704.522128] [<ffffffff813a9d7b>] __mutex_lock_killable_slowpath+0xeb/0x1c0
[6130704.522128] [<ffffffff810d6c90>] ? filldir+0x0/0xe0
[6130704.522128] [<ffffffff813a9ea8>] mutex_lock_killable+0x58/0x70
[6130704.522128] [<ffffffff810d6ed2>] vfs_readdir+0x72/0xc0
[6130704.522128] [<ffffffff810d707b>] sys_getdents+0x8b/0xf0
[6130704.522128] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6130704.522128] drdbmake D ffff88021e4c7800 0 27019 13796
[6130704.522128] ffff8801d1bcda88 0000000000000082 ffff8801f4ce9bf0
ffff8801678b1380
[6130704.522128] 0000000000010e80 000000000000c748 ffff8800404963c0
ffffffff81526360
[6130704.522128] ffff880040496730 00000000f4ce9bf0 000000025819cebe
0000000000000282
[6130704.522128] Call Trace:
[6130704.522128] [<ffffffff813a9639>] schedule+0x9/0x20
[6130704.522128] [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
[6130704.522128] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6130704.522128] [<ffffffff811781ce>] jbd2_journal_restart+0xbe/0x150
[6130704.522128] [<ffffffff8116243d>] ext4_ext_truncate+0x6dd/0xa20
[6130704.522128] [<ffffffff81095b3b>] ? find_get_pages+0x3b/0xf0
[6130704.522128] [<ffffffff81150a78>] ext4_truncate+0x198/0x680
[6130704.522128] [<ffffffff810ac984>] ? unmap_mapping_range+0x74/0x280
[6130704.522128] [<ffffffff811772c0>] ? jbd2_journal_stop+0x1e0/0x360
[6130704.522128] [<ffffffff810acd25>] vmtruncate+0xa5/0x110
[6130704.522128] [<ffffffff810dda10>] inode_setattr+0x30/0x180
[6130704.522128] [<ffffffff8114d073>] ext4_setattr+0x173/0x310
[6130704.522128] [<ffffffff810ddc79>] notify_change+0x119/0x330
[6130704.522128] [<ffffffff810c6df3>] do_truncate+0x63/0x90
[6130704.522128] [<ffffffff810d0cc3>] ? get_write_access+0x23/0x60
[6130704.522128] [<ffffffff810c70cb>] sys_truncate+0x17b/0x180
[6130704.522128] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6130704.522128] nodeserv D ffff88021eaedca8 0 17888 13796
[6130704.522128] ffff88021eaedbf8 0000000000000082 ffff8800280261e8
ffff88021e5b56e0
[6130704.522128] 0000000000010e80 000000000000c748 ffff880100148e40
ffffffff81526360
[6130704.522128] ffff8801001491b0 000000008119bd17 000000026d668c1b
0000000000000002
[6130704.522128] Call Trace:
[6130704.522128] [<ffffffffa0039291>] ? raid0_unplug+0x51/0x70 [raid0]
[6130704.522128] [<ffffffff813a9639>] schedule+0x9/0x20
[6130704.522128] [<ffffffff813a9687>] io_schedule+0x37/0x50
[6130704.522128] [<ffffffff81095e35>] sync_page+0x35/0x60
[6130704.522128] [<ffffffff81095e69>] sync_page_killable+0x9/0x50
[6130704.522128] [<ffffffff813a99d2>] __wait_on_bit_lock+0x52/0xb0
[6130704.522128] [<ffffffff81095e60>] ? sync_page_killable+0x0/0x50
[6130704.522128] [<ffffffff81095d74>] __lock_page_killable+0x64/0x70
[6130704.522128] [<ffffffff8105b940>] ? wake_bit_function+0x0/0x40
[6130704.522128] [<ffffffff81095c0b>] ? find_get_page+0x1b/0xb0
[6130704.522128] [<ffffffff81097908>] generic_file_aio_read+0x3b8/0x6b0
[6130704.522128] [<ffffffff810c7dc1>] do_sync_read+0xf1/0x140
[6130704.522128] [<ffffffff8106a5e8>] ? do_futex+0xb8/0xb20
[6130704.522128] [<ffffffff813ab78f>] ? _spin_unlock_irqrestore+0x2f/0x40
[6130704.522128] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6130704.522128] [<ffffffff8105bc73>] ? add_wait_queue+0x43/0x60
[6130704.522128] [<ffffffff81062a6c>] ? getnstimeofday+0x5c/0xf0
[6130704.522128] [<ffffffff8105f049>] ? ktime_get_ts+0x59/0x60
[6130704.522128] [<ffffffff810c85b8>] vfs_read+0xc8/0x170
[6130704.522128] [<ffffffff810c86fa>] sys_pread64+0x9a/0xa0
[6130704.522128] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6130704.522128] Sched Debug Version: v0.09, 2.6.30.4 #1
[6130704.522128] now at 6130704522.576956 msecs
[6130704.522128] .jiffies : 10425371817
[6130704.522128] .sysctl_sched_latency : 40.000000
[6130704.522128] .sysctl_sched_min_granularity : 8.000000
[6130704.522128] .sysctl_sched_wakeup_granularity : 10.000000
[6130704.522128] .sysctl_sched_child_runs_first : 0.000001
[6130704.522128] .sysctl_sched_features : 113917
[6130704.522128]
[6130704.522128] cpu#0, 1866.760 MHz
[6130704.522128] .nr_running : 0
[6130704.522128] .load : 0
[6130704.522128] .nr_switches : 1088197559
[6130704.522128] .nr_load_updates : 1494088224
[6130704.522128] .nr_uninterruptible : -10143689
[6130704.522128] .next_balance : 10425.371927
[6130704.522128] .curr->pid : 0
[6130704.522128] .clock : 6130704522.001795
[6130704.522128] .cpu_load[0] : 0
[6130704.522128] .cpu_load[1] : 732
[6130704.522128] .cpu_load[2] : 1307
[6130704.522128] .cpu_load[3] : 1628
[6130704.522128] .cpu_load[4] : 2055
[6130704.522128] .yld_count : 62066
[6130704.522128] .sched_switch : 0
[6130704.522128] .sched_count : 1162815210
[6130704.522128] .sched_goidle : 486300434
[6130704.522128] .ttwu_count : 639575934
[6130704.522128] .ttwu_local : 567060963
[6130704.522128] .bkl_count : 643
[6130704.522128]
[6130704.522128] cfs_rq[0]:
[6130704.522128] .exec_clock : 1105652808.029192
[6130704.522128] .MIN_vruntime : 0.000001
[6130704.522128] .min_vruntime : 7063664391.526226
[6130704.522128] .max_vruntime : 0.000001
[6130704.522128] .spread : 0.000000
[6130704.522128] .spread0 : 0.000000
[6130704.522128] .nr_running : 0
[6130704.522128] .load : 0
[6130704.522128] .nr_spread_over : 8978
[6130704.522128]
[6130704.522128] rt_rq[0]:
[6130704.522128] .rt_nr_running : 0
[6130704.522128] .rt_throttled : 0
[6130704.522128] .rt_time : 0.000000
[6130704.522128] .rt_runtime : 950.000000
[6130704.522128]
[6130704.522128] runnable tasks:
[6130704.522128] task PID tree-key switches
prio exec-runtime sum-exec sum-sleep
[6130704.522128]
----------------------------------------------------------------------------------------------------------
[6130704.522128]
[6130704.522128] cpu#1, 1866.760 MHz
[6130704.522128] .nr_running : 1
[6130704.522128] .load : 1024
[6130704.522128] .nr_switches : 818082742
[6130704.522128] .nr_load_updates : 1586872071
[6130704.522128] .nr_uninterruptible : 10143702
[6130704.522128] .next_balance : 10425.371857
[6130704.522128] .curr->pid : 11137
[6130704.522128] .clock : 6130704521.459918
[6130704.522128] .cpu_load[0] : 1024
[6130704.522128] .cpu_load[1] : 514
[6130704.522128] .cpu_load[2] : 318
[6130704.522128] .cpu_load[3] : 299
[6130704.522128] .cpu_load[4] : 261
[6130704.522128] .yld_count : 283011
[6130704.522128] .sched_switch : 0
[6130704.522128] .sched_count : 947252346
[6130704.522128] .sched_goidle : 479113257
[6130704.522128] .ttwu_count : 422473135
[6130704.522128] .ttwu_local : 316765436
[6130704.522128] .bkl_count : 10
[6130704.522128]
[6130704.522128] cfs_rq[1]:
[6130704.522128] .exec_clock : 478641403.088793
[6130704.522128] .MIN_vruntime : 0.000001
[6130704.522128] .min_vruntime : 8051549362.514115
[6130704.522128] .max_vruntime : 0.000001
[6130704.522128] .spread : 0.000000
[6130704.522128] .spread0 : 987884970.987889
[6130704.522128] .nr_running : 1
[6130704.522128] .load : 1024
[6130704.522128] .nr_spread_over : 11703
[6130704.522128]
[6130704.522128] rt_rq[1]:
[6130704.522128] .rt_nr_running : 0
[6130704.522128] .rt_throttled : 0
[6130704.522128] .rt_time : 0.000000
[6130704.522128] .rt_runtime : 950.000000
[6130704.522128]
[6130704.522128] runnable tasks:
[6130704.522128] task PID tree-key switches
prio exec-runtime sum-exec sum-sleep
[6130704.522128]
----------------------------------------------------------------------------------------------------------
[6130704.522128] R bash 11137 8051549322.514115 272
120 8051549322.514115 35.937180 689280132.211960
[6130704.522128]


Thanks,
Fredrik

On Wed, Oct 21, 2009 at 4:20 AM, Eric Sandeen <[email protected]> wrote:
> Fredrik Andersson wrote:
>>
>> Hi, here is the data for this process:
>
> Including all of the processes in D state (everything reported by sysrq-w)
> would probably be most helpful.
>
> Feel free to file an ext4 bug on bugzilla.kernel.org w/ this information,
> too, so it doesn't get lost in busy schedules ...
>
> Thanks,
> -Eric
>
>> 5958816.744013] drdbmake ? ? ?D ffff88021e4c7800 ? ? 0 27019 ?13796
>> [5958816.744013] ?ffff8801d1bcda88 0000000000000082 ffff8801f4ce9bf0
>> ffff8801678b1380
>> [5958816.744013] ?0000000000010e80 000000000000c748 ffff8800404963c0
>> ffffffff81526360
>> [5958816.744013] ?ffff880040496730 00000000f4ce9bf0 000000025819cebe
>> 0000000000000282
>> [5958816.744013] Call Trace:
>> [5958816.744013] ?[<ffffffff813a9639>] schedule+0x9/0x20
>> [5958816.744013] ?[<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
>> [5958816.744013] ?[<ffffffff8105b900>] ? autoremove_wake_function+0x0/
>> 0x40
>> [5958816.744013] ?[<ffffffff811781ce>] jbd2_journal_restart+0xbe/0x150
>> [5958816.744013] ?[<ffffffff8116243d>] ext4_ext_truncate+0x6dd/0xa20
>> [5958816.744013] ?[<ffffffff81095b3b>] ? find_get_pages+0x3b/0xf0
>> [5958816.744013] ?[<ffffffff81150a78>] ext4_truncate+0x198/0x680
>> [5958816.744013] ?[<ffffffff810ac984>] ? unmap_mapping_range+0x74/0x280
>> [5958816.744013] ?[<ffffffff811772c0>] ? jbd2_journal_stop+0x1e0/0x360
>> [5958816.744013] ?[<ffffffff810acd25>] vmtruncate+0xa5/0x110
>> [5958816.744013] ?[<ffffffff810dda10>] inode_setattr+0x30/0x180
>> [5958816.744013] ?[<ffffffff8114d073>] ext4_setattr+0x173/0x310
>> [5958816.744013] ?[<ffffffff810ddc79>] notify_change+0x119/0x330
>> [5958816.744013] ?[<ffffffff810c6df3>] do_truncate+0x63/0x90
>> [5958816.744013] ?[<ffffffff810d0cc3>] ? get_write_access+0x23/0x60
>> [5958816.744013] ?[<ffffffff810c70cb>] sys_truncate+0x17b/0x180
>> [5958816.744013] ?[<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
>>
>> Don't know if this has anything to do with it, but ?I also noticed
>> that another process of mine,
>> which is working just fine, is executing a suspicious looking function
>> called raid0_unplug.
>> It operates on the same raid0/ext4 filesystem as the hung process. I
>> include the calltrace for it here too:
>>
>> [5958816.744013] nodeserv ? ? ?D ffff880167bd7ca8 ? ? 0 17900 ?13796
>> [5958816.744013] ?ffff880167bd7bf8 0000000000000082 ffff88002800a588
>> ffff88021e5b56e0
>> [5958816.744013] ?0000000000010e80 000000000000c748 ffff880100664020
>> ffffffff81526360
>> [5958816.744013] ?ffff880100664390 000000008119bd17 000000026327bfa9
>> 0000000000000002
>> [5958816.744013] Call Trace:
>> [5958816.744013] ?[<ffffffffa0039291>] ? raid0_unplug+0x51/0x70 [raid0]
>> [5958816.744013] ?[<ffffffff813a9639>] schedule+0x9/0x20
>> [5958816.744013] ?[<ffffffff813a9687>] io_schedule+0x37/0x50
>> [5958816.744013] ?[<ffffffff81095e35>] sync_page+0x35/0x60
>> [5958816.744013] ?[<ffffffff81095e69>] sync_page_killable+0x9/0x50
>> [5958816.744013] ?[<ffffffff813a99d2>] __wait_on_bit_lock+0x52/0xb0
>> [5958816.744013] ?[<ffffffff81095e60>] ? sync_page_killable+0x0/0x50
>> [5958816.744013] ?[<ffffffff81095d74>] __lock_page_killable+0x64/0x70
>> [5958816.744013] ?[<ffffffff8105b940>] ? wake_bit_function+0x0/0x40
>> [5958816.744013] ?[<ffffffff81095c0b>] ? find_get_page+0x1b/0xb0
>> [5958816.744013] ?[<ffffffff81097908>] generic_file_aio_read+0x3b8/0x6b0
>> [5958816.744013] ?[<ffffffff810c7dc1>] do_sync_read+0xf1/0x140
>> [5958816.744013] ?[<ffffffff8106a5e8>] ? do_futex+0xb8/0xb20
>> [5958816.744013] ?[<ffffffff813ab78f>] ? _spin_unlock_irqrestore+0x2f/0x40
>> [5958816.744013] ?[<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
>> [5958816.744013] ?[<ffffffff8105bc73>] ? add_wait_queue+0x43/0x60
>> [5958816.744013] ?[<ffffffff81062a6c>] ? getnstimeofday+0x5c/0xf0
>> [5958816.744013] ?[<ffffffff810c85b8>] vfs_read+0xc8/0x170
>> [5958816.744013] ?[<ffffffff810c86fa>] sys_pread64+0x9a/0xa0
>> [5958816.744013] ?[<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
>>
>> Hope this makes sense to anyone, and please let me know if there is
>> more info I can provide.
>>
>> /Fredrik
>>
>> On Sun, Oct 18, 2009 at 5:57 PM, Eric Sandeen <[email protected]> wrote:
>>>
>>> Fredrik Andersson wrote:
>>>>
>>>> Hi, I'd like to report what I'm fairly certain is an ext4 bug. I hope
>>>> this is the right place to do so.
>>>>
>>>> My program creates a big file (around 30 GB) with posix_fallocate (to
>>>> utilize extents), fills it with data and uses ftruncate to crop it to
>>>> its final size (usually somewhere between 20 and 25 GB).
>>>> The problem is that in around 5% of the cases, the program locks up
>>>> completely in a syscall. The process can thus not be killed even with
>>>> kill -9, and a reboot is all that will do.
>>>
>>> does echo w > /proc/sysrq-trigger (this does sleeping processes; or use
>>> echo t for all processes) show you where the stuck threads are?
>>>
>>> -Eric
>>>
>
>

2009-10-21 22:48:09

by Mingming Cao

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

On Tue, 2009-10-20 at 18:49 +0200, Fredrik Andersson wrote:
> I found the following post to the ext4 list. This seems to fit my
> experienced problems pretty exactly.
>
> http://osdir.com/ml/linux-ext4/2009-08/msg00184.html
>
> Is it the same problem?
>

The link you provided about is related to a race between restart a
transaction from truncate, and the other process is doing something like
block allocation to the same file. Do you have another threads
allocating blocks while you are truncating?

> /Fredrik
>
> On Mon, Oct 19, 2009 at 11:49 AM, Fredrik Andersson <[email protected]> wrote:
> > Hi, here is the data for this process:
> >
> > 5958816.744013] drdbmake D ffff88021e4c7800 0 27019 13796
> > [5958816.744013] ffff8801d1bcda88 0000000000000082 ffff8801f4ce9bf0
> > ffff8801678b1380
> > [5958816.744013] 0000000000010e80 000000000000c748 ffff8800404963c0
> > ffffffff81526360
> > [5958816.744013] ffff880040496730 00000000f4ce9bf0 000000025819cebe
> > 0000000000000282
> > [5958816.744013] Call Trace:
> > [5958816.744013] [<ffffffff813a9639>] schedule+0x9/0x20
> > [5958816.744013] [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
> > [5958816.744013] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/
> > 0x40
> > [5958816.744013] [<ffffffff811781ce>] jbd2_journal_restart+0xbe/0x150
> > [5958816.744013] [<ffffffff8116243d>] ext4_ext_truncate+0x6dd/0xa20
> > [5958816.744013] [<ffffffff81095b3b>] ? find_get_pages+0x3b/0xf0
> > [5958816.744013] [<ffffffff81150a78>] ext4_truncate+0x198/0x680
> > [5958816.744013] [<ffffffff810ac984>] ? unmap_mapping_range+0x74/0x280
> > [5958816.744013] [<ffffffff811772c0>] ? jbd2_journal_stop+0x1e0/0x360
> > [5958816.744013] [<ffffffff810acd25>] vmtruncate+0xa5/0x110
> > [5958816.744013] [<ffffffff810dda10>] inode_setattr+0x30/0x180
> > [5958816.744013] [<ffffffff8114d073>] ext4_setattr+0x173/0x310
> > [5958816.744013] [<ffffffff810ddc79>] notify_change+0x119/0x330
> > [5958816.744013] [<ffffffff810c6df3>] do_truncate+0x63/0x90
> > [5958816.744013] [<ffffffff810d0cc3>] ? get_write_access+0x23/0x60
> > [5958816.744013] [<ffffffff810c70cb>] sys_truncate+0x17b/0x180
> > [5958816.744013] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
> >
> > Don't know if this has anything to do with it, but I also noticed
> > that another process of mine,
> > which is working just fine, is executing a suspicious looking function
> > called raid0_unplug.
> > It operates on the same raid0/ext4 filesystem as the hung process. I
> > include the calltrace for it here too:
> >
> > [5958816.744013] nodeserv D ffff880167bd7ca8 0 17900 13796
> > [5958816.744013] ffff880167bd7bf8 0000000000000082 ffff88002800a588
> > ffff88021e5b56e0
> > [5958816.744013] 0000000000010e80 000000000000c748 ffff880100664020
> > ffffffff81526360
> > [5958816.744013] ffff880100664390 000000008119bd17 000000026327bfa9
> > 0000000000000002
> > [5958816.744013] Call Trace:
> > [5958816.744013] [<ffffffffa0039291>] ? raid0_unplug+0x51/0x70 [raid0]
> > [5958816.744013] [<ffffffff813a9639>] schedule+0x9/0x20
> > [5958816.744013] [<ffffffff813a9687>] io_schedule+0x37/0x50
> > [5958816.744013] [<ffffffff81095e35>] sync_page+0x35/0x60
> > [5958816.744013] [<ffffffff81095e69>] sync_page_killable+0x9/0x50
> > [5958816.744013] [<ffffffff813a99d2>] __wait_on_bit_lock+0x52/0xb0
> > [5958816.744013] [<ffffffff81095e60>] ? sync_page_killable+0x0/0x50
> > [5958816.744013] [<ffffffff81095d74>] __lock_page_killable+0x64/0x70
> > [5958816.744013] [<ffffffff8105b940>] ? wake_bit_function+0x0/0x40
> > [5958816.744013] [<ffffffff81095c0b>] ? find_get_page+0x1b/0xb0
> > [5958816.744013] [<ffffffff81097908>] generic_file_aio_read+0x3b8/0x6b0
> > [5958816.744013] [<ffffffff810c7dc1>] do_sync_read+0xf1/0x140
> > [5958816.744013] [<ffffffff8106a5e8>] ? do_futex+0xb8/0xb20
> > [5958816.744013] [<ffffffff813ab78f>] ? _spin_unlock_irqrestore+0x2f/0x40
> > [5958816.744013] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
> > [5958816.744013] [<ffffffff8105bc73>] ? add_wait_queue+0x43/0x60
> > [5958816.744013] [<ffffffff81062a6c>] ? getnstimeofday+0x5c/0xf0
> > [5958816.744013] [<ffffffff810c85b8>] vfs_read+0xc8/0x170
> > [5958816.744013] [<ffffffff810c86fa>] sys_pread64+0x9a/0xa0
> > [5958816.744013] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
> >

This stack seems to me this thread is doing IO but never come back.

> > Hope this makes sense to anyone, and please let me know if there is
> > more info I can provide.
> >
> > /Fredrik
> >
> > On Sun, Oct 18, 2009 at 5:57 PM, Eric Sandeen <[email protected]> wrote:
> >>
> >> Fredrik Andersson wrote:
> >>>
> >>> Hi, I'd like to report what I'm fairly certain is an ext4 bug. I hope
> >>> this is the right place to do so.
> >>>
> >>> My program creates a big file (around 30 GB) with posix_fallocate (to
> >>> utilize extents), fills it with data and uses ftruncate to crop it to
> >>> its final size (usually somewhere between 20 and 25 GB).
> >>> The problem is that in around 5% of the cases, the program locks up
> >>> completely in a syscall. The process can thus not be killed even with
> >>> kill -9, and a reboot is all that will do.
> >>
> >> does echo w > /proc/sysrq-trigger (this does sleeping processes; or use echo t for all processes) show you where the stuck threads are?
> >>
> >> -Eric
> >>
> >
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html



2009-10-22 07:37:06

by Fredrik Andersson

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

> The link you provided about is related to a race between restart a
> transaction from truncate, and the other process is doing something like
> block allocation to the same file. Do you have another threads
> allocating blocks while you are truncating?

No, there is only a single thread in this program. No other thread or
process is accessing the same file or file descriptor.

/Fredrik

2009-10-26 10:43:03

by Fredrik Andersson

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

I had the same error again, on another machine. I'm posting the calls
stacks of these hung processes too.
Hope someone can figure this out..

6515054.873144] INFO: task pdflush:269 blocked for more than 120 seconds.
[6515054.873148] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[6515054.873151] pdflush D ffff8801f4cf0ad4 0 269 2
[6515054.873156] ffff88021e551960 0000000000000046 ffff88021e63d6e0
ffff88007ddd1540
[6515054.873161] 0000000000010e80 000000000000c748 ffff88021fa2c020
ffff88021f843900
[6515054.873166] ffff88021fa2c390 0000000100000008 00000002844bfcb2
0080001000800013
[6515054.873170] Call Trace:
[6515054.873180] [<ffffffff813a9639>] schedule+0x9/0x20
[6515054.873184] [<ffffffff813ab2d5>] __down_read+0x85/0xb7
[6515054.873187] [<ffffffff813aa565>] down_read+0x25/0x30
[6515054.873192] [<ffffffff8114e6f2>] ext4_get_blocks_wrap+0x52/0x290
[6515054.873196] [<ffffffff8114f601>] mpage_da_map_blocks+0xd1/0x7a0
[6515054.873202] [<ffffffff8109d12c>] ? write_cache_pages+0x2fc/0x470
[6515054.873205] [<ffffffff81150330>] ? __mpage_da_writepage+0x0/0x180
[6515054.873209] [<ffffffff8114ffdb>] ext4_da_writepages+0x30b/0x590
[6515054.873213] [<ffffffff8103d91b>] ? finish_task_switch+0x5b/0xd0
[6515054.873217] [<ffffffff8109d2f8>] do_writepages+0x28/0x50
[6515054.873222] [<ffffffff810e60fb>] __writeback_single_inode+0x9b/0x3e0
[6515054.873226] [<ffffffff8104f830>] ? process_timeout+0x0/0x10
[6515054.873230] [<ffffffff813a8827>] ? io_schedule_timeout+0x37/0x50
[6515054.873234] [<ffffffff810e68ec>] generic_sync_sb_inodes+0x36c/0x440
[6515054.873237] [<ffffffff810e6b8e>] writeback_inodes+0x5e/0x100
[6515054.873241] [<ffffffff8109d4ac>] wb_kupdate+0xbc/0x140
[6515054.873244] [<ffffffff8109e6a8>] pdflush+0x118/0x220
[6515054.873248] [<ffffffff8109d3f0>] ? wb_kupdate+0x0/0x140
[6515054.873251] [<ffffffff8109e590>] ? pdflush+0x0/0x220
[6515054.873255] [<ffffffff8105b4c6>] kthread+0x56/0x90
[6515054.873259] [<ffffffff8100cf6a>] child_rip+0xa/0x20
[6515054.873263] [<ffffffff8100c969>] ? restore_args+0x0/0x30
[6515054.873266] [<ffffffff8105b470>] ? kthread+0x0/0x90
[6515054.873270] [<ffffffff8100cf60>] ? child_rip+0x0/0x20
[6515054.873275] INFO: task kjournald2:1334 blocked for more than 120 seconds.
[6515054.873277] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[6515054.873279] kjournald2 D ffff88021e731e28 0 1334 2
[6515054.873283] ffff88021e731d10 0000000000000046 ffff88021e731ca0
ffffffff813a9b3b
[6515054.873288] 0000000000010e80 000000000000c748 ffff88021f9c5580
ffff88021f843900
[6515054.873292] ffff88021f9c58f0 0000000100000282 00000002844c0480
0000000000000282
[6515054.873296] Call Trace:
[6515054.873300] [<ffffffff813a9b3b>] ? __wait_on_bit+0x7b/0x90
[6515054.873303] [<ffffffff813a9639>] schedule+0x9/0x20
[6515054.873308] [<ffffffff81178ebe>]
jbd2_journal_commit_transaction+0x1be/0x1810
[6515054.873312] [<ffffffff81040cd6>] ? dequeue_task_fair+0x256/0x260
[6515054.873315] [<ffffffff8103d91b>] ? finish_task_switch+0x5b/0xd0
[6515054.873320] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6515054.873324] [<ffffffff8117ea8a>] kjournald2+0x11a/0x370
[6515054.873327] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6515054.873330] [<ffffffff8117e970>] ? kjournald2+0x0/0x370
[6515054.873333] [<ffffffff8117e970>] ? kjournald2+0x0/0x370
[6515054.873337] [<ffffffff8105b4c6>] kthread+0x56/0x90
[6515054.873340] [<ffffffff8100cf6a>] child_rip+0xa/0x20
[6515054.873344] [<ffffffff8100c969>] ? restore_args+0x0/0x30
[6515054.873347] [<ffffffff8105b470>] ? kthread+0x0/0x90
[6515054.873351] [<ffffffff8100cf60>] ? child_rip+0x0/0x20
[6515054.873361] INFO: task drdbmake:3919 blocked for more than 120 seconds.
[6515054.873363] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[6515054.873365] drdbmake D ffff88021e56f000 0 3919 27107
[6515054.873369] ffff88021eb4da88 0000000000000082 ffff8801f4cf0ae8
ffff880008ba03f0
[6515054.873374] 0000000000010e80 000000000000c748 ffff8801049063c0
ffff88021f843900
[6515054.873378] ffff880104906730 00000001f4cf0ae8 00000002844c0480
0000000000000282
[6515054.873382] Call Trace:
[6515054.873386] [<ffffffff813a9639>] schedule+0x9/0x20
[6515054.873390] [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
[6515054.873394] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6515054.873397] [<ffffffff811781ce>] jbd2_journal_restart+0xbe/0x150
[6515054.873402] [<ffffffff8116243d>] ext4_ext_truncate+0x6dd/0xa20
[6515054.873406] [<ffffffff81095b3b>] ? find_get_pages+0x3b/0xf0
[6515054.873409] [<ffffffff81150a78>] ext4_truncate+0x198/0x680
[6515054.873414] [<ffffffff810ac984>] ? unmap_mapping_range+0x74/0x280
[6515054.873418] [<ffffffff811772c0>] ? jbd2_journal_stop+0x1e0/0x360
[6515054.873422] [<ffffffff810acd25>] vmtruncate+0xa5/0x110
[6515054.873426] [<ffffffff810dda10>] inode_setattr+0x30/0x180
[6515054.873430] [<ffffffff8114d073>] ext4_setattr+0x173/0x310
[6515054.873433] [<ffffffff810ddc79>] notify_change+0x119/0x330
[6515054.873438] [<ffffffff810c6df3>] do_truncate+0x63/0x90
[6515054.873441] [<ffffffff810d0cc3>] ? get_write_access+0x23/0x60
[6515054.873445] [<ffffffff810c70cb>] sys_truncate+0x17b/0x180
[6515054.873448] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6515174.873066] INFO: task pdflush:269 blocked for more than 120 seconds.
[6515174.873070] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[6515174.873074] pdflush D ffff8801f4cf0ad4 0 269 2
[6515174.873079] ffff88021e551960 0000000000000046 ffff88021e63d6e0
ffff88007ddd1540
[6515174.873085] 0000000000010e80 000000000000c748 ffff88021fa2c020
ffff88021f843900
[6515174.873090] ffff88021fa2c390 0000000100000008 00000002844bfcb2
0080001000800013
[6515174.873096] Call Trace:
[6515174.873106] [<ffffffff813a9639>] schedule+0x9/0x20
[6515174.873110] [<ffffffff813ab2d5>] __down_read+0x85/0xb7
[6515174.873115] [<ffffffff813aa565>] down_read+0x25/0x30
[6515174.873120] [<ffffffff8114e6f2>] ext4_get_blocks_wrap+0x52/0x290
[6515174.873124] [<ffffffff8114f601>] mpage_da_map_blocks+0xd1/0x7a0
[6515174.873138] [<ffffffff8109d12c>] ? write_cache_pages+0x2fc/0x470
[6515174.873146] [<ffffffff81150330>] ? __mpage_da_writepage+0x0/0x180
[6515174.873155] [<ffffffff8114ffdb>] ext4_da_writepages+0x30b/0x590
[6515174.873164] [<ffffffff8103d91b>] ? finish_task_switch+0x5b/0xd0
[6515174.873172] [<ffffffff8109d2f8>] do_writepages+0x28/0x50
[6515174.873181] [<ffffffff810e60fb>] __writeback_single_inode+0x9b/0x3e0
[6515174.873190] [<ffffffff8104f830>] ? process_timeout+0x0/0x10
[6515174.873197] [<ffffffff813a8827>] ? io_schedule_timeout+0x37/0x50
[6515174.873206] [<ffffffff810e68ec>] generic_sync_sb_inodes+0x36c/0x440
[6515174.873214] [<ffffffff810e6b8e>] writeback_inodes+0x5e/0x100
[6515174.873222] [<ffffffff8109d4ac>] wb_kupdate+0xbc/0x140
[6515174.873230] [<ffffffff8109e6a8>] pdflush+0x118/0x220
[6515174.873238] [<ffffffff8109d3f0>] ? wb_kupdate+0x0/0x140
[6515174.873245] [<ffffffff8109e590>] ? pdflush+0x0/0x220
[6515174.873253] [<ffffffff8105b4c6>] kthread+0x56/0x90
[6515174.873261] [<ffffffff8100cf6a>] child_rip+0xa/0x20
[6515174.873269] [<ffffffff8100c969>] ? restore_args+0x0/0x30
[6515174.873277] [<ffffffff8105b470>] ? kthread+0x0/0x90
[6515174.873284] [<ffffffff8100cf60>] ? child_rip+0x0/0x20
[6515174.873303] INFO: task kjournald2:1334 blocked for more than 120 seconds.
[6515174.873308] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[6515174.873314] kjournald2 D ffff88021e731e28 0 1334 2
[6515174.873322] ffff88021e731d10 0000000000000046 ffff88021e731ca0
ffffffff813a9b3b
[6515174.873333] 0000000000010e80 000000000000c748 ffff88021f9c5580
ffff88021f843900
[6515174.873344] ffff88021f9c58f0 0000000100000282 00000002844c0480
0000000000000282
[6515174.873355] Call Trace:
[6515174.873360] [<ffffffff813a9b3b>] ? __wait_on_bit+0x7b/0x90
[6515174.873367] [<ffffffff813a9639>] schedule+0x9/0x20
[6515174.873375] [<ffffffff81178ebe>]
jbd2_journal_commit_transaction+0x1be/0x1810
[6515174.873382] [<ffffffff81040cd6>] ? dequeue_task_fair+0x256/0x260
[6515174.873389] [<ffffffff8103d91b>] ? finish_task_switch+0x5b/0xd0
[6515174.873396] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6515174.873404] [<ffffffff8117ea8a>] kjournald2+0x11a/0x370
[6515174.873411] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6515174.873418] [<ffffffff8117e970>] ? kjournald2+0x0/0x370
[6515174.873424] [<ffffffff8117e970>] ? kjournald2+0x0/0x370
[6515174.873431] [<ffffffff8105b4c6>] kthread+0x56/0x90
[6515174.873437] [<ffffffff8100cf6a>] child_rip+0xa/0x20
[6515174.873444] [<ffffffff8100c969>] ? restore_args+0x0/0x30
[6515174.873450] [<ffffffff8105b470>] ? kthread+0x0/0x90
[6515174.873457] [<ffffffff8100cf60>] ? child_rip+0x0/0x20
[6515174.873470] INFO: task drdbmake:3919 blocked for more than 120 seconds.
[6515174.873475] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[6515174.873480] drdbmake D ffff88021e56f000 0 3919 27107
[6515174.873489] ffff88021eb4da88 0000000000000082 ffff8801f4cf0ae8
ffff880008ba03f0
[6515174.873499] 0000000000010e80 000000000000c748 ffff8801049063c0
ffff88021f843900
[6515174.873510] ffff880104906730 00000001f4cf0ae8 00000002844c0480
0000000000000282
[6515174.873521] Call Trace:
[6515174.873526] [<ffffffff813a9639>] schedule+0x9/0x20
[6515174.873534] [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
[6515174.873541] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6515174.873548] [<ffffffff811781ce>] jbd2_journal_restart+0xbe/0x150
[6515174.873556] [<ffffffff8116243d>] ext4_ext_truncate+0x6dd/0xa20
[6515174.873563] [<ffffffff81095b3b>] ? find_get_pages+0x3b/0xf0
[6515174.873569] [<ffffffff81150a78>] ext4_truncate+0x198/0x680
[6515174.873578] [<ffffffff810ac984>] ? unmap_mapping_range+0x74/0x280
[6515174.873585] [<ffffffff811772c0>] ? jbd2_journal_stop+0x1e0/0x360
[6515174.873592] [<ffffffff810acd25>] vmtruncate+0xa5/0x110
[6515174.873600] [<ffffffff810dda10>] inode_setattr+0x30/0x180
[6515174.873607] [<ffffffff8114d073>] ext4_setattr+0x173/0x310
[6515174.873614] [<ffffffff810ddc79>] notify_change+0x119/0x330
[6515174.873621] [<ffffffff810c6df3>] do_truncate+0x63/0x90
[6515174.873628] [<ffffffff810d0cc3>] ? get_write_access+0x23/0x60
[6515174.873635] [<ffffffff810c70cb>] sys_truncate+0x17b/0x180
[6515174.873642] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6515294.873034] INFO: task pdflush:269 blocked for more than 120 seconds.
[6515294.873038] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[6515294.873041] pdflush D ffff8801f4cf0ad4 0 269 2
[6515294.873046] ffff88021e551960 0000000000000046 ffff88021e63d6e0
ffff88007ddd1540
[6515294.873053] 0000000000010e80 000000000000c748 ffff88021fa2c020
ffff88021f843900
[6515294.873058] ffff88021fa2c390 0000000100000008 00000002844bfcb2
0080001000800013
[6515294.873063] Call Trace:
[6515294.873073] [<ffffffff813a9639>] schedule+0x9/0x20
[6515294.873078] [<ffffffff813ab2d5>] __down_read+0x85/0xb7
[6515294.873082] [<ffffffff813aa565>] down_read+0x25/0x30
[6515294.873088] [<ffffffff8114e6f2>] ext4_get_blocks_wrap+0x52/0x290
[6515294.873092] [<ffffffff8114f601>] mpage_da_map_blocks+0xd1/0x7a0
[6515294.873099] [<ffffffff8109d12c>] ? write_cache_pages+0x2fc/0x470
[6515294.873103] [<ffffffff81150330>] ? __mpage_da_writepage+0x0/0x180
[6515294.873107] [<ffffffff8114ffdb>] ext4_da_writepages+0x30b/0x590
[6515294.873112] [<ffffffff8103d91b>] ? finish_task_switch+0x5b/0xd0
[6515294.873117] [<ffffffff8109d2f8>] do_writepages+0x28/0x50
[6515294.873123] [<ffffffff810e60fb>] __writeback_single_inode+0x9b/0x3e0
[6515294.873136] [<ffffffff8104f830>] ? process_timeout+0x0/0x10
[6515294.873144] [<ffffffff813a8827>] ? io_schedule_timeout+0x37/0x50
[6515294.873153] [<ffffffff810e68ec>] generic_sync_sb_inodes+0x36c/0x440
[6515294.873161] [<ffffffff810e6b8e>] writeback_inodes+0x5e/0x100
[6515294.873169] [<ffffffff8109d4ac>] wb_kupdate+0xbc/0x140
[6515294.873177] [<ffffffff8109e6a8>] pdflush+0x118/0x220
[6515294.873185] [<ffffffff8109d3f0>] ? wb_kupdate+0x0/0x140
[6515294.873192] [<ffffffff8109e590>] ? pdflush+0x0/0x220
[6515294.873200] [<ffffffff8105b4c6>] kthread+0x56/0x90
[6515294.873209] [<ffffffff8100cf6a>] child_rip+0xa/0x20
[6515294.873217] [<ffffffff8100c969>] ? restore_args+0x0/0x30
[6515294.873224] [<ffffffff8105b470>] ? kthread+0x0/0x90
[6515294.873232] [<ffffffff8100cf60>] ? child_rip+0x0/0x20
[6515294.873241] INFO: task kjournald2:1334 blocked for more than 120 seconds.
[6515294.873247] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[6515294.873253] kjournald2 D ffff88021e731e28 0 1334 2
[6515294.873273] ffff88021e731d10 0000000000000046 ffff88021e731ca0
ffffffff813a9b3b
[6515294.873284] 0000000000010e80 000000000000c748 ffff88021f9c5580
ffff88021f843900
[6515294.873294] ffff88021f9c58f0 0000000100000282 00000002844c0480
0000000000000282
[6515294.873305] Call Trace:
[6515294.873311] [<ffffffff813a9b3b>] ? __wait_on_bit+0x7b/0x90
[6515294.873318] [<ffffffff813a9639>] schedule+0x9/0x20
[6515294.873326] [<ffffffff81178ebe>]
jbd2_journal_commit_transaction+0x1be/0x1810
[6515294.873334] [<ffffffff81040cd6>] ? dequeue_task_fair+0x256/0x260
[6515294.873341] [<ffffffff8103d91b>] ? finish_task_switch+0x5b/0xd0
[6515294.873349] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6515294.873356] [<ffffffff8117ea8a>] kjournald2+0x11a/0x370
[6515294.873363] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6515294.873369] [<ffffffff8117e970>] ? kjournald2+0x0/0x370
[6515294.873375] [<ffffffff8117e970>] ? kjournald2+0x0/0x370
[6515294.873382] [<ffffffff8105b4c6>] kthread+0x56/0x90
[6515294.873389] [<ffffffff8100cf6a>] child_rip+0xa/0x20
[6515294.873395] [<ffffffff8100c969>] ? restore_args+0x0/0x30
[6515294.873402] [<ffffffff8105b470>] ? kthread+0x0/0x90
[6515294.873408] [<ffffffff8100cf60>] ? child_rip+0x0/0x20
[6515294.873418] INFO: task receptd:5709 blocked for more than 120 seconds.
[6515294.873423] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[6515294.873428] receptd D 000000000000002a 0 5709 2046
[6515294.873436] ffff88009cce3c78 0000000000000086 ffff88009cce3be8
ffffffff81152dd3
[6515294.873447] 0000000000010e80 000000000000c748 ffff880104901c80
ffffffff81526360
[6515294.873457] ffff880104901ff0 00000000810db183 00000002844fcb53
0000000000000286
[6515294.873468] Call Trace:
[6515294.873473] [<ffffffff81152dd3>] ? dx_release+0x23/0x50
[6515294.873480] [<ffffffff813a9639>] schedule+0x9/0x20
[6515294.873488] [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
[6515294.873495] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6515294.873502] [<ffffffff81178306>] jbd2_journal_start+0xa6/0xf0
[6515294.873510] [<ffffffff811593c5>] ext4_journal_start_sb+0x55/0x90
[6515294.873517] [<ffffffff81155441>] ext4_create+0x81/0x130
[6515294.873524] [<ffffffff810d23aa>] vfs_create+0x9a/0xb0
[6515294.873531] [<ffffffff810d522e>] do_filp_open+0x85e/0x990
[6515294.873538] [<ffffffff810d1d16>] ? getname+0x36/0x200
[6515294.873545] [<ffffffff810deada>] ? alloc_fd+0x4a/0x140
[6515294.873552] [<ffffffff810c5dfb>] do_sys_open+0x7b/0x110
[6515294.873558] [<ffffffff810c5ebb>] sys_open+0x1b/0x20
[6515294.873565] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b
[6515294.873575] INFO: task drdbmake:3919 blocked for more than 120 seconds.
[6515294.873580] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs"
disables this message.
[6515294.873586] drdbmake D ffff88021e56f000 0 3919 27107
[6515294.873594] ffff88021eb4da88 0000000000000082 ffff8801f4cf0ae8
ffff880008ba03f0
[6515294.873605] 0000000000010e80 000000000000c748 ffff8801049063c0
ffff88021f843900
[6515294.873616] ffff880104906730 00000001f4cf0ae8 00000002844c0480
0000000000000282
[6515294.873627] Call Trace:
[6515294.873633] [<ffffffff813a9639>] schedule+0x9/0x20
[6515294.873640] [<ffffffff81177ea5>] start_this_handle+0x365/0x5d0
[6515294.873647] [<ffffffff8105b900>] ? autoremove_wake_function+0x0/0x40
[6515294.873654] [<ffffffff811781ce>] jbd2_journal_restart+0xbe/0x150
[6515294.873662] [<ffffffff8116243d>] ext4_ext_truncate+0x6dd/0xa20
[6515294.873669] [<ffffffff81095b3b>] ? find_get_pages+0x3b/0xf0
[6515294.873676] [<ffffffff81150a78>] ext4_truncate+0x198/0x680
[6515294.873685] [<ffffffff810ac984>] ? unmap_mapping_range+0x74/0x280
[6515294.873692] [<ffffffff811772c0>] ? jbd2_journal_stop+0x1e0/0x360
[6515294.873700] [<ffffffff810acd25>] vmtruncate+0xa5/0x110
[6515294.873707] [<ffffffff810dda10>] inode_setattr+0x30/0x180
[6515294.873714] [<ffffffff8114d073>] ext4_setattr+0x173/0x310
[6515294.873721] [<ffffffff810ddc79>] notify_change+0x119/0x330
[6515294.873728] [<ffffffff810c6df3>] do_truncate+0x63/0x90
[6515294.873734] [<ffffffff810d0cc3>] ? get_write_access+0x23/0x60
[6515294.873741] [<ffffffff810c70cb>] sys_truncate+0x17b/0x180
[6515294.873747] [<ffffffff8100bfab>] system_call_fastpath+0x16/0x1b

2009-10-27 04:42:30

by Eric Sandeen

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

Fredrik Andersson wrote:
> Hi, I'd like to report what I'm fairly certain is an ext4 bug. I hope
> this is the right place to do so.
>
> My program creates a big file (around 30 GB) with posix_fallocate (to
> utilize extents), fills it with data and uses ftruncate to crop it to
> its final size (usually somewhere between 20 and 25 GB).
> The problem is that in around 5% of the cases, the program locks up
> completely in a syscall. The process can thus not be killed even with
> kill -9, and a reboot is all that will do.


Not trying to ignore you, but trying to sort out some corruption issues
that I'm afraid I have to rank higher than a hang for now...

From the traces, things are hung up on the i_data_sem; several
pdflushes want it for read in ext4_get_blocks_wrap, and drbdmakes
presumably have it downed for write in ext4_ext_truncate. I'm not
immediately seeing the deadlock tonight, but then it's getting late...

Is the application that's triggering this something you can share?

Thanks,
-Eric

2009-10-27 08:16:59

by Fredrik Andersson

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

> Not trying to ignore you, but trying to sort out some corruption issues that
> I'm afraid I have to rank higher than a hang for now...

I understand. Good to know someone is thinking about it anyway :-)

>
> From the traces, things are hung up on the i_data_sem; several pdflushes
> want it for read in ext4_get_blocks_wrap, and drbdmakes presumably have it
> downed for write in ext4_ext_truncate. ?I'm not immediately seeing the
> deadlock tonight, but then it's getting late...
>
> Is the application that's triggering this something you can share?
>

Sorry, I can't provide source code for these applications.

But drdbmake does what is described above: Preallocates a very big
chunk on disk and then
truncates it to its final length once it's done.

Please let me know if you need more details.

/Fredrik

2009-10-27 13:56:54

by Eric Sandeen

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

Fredrik Andersson wrote:
>> Not trying to ignore you, but trying to sort out some corruption issues that
>> I'm afraid I have to rank higher than a hang for now...
>
> I understand. Good to know someone is thinking about it anyway :-)
>
>> From the traces, things are hung up on the i_data_sem; several pdflushes
>> want it for read in ext4_get_blocks_wrap, and drbdmakes presumably have it
>> downed for write in ext4_ext_truncate. I'm not immediately seeing the
>> deadlock tonight, but then it's getting late...
>>
>> Is the application that's triggering this something you can share?
>>
>
> Sorry, I can't provide source code for these applications.

Ok, I understand

> But drdbmake does what is described above: Preallocates a very big
> chunk on disk and then
> truncates it to its final length once it's done.

To try to emulate, how does it write into the preallocated space; large
or small IOs? Sequential streaming? mmap writes? It may not be
relevant but would be nice to try to match it as closely as possible.

Thanks,
-Eric

> Please let me know if you need more details.
>
> /Fredrik


2009-10-27 15:29:09

by Fredrik Andersson

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

> To try to emulate, how does it write into the preallocated space; large or
> small IOs? ?Sequential streaming? ?mmap writes? ?It may not be relevant but
> would be nice to try to match it as closely as possible.

This is a big file that is written sequentially using stdio buffered
I/O (with a setvbuf of about 4K) in the drdbmake process. No mmap.
It is regenerated from an earlier version of the same file, and we
preallocate a file that is 25% bigger than the
previous version, to allow for more data than was in the previous file
and to utilize the extent concept in ext4.
We then read the previous file sequentially, update some entries here
and there and
rewrite it sequentially into the new, fallocated file. There is one
single instance of random I/O: Once the whole new
file has been written, we seek back to the start to write a fixed-size
header. We then ftruncate the file to the proper size.
No process is concurrently reading from the file that is being
written. There is however another process, nodeserv,
that does random reads from the "previous" file (the one we're
sequentially reading in drdbmake).
The deadlock is always in the final ftruncate. It does not help to
close the file and reopen it again before the ftruncate call.

/Fredrik

2009-10-27 15:37:13

by Eric Sandeen

[permalink] [raw]
Subject: Re: Fwd: Ext4 bug with fallocate

Fredrik Andersson wrote:
>> To try to emulate, how does it write into the preallocated space; large or
>> small IOs? Sequential streaming? mmap writes? It may not be relevant but
>> would be nice to try to match it as closely as possible.
>
> This is a big file that is written sequentially using stdio buffered
> I/O (with a setvbuf of about 4K) in the drdbmake process. No mmap.
> It is regenerated from an earlier version of the same file, and we
> preallocate a file that is 25% bigger than the
> previous version, to allow for more data than was in the previous file
> and to utilize the extent concept in ext4.

FWIW, you do not need to preallocate to get extents. Preallocation
fundamentally only guarantees space available (somewhere) though in
practice, it can lead to more contiguous allocation of that space since
it's all done up front ...


> We then read the previous file sequentially, update some entries here
> and there and
> rewrite it sequentially into the new, fallocated file. There is one
> single instance of random I/O: Once the whole new
> file has been written, we seek back to the start to write a fixed-size
> header. We then ftruncate the file to the proper size.
> No process is concurrently reading from the file that is being
> written. There is however another process, nodeserv,
> that does random reads from the "previous" file (the one we're
> sequentially reading in drdbmake).
> The deadlock is always in the final ftruncate. It does not help to
> close the file and reopen it again before the ftruncate call.

Thanks. If find time to think enough about the backtraces you sent
it'll probably be obvious, but the complete description of your workload
is helpful.

Just out of curiosity have you verified that the deadlock doesn't exist
if you skip the preallocation? I wonder about a fake test where you
simply write a bit extra, and truncate that back.

-Eric

> /Fredrik