2007-11-06 13:21:17

by Petar Bogdanovic

[permalink] [raw]
Subject: dd/mke2fs on loopback hangs

Hi,

I experience some strange problems with my loopback-on-ext3-setup.
After creating a plain `zeroed' dummy-file and doing a /dev/loop/0 on
it, every dd or mke2fs hangs while doing certain write()s. Here are the
steps just in order to show, how simple the setup is:

1) create a dummy file (size: 1GB or less)
dd if=/dev/zero of=dummy bs=1M count=1000

2) create a loopback device on the dummy file
losetup /dev/loop/0 dummy

3) write on the loopback device or create a filesystem
dd if=/dev/zero of=/dev/loop/0 bs=1M
mke2fs -m 0 /dev/loop/0


mke2fs will take ages to complete but not _every_ time. It has happend
once, that it completed very fast during the first run. The same goes
for dd -- it takes nearly forever:

# dd if=/dev/zero of=/dev/loop/0 bs=1M count=500
500+0 records in
500+0 records out
524288000 bytes (524 MB) copied, 547.585 s, 957 kB/s
^^^^^^^^


While dd/mke2fs hangs, it's worth mentioning that firefox hangs too. I
also do not have any other problems with IO besides this one.

A short test on an external USB-vfat-drive did not show the same
behaviour, so maybe it has to do something with my filesystem options:

# dumpe2fs -h /dev/sda2
dumpe2fs 1.40.2 (12-Jul-2007)
Filesystem volume name: <none>
Last mounted on: <not available>
Filesystem UUID: 794a82ae-b09b-48d0-901d-234dfdf28116
Filesystem magic number: 0xEF53
Filesystem revision #: 1 (dynamic)
Filesystem features: has_journal resize_inode dir_index filetype needs_recovery sparse_super large_file
Filesystem flags: signed directory hash
Default mount options: (none)
Filesystem state: clean
Errors behavior: Continue
Filesystem OS type: Linux
Inode count: 9043968
Block count: 18067100
Reserved block count: 903355
Free blocks: 17000463
Free inodes: 8954830
First block: 0
Block size: 4096
Fragment size: 4096
Reserved GDT blocks: 1019
Blocks per group: 32768
Fragments per group: 32768
Inodes per group: 16384
Inode blocks per group: 512
Filesystem created: Wed Aug 29 13:27:22 2007
Last mount time: Tue Nov 6 12:47:25 2007
Last write time: Tue Nov 6 12:47:25 2007
Mount count: 21
Maximum mount count: 21
Last checked: Tue Oct 30 17:21:00 2007
Check interval: 15552000 (6 months)
Next check after: Sun Apr 27 18:21:00 2008
Reserved blocks uid: 0 (user root)
Reserved blocks gid: 0 (group root)
First inode: 11
Inode size: 128
Journal inode: 8
Default directory hash: tea
Directory Hash Seed: a35cc1dc-1c7d-4fce-9c37-5092b932be09
Journal backup: inode blocks
Journal size: 128M


Thanks for any kind of hint,

Petar


P.S: $ uname -a
Linux pintail 2.6.23-ARCH #1 SMP PREEMPT Sat Oct 27 09:04:14 UTC
2007 i686 Intel(R) Pentium(R) M processor 1.70GHz GenuineIntel
GNU/Linux


2007-11-06 21:05:14

by Milan Broz

[permalink] [raw]
Subject: Re: dd/mke2fs on loopback hangs

Petar Bogdanovic wrote:
> I experience some strange problems with my loopback-on-ext3-setup.
> After creating a plain `zeroed' dummy-file and doing a /dev/loop/0 on
> it, every dd or mke2fs hangs while doing certain write()s. Here are the
> steps just in order to show, how simple the setup is:
...

> P.S: $ uname -a
> Linux pintail 2.6.23-ARCH #1 SMP PREEMPT Sat Oct 27 09:04:14 UTC
> 2007 i686 Intel(R) Pentium(R) M processor 1.70GHz GenuineIntel
> GNU/Linux

Hi Petar,

I saw similar bug report for dm-crypt over loop but reproducible even
on stand-alone loop devices - see http://bugzilla.kernel.org/show_bug.cgi?id=8020

The problem was caused by loop io stalling in balance_dirty_pages.

Per BDI dirty limit patchset (included in 2.6.24-rc) fixed it.

Anyway, you should attach output of process states when system stops
responding (output of "echo t >/proc/sysrq-trigger" ) here to allow
some analysis.

Milan
--
[email protected]


2007-11-07 13:28:22

by Petar Bogdanovic

[permalink] [raw]
Subject: Re: dd/mke2fs on loopback hangs

On Tue, Nov 06, 2007 at 10:04:51PM +0100, Milan Broz wrote:
> Petar Bogdanovic wrote:
> > I experience some strange problems with my loopback-on-ext3-setup.
> > After creating a plain `zeroed' dummy-file and doing a /dev/loop/0 on
> > it, every dd or mke2fs hangs while doing certain write()s. Here are the
> > steps just in order to show, how simple the setup is:
> ...
>
> > P.S: $ uname -a
> > Linux pintail 2.6.23-ARCH #1 SMP PREEMPT Sat Oct 27 09:04:14 UTC
> > 2007 i686 Intel(R) Pentium(R) M processor 1.70GHz GenuineIntel
> > GNU/Linux
>
> Hi Petar,
>
> I saw similar bug report for dm-crypt over loop but reproducible even
> on stand-alone loop devices - see http://bugzilla.kernel.org/show_bug.cgi?id=8020
>
> The problem was caused by loop io stalling in balance_dirty_pages.
>
> Per BDI dirty limit patchset (included in 2.6.24-rc) fixed it.
>
> Anyway, you should attach output of process states when system stops
> responding (output of "echo t >/proc/sysrq-trigger" ) here to allow
> some analysis.

Thanks Milane! :)

Here are the process states:

dd `hanging' on a 1GB zeroed loopback:

=======================
dd D eb52fcc0 0 10152 9657
eb52fcd4 00200086 00000002 eb52fcc0 eb52fcb8 00000000 c047aee0 c047de80
eb52fcc4 ed192550 c180ce80 00000000 00152450 00000000 0000000f 00000000
00000000 00000000 eb52fce4 0015246d 000055bd eb52fd0c c035fb0a 00000001
Call Trace:
[<c035fb0a>] schedule_timeout+0x4a/0xc0
[<c0135240>] process_timeout+0x0/0x10
[<c035f4be>] io_schedule_timeout+0x1e/0x30
[<c0166976>] congestion_wait+0x56/0x80
[<c0140200>] autoremove_wake_function+0x0/0x40
[<c0161701>] balance_dirty_pages_ratelimited_nr+0x141/0x220
[<c015cc1a>] generic_file_buffered_write+0x37a/0x6b0
[<c01489f8>] tick_program_event+0x38/0x60
[<c015d204>] __generic_file_aio_write_nolock+0x2b4/0x530
[<c013165b>] irq_exit+0x5b/0x90
[<c015d5a7>] generic_file_aio_write_nolock+0x47/0xb0
[<c0168b07>] unmap_vmas+0x537/0x610
[<c017df15>] do_sync_write+0xd5/0x120
[<c0140200>] autoremove_wake_function+0x0/0x40
[<c017de40>] do_sync_write+0x0/0x120
[<c017e7cf>] vfs_write+0xbf/0x140
[<c017ee61>] sys_write+0x41/0x70
[<c0104482>] sysenter_past_esp+0x6b/0xa1
=======================


and mke2fs `hanging' on a 30GB sparse file loopback:

=======================
mke2fs D e6bcdcc0 0 10767 9657
e6bcdcd4 00200086 00000002 e6bcdcc0 e6bcdcb8 00000000 c047aee0 c047de80
e6bcdcc4 dfe1f540 c180ce80 00000000 0017c1f9 00000000 0000000f 00000000
00000000 00000000 e6bcdce4 0017c259 000055c4 e6bcdd0c c035fb0a 00000001
Call Trace:
[<c035fb0a>] schedule_timeout+0x4a/0xc0
[<c0135240>] process_timeout+0x0/0x10
[<c035f4be>] io_schedule_timeout+0x1e/0x30
[<c0166976>] congestion_wait+0x56/0x80
[<c0140200>] autoremove_wake_function+0x0/0x40
[<c0161701>] balance_dirty_pages_ratelimited_nr+0x141/0x220
[<c015cc1a>] generic_file_buffered_write+0x37a/0x6b0
[<c015d204>] __generic_file_aio_write_nolock+0x2b4/0x530
[<c0124233>] __check_preempt_curr_fair+0x53/0xa0
[<c0128dd7>] check_preempt_curr_fair+0x57/0x90
[<c015d5a7>] generic_file_aio_write_nolock+0x47/0xb0
[<c01030b1>] __switch_to+0xa1/0x150
[<c017df15>] do_sync_write+0xd5/0x120
[<c0140200>] autoremove_wake_function+0x0/0x40
[<c017de40>] do_sync_write+0x0/0x120
[<c017e7cf>] vfs_write+0xbf/0x140
[<c017db1c>] vfs_llseek+0x3c/0x50
[<c017ee61>] sys_write+0x41/0x70
[<c0104482>] sysenter_past_esp+0x6b/0xa1
[<c0360000>] __mutex_lock_interruptible_slowpath+0x120/0x340
=======================


If you need more output, just tell me.


Thanks & with kind regards,

Petar

2007-11-13 09:36:18

by Petar Bogdanovic

[permalink] [raw]
Subject: Re: dd/mke2fs on loopback hangs

On Tue, Nov 06, 2007 at 10:04:51PM +0100, Milan Broz wrote:
> Petar Bogdanovic wrote:
> > I experience some strange problems with my loopback-on-ext3-setup.
> > After creating a plain `zeroed' dummy-file and doing a /dev/loop/0 on
> > it, every dd or mke2fs hangs while doing certain write()s. Here are the
> > steps just in order to show, how simple the setup is:
> ...
>
> > P.S: $ uname -a
> > Linux pintail 2.6.23-ARCH #1 SMP PREEMPT Sat Oct 27 09:04:14 UTC
> > 2007 i686 Intel(R) Pentium(R) M processor 1.70GHz GenuineIntel
> > GNU/Linux
>
> Hi Petar,
>
> I saw similar bug report for dm-crypt over loop but reproducible even
> on stand-alone loop devices - see http://bugzilla.kernel.org/show_bug.cgi?id=8020
>
> The problem was caused by loop io stalling in balance_dirty_pages.
>
> Per BDI dirty limit patchset (included in 2.6.24-rc) fixed it.

Indeed, the problem disappeared on 2.6.24-rc2. Thanks again for the
hint!


Petar