LinuxLists.cc - ext4 blocks up

2012-07-02 12:15:16

Subject: ext4 blocks up

Hi Ted,

first of all I want to remind you about the sync_supers changes which
you said you'd make sure get properly reviewed and tested for v3.5. But
I am getting worried about missing also 3.5.

I wanted to re-send the patch-set, but I cannot test it because ext4
blocks when I test it with xfstests.

The HEAD of xfstests I use is:
7d14795 xfstests: introduce 286 for SEEK_DATA/SEEK_HOLE copy test

I run xfstests with "check -g auto" command. My environment is:

export TEST_DIR=/mnt/xfstests-disk-1
export TEST_DEV=/dev/vdb
export SCRATCH_MNT=/mnt/xfstests-disk-2
export SCRATCH_DEV=/dev/vdc
export FSTYP=ext4

The disks are not very big, df says:

/dev/vdb 4181968 192780 3779476 5% /mnt/xfstests-disk-1
/dev/vdc 12539816 486160 11424512 5% /mnt/xfstests-disk-2

Is this a known issue? Please, let me know if you need more information.
I can also try some patches for you.

I've just tested 3.5-rc5 in my kvm machine. Below is what dmesg gives
me:

[ 2.541716] EXT4-fs (vda1): re-mounted. Opts: (null)
[ 3.230435] systemd-fsck[508]: home: clean, 316114/2097152 files, 2959603/8388608 blocks
[ 3.296292] hrtimer: interrupt took 2148140 ns
[ 3.369675] EXT4-fs (vda2): mounted filesystem with ordered data mode. Opts: (null)
[ 3.604614] /usr/sbin/crond[559]: (CRON) INFO (Syslog will be used instead of sendmail.): No such file or directory
[ 3.638686] /usr/sbin/crond[559]: (CRON) INFO (running with inotify support)
[ 50.659283] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: (null)
[ 50.687492] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: (null)
[ 52.098169] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 52.272963] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 56.323779] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 56.771414] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 57.459560] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 58.176195] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 59.117647] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 60.686415] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 61.116057] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 62.321826] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 69.970617] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 70.315529] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 70.344068] EXT4-fs (vdb): re-mounted. Opts: data=ordered
[ 71.086640] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 71.632241] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 71.840146] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 72.229582] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 73.526206] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 78.180546] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 78.518143] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 78.693992] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 80.758908] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 81.629718] EXT4-fs (vdb): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 83.278788] EXT4-fs (vdc): mounted filesystem with ordered data mode. Opts: acl,user_xattr
[ 240.379170] INFO: task flush-253:32:1336 blocked for more than 120 seconds.
[ 240.379402] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.379605] flush-253:32 D ffff8803118d8508 0 1336 2 0x00000000
[ 240.379608] ffff88030f93dae0 0000000000000046 ffff88030f93da80 ffffffff810bbb5d
[ 240.379612] ffff880312b13f80 ffff88030f93dfd8 ffff88030f93dfd8 ffff88030f93dfd8
[ 240.379615] ffff88031458df40 ffff880312b13f80 ffff88030f93db70 ffff88030c198800
[ 240.379618] Call Trace:
[ 240.379632] [<ffffffff810bbb5d>] ? trace_hardirqs_on+0xd/0x10
[ 240.379641] [<ffffffff81628029>] schedule+0x29/0x70
[ 240.379650] [<ffffffff81236e85>] ext4_force_commit+0x85/0xc0
[ 240.379656] [<ffffffff8107fcc0>] ? __init_waitqueue_head+0x60/0x60
[ 240.379660] [<ffffffff81214805>] ext4_write_inode+0x75/0x110
[ 240.379667] [<ffffffff811bcba1>] __writeback_single_inode+0x161/0x1a0
[ 240.379670] [<ffffffff811becdf>] writeback_sb_inodes+0x24f/0x440
[ 240.379672] [<ffffffff811bf094>] wb_writeback+0xf4/0x370
[ 240.379677] [<ffffffff81062647>] ? local_bh_enable_ip+0x97/0x100
[ 240.379680] [<ffffffff810bbb5d>] ? trace_hardirqs_on+0xd/0x10
[ 240.379683] [<ffffffff811c0ae2>] wb_do_writeback+0xc2/0x200
[ 240.379685] [<ffffffff8106b02e>] ? del_timer+0x8e/0x140
[ 240.379688] [<ffffffff811c0ca4>] bdi_writeback_thread+0x84/0x2d0
[ 240.379692] [<ffffffff811c0c20>] ? wb_do_writeback+0x200/0x200
[ 240.379695] [<ffffffff8107f19e>] kthread+0xae/0xc0
[ 240.379699] [<ffffffff81633374>] kernel_thread_helper+0x4/0x10
[ 240.379702] [<ffffffff81629c70>] ? retint_restore_args+0x13/0x13
[ 240.379704] [<ffffffff8107f0f0>] ? __init_kthread_worker+0x70/0x70
[ 240.379706] [<ffffffff81633370>] ? gs_change+0x13/0x13
[ 240.379708] no locks held by flush-253:32/1336.
[ 240.379710] INFO: task ext4lazyinit:14878 blocked for more than 120 seconds.
[ 240.379899] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.380129] ext4lazyinit D 0000000000000000 0 14878 2 0x00000000
[ 240.380132] ffff8803140dfd10 0000000000000046 ffff8803140dfcb0 ffffffff810bbb5d
[ 240.380135] ffff880312523f80 ffff8803140dffd8 ffff8803140dffd8 ffff8803140dffd8
[ 240.380138] ffffffff81c13420 ffff880312523f80 ffff8803140dfd00 ffff88030c19a000
[ 240.380141] Call Trace:
[ 240.380144] [<ffffffff810bbb5d>] ? trace_hardirqs_on+0xd/0x10
[ 240.380147] [<ffffffff81628029>] schedule+0x29/0x70
[ 240.380149] [<ffffffff812360c5>] ext4_journal_start_sb+0x1c5/0x200
[ 240.380152] [<ffffffff81210ddb>] ? ext4_init_inode_table+0xab/0x350
[ 240.380154] [<ffffffff8107fcc0>] ? __init_waitqueue_head+0x60/0x60
[ 240.380156] [<ffffffff81210ddb>] ext4_init_inode_table+0xab/0x350
[ 240.380159] [<ffffffff816260f0>] ? mutex_lock_nested+0x290/0x360
[ 240.380162] [<ffffffff8122558b>] ? ext4_lazyinit_thread+0x5b/0x2f0
[ 240.380165] [<ffffffff81225640>] ext4_lazyinit_thread+0x110/0x2f0
[ 240.380167] [<ffffffff81225530>] ? ext4_unregister_li_request+0x70/0x70
[ 240.380169] [<ffffffff8107f19e>] kthread+0xae/0xc0
[ 240.380172] [<ffffffff81633374>] kernel_thread_helper+0x4/0x10
[ 240.380174] [<ffffffff81629c70>] ? retint_restore_args+0x13/0x13
[ 240.380176] [<ffffffff8107f0f0>] ? __init_kthread_worker+0x70/0x70
[ 240.380178] [<ffffffff81633370>] ? gs_change+0x13/0x13
[ 240.380180] 1 lock held by ext4lazyinit/14878:
[ 240.380181] #0: (&eli->li_list_mtx){+.+...}, at: [<ffffffff8122558b>] ext4_lazyinit_thread+0x5b/0x2f0
[ 240.380191] INFO: task fsstress:15250 blocked for more than 120 seconds.
[ 240.380429] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.380718] fsstress D ffff880308e05880 0 15250 15249 0x00000000
[ 240.380721] ffff8803107ddc18 0000000000000046 ffff8803107ddbb8 ffffffff810bbb5d
[ 240.380724] ffff8803125adf40 ffff8803107ddfd8 ffff8803107ddfd8 ffff8803107ddfd8
[ 240.380727] ffff88031458df40 ffff8803125adf40 ffff880300000000 ffff880312be6140
[ 240.380730] Call Trace:
[ 240.380734] [<ffffffff810bbb5d>] ? trace_hardirqs_on+0xd/0x10
[ 240.380737] [<ffffffff81628029>] schedule+0x29/0x70
[ 240.380744] [<ffffffff81133b55>] __generic_file_aio_write+0xb5/0x440
[ 240.380747] [<ffffffff8107fcc0>] ? __init_waitqueue_head+0x60/0x60
[ 240.380749] [<ffffffff81133f54>] generic_file_aio_write+0x74/0xe0
[ 240.380753] [<ffffffff811a0da5>] ? putname+0x35/0x50
[ 240.380755] [<ffffffff8120df27>] ext4_file_write+0xc7/0x270
[ 240.380759] [<ffffffff81195072>] do_sync_write+0xd2/0x110
[ 240.380762] [<ffffffff81084bb3>] ? up_read+0x23/0x40
[ 240.380767] [<ffffffff81284a5c>] ? security_file_permission+0x2c/0xb0
[ 240.380770] [<ffffffff81195611>] ? rw_verify_area+0x61/0xf0
[ 240.380772] [<ffffffff81195973>] vfs_write+0xb3/0x180
[ 240.380774] [<ffffffff81195c9a>] sys_write+0x4a/0x90
[ 240.380777] [<ffffffff81631f29>] system_call_fastpath+0x16/0x1b
[ 240.380778] 1 lock held by fsstress/15250:
[ 240.380779] #0: (&sb->s_type->i_mutex_key#12){+.+...}, at: [<ffffffff81133f38>] generic_file_aio_write+0x58/0xe0
[ 240.380785] INFO: task fsstress:15251 blocked for more than 120 seconds.
[ 240.380966] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.381194] fsstress D ffff8803125a8000 0 15251 15249 0x00000000
[ 240.381197] ffff880310651c38 0000000000000046 ffff8803125a86d8 0000000000000046
[ 240.381202] ffff8803125a8000 ffff880310651fd8 ffff880310651fd8 ffff880310651fd8
[ 240.381205] ffffffff81c13420 ffff8803125a8000 ffff880310651c98 7fffffffffffffff
[ 240.381208] Call Trace:
[ 240.381212] [<ffffffff81628029>] schedule+0x29/0x70
[ 240.381214] [<ffffffff81625975>] schedule_timeout+0x305/0x380
[ 240.381217] [<ffffffff810bb906>] ? mark_held_locks+0x86/0x140
[ 240.381219] [<ffffffff81629430>] ? _raw_spin_unlock_irq+0x30/0x50
[ 240.381222] [<ffffffff810bbac5>] ? trace_hardirqs_on_caller+0x105/0x190
[ 240.381224] [<ffffffff81627ed2>] wait_for_common+0x122/0x170
[ 240.381228] [<ffffffff81092d90>] ? try_to_wake_up+0x2f0/0x2f0
[ 240.381231] [<ffffffff811c4630>] ? __sync_filesystem+0x90/0x90
[ 240.381234] [<ffffffff811c4630>] ? __sync_filesystem+0x90/0x90
[ 240.381236] [<ffffffff81627ffd>] wait_for_completion+0x1d/0x20
[ 240.381238] [<ffffffff811bd1ae>] sync_inodes_sb+0x12e/0x280
[ 240.381241] [<ffffffff811c4630>] ? __sync_filesystem+0x90/0x90
[ 240.381244] [<ffffffff811c4628>] __sync_filesystem+0x88/0x90
[ 240.381246] [<ffffffff811c464f>] sync_one_sb+0x1f/0x30
[ 240.381252] [<ffffffff81198cf1>] iterate_supers+0xf1/0x100
[ 240.381256] [<ffffffff811c4717>] sys_sync+0x47/0x70
[ 240.381260] [<ffffffff81631f29>] system_call_fastpath+0x16/0x1b
[ 240.381263] 1 lock held by fsstress/15251:
[ 240.381264] #0: (&type->s_umount_key#19){+++++.}, at: [<ffffffff81198c88>] iterate_supers+0x88/0x100
[ 240.381276] INFO: task xfs_io:15270 blocked for more than 120 seconds.
[ 240.381527] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 240.381724] xfs_io D 0000000000000002 0 15270 15266 0x00000000
[ 240.381727] ffff8803123b3d68 0000000000000046 0000000000000000 ffff880312ab9fc0
[ 240.381730] ffff880312ab9fc0 ffff8803123b3fd8 ffff8803123b3fd8 ffff8803123b3fd8
[ 240.381733] ffff88031458df40 ffff880312ab9fc0 ffff88030c198870 ffff880312ab9fc0
[ 240.381736] Call Trace:
[ 240.381740] [<ffffffff81628029>] schedule+0x29/0x70
[ 240.381742] [<ffffffff81628d75>] rwsem_down_failed_common+0xb5/0x150
[ 240.381744] [<ffffffff81628e23>] rwsem_down_write_failed+0x13/0x20
[ 240.381751] [<ffffffff812ed303>] call_rwsem_down_write_failed+0x13/0x20
[ 240.381754] [<ffffffff816271f5>] ? down_write+0x65/0x70
[ 240.381757] [<ffffffff81197ab8>] ? thaw_super+0x28/0xd0
[ 240.381759] [<ffffffff81197ab8>] thaw_super+0x28/0xd0
[ 240.381761] [<ffffffff811a0da5>] ? putname+0x35/0x50
[ 240.381765] [<ffffffff811a7f1c>] do_vfs_ioctl+0x36c/0x560
[ 240.381767] [<ffffffff810bbb5d>] ? trace_hardirqs_on+0xd/0x10
[ 240.381769] [<ffffffff81629c55>] ? retint_swapgs+0x13/0x1b
[ 240.381772] [<ffffffff811a81a1>] sys_ioctl+0x91/0xa0
[ 240.381774] [<ffffffff81631f29>] system_call_fastpath+0x16/0x1b
[ 240.381776] 1 lock held by xfs_io/15270:
[ 240.381776] #0: (&type->s_umount_key#19){+++++.}, at: [<ffffffff81197ab8>] thaw_super+0x28/0xd0

--
Best Regards,
Artem Bityutskiy

Attachments:

signature.asc (836.00 B)
This is a digitally signed message part

2012-07-02 14:44:47

by Theodore Ts'o

[permalink] [raw]

Subject: Re: ext4 blocks up

On Mon, Jul 02, 2012 at 03:15:16PM +0300, Artem Bityutskiy wrote:
> Hi Ted,
>
> first of all I want to remind you about the sync_supers changes which
> you said you'd make sure get properly reviewed and tested for v3.5. But
> I am getting worried about missing also 3.5.
>
> I wanted to re-send the patch-set, but I cannot test it because ext4
> blocks when I test it with xfstests.

Sorry, I've just been totally swamped this past couple of monthes;
work deadlines, plus a lot of other things that have been just keeping
me totally overloaded.

Can you tell me *which* xfstest this was blocking on? I generally run
xfstests under KVM, on the console, which is nice because the xfstests
output gets interleaved with any kernel oops/softlock up messages.

Also, for the sort of changes you are doing using the check -g quick
is probably good enough, that will get you unblocked.

I don't recall anything going wrong with a -g auto run with a standard
4k block size, but I'll give it a try on my end and see what I get.

- Ted

2012-07-02 14:54:22

On Mon, 2012-07-02 at 11:14 -0400, Theodore Ts'o wrote:
> On Mon, Jul 02, 2012 at 05:54:22PM +0300, Artem Bityutskiy wrote:
> > On Mon, 2012-07-02 at 10:44 -0400, Theodore Ts'o wrote:
> > > Can you tell me *which* xfstest this was blocking on?
> >
> > Test 068 which runs fsstress.
>
> Hmmm, I had just started an xfstests -g auto run using standard ext4
> defaults with 3.5-rc2, and I'm not noting any problems. And I've
> gotten past test 68 w/o any problems.
>
> Can you tell me more about your KVM config?
>
> I'm using a 512 megs of memory and 2 CPU's under KVM, with a 5 gig
> test partition running on a 5400 RPM laptop drive.

I have to run home now, but quickly few words about my setup. Yes, it is
100% reproducible, I can probably even try to bisect it. It worked, not
sure in which kernel, but I had xfstests finish successfully. I
reproduce this on stock v3.5-r5 kernel.

This is how I run kvm:

IMAGE="np.img"
DEVEL_DISK1="devel-disk-1.img"
DEVEL_DISK2="devel-disk-2.img"

mv np.output np.output.prev

kvm -m 12384 -spice port=7837,disable-ticketing \
-cpu core2duo -smp 12,sockets=1,cores=6,threads=2 \
-enable-kvm -name np \
-drive file=$IMAGE,if=virtio,cache=none,boot=on \
-drive file=$DEVEL_DISK1,if=virtio,cache=none \
-drive file=$DEVEL_DISK2,if=virtio,cache=none \
-rtc base=utc \
-net nic,model=virtio -net user \
-device virtio-balloon-pci \
-nographic \
-redir tcp:7838::22 \
| tee np.output

The cpuinfo in kvm is attached.

My host system is a SandyBridge with 6 cores, hyperthreding gives 12.

The disk where the images sit is an Intel SSD.

My hacky scripts to run the tests are also attached. I just run
"./test-ext4". My .config is also attached. Yes, I just create the
file-system with mkfs.ext4 without any additional options.

--
Best Regards,
Artem Bityutskiy

Attachments:

cpuinfo_in_kvm (8.03 kB)
environment.sh (148.00 B)
prepare-ext4.sh (324.00 B)
test-ext4.sh (295.00 B)
.config (87.18 kB)
signature.asc (836.00 B)
This is a digitally signed message part Download all attachments

2012-07-02 16:55:35

On Mon, Jul 02, 2012 at 07:53:41PM +0200, Jan Kara wrote:
>
> > Can you send me a pointer to your patches? Is this something that Al
> http://www.spinics.net/lists/kernel/msg1355763.html

Ah, right. OK, I didn't realize until I took a closer look that
xfstests #68 was testing file system freezes while running fsstress.

So Artem, what I'd suggest doing for now is to simply comment out test
#68 from your xfstests's group file and just avoid running that test
until this gets sorted. I simply hadn't noticed since for most of my
quick checks I tend to use -g quick, and -g auto more rarely (i.e.,
before I push changes to Linus, and every so often during the dev
cycle), and test #68 isn't run for -g quick, and it's only failing
about 5-10% of the time in my test configuration.

- Ted