2016-11-24 20:00:00

by Tuomas Tynkkynen

[permalink] [raw]
Subject: 9pfs hangs since 4.7

Hi fsdevel,

I have been observing hangs when running xfstests generic/224. Curiously
enough, the test is *not* causing problems on the FS under test (I've
tried both ext4 and f2fs) but instead it's causing the 9pfs that I'm
using as the root filesystem to crap out.

How it shows up is that the test doesn't finish in time (usually
takes ~50 sec) but the hung task detector triggers for some task in
d_alloc_parallel():

[ 660.701646] INFO: task 224:7800 blocked for more than 300 seconds.
[ 660.702756] Not tainted 4.9.0-rc5 #1-NixOS
[ 660.703232] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 660.703927] 224 D 0 7800 549 0x00000000
[ 660.704501] ffff8a82ec022800 0000000000000000 ffff8a82fc03c800 ffff8a82ff217dc0
[ 660.705302] ffff8a82d0f88c00 ffffa94a41a27b88 ffffffffaeb4ad1d ffffa94a41a27b78
[ 660.706125] ffffffffae800fc6 ffff8a82fbd90f08 ffff8a82d0f88c00 ffff8a82fbfd5418
[ 660.706924] Call Trace:
[ 660.707185] [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
[ 660.707751] [<ffffffffae800fc6>] ? __d_alloc+0x126/0x1e0
[ 660.708304] [<ffffffffaeb4b206>] schedule+0x36/0x80
[ 660.708841] [<ffffffffae801937>] d_alloc_parallel+0x3a7/0x480
[ 660.709454] [<ffffffffae697800>] ? wake_up_q+0x70/0x70
[ 660.710007] [<ffffffffae7f37e3>] lookup_slow+0x73/0x140
[ 660.710572] [<ffffffffae7f3eea>] walk_component+0x1ca/0x2f0
[ 660.711167] [<ffffffffae7f2409>] ? path_init+0x1d9/0x330
[ 660.711747] [<ffffffffae808ca4>] ? mntput+0x24/0x40
[ 660.716962] [<ffffffffae7f4fbd>] path_lookupat+0x5d/0x110
[ 660.717581] [<ffffffffae7f784e>] filename_lookup+0x9e/0x150
[ 660.718194] [<ffffffffae7ca036>] ? kmem_cache_alloc+0x156/0x1b0
[ 660.719037] [<ffffffffae7f7496>] ? getname_flags+0x56/0x1f0
[ 660.719801] [<ffffffffae7f74b2>] ? getname_flags+0x72/0x1f0
[ 660.720492] [<ffffffffae7f79b6>] user_path_at_empty+0x36/0x40
[ 660.721206] [<ffffffffae7ec943>] vfs_fstatat+0x53/0xa0
[ 660.721980] [<ffffffffae7ecdbf>] SYSC_newstat+0x1f/0x40
[ 660.722732] [<ffffffffae7ecebe>] SyS_newstat+0xe/0x10
[ 660.723702] [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9

SysRq-T is full of things stuck inside p9_client_rpc like:

[ 271.703598] bash S 0 100 96 0x00000000
[ 271.703968] ffff8a82ff824800 0000000000000000 ffff8a82faee4800 ffff8a82ff217dc0
[ 271.704486] ffff8a82fb946c00 ffffa94a404ebae8 ffffffffaeb4ad1d ffff8a82fb9fc058
[ 271.705024] ffffa94a404ebb10 ffffffffae8f21f9 ffff8a82fb946c00 ffff8a82fbbba000
[ 271.705542] Call Trace:
[ 271.705715] [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
[ 271.706079] [<ffffffffae8f21f9>] ? idr_get_empty_slot+0x199/0x3b0
[ 271.706489] [<ffffffffaeb4b206>] schedule+0x36/0x80
[ 271.706825] [<ffffffffc01b02ba>] p9_client_rpc+0x12a/0x460 [9pnet]
[ 271.707239] [<ffffffffae8f2497>] ? idr_alloc+0x87/0x100
[ 271.707596] [<ffffffffae6abf90>] ? wake_atomic_t_function+0x60/0x60
[ 271.708043] [<ffffffffc01b2247>] p9_client_walk+0x77/0x200 [9pnet]
[ 271.708459] [<ffffffffc01ccae9>] v9fs_vfs_lookup.part.16+0x59/0x120 [9p]
[ 271.708912] [<ffffffffc01ccbcf>] v9fs_vfs_lookup+0x1f/0x30 [9p]
[ 271.709308] [<ffffffffae7f3806>] lookup_slow+0x96/0x140
[ 271.709664] [<ffffffffae7f3eea>] walk_component+0x1ca/0x2f0
[ 271.710036] [<ffffffffae7f2409>] ? path_init+0x1d9/0x330
[ 271.710390] [<ffffffffae7f4fbd>] path_lookupat+0x5d/0x110
[ 271.710763] [<ffffffffae7f784e>] filename_lookup+0x9e/0x150
[ 271.711136] [<ffffffffae7e091e>] ? mem_cgroup_commit_charge+0x7e/0x4a0
[ 271.711581] [<ffffffffae7ca036>] ? kmem_cache_alloc+0x156/0x1b0
[ 271.711977] [<ffffffffae7f7496>] ? getname_flags+0x56/0x1f0
[ 271.712349] [<ffffffffae7f74b2>] ? getname_flags+0x72/0x1f0
[ 271.712726] [<ffffffffae7f79b6>] user_path_at_empty+0x36/0x40
[ 271.713110] [<ffffffffae7ec943>] vfs_fstatat+0x53/0xa0
[ 271.713454] [<ffffffffae7ecdbf>] SYSC_newstat+0x1f/0x40
[ 271.713810] [<ffffffffae7ecebe>] SyS_newstat+0xe/0x10
[ 271.714150] [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9

[ 271.729022] sleep S 0 218 216 0x00000002
[ 271.729391] ffff8a82fb990800 0000000000000000 ffff8a82fc0d8000 ffff8a82ff317dc0
[ 271.729915] ffff8a82fbbec800 ffffa94a404f3cf8 ffffffffaeb4ad1d ffff8a82fb9fc058
[ 271.730426] ffffec95c1ee08c0 0000000000000001 ffff8a82fbbec800 ffff8a82fbbba000
[ 271.730950] Call Trace:
[ 271.731115] [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
[ 271.731479] [<ffffffffaeb4b206>] schedule+0x36/0x80
[ 271.731814] [<ffffffffc01b02ba>] p9_client_rpc+0x12a/0x460 [9pnet]
[ 271.732226] [<ffffffffae6abf90>] ? wake_atomic_t_function+0x60/0x60
[ 271.732649] [<ffffffffc01b2158>] p9_client_clunk+0x38/0xb0 [9pnet]
[ 271.733061] [<ffffffffc01cf38a>] v9fs_dir_release+0x1a/0x30 [9p]
[ 271.733494] [<ffffffffae7e964f>] __fput+0xdf/0x1f0
[ 271.733844] [<ffffffffae7e979e>] ____fput+0xe/0x10
[ 271.734176] [<ffffffffae68b5be>] task_work_run+0x7e/0xa0
[ 271.734532] [<ffffffffae672559>] do_exit+0x2b9/0xad0
[ 271.734888] [<ffffffffae65cc67>] ? __do_page_fault+0x287/0x4b0
[ 271.735276] [<ffffffffae672df3>] do_group_exit+0x43/0xb0
[ 271.735639] [<ffffffffae672e74>] SyS_exit_group+0x14/0x20
[ 271.736002] [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9

Full dmesgs is available from:
https://gist.githubusercontent.com/dezgeg/31c2a50a1ce82e4284f6c9e617e7eba8/raw/e5ed6e62c7a1a5234d9316563154e530a2e95586/dmesg (shorter)
https://gist.githubusercontent.com/dezgeg/6989a0746ba8c000324455f473dc58e9/raw/4d7c6c58de88ef9d0367147c4ef1d990cfb267ce/dmesg (longer)

This typically reproduces quite fast, maybe half an hour or so. 4.7,
4.8.10 and 4.9-rc5 all are affected.

This happens in a 2-core QEMU+KVM vm with 2GB RAM using its
internal 9p server
(-virtfs local,path=MOUNTPOINT,security_model=none,mount_tag=store).

Any ideas for further debugging?


2016-11-29 16:40:14

by Eric Van Hensbergen

[permalink] [raw]
Subject: Re: 9pfs hangs since 4.7

Any idea of what xfstests is doing at this point in time? I'd be a
bit worried about some sort of loop in the namespace since it seems to
be in path traversal. Could also be some sort of resource leak or
fragmentation, I'll admit that many of the regression tests we do are
fairly short in duration. Another approach would be to look at doing
this with a different server (over a network link instead of virtio)
to isolate it as a client versus server side problem (although from
the looks of things this does seem to be a client issue).

On Thu, Nov 24, 2016 at 1:50 PM, Tuomas Tynkkynen <[email protected]> wrote:
> Hi fsdevel,
>
> I have been observing hangs when running xfstests generic/224. Curiously
> enough, the test is *not* causing problems on the FS under test (I've
> tried both ext4 and f2fs) but instead it's causing the 9pfs that I'm
> using as the root filesystem to crap out.
>
> How it shows up is that the test doesn't finish in time (usually
> takes ~50 sec) but the hung task detector triggers for some task in
> d_alloc_parallel():
>
> [ 660.701646] INFO: task 224:7800 blocked for more than 300 seconds.
> [ 660.702756] Not tainted 4.9.0-rc5 #1-NixOS
> [ 660.703232] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> [ 660.703927] 224 D 0 7800 549 0x00000000
> [ 660.704501] ffff8a82ec022800 0000000000000000 ffff8a82fc03c800 ffff8a82ff217dc0
> [ 660.705302] ffff8a82d0f88c00 ffffa94a41a27b88 ffffffffaeb4ad1d ffffa94a41a27b78
> [ 660.706125] ffffffffae800fc6 ffff8a82fbd90f08 ffff8a82d0f88c00 ffff8a82fbfd5418
> [ 660.706924] Call Trace:
> [ 660.707185] [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
> [ 660.707751] [<ffffffffae800fc6>] ? __d_alloc+0x126/0x1e0
> [ 660.708304] [<ffffffffaeb4b206>] schedule+0x36/0x80
> [ 660.708841] [<ffffffffae801937>] d_alloc_parallel+0x3a7/0x480
> [ 660.709454] [<ffffffffae697800>] ? wake_up_q+0x70/0x70
> [ 660.710007] [<ffffffffae7f37e3>] lookup_slow+0x73/0x140
> [ 660.710572] [<ffffffffae7f3eea>] walk_component+0x1ca/0x2f0
> [ 660.711167] [<ffffffffae7f2409>] ? path_init+0x1d9/0x330
> [ 660.711747] [<ffffffffae808ca4>] ? mntput+0x24/0x40
> [ 660.716962] [<ffffffffae7f4fbd>] path_lookupat+0x5d/0x110
> [ 660.717581] [<ffffffffae7f784e>] filename_lookup+0x9e/0x150
> [ 660.718194] [<ffffffffae7ca036>] ? kmem_cache_alloc+0x156/0x1b0
> [ 660.719037] [<ffffffffae7f7496>] ? getname_flags+0x56/0x1f0
> [ 660.719801] [<ffffffffae7f74b2>] ? getname_flags+0x72/0x1f0
> [ 660.720492] [<ffffffffae7f79b6>] user_path_at_empty+0x36/0x40
> [ 660.721206] [<ffffffffae7ec943>] vfs_fstatat+0x53/0xa0
> [ 660.721980] [<ffffffffae7ecdbf>] SYSC_newstat+0x1f/0x40
> [ 660.722732] [<ffffffffae7ecebe>] SyS_newstat+0xe/0x10
> [ 660.723702] [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
>
> SysRq-T is full of things stuck inside p9_client_rpc like:
>
> [ 271.703598] bash S 0 100 96 0x00000000
> [ 271.703968] ffff8a82ff824800 0000000000000000 ffff8a82faee4800 ffff8a82ff217dc0
> [ 271.704486] ffff8a82fb946c00 ffffa94a404ebae8 ffffffffaeb4ad1d ffff8a82fb9fc058
> [ 271.705024] ffffa94a404ebb10 ffffffffae8f21f9 ffff8a82fb946c00 ffff8a82fbbba000
> [ 271.705542] Call Trace:
> [ 271.705715] [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
> [ 271.706079] [<ffffffffae8f21f9>] ? idr_get_empty_slot+0x199/0x3b0
> [ 271.706489] [<ffffffffaeb4b206>] schedule+0x36/0x80
> [ 271.706825] [<ffffffffc01b02ba>] p9_client_rpc+0x12a/0x460 [9pnet]
> [ 271.707239] [<ffffffffae8f2497>] ? idr_alloc+0x87/0x100
> [ 271.707596] [<ffffffffae6abf90>] ? wake_atomic_t_function+0x60/0x60
> [ 271.708043] [<ffffffffc01b2247>] p9_client_walk+0x77/0x200 [9pnet]
> [ 271.708459] [<ffffffffc01ccae9>] v9fs_vfs_lookup.part.16+0x59/0x120 [9p]
> [ 271.708912] [<ffffffffc01ccbcf>] v9fs_vfs_lookup+0x1f/0x30 [9p]
> [ 271.709308] [<ffffffffae7f3806>] lookup_slow+0x96/0x140
> [ 271.709664] [<ffffffffae7f3eea>] walk_component+0x1ca/0x2f0
> [ 271.710036] [<ffffffffae7f2409>] ? path_init+0x1d9/0x330
> [ 271.710390] [<ffffffffae7f4fbd>] path_lookupat+0x5d/0x110
> [ 271.710763] [<ffffffffae7f784e>] filename_lookup+0x9e/0x150
> [ 271.711136] [<ffffffffae7e091e>] ? mem_cgroup_commit_charge+0x7e/0x4a0
> [ 271.711581] [<ffffffffae7ca036>] ? kmem_cache_alloc+0x156/0x1b0
> [ 271.711977] [<ffffffffae7f7496>] ? getname_flags+0x56/0x1f0
> [ 271.712349] [<ffffffffae7f74b2>] ? getname_flags+0x72/0x1f0
> [ 271.712726] [<ffffffffae7f79b6>] user_path_at_empty+0x36/0x40
> [ 271.713110] [<ffffffffae7ec943>] vfs_fstatat+0x53/0xa0
> [ 271.713454] [<ffffffffae7ecdbf>] SYSC_newstat+0x1f/0x40
> [ 271.713810] [<ffffffffae7ecebe>] SyS_newstat+0xe/0x10
> [ 271.714150] [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
>
> [ 271.729022] sleep S 0 218 216 0x00000002
> [ 271.729391] ffff8a82fb990800 0000000000000000 ffff8a82fc0d8000 ffff8a82ff317dc0
> [ 271.729915] ffff8a82fbbec800 ffffa94a404f3cf8 ffffffffaeb4ad1d ffff8a82fb9fc058
> [ 271.730426] ffffec95c1ee08c0 0000000000000001 ffff8a82fbbec800 ffff8a82fbbba000
> [ 271.730950] Call Trace:
> [ 271.731115] [<ffffffffaeb4ad1d>] ? __schedule+0x18d/0x640
> [ 271.731479] [<ffffffffaeb4b206>] schedule+0x36/0x80
> [ 271.731814] [<ffffffffc01b02ba>] p9_client_rpc+0x12a/0x460 [9pnet]
> [ 271.732226] [<ffffffffae6abf90>] ? wake_atomic_t_function+0x60/0x60
> [ 271.732649] [<ffffffffc01b2158>] p9_client_clunk+0x38/0xb0 [9pnet]
> [ 271.733061] [<ffffffffc01cf38a>] v9fs_dir_release+0x1a/0x30 [9p]
> [ 271.733494] [<ffffffffae7e964f>] __fput+0xdf/0x1f0
> [ 271.733844] [<ffffffffae7e979e>] ____fput+0xe/0x10
> [ 271.734176] [<ffffffffae68b5be>] task_work_run+0x7e/0xa0
> [ 271.734532] [<ffffffffae672559>] do_exit+0x2b9/0xad0
> [ 271.734888] [<ffffffffae65cc67>] ? __do_page_fault+0x287/0x4b0
> [ 271.735276] [<ffffffffae672df3>] do_group_exit+0x43/0xb0
> [ 271.735639] [<ffffffffae672e74>] SyS_exit_group+0x14/0x20
> [ 271.736002] [<ffffffffaeb4fb77>] entry_SYSCALL_64_fastpath+0x1a/0xa9
>
> Full dmesgs is available from:
> https://gist.githubusercontent.com/dezgeg/31c2a50a1ce82e4284f6c9e617e7eba8/raw/e5ed6e62c7a1a5234d9316563154e530a2e95586/dmesg (shorter)
> https://gist.githubusercontent.com/dezgeg/6989a0746ba8c000324455f473dc58e9/raw/4d7c6c58de88ef9d0367147c4ef1d990cfb267ce/dmesg (longer)
>
> This typically reproduces quite fast, maybe half an hour or so. 4.7,
> 4.8.10 and 4.9-rc5 all are affected.
>
> This happens in a 2-core QEMU+KVM vm with 2GB RAM using its
> internal 9p server
> (-virtfs local,path=MOUNTPOINT,security_model=none,mount_tag=store).
>
> Any ideas for further debugging?

2016-12-02 20:11:26

by Tuomas Tynkkynen

[permalink] [raw]
Subject: Re: 9pfs hangs since 4.7

On Tue, 29 Nov 2016 10:39:39 -0600
Eric Van Hensbergen <[email protected]> wrote:

> Any idea of what xfstests is doing at this point in time? I'd be a
> bit worried about some sort of loop in the namespace since it seems to
> be in path traversal. Could also be some sort of resource leak or
> fragmentation, I'll admit that many of the regression tests we do are
> fairly short in duration. Another approach would be to look at doing
> this with a different server (over a network link instead of virtio)
> to isolate it as a client versus server side problem (although from
> the looks of things this does seem to be a client issue).

The xfstests part where it hangs is either of these loops:

FILES=1000
for i in `seq 0 1 $FILES`; do
(
sleep 5
xfs_io -f -c "truncate 10485760" $SCRATCH_MNT/testfile.$i
dd if=/dev/zero of=$SCRATCH_MNT/testfile.$i bs=4k conv=notrunc
) > /dev/null 2>&1 &
done
wait

for i in `seq 0 1 $FILES`; do
dd of=/dev/null if=$SCRATCH_MNT/testfile.$i bs=512k iflag=direct > /dev/null 2>&1 &
done
wait

So all what's happening on the 9p is a bunch of reads+opens
on the binaries (sleep, xfs_io, dd) and their .so dependencies
(which includes some readlinks as well apparently).

I also tried building QEMU with tracing support enabled and
according to its own 9p event log the server did end up replying
to each client request (i.e. each v9fs_foo with a given tag was
was matched up with a v9fs_foo_return or a v9fs_rerror)...
so yes, looking more like a client problem.