2013-10-15 18:29:52

by Ben Greear

[permalink] [raw]
Subject: 'umount -f /mnt/foo' fails if server IP is gone.

Is 'umount -f' supposed to always work, even if the file server
goes away?

I have a user's system that just hangs forever in this case.

Could be local changes we have made, but I'm curious about
the expected behaviour before I go digging too deep...

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com



2013-10-17 18:42:28

by Myklebust, Trond

[permalink] [raw]
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.

On Thu, 2013-10-17 at 11:35 -0700, Ben Greear wrote:
+AD4- On 10/17/2013 11:32 AM, Myklebust, Trond wrote:
+AD4- +AD4- On Thu, 2013-10-17 at 11:11 -0700, Ben Greear wrote:
+AD4- +AD4APg- On 10/17/2013 11:05 AM, Myklebust, Trond wrote:
+AD4- +AD4APgA+- On Thu, 2013-10-17 at 10:35 -0700, Ben Greear wrote:
+AD4- +AD4APgA+AD4- On 10/15/2013 11:29 AM, Ben Greear wrote:
+AD4- +AD4APgA+AD4APg- Is 'umount -f' supposed to always work, even if the file server
+AD4- +AD4APgA+AD4APg- goes away?
+AD4- +AD4APgA+AD4APg-
+AD4- +AD4APgA+AD4APg- I have a user's system that just hangs forever in this case.
+AD4- +AD4APgA+AD4APg-
+AD4- +AD4APgA+AD4APg- Could be local changes we have made, but I'm curious about
+AD4- +AD4APgA+AD4APg- the expected behaviour before I go digging too deep...
+AD4- +AD4APgA+AD4-
+AD4- +AD4APgA+AD4- Any input on this? I don't mind trying to fix it, but I
+AD4- +AD4APgA+AD4- would like to know how it is supposed to work.
+AD4- +AD4APgA+-
+AD4- +AD4APgA+- 'umount -f' has always been iffy. It just kills any pending RPC calls
+AD4- +AD4APgA+- +AF8-before+AF8- trying to unmount. Since the unmount itself can trigger
+AD4- +AD4APgA+- writeback flushes (and hence more RPC calls), the trace you are seeing
+AD4- +AD4APgA+- is indeed possible.
+AD4- +AD4APg-
+AD4- +AD4APg- I tried 'umount -f -l', and that also does not work.
+AD4- +AD4APg-
+AD4- +AD4APg- Any ideas on how to fix this properly?
+AD4- +AD4-
+AD4- +AD4- 'umount -f -l' should normally work to at least hide the gruesome
+AD4- +AD4- details of your hanging superblock.
+AD4- +AD4-
+AD4- +AD4- I'm guessing that you're falling afoul of the path revalidation that
+AD4- +AD4- Chuck alluded to. There should already be a fix for that problem with
+AD4- +AD4- the path+AF8-umountat() patches that went into Linux 3.12-rc1. Are those
+AD4- +AD4- failing to help?
+AD4-
+AD4- I have not tried past 3.9.11 kernel yet. I will go look for those patches
+AD4- you mention as well. Did any of this go to -stable by chance?

Not as far as I know.

The commit identifier is 8033426e6bdb2690d302872ac1e1fadaec1a5581 (vfs:
allow umount to handle mountpoints without revalidating them) in case
you are interested.

--
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust+AEA-netapp.com
http://www.netapp.com

2013-10-17 18:35:04

by Ben Greear

[permalink] [raw]
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.

On 10/17/2013 11:32 AM, Myklebust, Trond wrote:
> On Thu, 2013-10-17 at 11:11 -0700, Ben Greear wrote:
>> On 10/17/2013 11:05 AM, Myklebust, Trond wrote:
>>> On Thu, 2013-10-17 at 10:35 -0700, Ben Greear wrote:
>>>> On 10/15/2013 11:29 AM, Ben Greear wrote:
>>>>> Is 'umount -f' supposed to always work, even if the file server
>>>>> goes away?
>>>>>
>>>>> I have a user's system that just hangs forever in this case.
>>>>>
>>>>> Could be local changes we have made, but I'm curious about
>>>>> the expected behaviour before I go digging too deep...
>>>>
>>>> Any input on this? I don't mind trying to fix it, but I
>>>> would like to know how it is supposed to work.
>>>
>>> 'umount -f' has always been iffy. It just kills any pending RPC calls
>>> _before_ trying to unmount. Since the unmount itself can trigger
>>> writeback flushes (and hence more RPC calls), the trace you are seeing
>>> is indeed possible.
>>
>> I tried 'umount -f -l', and that also does not work.
>>
>> Any ideas on how to fix this properly?
>
> 'umount -f -l' should normally work to at least hide the gruesome
> details of your hanging superblock.
>
> I'm guessing that you're falling afoul of the path revalidation that
> Chuck alluded to. There should already be a fix for that problem with
> the path_umountat() patches that went into Linux 3.12-rc1. Are those
> failing to help?

I have not tried past 3.9.11+ kernel yet. I will go look for those patches
you mention as well. Did any of this go to -stable by chance?

Thanks,
Ben


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2013-10-17 18:23:54

by Christopher T Vogan

[permalink] [raw]
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.

I have reported 2 scenarios related to this issue, the second topic being
more relevant to your problem.
vfs: allow umount to handle mountpoints without revalidating them
and
NFSERR_STALE on umount with 3.10.0.RC5 kernel


Christopher Vogan
NFS Development & Test



From: Ben Greear <[email protected]>
To: "Myklebust, Trond" <[email protected]>,
Cc: "[email protected]" <[email protected]>
Date: 10/17/2013 01:14 PM
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.
Sent by: [email protected]



On 10/17/2013 11:05 AM, Myklebust, Trond wrote:
> On Thu, 2013-10-17 at 10:35 -0700, Ben Greear wrote:
>> On 10/15/2013 11:29 AM, Ben Greear wrote:
>>> Is 'umount -f' supposed to always work, even if the file server
>>> goes away?
>>>
>>> I have a user's system that just hangs forever in this case.
>>>
>>> Could be local changes we have made, but I'm curious about
>>> the expected behaviour before I go digging too deep...
>>
>> Any input on this? I don't mind trying to fix it, but I
>> would like to know how it is supposed to work.
>
> 'umount -f' has always been iffy. It just kills any pending RPC calls
> _before_ trying to unmount. Since the unmount itself can trigger
> writeback flushes (and hence more RPC calls), the trace you are seeing
> is indeed possible.

I tried 'umount -f -l', and that also does not work.

Any ideas on how to fix this properly?

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com

2013-10-17 18:05:40

by Myklebust, Trond

[permalink] [raw]
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.

On Thu, 2013-10-17 at 10:35 -0700, Ben Greear wrote:
+AD4- On 10/15/2013 11:29 AM, Ben Greear wrote:
+AD4- +AD4- Is 'umount -f' supposed to always work, even if the file server
+AD4- +AD4- goes away?
+AD4- +AD4-
+AD4- +AD4- I have a user's system that just hangs forever in this case.
+AD4- +AD4-
+AD4- +AD4- Could be local changes we have made, but I'm curious about
+AD4- +AD4- the expected behaviour before I go digging too deep...
+AD4-
+AD4- Any input on this? I don't mind trying to fix it, but I
+AD4- would like to know how it is supposed to work.

'umount -f' has always been iffy. It just kills any pending RPC calls
+AF8-before+AF8- trying to unmount. Since the unmount itself can trigger
writeback flushes (and hence more RPC calls), the trace you are seeing
is indeed possible.

--
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust+AEA-netapp.com
http://www.netapp.com

2013-10-17 17:35:42

by Ben Greear

[permalink] [raw]
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.

On 10/15/2013 11:29 AM, Ben Greear wrote:
> Is 'umount -f' supposed to always work, even if the file server
> goes away?
>
> I have a user's system that just hangs forever in this case.
>
> Could be local changes we have made, but I'm curious about
> the expected behaviour before I go digging too deep...

Any input on this? I don't mind trying to fix it, but I
would like to know how it is supposed to work.

Older kernels do not hang (we tried 3.0.x), but I'm not sure
exactly where the problem started.

Test case was to set up NFSv3 mount, then pull the Ethernet cable
on the nfs client machine. This system is running 3.9.11+ kernel.

From /proc/mounts:

10.2.46.90:/nfs_export on /mnt/lf/nfs3-001 type nfs
(rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.2.46.90,mountvers=3,mountport=19408,mountproto=udp,srcaddr=10.2.46.91,local_lock=none,addr=10.2.46.90)

# umount /mnt/lf/nfs3-001
^C
# umount -f /mnt/lf/nfs3-001
[hangs forever it seems, certainly for a long time]


Here is a stack trace of hung processes, for instance:

Oct 17 10:24:18 localhost kernel: [688601.930366] SysRq : Show Blocked State
Oct 17 10:24:18 localhost kernel: [688601.931016] task PC stack pid father
Oct 17 10:24:18 localhost kernel: [688601.931016] mkdir D f1bf6700 0 16898 16831 0x00000082
Oct 17 10:24:18 localhost kernel: [688601.931016] f070bd8c 00000046 00000282 f1bf6700 f5b55a20 c0d7e400 f5b55a20 c0d7e400
Oct 17 10:24:18 localhost kernel: [688601.931016] c0d7e400 c0d7e400 c0d7e400 f79e9400 f5b55a20 f79e9400 f5b55a20 f58b19c0
Oct 17 10:24:18 localhost kernel: [688601.931016] f8dc4fd0 f070bd50 f0ce9924 f070bd50 f8ec6bff f070bd94 f8dbf9f7 ee91a138
Oct 17 10:24:18 localhost kernel: [688601.931016] Call Trace:
Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8ec6bff>] ? rpc_put_task+0xf/0x20 [sunrpc]
Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8dbf9f7>] ? nfs_initiate_write+0xb7/0xe0 [nfs]
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c04a9f0e>] ? ktime_get_ts+0x3e/0x110
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09cb133>] schedule+0x23/0x60
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09cb1e6>] io_schedule+0x76/0xc0
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c051607d>] sleep_on_page+0xd/0x20
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09c8d4d>] __wait_on_bit+0x4d/0x70
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0516070>] ? __lock_page+0x90/0x90
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0516301>] wait_on_page_bit+0x91/0xa0
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0478710>] ? wake_atomic_t_function+0x50/0x50
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05164cb>] filemap_fdatawait_range+0xcb/0x150
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05166c7>] filemap_write_and_wait_range+0x97/0xb0
Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4074>] nfs_file_fsync+0x44/0xa0 [nfs]
Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs]
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c058e1f9>] vfs_fsync_range+0x59/0x70
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c058e237>] vfs_fsync+0x27/0x30
Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4b0b>] nfs_file_flush+0x6b/0x90 [nfs]
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05631a1>] filp_close+0x31/0x80
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c057ea55>] put_files_struct+0x85/0xe0
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c057eaf7>] exit_files+0x47/0x60
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045b83c>] do_exit+0x25c/0x980
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c056a0be>] ? SyS_stat64+0x2e/0x40
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045bf9e>] do_group_exit+0x3e/0xa0
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045c018>] SyS_exit_group+0x18/0x20
Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09d370d>] sysenter_do_call+0x12/0x28
Oct 17 10:24:18 localhost kernel: [688601.931016] umount.nfs D f11c4900 0 17150 17149 0x00000080
Oct 17 10:24:18 localhost kernel: [688602.225057] f3955d00 00000082 efea0d8c f11c4900 f3955c8c c08d9f96 f104e700 c0d7e400
Oct 17 10:24:18 localhost kernel: [688602.225057] c0d7e400 c0d7e400 c0d7e400 efea0d8c efea0c80 f79db400 f104e700 c0c3e980
Oct 17 10:24:18 localhost kernel: [688602.225057] f3955cd0 f3955cb4 f3955e90 0000002c 0000005c 132df575 efea0d80 0000005c
Oct 17 10:24:18 localhost kernel: [688602.225057] Call Trace:
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c08d9f96>] ? __kfree_skb+0x36/0x90
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09cb133>] schedule+0x23/0x60
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6edd>] rpc_wait_bit_killable+0x2d/0x70 [sunrpc]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09c8d4d>] __wait_on_bit+0x4d/0x70
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6eb0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6eb0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09c8e1b>] out_of_line_wait_on_bit+0xab/0xc0
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0478710>] ? wake_atomic_t_function+0x50/0x50
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec7f9e>] __rpc_execute+0x11e/0x290 [sunrpc]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ebf130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ebf130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c047865f>] ? wake_up_bit+0x5f/0x70
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec814c>] rpc_execute+0x3c/0xa0 [sunrpc]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec0f09>] rpc_run_task+0x59/0x70 [sunrpc]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec1022>] rpc_call_sync+0x42/0xa0 [sunrpc]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8e0b46c>] nfs3_rpc_wrapper.clone.0+0x5c/0xa0 [nfsv3]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8e0c0d4>] nfs3_proc_getattr+0x34/0x40 [nfsv3]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db7397>] __nfs_revalidate_inode+0xc7/0x140 [nfs]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db743f>] nfs_revalidate_inode+0x2f/0x60 [nfs]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db14a8>] nfs_weak_revalidate+0x38/0x50 [nfs]
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c056fba8>] complete_walk+0xa8/0xf0
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0571e53>] path_lookupat+0x63/0x690
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05724ae>] filename_lookup+0x2e/0xc0
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05733a3>] user_path_at_empty+0x43/0x80
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0578b9e>] ? __d_free+0x2e/0x50
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c064450c>] ? security_capable+0x1c/0x30
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05733ff>] user_path_at+0x1f/0x30
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05807c3>] SyS_umount+0x83/0x380
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c04d2606>] ? __audit_syscall_exit+0x1f6/0x290
Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09d370d>] sysenter_do_call+0x12/0x28

....

Oct 17 10:24:42 localhost kernel: [688631.186190] INFO: task mkdir:16898 blocked for more than 180 seconds.
Oct 17 10:24:42 localhost kernel: [688631.195666] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Oct 17 10:24:42 localhost kernel: [688631.206304] mkdir D f1bf6700 0 16898 16831 0x00000082
Oct 17 10:24:42 localhost kernel: [688631.215220] f070bd8c 00000046 00000282 f1bf6700 f5b55a20 c0d7e400 f5b55a20 c0d7e400
Oct 17 10:24:42 localhost kernel: [688631.225933] c0d7e400 c0d7e400 c0d7e400 f79e9400 f5b55a20 f79e9400 f5b55a20 f58b19c0
Oct 17 10:24:42 localhost kernel: [688631.236712] f8dc4fd0 f070bd50 f0ce9924 f070bd50 f8ec6bff f070bd94 f8dbf9f7 ee91a138
Oct 17 10:24:42 localhost kernel: [688631.247550] Call Trace:
Oct 17 10:24:42 localhost kernel: [688631.252746] [<f8ec6bff>] ? rpc_put_task+0xf/0x20 [sunrpc]
Oct 17 10:24:42 localhost kernel: [688631.261369] [<f8dbf9f7>] ? nfs_initiate_write+0xb7/0xe0 [nfs]
Oct 17 10:24:42 localhost kernel: [688631.270065] [<c04a9f0e>] ? ktime_get_ts+0x3e/0x110
Oct 17 10:24:42 localhost kernel: [688631.277724] [<c09cb133>] schedule+0x23/0x60
Oct 17 10:24:42 localhost kernel: [688631.285298] [<c09cb1e6>] io_schedule+0x76/0xc0
Oct 17 10:24:42 localhost kernel: [688631.292738] [<c051607d>] sleep_on_page+0xd/0x20
Oct 17 10:24:42 localhost kernel: [688631.300316] [<c09c8d4d>] __wait_on_bit+0x4d/0x70
Oct 17 10:24:42 localhost kernel: [688631.308117] [<c0516070>] ? __lock_page+0x90/0x90
Oct 17 10:24:42 localhost kernel: [688631.315731] [<c0516301>] wait_on_page_bit+0x91/0xa0
Oct 17 10:24:42 localhost kernel: [688631.323630] [<c0478710>] ? wake_atomic_t_function+0x50/0x50
Oct 17 10:24:42 localhost kernel: [688631.332536] [<c05164cb>] filemap_fdatawait_range+0xcb/0x150
Oct 17 10:24:42 localhost kernel: [688631.341221] [<c05166c7>] filemap_write_and_wait_range+0x97/0xb0
Oct 17 10:24:42 localhost kernel: [688631.350224] [<f8db4074>] nfs_file_fsync+0x44/0xa0 [nfs]
Oct 17 10:24:42 localhost kernel: [688631.358569] [<f8db4030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs]
Oct 17 10:24:42 localhost kernel: [688631.367764] [<c058e1f9>] vfs_fsync_range+0x59/0x70
Oct 17 10:24:42 localhost kernel: [688631.375818] [<c058e237>] vfs_fsync+0x27/0x30
Oct 17 10:24:42 localhost kernel: [688631.383346] [<f8db4b0b>] nfs_file_flush+0x6b/0x90 [nfs]
Oct 17 10:24:42 localhost kernel: [688631.392117] [<c05631a1>] filp_close+0x31/0x80
Oct 17 10:24:42 localhost kernel: [688631.399741] [<c057ea55>] put_files_struct+0x85/0xe0
Oct 17 10:24:42 localhost kernel: [688631.407871] [<c057eaf7>] exit_files+0x47/0x60
Oct 17 10:24:42 localhost kernel: [688631.415535] [<c045b83c>] do_exit+0x25c/0x980
Oct 17 10:24:42 localhost kernel: [688631.423133] [<c056a0be>] ? SyS_stat64+0x2e/0x40
Oct 17 10:24:42 localhost kernel: [688631.431078] [<c045bf9e>] do_group_exit+0x3e/0xa0
Oct 17 10:24:42 localhost kernel: [688631.439103] [<c045c018>] SyS_exit_group+0x18/0x20
Oct 17 10:24:42 localhost kernel: [688631.447169] [<c09d370d>] sysenter_do_call+0x12/0x28
Oct 17 10:24:54 localhost kernel: [688643.517069] RPC: AUTH_GSS upcall timed out.


Thanks,
Ben


--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2013-10-17 18:32:28

by Myklebust, Trond

[permalink] [raw]
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.

On Thu, 2013-10-17 at 11:11 -0700, Ben Greear wrote:
+AD4- On 10/17/2013 11:05 AM, Myklebust, Trond wrote:
+AD4- +AD4- On Thu, 2013-10-17 at 10:35 -0700, Ben Greear wrote:
+AD4- +AD4APg- On 10/15/2013 11:29 AM, Ben Greear wrote:
+AD4- +AD4APgA+- Is 'umount -f' supposed to always work, even if the file server
+AD4- +AD4APgA+- goes away?
+AD4- +AD4APgA+-
+AD4- +AD4APgA+- I have a user's system that just hangs forever in this case.
+AD4- +AD4APgA+-
+AD4- +AD4APgA+- Could be local changes we have made, but I'm curious about
+AD4- +AD4APgA+- the expected behaviour before I go digging too deep...
+AD4- +AD4APg-
+AD4- +AD4APg- Any input on this? I don't mind trying to fix it, but I
+AD4- +AD4APg- would like to know how it is supposed to work.
+AD4- +AD4-
+AD4- +AD4- 'umount -f' has always been iffy. It just kills any pending RPC calls
+AD4- +AD4- +AF8-before+AF8- trying to unmount. Since the unmount itself can trigger
+AD4- +AD4- writeback flushes (and hence more RPC calls), the trace you are seeing
+AD4- +AD4- is indeed possible.
+AD4-
+AD4- I tried 'umount -f -l', and that also does not work.
+AD4-
+AD4- Any ideas on how to fix this properly?

'umount -f -l' should normally work to at least hide the gruesome
details of your hanging superblock.

I'm guessing that you're falling afoul of the path revalidation that
Chuck alluded to. There should already be a fix for that problem with
the path+AF8-umountat() patches that went into Linux 3.12-rc1. Are those
failing to help?

--
Trond Myklebust
Linux NFS client maintainer

NetApp
Trond.Myklebust+AEA-netapp.com
http://www.netapp.com

2013-10-17 18:03:23

by Chuck Lever

[permalink] [raw]
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.


On Oct 17, 2013, at 1:35 PM, Ben Greear <[email protected]> wrote:

> On 10/15/2013 11:29 AM, Ben Greear wrote:
>> Is 'umount -f' supposed to always work, even if the file server
>> goes away?
>>
>> I have a user's system that just hangs forever in this case.
>>
>> Could be local changes we have made, but I'm curious about
>> the expected behaviour before I go digging too deep...
>
> Any input on this? I don't mind trying to fix it, but I
> would like to know how it is supposed to work.

Recent kernels emit a GETATTR at umount time. It is probably this operation that is stuck.


> Older kernels do not hang (we tried 3.0.x), but I'm not sure
> exactly where the problem started.
>
> Test case was to set up NFSv3 mount, then pull the Ethernet cable
> on the nfs client machine. This system is running 3.9.11+ kernel.
>
> From /proc/mounts:
>
> 10.2.46.90:/nfs_export on /mnt/lf/nfs3-001 type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.2.46.90,mountvers=3,mountport=19408,mountproto=udp,srcaddr=10.2.46.91,local_lock=none,addr=10.2.46.90)
>
> # umount /mnt/lf/nfs3-001
> ^C
> # umount -f /mnt/lf/nfs3-001
> [hangs forever it seems, certainly for a long time]
>
>
> Here is a stack trace of hung processes, for instance:
>
> Oct 17 10:24:18 localhost kernel: [688601.930366] SysRq : Show Blocked State
> Oct 17 10:24:18 localhost kernel: [688601.931016] task PC stack pid father
> Oct 17 10:24:18 localhost kernel: [688601.931016] mkdir D f1bf6700 0 16898 16831 0x00000082
> Oct 17 10:24:18 localhost kernel: [688601.931016] f070bd8c 00000046 00000282 f1bf6700 f5b55a20 c0d7e400 f5b55a20 c0d7e400
> Oct 17 10:24:18 localhost kernel: [688601.931016] c0d7e400 c0d7e400 c0d7e400 f79e9400 f5b55a20 f79e9400 f5b55a20 f58b19c0
> Oct 17 10:24:18 localhost kernel: [688601.931016] f8dc4fd0 f070bd50 f0ce9924 f070bd50 f8ec6bff f070bd94 f8dbf9f7 ee91a138
> Oct 17 10:24:18 localhost kernel: [688601.931016] Call Trace:
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8ec6bff>] ? rpc_put_task+0xf/0x20 [sunrpc]
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8dbf9f7>] ? nfs_initiate_write+0xb7/0xe0 [nfs]
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c04a9f0e>] ? ktime_get_ts+0x3e/0x110
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09cb133>] schedule+0x23/0x60
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09cb1e6>] io_schedule+0x76/0xc0
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c051607d>] sleep_on_page+0xd/0x20
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09c8d4d>] __wait_on_bit+0x4d/0x70
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0516070>] ? __lock_page+0x90/0x90
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0516301>] wait_on_page_bit+0x91/0xa0
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0478710>] ? wake_atomic_t_function+0x50/0x50
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05164cb>] filemap_fdatawait_range+0xcb/0x150
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05166c7>] filemap_write_and_wait_range+0x97/0xb0
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4074>] nfs_file_fsync+0x44/0xa0 [nfs]
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs]
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c058e1f9>] vfs_fsync_range+0x59/0x70
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c058e237>] vfs_fsync+0x27/0x30
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4b0b>] nfs_file_flush+0x6b/0x90 [nfs]
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05631a1>] filp_close+0x31/0x80
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c057ea55>] put_files_struct+0x85/0xe0
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c057eaf7>] exit_files+0x47/0x60
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045b83c>] do_exit+0x25c/0x980
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c056a0be>] ? SyS_stat64+0x2e/0x40
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045bf9e>] do_group_exit+0x3e/0xa0
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045c018>] SyS_exit_group+0x18/0x20
> Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09d370d>] sysenter_do_call+0x12/0x28
> Oct 17 10:24:18 localhost kernel: [688601.931016] umount.nfs D f11c4900 0 17150 17149 0x00000080
> Oct 17 10:24:18 localhost kernel: [688602.225057] f3955d00 00000082 efea0d8c f11c4900 f3955c8c c08d9f96 f104e700 c0d7e400
> Oct 17 10:24:18 localhost kernel: [688602.225057] c0d7e400 c0d7e400 c0d7e400 efea0d8c efea0c80 f79db400 f104e700 c0c3e980
> Oct 17 10:24:18 localhost kernel: [688602.225057] f3955cd0 f3955cb4 f3955e90 0000002c 0000005c 132df575 efea0d80 0000005c
> Oct 17 10:24:18 localhost kernel: [688602.225057] Call Trace:
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c08d9f96>] ? __kfree_skb+0x36/0x90
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09cb133>] schedule+0x23/0x60
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6edd>] rpc_wait_bit_killable+0x2d/0x70 [sunrpc]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09c8d4d>] __wait_on_bit+0x4d/0x70
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6eb0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6eb0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09c8e1b>] out_of_line_wait_on_bit+0xab/0xc0
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0478710>] ? wake_atomic_t_function+0x50/0x50
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec7f9e>] __rpc_execute+0x11e/0x290 [sunrpc]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ebf130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ebf130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c047865f>] ? wake_up_bit+0x5f/0x70
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec814c>] rpc_execute+0x3c/0xa0 [sunrpc]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec0f09>] rpc_run_task+0x59/0x70 [sunrpc]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec1022>] rpc_call_sync+0x42/0xa0 [sunrpc]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8e0b46c>] nfs3_rpc_wrapper.clone.0+0x5c/0xa0 [nfsv3]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8e0c0d4>] nfs3_proc_getattr+0x34/0x40 [nfsv3]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db7397>] __nfs_revalidate_inode+0xc7/0x140 [nfs]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db743f>] nfs_revalidate_inode+0x2f/0x60 [nfs]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db14a8>] nfs_weak_revalidate+0x38/0x50 [nfs]
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c056fba8>] complete_walk+0xa8/0xf0
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0571e53>] path_lookupat+0x63/0x690
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05724ae>] filename_lookup+0x2e/0xc0
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05733a3>] user_path_at_empty+0x43/0x80
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0578b9e>] ? __d_free+0x2e/0x50
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c064450c>] ? security_capable+0x1c/0x30
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05733ff>] user_path_at+0x1f/0x30
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05807c3>] SyS_umount+0x83/0x380
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c04d2606>] ? __audit_syscall_exit+0x1f6/0x290
> Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09d370d>] sysenter_do_call+0x12/0x28
>
> ....
>
> Oct 17 10:24:42 localhost kernel: [688631.186190] INFO: task mkdir:16898 blocked for more than 180 seconds.
> Oct 17 10:24:42 localhost kernel: [688631.195666] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> Oct 17 10:24:42 localhost kernel: [688631.206304] mkdir D f1bf6700 0 16898 16831 0x00000082
> Oct 17 10:24:42 localhost kernel: [688631.215220] f070bd8c 00000046 00000282 f1bf6700 f5b55a20 c0d7e400 f5b55a20 c0d7e400
> Oct 17 10:24:42 localhost kernel: [688631.225933] c0d7e400 c0d7e400 c0d7e400 f79e9400 f5b55a20 f79e9400 f5b55a20 f58b19c0
> Oct 17 10:24:42 localhost kernel: [688631.236712] f8dc4fd0 f070bd50 f0ce9924 f070bd50 f8ec6bff f070bd94 f8dbf9f7 ee91a138
> Oct 17 10:24:42 localhost kernel: [688631.247550] Call Trace:
> Oct 17 10:24:42 localhost kernel: [688631.252746] [<f8ec6bff>] ? rpc_put_task+0xf/0x20 [sunrpc]
> Oct 17 10:24:42 localhost kernel: [688631.261369] [<f8dbf9f7>] ? nfs_initiate_write+0xb7/0xe0 [nfs]
> Oct 17 10:24:42 localhost kernel: [688631.270065] [<c04a9f0e>] ? ktime_get_ts+0x3e/0x110
> Oct 17 10:24:42 localhost kernel: [688631.277724] [<c09cb133>] schedule+0x23/0x60
> Oct 17 10:24:42 localhost kernel: [688631.285298] [<c09cb1e6>] io_schedule+0x76/0xc0
> Oct 17 10:24:42 localhost kernel: [688631.292738] [<c051607d>] sleep_on_page+0xd/0x20
> Oct 17 10:24:42 localhost kernel: [688631.300316] [<c09c8d4d>] __wait_on_bit+0x4d/0x70
> Oct 17 10:24:42 localhost kernel: [688631.308117] [<c0516070>] ? __lock_page+0x90/0x90
> Oct 17 10:24:42 localhost kernel: [688631.315731] [<c0516301>] wait_on_page_bit+0x91/0xa0
> Oct 17 10:24:42 localhost kernel: [688631.323630] [<c0478710>] ? wake_atomic_t_function+0x50/0x50
> Oct 17 10:24:42 localhost kernel: [688631.332536] [<c05164cb>] filemap_fdatawait_range+0xcb/0x150
> Oct 17 10:24:42 localhost kernel: [688631.341221] [<c05166c7>] filemap_write_and_wait_range+0x97/0xb0
> Oct 17 10:24:42 localhost kernel: [688631.350224] [<f8db4074>] nfs_file_fsync+0x44/0xa0 [nfs]
> Oct 17 10:24:42 localhost kernel: [688631.358569] [<f8db4030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs]
> Oct 17 10:24:42 localhost kernel: [688631.367764] [<c058e1f9>] vfs_fsync_range+0x59/0x70
> Oct 17 10:24:42 localhost kernel: [688631.375818] [<c058e237>] vfs_fsync+0x27/0x30
> Oct 17 10:24:42 localhost kernel: [688631.383346] [<f8db4b0b>] nfs_file_flush+0x6b/0x90 [nfs]
> Oct 17 10:24:42 localhost kernel: [688631.392117] [<c05631a1>] filp_close+0x31/0x80
> Oct 17 10:24:42 localhost kernel: [688631.399741] [<c057ea55>] put_files_struct+0x85/0xe0
> Oct 17 10:24:42 localhost kernel: [688631.407871] [<c057eaf7>] exit_files+0x47/0x60
> Oct 17 10:24:42 localhost kernel: [688631.415535] [<c045b83c>] do_exit+0x25c/0x980
> Oct 17 10:24:42 localhost kernel: [688631.423133] [<c056a0be>] ? SyS_stat64+0x2e/0x40
> Oct 17 10:24:42 localhost kernel: [688631.431078] [<c045bf9e>] do_group_exit+0x3e/0xa0
> Oct 17 10:24:42 localhost kernel: [688631.439103] [<c045c018>] SyS_exit_group+0x18/0x20
> Oct 17 10:24:42 localhost kernel: [688631.447169] [<c09d370d>] sysenter_do_call+0x12/0x28
> Oct 17 10:24:54 localhost kernel: [688643.517069] RPC: AUTH_GSS upcall timed out.
>
>
> Thanks,
> Ben
>
>
> --
> Ben Greear <[email protected]>
> Candela Technologies Inc http://www.candelatech.com
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2013-10-17 18:16:08

by Jeff Layton

[permalink] [raw]
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.

On Thu, 17 Oct 2013 14:03:05 -0400
Chuck Lever <[email protected]> wrote:

>
> On Oct 17, 2013, at 1:35 PM, Ben Greear <[email protected]> wrote:
>
> > On 10/15/2013 11:29 AM, Ben Greear wrote:
> >> Is 'umount -f' supposed to always work, even if the file server
> >> goes away?
> >>
> >> I have a user's system that just hangs forever in this case.
> >>
> >> Could be local changes we have made, but I'm curious about
> >> the expected behaviour before I go digging too deep...
> >
> > Any input on this? I don't mind trying to fix it, but I
> > would like to know how it is supposed to work.
>
> Recent kernels emit a GETATTR at umount time. It is probably this operation that is stuck.
>

Yep.

>
> > Older kernels do not hang (we tried 3.0.x), but I'm not sure
> > exactly where the problem started.
> >
> > Test case was to set up NFSv3 mount, then pull the Ethernet cable
> > on the nfs client machine. This system is running 3.9.11+ kernel.
> >
> > From /proc/mounts:
> >
> > 10.2.46.90:/nfs_export on /mnt/lf/nfs3-001 type nfs (rw,relatime,vers=3,rsize=131072,wsize=131072,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=10.2.46.90,mountvers=3,mountport=19408,mountproto=udp,srcaddr=10.2.46.91,local_lock=none,addr=10.2.46.90)
> >
> > # umount /mnt/lf/nfs3-001
> > ^C
> > # umount -f /mnt/lf/nfs3-001
> > [hangs forever it seems, certainly for a long time]
> >
> >
> > Here is a stack trace of hung processes, for instance:
> >
> > Oct 17 10:24:18 localhost kernel: [688601.930366] SysRq : Show Blocked State
> > Oct 17 10:24:18 localhost kernel: [688601.931016] task PC stack pid father
> > Oct 17 10:24:18 localhost kernel: [688601.931016] mkdir D f1bf6700 0 16898 16831 0x00000082
> > Oct 17 10:24:18 localhost kernel: [688601.931016] f070bd8c 00000046 00000282 f1bf6700 f5b55a20 c0d7e400 f5b55a20 c0d7e400
> > Oct 17 10:24:18 localhost kernel: [688601.931016] c0d7e400 c0d7e400 c0d7e400 f79e9400 f5b55a20 f79e9400 f5b55a20 f58b19c0
> > Oct 17 10:24:18 localhost kernel: [688601.931016] f8dc4fd0 f070bd50 f0ce9924 f070bd50 f8ec6bff f070bd94 f8dbf9f7 ee91a138
> > Oct 17 10:24:18 localhost kernel: [688601.931016] Call Trace:
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8ec6bff>] ? rpc_put_task+0xf/0x20 [sunrpc]
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8dbf9f7>] ? nfs_initiate_write+0xb7/0xe0 [nfs]
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c04a9f0e>] ? ktime_get_ts+0x3e/0x110
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09cb133>] schedule+0x23/0x60
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09cb1e6>] io_schedule+0x76/0xc0
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c051607d>] sleep_on_page+0xd/0x20
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09c8d4d>] __wait_on_bit+0x4d/0x70
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0516070>] ? __lock_page+0x90/0x90
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0516301>] wait_on_page_bit+0x91/0xa0
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c0478710>] ? wake_atomic_t_function+0x50/0x50
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05164cb>] filemap_fdatawait_range+0xcb/0x150
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05166c7>] filemap_write_and_wait_range+0x97/0xb0
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4074>] nfs_file_fsync+0x44/0xa0 [nfs]
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs]
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c058e1f9>] vfs_fsync_range+0x59/0x70
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c058e237>] vfs_fsync+0x27/0x30
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<f8db4b0b>] nfs_file_flush+0x6b/0x90 [nfs]
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c05631a1>] filp_close+0x31/0x80
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c057ea55>] put_files_struct+0x85/0xe0
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c057eaf7>] exit_files+0x47/0x60
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045b83c>] do_exit+0x25c/0x980
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c056a0be>] ? SyS_stat64+0x2e/0x40
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045bf9e>] do_group_exit+0x3e/0xa0
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c045c018>] SyS_exit_group+0x18/0x20
> > Oct 17 10:24:18 localhost kernel: [688601.931016] [<c09d370d>] sysenter_do_call+0x12/0x28
> > Oct 17 10:24:18 localhost kernel: [688601.931016] umount.nfs D f11c4900 0 17150 17149 0x00000080
> > Oct 17 10:24:18 localhost kernel: [688602.225057] f3955d00 00000082 efea0d8c f11c4900 f3955c8c c08d9f96 f104e700 c0d7e400
> > Oct 17 10:24:18 localhost kernel: [688602.225057] c0d7e400 c0d7e400 c0d7e400 efea0d8c efea0c80 f79db400 f104e700 c0c3e980
> > Oct 17 10:24:18 localhost kernel: [688602.225057] f3955cd0 f3955cb4 f3955e90 0000002c 0000005c 132df575 efea0d80 0000005c
> > Oct 17 10:24:18 localhost kernel: [688602.225057] Call Trace:
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c08d9f96>] ? __kfree_skb+0x36/0x90
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09cb133>] schedule+0x23/0x60
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6edd>] rpc_wait_bit_killable+0x2d/0x70 [sunrpc]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09c8d4d>] __wait_on_bit+0x4d/0x70
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6eb0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec6eb0>] ? __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09c8e1b>] out_of_line_wait_on_bit+0xab/0xc0
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0478710>] ? wake_atomic_t_function+0x50/0x50
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec7f9e>] __rpc_execute+0x11e/0x290 [sunrpc]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ebf130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ebf130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c047865f>] ? wake_up_bit+0x5f/0x70
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec814c>] rpc_execute+0x3c/0xa0 [sunrpc]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec0f09>] rpc_run_task+0x59/0x70 [sunrpc]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8ec1022>] rpc_call_sync+0x42/0xa0 [sunrpc]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8e0b46c>] nfs3_rpc_wrapper.clone.0+0x5c/0xa0 [nfsv3]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8e0c0d4>] nfs3_proc_getattr+0x34/0x40 [nfsv3]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db7397>] __nfs_revalidate_inode+0xc7/0x140 [nfs]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db743f>] nfs_revalidate_inode+0x2f/0x60 [nfs]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<f8db14a8>] nfs_weak_revalidate+0x38/0x50 [nfs]
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c056fba8>] complete_walk+0xa8/0xf0
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0571e53>] path_lookupat+0x63/0x690
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05724ae>] filename_lookup+0x2e/0xc0
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05733a3>] user_path_at_empty+0x43/0x80
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c0578b9e>] ? __d_free+0x2e/0x50
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c064450c>] ? security_capable+0x1c/0x30
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05733ff>] user_path_at+0x1f/0x30
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c05807c3>] SyS_umount+0x83/0x380
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c04d2606>] ? __audit_syscall_exit+0x1f6/0x290
> > Oct 17 10:24:18 localhost kernel: [688602.225057] [<c09d370d>] sysenter_do_call+0x12/0x28
> >

The umount here is stuck trying to revalidate the dentry at the root of
the mount. This situation should be improved by commit 8033426e6b,
which skips revalidating the last component of the lookup.

> >
> > Oct 17 10:24:42 localhost kernel: [688631.186190] INFO: task mkdir:16898 blocked for more than 180 seconds.
> > Oct 17 10:24:42 localhost kernel: [688631.195666] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
> > Oct 17 10:24:42 localhost kernel: [688631.206304] mkdir D f1bf6700 0 16898 16831 0x00000082
> > Oct 17 10:24:42 localhost kernel: [688631.215220] f070bd8c 00000046 00000282 f1bf6700 f5b55a20 c0d7e400 f5b55a20 c0d7e400
> > Oct 17 10:24:42 localhost kernel: [688631.225933] c0d7e400 c0d7e400 c0d7e400 f79e9400 f5b55a20 f79e9400 f5b55a20 f58b19c0
> > Oct 17 10:24:42 localhost kernel: [688631.236712] f8dc4fd0 f070bd50 f0ce9924 f070bd50 f8ec6bff f070bd94 f8dbf9f7 ee91a138
> > Oct 17 10:24:42 localhost kernel: [688631.247550] Call Trace:
> > Oct 17 10:24:42 localhost kernel: [688631.252746] [<f8ec6bff>] ? rpc_put_task+0xf/0x20 [sunrpc]
> > Oct 17 10:24:42 localhost kernel: [688631.261369] [<f8dbf9f7>] ? nfs_initiate_write+0xb7/0xe0 [nfs]
> > Oct 17 10:24:42 localhost kernel: [688631.270065] [<c04a9f0e>] ? ktime_get_ts+0x3e/0x110
> > Oct 17 10:24:42 localhost kernel: [688631.277724] [<c09cb133>] schedule+0x23/0x60
> > Oct 17 10:24:42 localhost kernel: [688631.285298] [<c09cb1e6>] io_schedule+0x76/0xc0
> > Oct 17 10:24:42 localhost kernel: [688631.292738] [<c051607d>] sleep_on_page+0xd/0x20
> > Oct 17 10:24:42 localhost kernel: [688631.300316] [<c09c8d4d>] __wait_on_bit+0x4d/0x70
> > Oct 17 10:24:42 localhost kernel: [688631.308117] [<c0516070>] ? __lock_page+0x90/0x90
> > Oct 17 10:24:42 localhost kernel: [688631.315731] [<c0516301>] wait_on_page_bit+0x91/0xa0
> > Oct 17 10:24:42 localhost kernel: [688631.323630] [<c0478710>] ? wake_atomic_t_function+0x50/0x50
> > Oct 17 10:24:42 localhost kernel: [688631.332536] [<c05164cb>] filemap_fdatawait_range+0xcb/0x150
> > Oct 17 10:24:42 localhost kernel: [688631.341221] [<c05166c7>] filemap_write_and_wait_range+0x97/0xb0
> > Oct 17 10:24:42 localhost kernel: [688631.350224] [<f8db4074>] nfs_file_fsync+0x44/0xa0 [nfs]
> > Oct 17 10:24:42 localhost kernel: [688631.358569] [<f8db4030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs]
> > Oct 17 10:24:42 localhost kernel: [688631.367764] [<c058e1f9>] vfs_fsync_range+0x59/0x70
> > Oct 17 10:24:42 localhost kernel: [688631.375818] [<c058e237>] vfs_fsync+0x27/0x30
> > Oct 17 10:24:42 localhost kernel: [688631.383346] [<f8db4b0b>] nfs_file_flush+0x6b/0x90 [nfs]
> > Oct 17 10:24:42 localhost kernel: [688631.392117] [<c05631a1>] filp_close+0x31/0x80
> > Oct 17 10:24:42 localhost kernel: [688631.399741] [<c057ea55>] put_files_struct+0x85/0xe0
> > Oct 17 10:24:42 localhost kernel: [688631.407871] [<c057eaf7>] exit_files+0x47/0x60
> > Oct 17 10:24:42 localhost kernel: [688631.415535] [<c045b83c>] do_exit+0x25c/0x980
> > Oct 17 10:24:42 localhost kernel: [688631.423133] [<c056a0be>] ? SyS_stat64+0x2e/0x40
> > Oct 17 10:24:42 localhost kernel: [688631.431078] [<c045bf9e>] do_group_exit+0x3e/0xa0
> > Oct 17 10:24:42 localhost kernel: [688631.439103] [<c045c018>] SyS_exit_group+0x18/0x20
> > Oct 17 10:24:42 localhost kernel: [688631.447169] [<c09d370d>] sysenter_do_call+0x12/0x28
> > Oct 17 10:24:54 localhost kernel: [688643.517069] RPC: AUTH_GSS upcall timed out.
> >

Of course, the mkdir process here might be holding references that will
prevent you from unmounting, but that commit should at least keep the
lookup from getting stuck.

--
Jeff Layton <[email protected]>

2013-10-17 19:36:04

by Ben Greear

[permalink] [raw]
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.

On 10/17/2013 12:34 PM, Ben Greear wrote:

> After cable is reconnected, (and with btserver process still hung),
> I tried to re-mount the same partition. Those mount calls are hanging
> as well.
>
> So, maybe some progress, but I think there are still some fixes needed.

About the time I finished composing this email and sent it, it appears
everything cleaned up. So, maybe not quite as bad as it first looked,
but still room for improvement in my opinion.

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2013-10-17 18:08:53

by Ben Greear

[permalink] [raw]
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.

On 10/17/2013 11:03 AM, Chuck Lever wrote:
>
> On Oct 17, 2013, at 1:35 PM, Ben Greear <[email protected]> wrote:
>
>> On 10/15/2013 11:29 AM, Ben Greear wrote:
>>> Is 'umount -f' supposed to always work, even if the file server
>>> goes away?
>>>
>>> I have a user's system that just hangs forever in this case.
>>>
>>> Could be local changes we have made, but I'm curious about
>>> the expected behaviour before I go digging too deep...
>>
>> Any input on this? I don't mind trying to fix it, but I
>> would like to know how it is supposed to work.
>
> Recent kernels emit a GETATTR at umount time. It is probably this operation that is stuck.

It seems a 'mkdir' process is trying to complete at the same time,
not sure if that is cause or effect.

How can I go about cleaning up these stuck operations?

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2013-10-17 18:11:04

by Ben Greear

[permalink] [raw]
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.

On 10/17/2013 11:05 AM, Myklebust, Trond wrote:
> On Thu, 2013-10-17 at 10:35 -0700, Ben Greear wrote:
>> On 10/15/2013 11:29 AM, Ben Greear wrote:
>>> Is 'umount -f' supposed to always work, even if the file server
>>> goes away?
>>>
>>> I have a user's system that just hangs forever in this case.
>>>
>>> Could be local changes we have made, but I'm curious about
>>> the expected behaviour before I go digging too deep...
>>
>> Any input on this? I don't mind trying to fix it, but I
>> would like to know how it is supposed to work.
>
> 'umount -f' has always been iffy. It just kills any pending RPC calls
> _before_ trying to unmount. Since the unmount itself can trigger
> writeback flushes (and hence more RPC calls), the trace you are seeing
> is indeed possible.

I tried 'umount -f -l', and that also does not work.

Any ideas on how to fix this properly?

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com


2013-10-17 19:34:09

by Ben Greear

[permalink] [raw]
Subject: Re: 'umount -f /mnt/foo' fails if server IP is gone.

On 10/17/2013 11:42 AM, Myklebust, Trond wrote:
> On Thu, 2013-10-17 at 11:35 -0700, Ben Greear wrote:

>>> 'umount -f -l' should normally work to at least hide the gruesome
>>> details of your hanging superblock.
>>>
>>> I'm guessing that you're falling afoul of the path revalidation that
>>> Chuck alluded to. There should already be a fix for that problem with
>>> the path_umountat() patches that went into Linux 3.12-rc1. Are those
>>> failing to help?
>>
>> I have not tried past 3.9.11 kernel yet. I will go look for those patches
>> you mention as well. Did any of this go to -stable by chance?
>
> Not as far as I know.
>
> The commit identifier is 8033426e6bdb2690d302872ac1e1fadaec1a5581 (vfs:
> allow umount to handle mountpoints without revalidating them) in case
> you are interested.

Ok, that is the one that Jeff pointed me to a bit ago.

I re-ran the test with this patch (which applies cleanly into 3.9.11+).

In this case, I see a hang in my file-io process, but, 'umount -l foo'
returns immediately and the mount is gone from /proc/mounts.

I tried 'kill -9' but the btserver process won't die. I plugged the cable
so that the mount could recover, but still the process is hung. Maybe
because I did the 'umount -l' ?

After cable is reconnected, (and with btserver process still hung),
I tried to re-mount the same partition. Those mount calls are hanging
as well.

So, maybe some progress, but I think there are still some fixes needed.


[ 167.229748] r8169 0000:02:00.0 eth1: link down
[ 379.288195] INFO: task btserver:6895 blocked for more than 180 seconds.
[ 379.300366] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[ 379.313502] btserver D f3a3a2a4 0 6895 1431 0x00000080
[ 379.325191] f0615e08 00000086 00000282 f3a3a2a4 f0615dd8 f3a3a2a4 f1ed99a0 c0d41240
[ 379.338396] c0d41240 c0d41240 c0d41240 7913580e 00000027 f79db240 f1ed99a0 f5936680
[ 379.351591] f8e4ffd0 f0615dcc f3a3a2a4 f0615dcc f8e120df f0615e10 f8e4a3c7 f0f2a138
[ 379.365431] Call Trace:
[ 379.373114] [<f8e120df>] ? rpc_put_task+0xf/0x20 [sunrpc]
[ 379.384078] [<f8e4a3c7>] ? nfs_initiate_write+0xb7/0xe0 [nfs]
[ 379.395078] [<c04a076e>] ? ktime_get_ts+0x3e/0x110
[ 379.405192] [<c09baf43>] schedule+0x23/0x60
[ 379.414219] [<c09baff6>] io_schedule+0x76/0xc0
[ 379.423540] [<c05080bd>] sleep_on_page+0xd/0x20
[ 379.432895] [<c09b8fcd>] __wait_on_bit+0x4d/0x70
[ 379.442306] [<c05080b0>] ? __lock_page+0x90/0x90
[ 379.451693] [<c0508381>] wait_on_page_bit+0x91/0xa0
[ 379.461264] [<c0472690>] ? autoremove_wake_function+0x50/0x50
[ 379.472217] [<c050855b>] filemap_fdatawait_range+0xdb/0x150
[ 379.482471] [<c0508727>] filemap_write_and_wait_range+0x77/0x90
[ 379.493219] [<f8e3f074>] nfs_file_fsync+0x44/0xa0 [nfs]
[ 379.502922] [<f8e3f030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs]
[ 379.513423] [<c0581179>] vfs_fsync_range+0x59/0x70
[ 379.522692] [<c05811b7>] vfs_fsync+0x27/0x30
[ 379.531426] [<f8e3fabb>] nfs_file_flush+0x6b/0x90 [nfs]
[ 379.541135] [<c05546b1>] filp_close+0x31/0x80
[ 379.549817] [<c056fb9a>] __close_fd+0x6a/0x90
[ 379.558490] [<c055465c>] sys_close+0x1c/0x40
[ 379.567062] [<c09c26cd>] sysenter_do_call+0x12/0x28


....


Oct 17 12:25:09 localhost kernel: [ 1240.992796] SysRq : Show Blocked State
Oct 17 12:25:09 localhost kernel: [ 1240.993012] task PC stack pid father
Oct 17 12:25:09 localhost kernel: [ 1240.993012] btserver D f0f2a204 0 8701 1431 0x00000086
Oct 17 12:25:09 localhost kernel: [ 1240.993012] f5bc3c64 00000046 00000000 f0f2a204 00000000 f5aec010 f153e680 c0d41240
Oct 17 12:25:09 localhost kernel: [ 1240.993012] c0d41240 c0d41240 c0d41240 cbf49405 00000103 f79e9240 f153e680 f11a8000
Oct 17 12:25:09 localhost kernel: [ 1240.993012] f5bc3c28 c04a076e f582a148 00000246 00000246 f5bc3c5c c04d6ff6 00014993
Oct 17 12:25:09 localhost kernel: [ 1240.993012] Call Trace:
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c04a076e>] ? ktime_get_ts+0x3e/0x110
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c04d6ff6>] ? delayacct_end+0x96/0xb0
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c04a076e>] ? ktime_get_ts+0x3e/0x110
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c09baf43>] schedule+0x23/0x60
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c09baff6>] io_schedule+0x76/0xc0
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c05080bd>] sleep_on_page+0xd/0x20
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c09b8fcd>] __wait_on_bit+0x4d/0x70
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c05080b0>] ? __lock_page+0x90/0x90
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c0508381>] wait_on_page_bit+0x91/0xa0
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c0472690>] ? autoremove_wake_function+0x50/0x50
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c050855b>] filemap_fdatawait_range+0xdb/0x150
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c0508727>] filemap_write_and_wait_range+0x77/0x90
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<f8e3f074>] nfs_file_fsync+0x44/0xa0 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<f8e3f030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1240.993012] [<c0581179>] vfs_fsync_range+0x59/0x70
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05811b7>] vfs_fsync+0x27/0x30
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3fabb>] nfs_file_flush+0x6b/0x90 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05546b1>] filp_close+0x31/0x80
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0570085>] put_files_struct+0x85/0xe0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0570127>] exit_files+0x47/0x60
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c045653c>] do_exit+0x25c/0x980
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0456c9e>] do_group_exit+0x3e/0xa0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c046630b>] get_signal_to_deliver+0x1db/0x5f0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09ba9f3>] ? __schedule+0x3e3/0x7e0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c04135aa>] do_signal+0x3a/0x920
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c047eedb>] ? update_rq_clock+0x3b/0x2b0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0456eee>] ? do_wait+0xfe/0x210
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c045707d>] ? sys_wait4+0x7d/0xb0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c04c8126>] ? __audit_syscall_exit+0x1f6/0x280
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0454f70>] ? wait_noreap_copyout+0xd0/0xd0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0413eff>] do_notify_resume+0x6f/0xa0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09bc505>] work_notifysig+0x30/0x37
Oct 17 12:25:09 localhost kernel: [ 1241.175689] mkdir D f5aec010 0 8741 8701 0x00000082
Oct 17 12:25:09 localhost kernel: [ 1241.175689] f3abfd8c 00000046 00000282 f5aec010 f11a8000 f153e680 f11a8000 c0d41240
Oct 17 12:25:09 localhost kernel: [ 1241.175689] c0d41240 c0d41240 c0d41240 cbf72225 00000103 f79e9240 f11a8000 f3188cd0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] f3abfd50 c04a076e f15526e8 00000246 00000246 f3abfd84 c04d6ff6 00019454
Oct 17 12:25:09 localhost kernel: [ 1241.175689] Call Trace:
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c04a076e>] ? ktime_get_ts+0x3e/0x110
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c04d6ff6>] ? delayacct_end+0x96/0xb0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c04a076e>] ? ktime_get_ts+0x3e/0x110
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09baf43>] schedule+0x23/0x60
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09baff6>] io_schedule+0x76/0xc0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05080bd>] sleep_on_page+0xd/0x20
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09b8fcd>] __wait_on_bit+0x4d/0x70
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05080b0>] ? __lock_page+0x90/0x90
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0508381>] wait_on_page_bit+0x91/0xa0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0472690>] ? autoremove_wake_function+0x50/0x50
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c050855b>] filemap_fdatawait_range+0xdb/0x150
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0508727>] filemap_write_and_wait_range+0x77/0x90
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3f074>] nfs_file_fsync+0x44/0xa0 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3f030>] ? nfs_file_fsync_commit+0xb0/0xb0 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0581179>] vfs_fsync_range+0x59/0x70
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05811b7>] vfs_fsync+0x27/0x30
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3fabb>] nfs_file_flush+0x6b/0x90 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05546b1>] filp_close+0x31/0x80
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0570085>] put_files_struct+0x85/0xe0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0570127>] exit_files+0x47/0x60
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c045653c>] do_exit+0x25c/0x980
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0456c9e>] do_group_exit+0x3e/0xa0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0456d18>] sys_exit_group+0x18/0x20
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09c26cd>] sysenter_do_call+0x12/0x28
Oct 17 12:25:09 localhost kernel: [ 1241.175689] mount.nfs D 00000000 0 9474 9473 0x00000080
Oct 17 12:25:09 localhost kernel: [ 1241.175689] f04d1be0 00000082 d07942dc 00000000 00000082 0000b800 f1fec010 c0d41240
Oct 17 12:25:09 localhost kernel: [ 1241.175689] c0d41240 c0d41240 c0d41240 f58bc570 00000000 f79db240 f1fec010 c0c19180
Oct 17 12:25:09 localhost kernel: [ 1241.175689] 00000000 00000000 00000020 00000000 f582b400 f79db240 00000000 f04d1c10
Oct 17 12:25:09 localhost kernel: [ 1241.175689] Call Trace:
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c048b2a0>] ? idle_balance+0x100/0x420
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09baf43>] schedule+0x23/0x60
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e123fd>] rpc_wait_bit_killable+0x2d/0x70 [sunrpc]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09b8fcd>] __wait_on_bit+0x4d/0x70
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e123d0>] ? rpc_queue_empty+0x40/0x40 [sunrpc]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e123d0>] ? rpc_queue_empty+0x40/0x40 [sunrpc]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09b909b>] out_of_line_wait_on_bit+0xab/0xc0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0472690>] ? autoremove_wake_function+0x50/0x50
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e134fe>] __rpc_execute+0x11e/0x2a0 [sunrpc]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e0a130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e0a130>] ? rpcproc_decode_null+0x10/0x10 [sunrpc]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c047262f>] ? wake_up_bit+0x5f/0x70
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e136b4>] rpc_execute+0x34/0x90 [sunrpc]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e0bc79>] rpc_run_task+0x59/0x70 [sunrpc]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e0bd92>] rpc_call_sync+0x42/0xa0 [sunrpc]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8c0547c>] nfs3_rpc_wrapper.clone.0+0x5c/0xa0 [nfsv3]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8c06153>] do_proc_fsinfo+0x33/0x40 [nfsv3]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8c06183>] nfs3_proc_fsinfo+0x23/0x50 [nfsv3]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3a97f>] nfs_probe_fsinfo+0x4f/0x500 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3bef1>] nfs_create_server+0x201/0x440 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8c050ae>] nfs3_create_server+0xe/0x30 [nfsv3]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e43fc1>] nfs_try_mount+0x151/0x280 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e42e1d>] ? nfs_get_option_ul+0x3d/0x50 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e45d1b>] ? nfs_fs_mount+0x6db/0x9c0 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3a7d8>] ? get_nfs_version+0x28/0x80 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e3a7d8>] ? get_nfs_version+0x28/0x80 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0520453>] ? kstrndup+0x43/0x60
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e457cd>] nfs_fs_mount+0x18d/0x9c0 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e45450>] ? nfs_clone_super+0x150/0x150 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<f8e43d50>] ? nfs_clone_sb_security+0x50/0x50 [nfs]
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0559036>] mount_fs+0x36/0x180
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0524b3f>] ? __alloc_percpu+0xf/0x20
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0572180>] vfs_kern_mount+0x50/0xc0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05737d8>] do_mount+0x2b8/0x810
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c050f68b>] ? __get_free_pages+0x2b/0x30
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c05714e1>] ? copy_mount_options+0x41/0x120
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c0573d9b>] sys_mount+0x6b/0xa0
Oct 17 12:25:09 localhost kernel: [ 1241.175689] [<c09c26cd>] sysenter_do_call+0x12/0x28

Thanks,
Ben

--
Ben Greear <[email protected]>
Candela Technologies Inc http://www.candelatech.com