Hi.
I had here a case with a Linux 3.7.9 system, a virtual machine in a
RFC1918 range, that did not want to mount NFS.
linux-3lzm:~ # strace -fe mount mount -t nfs 134.76.12.5:/X /mnt
Process 1477 attached
[pid 1515] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "vers=4,addr=134.76.12.5,clientaddr=0.0.0.0") = -1 EIO (Input/output error)
mount.nfs: mount system call failed
[pid 1477] +++ exited with 32 +++
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1477, si_status=32, si_utime=0, si_stime=0} ---
+++ exited with 32 +++
Nothing in dmesg...
[ 84.202243] RPC: Registered named UNIX socket transport module.
[ 84.202246] RPC: Registered udp transport module.
[ 84.202248] RPC: Registered tcp transport module.
[ 84.202249] RPC: Registered tcp NFSv4.1 backchannel transport module.
[ 84.205909] FS-Cache: Loaded
[ 84.208782] FS-Cache: Netfs 'nfs' registered for caching
[ 84.215733] NFS: Registering the id_resolver key type
[ 84.215762] Key type id_resolver registered
[ 84.215763] Key type id_legacy registered
When mounting with NFSv3, the error became clear:
# strace -fe mount -s 65536 mount -t nfs 134.76.12.5:/X /mnt -o nfsvers=3,nolock
Process 1550 attached
mount.nfs: Network is unreachable
[pid 1550] +++ exited with 32 +++
--- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1550, si_status=32, si_utime=0, si_stime=0} ---
+++ exited with 32 +++
Can NFSv4 be made to return -ENETUNREACH as well?
On Monday 2013-03-04 18:47, Chuck Lever wrote:
>>
>> linux-3lzm:~ # strace -fe mount mount -t nfs 134.76.12.5:/X /mnt
>> Process 1477 attached
>> [pid 1515] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "vers=4,addr=134.76.12.5,clientaddr=0.0.0.0") = -1 EIO (Input/output error)
>
>"clientaddr=0.0.0.0" is interesting: perhaps the mount.nfs command
>should have failed right there.
Maybe (since there is no route). Maybe not: I would interpret 0.0.0.0
as "kernel will pick something".
This kernel 3.7.9 system has rpcbind-0.2.0_git201103171419-7.1.1.x86_64
and nfs-client-1.2.7-2.1.1.x86_64.
>But, let's find out why the kernel is returning EIO. Enter:
>
> # rpcdebug -m nfs -s all
> # rpcdebug -m rpc -s call xprt
>
>Try your NFSv4 mount command again, then post relevant excerpts from the kernel log.
[205196.949572] NFS: nfs mount opts='vers=4,addr=134.76.12.5,clientaddr=0.0.0.0'
[205196.949580] NFS: parsing nfs mount option 'vers=4'
[205196.949586] NFS: parsing nfs mount option 'addr=134.76.12.5'
[205196.949592] NFS: parsing nfs mount option 'clientaddr=0.0.0.0'
[205196.949597] NFS: MNTPATH: '/X'
[205196.949600] --> nfs4_try_mount()
[205196.949607] --> nfs4_create_server()
[205196.949639] --> nfs4_init_server()
[205196.949641] --> nfs4_set_client()
[205196.949644] --> nfs_get_client(134.76.12.5,v4)
[205196.949651] NFS: get client cookie (0xffff88007abe3800/0xffff88007ba29420)
[205196.949666] RPC: created transport ffff88007ba50800 with 65536 slots
[205196.949670] RPC: creating nfs client for 134.76.12.5 (xprt ffff88007ba50800)
[205196.949693] RPC: 2 call_start nfs4 proc NULL (sync)
[205196.949705] RPC: 2 call_reserve (status 0)
[205196.949709] RPC: 2 reserved req ffff88007b342800 xid 11fca56c
[205196.949712] RPC: 2 call_reserveresult (status 0)
[205196.949714] RPC: 2 call_refresh (status 0)
[205196.949716] RPC: 2 call_refreshresult (status 0)
[205196.949718] RPC: 2 call_allocate (status 0)
[205196.949722] RPC: 2 call_bind (status 0)
[205196.949724] RPC: 2 call_connect xprt ffff88007ba50800 is not connected
[205196.949727] RPC: 2 xprt_connect xprt ffff88007ba50800 is not connected
[205196.949765] RPC: 2 xprt_connect_status: error 101 connecting to server 134.76.12.5
[205196.949779] RPC: 2 call_connect_status (status -5)
[205196.949791] RPC: 2 release request ffff88007b342800
[205196.949794] RPC: rpc_release_client(ffff88007b342e00)
[205196.949797] RPC: shutting down nfs client for 134.76.12.5
[205196.949798] RPC: rpc_release_client(ffff88007b342e00)
[205196.949801] RPC: destroying nfs client for 134.76.12.5
[205196.949803] RPC: destroying transport ffff88007ba50800
[205196.949812] RPC: disconnected transport ffff88007ba50800
[205196.949816] nfs_create_rpc_client: cannot create RPC client. Error = -5
[205196.949818] --> nfs_put_client({1})
[205196.949820] --> nfs_free_client(4)
[205196.949822] NFS: releasing client cookie (0xffff88007abe3800/0xffff88007ba29420)
[205196.949824] <-- nfs_free_client()
[205196.949826] <-- nfs4_init_client() = xerror -5
[205196.949828] <-- nfs4_set_client() = xerror -5
[205196.949829] <-- nfs4_init_server() = -5
[205196.949831] --> nfs_free_server()
[205196.949850] <-- nfs_free_server()
[205196.949852] <-- nfs4_create_server() = error -5
[205196.949856] <-- nfs4_try_mount() = -5 [error]
Just nuke your default route, and it should be easily reproducible.
>By the way, does this happen on older kernels?
On kernel 3.0.51 with rpcbind-0.1.6+git20080930-6.18.1 and
nfs-utils-1.2.3, I observe that
rtsnode1:~ # strace -fe mount -s 65535 mount -t nfs 134.76.12.5:/X /mnt -o
nfsvers=4
Process 24358 attached
Process 24357 suspended
[pid 24358] mount("134.76.12.5:/X", "/mnt", "nfs", 0,
"nfsvers=4,addr=134.76.12.5,clientaddr=0.0.0.0"
<hang, but interruptible>
rpcdebug/dmesg:
[790206.304549] NFS: nfs mount opts='nfsvers=4,addr=134.76.12.5,clientaddr=0.0.0.0'
[790206.304549] NFS: parsing nfs mount option 'nfsvers=4'
[790206.304549] NFS: parsing nfs mount option 'addr=134.76.12.5'
[790206.304549] NFS: parsing nfs mount option 'clientaddr=0.0.0.0'
[790206.304549] NFS: MNTPATH: '/X'
[790206.304549] --> nfs4_try_mount()
[790206.304549] --> nfs4_create_server()
[790206.304549] --> nfs4_init_server()
[790206.304549] --> nfs4_set_client()
[790206.304549] --> nfs_get_client(134.76.12.5,v4)
[790206.304549] NFS: get client cookie (0xffff880079a16400/0xffff880079cd41e0)
[790206.304549] RPC: created transport ffff880078f26800 with 16 slots
[790206.304549] RPC: creating nfs client for 134.76.12.5 (xprt ffff880078f26800)
[790206.304550] RPC: 45 call_start nfs4 proc NULL (sync)
[790206.304550] RPC: 45 call_reserve (status 0)
[790206.304550] RPC: 45 reserved req ffff880037a16000 xid 68604dfd
[790206.304550] RPC: 45 call_reserveresult (status 0)
[790206.304550] RPC: 45 call_refresh (status 0)
[790206.304550] RPC: 45 call_refreshresult (status 0)
[790206.304550] RPC: 45 call_allocate (status 0)
[790206.304550] RPC: 45 call_bind (status 0)
[790206.304550] RPC: 45 call_connect xprt ffff880078f26800 is not connected
[790206.304550] RPC: 45 xprt_connect xprt ffff880078f26800 is not connected
<no further output until Ctrl-C issued>
[790320.020951] RPC: 45 xprt_connect_status: error 512 connecting to server 134.76.12.5
[790320.020960] RPC: 45 release request ffff880037a16000
[790320.020963] RPC: rpc_release_client(ffff880079bc0200)
[790320.020968] RPC: shutting down nfs client for 134.76.12.5
[790320.020970] RPC: rpc_release_client(ffff880079bc0200)
[790320.020973] RPC: destroying nfs client for 134.76.12.5
[790320.024752] RPC: destroying transport ffff880078f26800
[790320.024767] RPC: disconnected transport ffff880078f26800
[790320.024772] nfs_create_rpc_client: cannot create RPC client. Error = -5
[790320.024775] <-- nfs4_init_client() = xerror -5
[790320.024777] --> nfs_put_client({1})
[790320.024781] --> nfs_free_client(4)
[790320.024783] NFS: releasing client cookie (0xffff880079a16400/0xffff880079cd41e0)
[790320.024788] <-- nfs_free_client()
[790320.024789] <-- nfs4_set_client() = xerror -5
[790320.024791] <-- nfs4_init_server() = -5
[790320.024810] --> nfs_free_server()
[790320.024816] <-- nfs_free_server()
[790320.024818] <-- nfs4_create_server() = error -5
[790320.024822] <-- nfs4_try_mount() = -5 [error]
And, if network connectivity is present, this occurs:
# strace -fe mount -s 65535 mount -t nfs 134.76.12.5:/X /mnt -o nfsvers=4
Process 24383 attached
Process 24382 suspended
[pid 24383] mount("134.76.12.5:/X", "/mnt", "nfs", 0,
"nfsvers=4,addr=134.76.12.5,clientaddr=10.10.7.142" <short wait> ) = 0
Process 24382 resumed
Process 24383 detached
--- SIGCHLD (Child exited) @ 0 (0) ---
And that then probably answers your clientaddr= concern.
So what 3.7.x does better is that it immediately identifies
EHOSTUNREACH/ENETUNREACH while 3.0.x waits for Godot.
On Monday 2013-03-04 23:37, Myklebust, Trond wrote:
>On Mon, 2013-03-04 at 20:43 +0000, Myklebust, Trond wrote:
>> On Mon, 2013-03-04 at 20:10 +0100, Jan Engelhardt wrote:
>> > Just nuke your default route, and it should be easily reproducible.
>> >
>>
>> The problem is that call_connect_status() is converting that ENETUNREACH
>> into a EIO. We shouldn't be doing that, but should leave it up to the
>> caller (i.e. the NFS layer) to perform that kind of mapping.
>
>Could you please check if the attached patch helps.
With the patch, I still get EIO from the mount syscall.
On Mon, 2013-03-04 at 20:43 +0000, Myklebust, Trond wrote:
> On Mon, 2013-03-04 at 20:10 +0100, Jan Engelhardt wrote:
> > Just nuke your default route, and it should be easily reproducible.
> >
>
> The problem is that call_connect_status() is converting that ENETUNREACH
> into a EIO. We shouldn't be doing that, but should leave it up to the
> caller (i.e. the NFS layer) to perform that kind of mapping.
Could you please check if the attached patch helps.
--
Trond Myklebust
Linux NFS client maintainer
NetApp
[email protected]
http://www.netapp.com
On Mon, 2013-03-04 at 20:10 +0100, Jan Engelhardt wrote:
> On Monday 2013-03-04 18:47, Chuck Lever wrote:
> >>
> >> linux-3lzm:~ # strace -fe mount mount -t nfs 134.76.12.5:/X /mnt
> >> Process 1477 attached
> >> [pid 1515] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "vers=4,addr=134.76.12.5,clientaddr=0.0.0.0") = -1 EIO (Input/output error)
> >
> >"clientaddr=0.0.0.0" is interesting: perhaps the mount.nfs command
> >should have failed right there.
>
> Maybe (since there is no route). Maybe not: I would interpret 0.0.0.0
> as "kernel will pick something".
>
> This kernel 3.7.9 system has rpcbind-0.2.0_git201103171419-7.1.1.x86_64
> and nfs-client-1.2.7-2.1.1.x86_64.
>
>
> >But, let's find out why the kernel is returning EIO. Enter:
> >
> > # rpcdebug -m nfs -s all
> > # rpcdebug -m rpc -s call xprt
> >
> >Try your NFSv4 mount command again, then post relevant excerpts from the kernel log.
>
> [205196.949572] NFS: nfs mount opts='vers=4,addr=134.76.12.5,clientaddr=0.0.0.0'
> [205196.949580] NFS: parsing nfs mount option 'vers=4'
> [205196.949586] NFS: parsing nfs mount option 'addr=134.76.12.5'
> [205196.949592] NFS: parsing nfs mount option 'clientaddr=0.0.0.0'
> [205196.949597] NFS: MNTPATH: '/X'
> [205196.949600] --> nfs4_try_mount()
> [205196.949607] --> nfs4_create_server()
> [205196.949639] --> nfs4_init_server()
> [205196.949641] --> nfs4_set_client()
> [205196.949644] --> nfs_get_client(134.76.12.5,v4)
> [205196.949651] NFS: get client cookie (0xffff88007abe3800/0xffff88007ba29420)
> [205196.949666] RPC: created transport ffff88007ba50800 with 65536 slots
> [205196.949670] RPC: creating nfs client for 134.76.12.5 (xprt ffff88007ba50800)
> [205196.949693] RPC: 2 call_start nfs4 proc NULL (sync)
> [205196.949705] RPC: 2 call_reserve (status 0)
> [205196.949709] RPC: 2 reserved req ffff88007b342800 xid 11fca56c
> [205196.949712] RPC: 2 call_reserveresult (status 0)
> [205196.949714] RPC: 2 call_refresh (status 0)
> [205196.949716] RPC: 2 call_refreshresult (status 0)
> [205196.949718] RPC: 2 call_allocate (status 0)
> [205196.949722] RPC: 2 call_bind (status 0)
> [205196.949724] RPC: 2 call_connect xprt ffff88007ba50800 is not connected
> [205196.949727] RPC: 2 xprt_connect xprt ffff88007ba50800 is not connected
> [205196.949765] RPC: 2 xprt_connect_status: error 101 connecting to server 134.76.12.5
> [205196.949779] RPC: 2 call_connect_status (status -5)
> [205196.949791] RPC: 2 release request ffff88007b342800
> [205196.949794] RPC: rpc_release_client(ffff88007b342e00)
> [205196.949797] RPC: shutting down nfs client for 134.76.12.5
> [205196.949798] RPC: rpc_release_client(ffff88007b342e00)
> [205196.949801] RPC: destroying nfs client for 134.76.12.5
> [205196.949803] RPC: destroying transport ffff88007ba50800
> [205196.949812] RPC: disconnected transport ffff88007ba50800
> [205196.949816] nfs_create_rpc_client: cannot create RPC client. Error = -5
> [205196.949818] --> nfs_put_client({1})
> [205196.949820] --> nfs_free_client(4)
> [205196.949822] NFS: releasing client cookie (0xffff88007abe3800/0xffff88007ba29420)
> [205196.949824] <-- nfs_free_client()
> [205196.949826] <-- nfs4_init_client() = xerror -5
> [205196.949828] <-- nfs4_set_client() = xerror -5
> [205196.949829] <-- nfs4_init_server() = -5
> [205196.949831] --> nfs_free_server()
> [205196.949850] <-- nfs_free_server()
> [205196.949852] <-- nfs4_create_server() = error -5
> [205196.949856] <-- nfs4_try_mount() = -5 [error]
>
> Just nuke your default route, and it should be easily reproducible.
>
The problem is that call_connect_status() is converting that ENETUNREACH
into a EIO. We shouldn't be doing that, but should leave it up to the
caller (i.e. the NFS layer) to perform that kind of mapping.
Cheers
Trond
On Mar 1, 2013, at 6:43 AM, Jan Engelhardt <[email protected]> wrote:
> Hi.
>
>
> I had here a case with a Linux 3.7.9 system, a virtual machine in a
> RFC1918 range, that did not want to mount NFS.
>
> linux-3lzm:~ # strace -fe mount mount -t nfs 134.76.12.5:/X /mnt
> Process 1477 attached
> [pid 1515] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "vers=4,addr=134.76.12.5,clientaddr=0.0.0.0") = -1 EIO (Input/output error)
"clientaddr=0.0.0.0" is interesting: perhaps the mount.nfs command should have failed right there.
But, let's find out why the kernel is returning EIO. Enter:
# rpcdebug -m nfs -s all
# rpcdebug -m rpc -s call xprt
Try your NFSv4 mount command again, then post relevant excerpts from the kernel log.
By the way, does this happen on older kernels?
> mount.nfs: mount system call failed
> [pid 1477] +++ exited with 32 +++
> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1477, si_status=32, si_utime=0, si_stime=0} ---
> +++ exited with 32 +++
>
> Nothing in dmesg...
>
> [ 84.202243] RPC: Registered named UNIX socket transport module.
> [ 84.202246] RPC: Registered udp transport module.
> [ 84.202248] RPC: Registered tcp transport module.
> [ 84.202249] RPC: Registered tcp NFSv4.1 backchannel transport module.
> [ 84.205909] FS-Cache: Loaded
> [ 84.208782] FS-Cache: Netfs 'nfs' registered for caching
> [ 84.215733] NFS: Registering the id_resolver key type
> [ 84.215762] Key type id_resolver registered
> [ 84.215763] Key type id_legacy registered
--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com
Adding Chuck to cc:, as he's probably done the most mucking with the
mount code in recent years?
--b.
On Fri, Mar 01, 2013 at 12:43:58PM +0100, Jan Engelhardt wrote:
> Hi.
>
>
> I had here a case with a Linux 3.7.9 system, a virtual machine in a
> RFC1918 range, that did not want to mount NFS.
>
> linux-3lzm:~ # strace -fe mount mount -t nfs 134.76.12.5:/X /mnt
> Process 1477 attached
> [pid 1515] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "vers=4,addr=134.76.12.5,clientaddr=0.0.0.0") = -1 EIO (Input/output error)
> mount.nfs: mount system call failed
> [pid 1477] +++ exited with 32 +++
> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1477, si_status=32, si_utime=0, si_stime=0} ---
> +++ exited with 32 +++
>
> Nothing in dmesg...
>
> [ 84.202243] RPC: Registered named UNIX socket transport module.
> [ 84.202246] RPC: Registered udp transport module.
> [ 84.202248] RPC: Registered tcp transport module.
> [ 84.202249] RPC: Registered tcp NFSv4.1 backchannel transport module.
> [ 84.205909] FS-Cache: Loaded
> [ 84.208782] FS-Cache: Netfs 'nfs' registered for caching
> [ 84.215733] NFS: Registering the id_resolver key type
> [ 84.215762] Key type id_resolver registered
> [ 84.215763] Key type id_legacy registered
>
> When mounting with NFSv3, the error became clear:
>
> # strace -fe mount -s 65536 mount -t nfs 134.76.12.5:/X /mnt -o nfsvers=3,nolock
> Process 1550 attached
> mount.nfs: Network is unreachable
> [pid 1550] +++ exited with 32 +++
> --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1550, si_status=32, si_utime=0, si_stime=0} ---
> +++ exited with 32 +++
>
> Can NFSv4 be made to return -ENETUNREACH as well?
Um, for real this time.
On Mon, Mar 04, 2013 at 09:16:00AM -0500, J. Bruce Fields wrote:
> Adding Chuck to cc:, as he's probably done the most mucking with the
> mount code in recent years?
>
> --b.
>
> On Fri, Mar 01, 2013 at 12:43:58PM +0100, Jan Engelhardt wrote:
> > Hi.
> >
> >
> > I had here a case with a Linux 3.7.9 system, a virtual machine in a
> > RFC1918 range, that did not want to mount NFS.
> >
> > linux-3lzm:~ # strace -fe mount mount -t nfs 134.76.12.5:/X /mnt
> > Process 1477 attached
> > [pid 1515] mount("134.76.12.5:/X", "/mnt", "nfs", 0, "vers=4,addr=134.76.12.5,clientaddr=0.0.0.0") = -1 EIO (Input/output error)
> > mount.nfs: mount system call failed
> > [pid 1477] +++ exited with 32 +++
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1477, si_status=32, si_utime=0, si_stime=0} ---
> > +++ exited with 32 +++
> >
> > Nothing in dmesg...
> >
> > [ 84.202243] RPC: Registered named UNIX socket transport module.
> > [ 84.202246] RPC: Registered udp transport module.
> > [ 84.202248] RPC: Registered tcp transport module.
> > [ 84.202249] RPC: Registered tcp NFSv4.1 backchannel transport module.
> > [ 84.205909] FS-Cache: Loaded
> > [ 84.208782] FS-Cache: Netfs 'nfs' registered for caching
> > [ 84.215733] NFS: Registering the id_resolver key type
> > [ 84.215762] Key type id_resolver registered
> > [ 84.215763] Key type id_legacy registered
> >
> > When mounting with NFSv3, the error became clear:
> >
> > # strace -fe mount -s 65536 mount -t nfs 134.76.12.5:/X /mnt -o nfsvers=3,nolock
> > Process 1550 attached
> > mount.nfs: Network is unreachable
> > [pid 1550] +++ exited with 32 +++
> > --- SIGCHLD {si_signo=SIGCHLD, si_code=CLD_EXITED, si_pid=1550, si_status=32, si_utime=0, si_stime=0} ---
> > +++ exited with 32 +++
> >
> > Can NFSv4 be made to return -ENETUNREACH as well?