On 03/31/2011 11:48 PM, [email protected] wrote:
> The mm-of-the-moment snapshot 2011-03-31-14-48 has been uploaded to
Hi, nfs client is defunct in this kernel. Tcpdump says:
10:51:55.489717 IP 10.20.11.33.759945860 > 10.20.3.2.2049: 132 getattr
fh 0,0/24
10:51:55.515927 IP 10.20.3.2.2049 > 10.20.11.33.759945860: reply ok 44
getattr ERROR: Operation not permitted
10:51:55.515949 IP 10.20.11.33.921 > 10.20.3.2.2049: Flags [.], ack
3569361440, win 115, options [nop,nop,TS val 599750 ecr 255058541], length 0
10:52:04.130310 IP 10.20.11.33.793500292 > 10.20.3.2.2049: 76 getattr fh
0,0/24
10:52:04.152178 IP 10.20.3.2.2049 > 10.20.11.33.793500292: reply ok 44
getattr ERROR: Operation not permitted
If I run the same mount command (mount -oro,intr host:dir mountpoint)
from within a virtual machine with 2.6.38.2 there, everything mounts OK.
thanks,
--
js
suse labs
On 04/13/2011 10:42 PM, Bryan Schumaker wrote:
> On 04/12/2011 02:52 PM, Jiri Slaby wrote:
>> On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
>>> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
>>>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>>>>> inacceptable for automounted NFS dirs.
>>>>>
>>>>> I'm still confused as to why you are hitting it at all. In the normal
>>>>> autonegotiation case, the client should be trying to use AUTH_SYS first
>>>>> and then trying rpcsec_gss if and only if that fails.
>>>>>
>>>>> Are you really exporting a filesystem using AUTH_NULL as the only
>>>>> supported flavour?
>>>>
>>>> I don't know, I connect to a nfs server which is not maintained by me.
>>>> It looks like that. How can I find out?
>>>
>>> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
>>
>> I don't have NFS in modules. It's all built-in. And this one is
>> unconditionally selected because of CONFIG_NFS_V4.
>
> Does this patch help?
Nope, it makes things even worse:
# mount -oro,intr XXX:/yyy /mnt/c/
<15s delay here>
mount.nfs: access denied by server while mounting XXX:/yyy
So in nfs4_proc_get_root I do:
printk("%s: %d %u\n", __func__, i, flav_array[i]);
status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
printk("%s: res=%d\n", __func__, status);
and get:
[ 18.159818] nfs4_proc_get_root: 0 1
[ 18.214872] nfs4_proc_get_root: res=-1
[ 18.214875] nfs4_proc_get_root: 1 0
[ 18.254636] nfs4_proc_get_root: res=-1
[ 18.254639] nfs4_proc_get_root: 2 390003
[ 33.252174] RPC: AUTH_GSS upcall timed out.
[ 33.252177] Please check user daemon is running.
[ 33.252192] nfs4_proc_get_root: res=-13
If I revert that back and do the same:
[ 28.275569] nfs4_proc_get_root: 0 1
[ 28.296545] nfs4_proc_get_root: res=-1
[ 28.296548] nfs4_proc_get_root: 1 390003
[ 43.296107] RPC: AUTH_GSS upcall timed out.
[ 43.296108] Please check user daemon is running.
[ 43.296121] nfs4_proc_get_root: res=-13
[ 43.296122] nfs4_proc_get_root: 2 0
[ 43.318201] nfs4_proc_get_root: res=-1
I.e. all methods fail. And what matters is the last retval. From NULL it
is EPERM, from GSS it is EACCESS. For EPERM, mount(8) falls back to
nfs3, for EACCESS it dies terrible death.
linux-b984:~ # strace -fe mount -s 1000 mount -oro,intr XXX:/yyy /mnt/c/
Process 2396 attached
Process 2395 suspended
[pid 2396] mount("XXX:/yyy", "/mnt/c", "nfs", MS_RDONLY,
"intr,vers=4,addr=10.20.3.2,clientaddr=10.0.2.15") = -1 EPERM (Operation
not permitted)
[pid 2396] mount("XXX:/yyy", "/mnt/c", "nfs", MS_RDONLY,
"intr,addr=10.20.3.2,vers=3,proto=tcp,mountvers=3,mountproto=udp,mountport=709")
= 0
Process 2395 resumed
Process 2396 detached
--- SIGCHLD (Child exited) @ 0 (0) ---
thanks,
--
js
suse labs
On Tue, 2011-04-12 at 20:05 +0200, Jiri Slaby wrote:
> On 04/12/2011 07:41 PM, Bryan Schumaker wrote:
> > On 04/11/2011 05:08 PM, Jiri Slaby wrote:
> >>
> >> Sorry for an extra message. I've just found out that there appears
> >> messages in dmesg:
> >> [ 58.656048] RPC: AUTH_GSS upcall timed out.
> >> [ 58.656050] Please check user daemon is running.
> >> [ 88.656065] RPC: AUTH_GSS upcall timed out.
> >> [ 88.656068] Please check user daemon is running.
> >> [ 118.656077] RPC: AUTH_GSS upcall timed out.
> >> [ 118.656080] Please check user daemon is running.
> >> [ 148.656049] RPC: AUTH_GSS upcall timed out.
> >> [ 148.656052] Please check user daemon is running.
> >> [ 178.656046] RPC: AUTH_GSS upcall timed out.
> >> [ 178.656049] Please check user daemon is running.
> >>
> >>
> >> I instrumented the code and it's stuck with trying RPC_AUTH_GSS_KRB5.
> >>
> >> I don't use GSS at all.
> >>
> >> regards,
> >
> > Does this patch help?
> >
> > - Bryan
> >
> >
> >
> > There can be an infinite loop if gss_create_upcall() is called without
> > the userspace program running. To prevent this, we return -EACCES if
> > we notice that pipe_version hasn't changed (indicating that the pipe
> > has not been opened).
>
> Yes, it fixes the problem. But it waits 15s before it times out. This is
> inacceptable for automounted NFS dirs.
I'm still confused as to why you are hitting it at all. In the normal
autonegotiation case, the client should be trying to use AUTH_SYS first
and then trying rpcsec_gss if and only if that fails.
Are you really exporting a filesystem using AUTH_NULL as the only
supported flavour?
--
Trond Myklebust
Linux NFS client maintainer
NetApp
[email protected]
http://www.netapp.com
On 04/12/2011 07:41 PM, Bryan Schumaker wrote:
> On 04/11/2011 05:08 PM, Jiri Slaby wrote:
>>
>> Sorry for an extra message. I've just found out that there appears
>> messages in dmesg:
>> [ 58.656048] RPC: AUTH_GSS upcall timed out.
>> [ 58.656050] Please check user daemon is running.
>> [ 88.656065] RPC: AUTH_GSS upcall timed out.
>> [ 88.656068] Please check user daemon is running.
>> [ 118.656077] RPC: AUTH_GSS upcall timed out.
>> [ 118.656080] Please check user daemon is running.
>> [ 148.656049] RPC: AUTH_GSS upcall timed out.
>> [ 148.656052] Please check user daemon is running.
>> [ 178.656046] RPC: AUTH_GSS upcall timed out.
>> [ 178.656049] Please check user daemon is running.
>>
>>
>> I instrumented the code and it's stuck with trying RPC_AUTH_GSS_KRB5.
>>
>> I don't use GSS at all.
>>
>> regards,
>
> Does this patch help?
>
> - Bryan
>
>
>
> There can be an infinite loop if gss_create_upcall() is called without
> the userspace program running. To prevent this, we return -EACCES if
> we notice that pipe_version hasn't changed (indicating that the pipe
> has not been opened).
Yes, it fixes the problem. But it waits 15s before it times out. This is
inacceptable for automounted NFS dirs.
thanks,
--
js
suse labs
On 04/11/2011 04:40 PM, Jiri Slaby wrote:
> On 04/07/2011 08:42 AM, Jiri Slaby wrote:
>> On 04/06/2011 10:44 PM, Myklebust, Trond wrote:
>>> On Sat, 2011-04-02 at 10:56 +0200, Jiri Slaby wrote:
>>>> On 03/31/2011 11:48 PM, [email protected] wrote:
>>>>> The mm-of-the-moment snapshot 2011-03-31-14-48 has been uploaded to
>>>>
>>>> Hi, nfs client is defunct in this kernel. Tcpdump says:
>>>> 10:51:55.489717 IP 10.20.11.33.759945860 > 10.20.3.2.2049: 132 getattr
>>>> fh 0,0/24
>>>> 10:51:55.515927 IP 10.20.3.2.2049 > 10.20.11.33.759945860: reply ok 44
>>>> getattr ERROR: Operation not permitted
>>>> 10:51:55.515949 IP 10.20.11.33.921 > 10.20.3.2.2049: Flags [.], ack
>>>> 3569361440, win 115, options [nop,nop,TS val 599750 ecr 255058541],
>>> length 0
>>>> 10:52:04.130310 IP 10.20.11.33.793500292 > 10.20.3.2.2049: 76 getattr fh
>>>> 0,0/24
>>>> 10:52:04.152178 IP 10.20.3.2.2049 > 10.20.11.33.793500292: reply ok 44
>>>> getattr ERROR: Operation not permitted
>>>>
>>>> If I run the same mount command (mount -oro,intr host:dir mountpoint)
>>>> from within a virtual machine with 2.6.38.2 there, everything mounts OK.
>>>
>>> Does the attached patch help?
>>
>> No, still the operation not permitted in the tcpdump output and no mount.
Does this patch help?
- Bryan
When attempting an initial mount, we should only attempt other
authflavors if AUTH_UNIX receives a NFS4ERR_WRONGSEC error.
This allows other errors to be passed back to userspace programs.
Signed-off-by: Bryan Schumaker <[email protected]>
---
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index dfd1e6d..9bf41ea 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2204,8 +2204,6 @@ static int nfs4_lookup_root_sec(struct nfs_server *server, struct nfs_fh *fhandl
goto out;
}
ret = nfs4_lookup_root(server, fhandle, info);
- if (ret < 0)
- ret = -EAGAIN;
out:
return ret;
}
@@ -2226,7 +2224,7 @@ static int nfs4_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
for (i = 0; i < len; i++) {
status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
- if (status == 0)
+ if (status != -EPERM)
break;
}
if (status == 0)
>
> The next tree from 20110411 still doesn't work. The topmost commit in
> fs/nfs/namespace.c is:
> commit 418875900e3de4831c84f86ae4756690dac5be77
> Author: Bryan Schumaker <[email protected]>
> Date: Wed Apr 6 14:33:28 2011 -0400
>
> NFS: Fix a signed vs. unsigned secinfo bug
>
>
> I bisected it to (in vanilla already):
>
> 8f70e95f9f4159184f557a1db60c909d7c1bd2e3 is the first bad commit
> commit 8f70e95f9f4159184f557a1db60c909d7c1bd2e3
> Author: Bryan Schumaker <[email protected]>
> Date: Thu Mar 24 17:12:31 2011 +0000
>
> NFS: Determine initial mount security
>
> When sec=<something> is not presented as a mount option,
> we should attempt to determine what security flavor the
> server is using.
>
> Signed-off-by: Bryan Schumaker <[email protected]>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> :040000 040000 8e5a640b37e00f0df21e1d9cd9aff160df2d5938
> 0152daa67bc8d12e32cda5f4a036807d2e380392 M fs
> :040000 040000 f74aa33f8597cb82cd0fd7d90d84e0660b7f5804
> 527bc0ca6975cedc7e684b45dc9961f8aaf1207a M include
> :040000 040000 87559d2f211ea905343a86c8551b6610dd239891
> 7e4ee0e5eddf12474b6de9e7fdb6218b6165bdb2 M net
>
> thanks,
On 04/14/2011 11:21 PM, Trond Myklebust wrote:
> On Thu, 2011-04-14 at 22:37 +0200, Jiri Slaby wrote:
>> On 04/13/2011 10:42 PM, Bryan Schumaker wrote:
>>> On 04/12/2011 02:52 PM, Jiri Slaby wrote:
>>>> On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
>>>>> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
>>>>>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>>>>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>>>>>>> inacceptable for automounted NFS dirs.
>>>>>>>
>>>>>>> I'm still confused as to why you are hitting it at all. In the normal
>>>>>>> autonegotiation case, the client should be trying to use AUTH_SYS first
>>>>>>> and then trying rpcsec_gss if and only if that fails.
>>>>>>>
>>>>>>> Are you really exporting a filesystem using AUTH_NULL as the only
>>>>>>> supported flavour?
>>>>>>
>>>>>> I don't know, I connect to a nfs server which is not maintained by me.
>>>>>> It looks like that. How can I find out?
>>>>>
>>>>> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
>>>>
>>>> I don't have NFS in modules. It's all built-in. And this one is
>>>> unconditionally selected because of CONFIG_NFS_V4.
>>>
>>> Does this patch help?
>>
>> Nope, it makes things even worse:
>> # mount -oro,intr XXX:/yyy /mnt/c/
>> <15s delay here>
>> mount.nfs: access denied by server while mounting XXX:/yyy
>>
>> So in nfs4_proc_get_root I do:
>> printk("%s: %d %u\n", __func__, i, flav_array[i]);
>> status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
>> printk("%s: res=%d\n", __func__, status);
>> and get:
>> [ 18.159818] nfs4_proc_get_root: 0 1
>> [ 18.214872] nfs4_proc_get_root: res=-1
>> [ 18.214875] nfs4_proc_get_root: 1 0
>> [ 18.254636] nfs4_proc_get_root: res=-1
>> [ 18.254639] nfs4_proc_get_root: 2 390003
>> [ 33.252174] RPC: AUTH_GSS upcall timed out.
>> [ 33.252177] Please check user daemon is running.
>> [ 33.252192] nfs4_proc_get_root: res=-13
>>
>> If I revert that back and do the same:
>> [ 28.275569] nfs4_proc_get_root: 0 1
>> [ 28.296545] nfs4_proc_get_root: res=-1
>> [ 28.296548] nfs4_proc_get_root: 1 390003
>> [ 43.296107] RPC: AUTH_GSS upcall timed out.
>> [ 43.296108] Please check user daemon is running.
>> [ 43.296121] nfs4_proc_get_root: res=-13
>> [ 43.296122] nfs4_proc_get_root: 2 0
>> [ 43.318201] nfs4_proc_get_root: res=-1
>>
>> I.e. all methods fail. And what matters is the last retval. From NULL it
>> is EPERM, from GSS it is EACCESS. For EPERM, mount(8) falls back to
>> nfs3, for EACCESS it dies terrible death.
>
> OK. That's good information. Thanks for testing!
>
> I'm still curious as to why that NFS server is refusing all NFSv4 mounts
> with NFS4ERR_WRONGSEC. Unless NFSv4 really is configured only to export
> the root filesystem with RPCSEC_GSS, then that definitely sounds like a
> bug...
With gssd running if that helps:
[ 229.806528] nfs4_proc_get_root: 0 1
[ 229.828491] nfs4_proc_get_root: res=-1
[ 229.828494] nfs4_proc_get_root: 1 390003
[ 229.896994] nfs4_proc_get_root: res=-13
[ 229.896997] nfs4_proc_get_root: 2 0
[ 229.920344] nfs4_proc_get_root: res=-1
thanks,
--
js
suse labs
On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>> inacceptable for automounted NFS dirs.
>
> I'm still confused as to why you are hitting it at all. In the normal
> autonegotiation case, the client should be trying to use AUTH_SYS first
> and then trying rpcsec_gss if and only if that fails.
>
> Are you really exporting a filesystem using AUTH_NULL as the only
> supported flavour?
I don't know, I connect to a nfs server which is not maintained by me.
It looks like that. How can I find out?
thanks,
--
js
suse labs
On 04/11/2011 05:08 PM, Jiri Slaby wrote:
>
> Sorry for an extra message. I've just found out that there appears
> messages in dmesg:
> [ 58.656048] RPC: AUTH_GSS upcall timed out.
> [ 58.656050] Please check user daemon is running.
> [ 88.656065] RPC: AUTH_GSS upcall timed out.
> [ 88.656068] Please check user daemon is running.
> [ 118.656077] RPC: AUTH_GSS upcall timed out.
> [ 118.656080] Please check user daemon is running.
> [ 148.656049] RPC: AUTH_GSS upcall timed out.
> [ 148.656052] Please check user daemon is running.
> [ 178.656046] RPC: AUTH_GSS upcall timed out.
> [ 178.656049] Please check user daemon is running.
>
>
> I instrumented the code and it's stuck with trying RPC_AUTH_GSS_KRB5.
>
> I don't use GSS at all.
>
> regards,
Does this patch help?
- Bryan
There can be an infinite loop if gss_create_upcall() is called without
the userspace program running. To prevent this, we return -EACCES if
we notice that pipe_version hasn't changed (indicating that the pipe
has not been opened).
Signed-off-by: Bryan Schumaker <[email protected]>
--
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 9bf41ea..8a03ee0 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2224,8 +2224,9 @@ static int nfs4_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
for (i = 0; i < len; i++) {
status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
- if (status != -EPERM)
- break;
+ if (status == -EPERM || status == -EACCES)
+ continue;
+ break;
}
if (status == 0)
status = nfs4_server_capabilities(server, fhandle);
diff --git a/net/sunrpc/auth_gss/auth_gss.c b/net/sunrpc/auth_gss/auth_gss.c
index f3914d0..339ba64 100644
--- a/net/sunrpc/auth_gss/auth_gss.c
+++ b/net/sunrpc/auth_gss/auth_gss.c
@@ -520,7 +520,7 @@ gss_refresh_upcall(struct rpc_task *task)
warn_gssd();
task->tk_timeout = 15*HZ;
rpc_sleep_on(&pipe_version_rpc_waitqueue, task, NULL);
- return 0;
+ return -EAGAIN;
}
if (IS_ERR(gss_msg)) {
err = PTR_ERR(gss_msg);
@@ -563,10 +563,12 @@ retry:
if (PTR_ERR(gss_msg) == -EAGAIN) {
err = wait_event_interruptible_timeout(pipe_version_waitqueue,
pipe_version >= 0, 15*HZ);
+ if (pipe_version < 0) {
+ warn_gssd();
+ err = -EACCES;
+ }
if (err)
goto out;
- if (pipe_version < 0)
- warn_gssd();
goto retry;
}
if (IS_ERR(gss_msg)) {
On 04/12/2011 02:34 PM, Jiri Slaby wrote:
> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>> inacceptable for automounted NFS dirs.
>>
>> I'm still confused as to why you are hitting it at all. In the normal
>> autonegotiation case, the client should be trying to use AUTH_SYS first
>> and then trying rpcsec_gss if and only if that fails.
>>
>> Are you really exporting a filesystem using AUTH_NULL as the only
>> supported flavour?
>
> I don't know, I connect to a nfs server which is not maintained by me.
> It looks like that. How can I find out?
If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
- Bryan
>
> thanks,
On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>>> inacceptable for automounted NFS dirs.
>>>
>>> I'm still confused as to why you are hitting it at all. In the normal
>>> autonegotiation case, the client should be trying to use AUTH_SYS first
>>> and then trying rpcsec_gss if and only if that fails.
>>>
>>> Are you really exporting a filesystem using AUTH_NULL as the only
>>> supported flavour?
>>
>> I don't know, I connect to a nfs server which is not maintained by me.
>> It looks like that. How can I find out?
>
> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
I don't have NFS in modules. It's all built-in. And this one is
unconditionally selected because of CONFIG_NFS_V4.
regards,
--
js
suse labs
On Thu, 2011-04-14 at 22:37 +0200, Jiri Slaby wrote:
> On 04/13/2011 10:42 PM, Bryan Schumaker wrote:
> > On 04/12/2011 02:52 PM, Jiri Slaby wrote:
> >> On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
> >>> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
> >>>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
> >>>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
> >>>>>> inacceptable for automounted NFS dirs.
> >>>>>
> >>>>> I'm still confused as to why you are hitting it at all. In the normal
> >>>>> autonegotiation case, the client should be trying to use AUTH_SYS first
> >>>>> and then trying rpcsec_gss if and only if that fails.
> >>>>>
> >>>>> Are you really exporting a filesystem using AUTH_NULL as the only
> >>>>> supported flavour?
> >>>>
> >>>> I don't know, I connect to a nfs server which is not maintained by me.
> >>>> It looks like that. How can I find out?
> >>>
> >>> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
> >>
> >> I don't have NFS in modules. It's all built-in. And this one is
> >> unconditionally selected because of CONFIG_NFS_V4.
> >
> > Does this patch help?
>
> Nope, it makes things even worse:
> # mount -oro,intr XXX:/yyy /mnt/c/
> <15s delay here>
> mount.nfs: access denied by server while mounting XXX:/yyy
>
> So in nfs4_proc_get_root I do:
> printk("%s: %d %u\n", __func__, i, flav_array[i]);
> status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
> printk("%s: res=%d\n", __func__, status);
> and get:
> [ 18.159818] nfs4_proc_get_root: 0 1
> [ 18.214872] nfs4_proc_get_root: res=-1
> [ 18.214875] nfs4_proc_get_root: 1 0
> [ 18.254636] nfs4_proc_get_root: res=-1
> [ 18.254639] nfs4_proc_get_root: 2 390003
> [ 33.252174] RPC: AUTH_GSS upcall timed out.
> [ 33.252177] Please check user daemon is running.
> [ 33.252192] nfs4_proc_get_root: res=-13
>
> If I revert that back and do the same:
> [ 28.275569] nfs4_proc_get_root: 0 1
> [ 28.296545] nfs4_proc_get_root: res=-1
> [ 28.296548] nfs4_proc_get_root: 1 390003
> [ 43.296107] RPC: AUTH_GSS upcall timed out.
> [ 43.296108] Please check user daemon is running.
> [ 43.296121] nfs4_proc_get_root: res=-13
> [ 43.296122] nfs4_proc_get_root: 2 0
> [ 43.318201] nfs4_proc_get_root: res=-1
>
> I.e. all methods fail. And what matters is the last retval. From NULL it
> is EPERM, from GSS it is EACCESS. For EPERM, mount(8) falls back to
> nfs3, for EACCESS it dies terrible death.
OK. That's good information. Thanks for testing!
I'm still curious as to why that NFS server is refusing all NFSv4 mounts
with NFS4ERR_WRONGSEC. Unless NFSv4 really is configured only to export
the root filesystem with RPCSEC_GSS, then that definitely sounds like a
bug...
Cheers
Trond
--
Trond Myklebust
Linux NFS client maintainer
NetApp
[email protected]
http://www.netapp.com
On 04/07/2011 08:42 AM, Jiri Slaby wrote:
> On 04/06/2011 10:44 PM, Myklebust, Trond wrote:
>> On Sat, 2011-04-02 at 10:56 +0200, Jiri Slaby wrote:
>>> On 03/31/2011 11:48 PM, [email protected] wrote:
>>>> The mm-of-the-moment snapshot 2011-03-31-14-48 has been uploaded to
>>>
>>> Hi, nfs client is defunct in this kernel. Tcpdump says:
>>> 10:51:55.489717 IP 10.20.11.33.759945860 > 10.20.3.2.2049: 132 getattr
>>> fh 0,0/24
>>> 10:51:55.515927 IP 10.20.3.2.2049 > 10.20.11.33.759945860: reply ok 44
>>> getattr ERROR: Operation not permitted
>>> 10:51:55.515949 IP 10.20.11.33.921 > 10.20.3.2.2049: Flags [.], ack
>>> 3569361440, win 115, options [nop,nop,TS val 599750 ecr 255058541],
>> length 0
>>> 10:52:04.130310 IP 10.20.11.33.793500292 > 10.20.3.2.2049: 76 getattr fh
>>> 0,0/24
>>> 10:52:04.152178 IP 10.20.3.2.2049 > 10.20.11.33.793500292: reply ok 44
>>> getattr ERROR: Operation not permitted
>>>
>>> If I run the same mount command (mount -oro,intr host:dir mountpoint)
>>> from within a virtual machine with 2.6.38.2 there, everything mounts OK.
>>
>> Does the attached patch help?
>
> No, still the operation not permitted in the tcpdump output and no mount.
The next tree from 20110411 still doesn't work. The topmost commit in
fs/nfs/namespace.c is:
commit 418875900e3de4831c84f86ae4756690dac5be77
Author: Bryan Schumaker <[email protected]>
Date: Wed Apr 6 14:33:28 2011 -0400
NFS: Fix a signed vs. unsigned secinfo bug
I bisected it to (in vanilla already):
8f70e95f9f4159184f557a1db60c909d7c1bd2e3 is the first bad commit
commit 8f70e95f9f4159184f557a1db60c909d7c1bd2e3
Author: Bryan Schumaker <[email protected]>
Date: Thu Mar 24 17:12:31 2011 +0000
NFS: Determine initial mount security
When sec=<something> is not presented as a mount option,
we should attempt to determine what security flavor the
server is using.
Signed-off-by: Bryan Schumaker <[email protected]>
Signed-off-by: Trond Myklebust <[email protected]>
:040000 040000 8e5a640b37e00f0df21e1d9cd9aff160df2d5938
0152daa67bc8d12e32cda5f4a036807d2e380392 M fs
:040000 040000 f74aa33f8597cb82cd0fd7d90d84e0660b7f5804
527bc0ca6975cedc7e684b45dc9961f8aaf1207a M include
:040000 040000 87559d2f211ea905343a86c8551b6610dd239891
7e4ee0e5eddf12474b6de9e7fdb6218b6165bdb2 M net
thanks,
--
js
suse labs
On 04/12/2011 02:52 PM, Jiri Slaby wrote:
> On 04/12/2011 08:43 PM, Bryan Schumaker wrote:
>> On 04/12/2011 02:34 PM, Jiri Slaby wrote:
>>> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
>>>>> Yes, it fixes the problem. But it waits 15s before it times out. This is
>>>>> inacceptable for automounted NFS dirs.
>>>>
>>>> I'm still confused as to why you are hitting it at all. In the normal
>>>> autonegotiation case, the client should be trying to use AUTH_SYS first
>>>> and then trying rpcsec_gss if and only if that fails.
>>>>
>>>> Are you really exporting a filesystem using AUTH_NULL as the only
>>>> supported flavour?
>>>
>>> I don't know, I connect to a nfs server which is not maintained by me.
>>> It looks like that. How can I find out?
>>
>> If you're not using gss for anything, you could try rmmod-ing rpcsec_gss_krb5 (and other rpcsec_gss_* modules).
>
> I don't have NFS in modules. It's all built-in. And this one is
> unconditionally selected because of CONFIG_NFS_V4.
Does this patch help?
- Bryan
We should attempt an AUTH_NULL style mount before
trying gss flavors. This should prevent a hang if
gss modules are loaded but the userspace program
isn't running.
diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
index 9bf41ea..4e3c16b 100644
--- a/fs/nfs/nfs4proc.c
+++ b/fs/nfs/nfs4proc.c
@@ -2218,8 +2218,8 @@ static int nfs4_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
rpc_authflavor_t flav_array[NFS_MAX_SECFLAVORS + 2];
flav_array[0] = RPC_AUTH_UNIX;
- len = gss_mech_list_pseudoflavors(&flav_array[1]);
- flav_array[1+len] = RPC_AUTH_NULL;
+ flav_array[1] = RPC_AUTH_NULL;
+ len = gss_mech_list_pseudoflavors(&flav_array[2]);
len += 2;
for (i = 0; i < len; i++) {
>
> regards,
On 04/06/2011 10:44 PM, Myklebust, Trond wrote:
> On Sat, 2011-04-02 at 10:56 +0200, Jiri Slaby wrote:
>> On 03/31/2011 11:48 PM, [email protected] wrote:
>> > The mm-of-the-moment snapshot 2011-03-31-14-48 has been uploaded to
>>
>> Hi, nfs client is defunct in this kernel. Tcpdump says:
>> 10:51:55.489717 IP 10.20.11.33.759945860 > 10.20.3.2.2049: 132 getattr
>> fh 0,0/24
>> 10:51:55.515927 IP 10.20.3.2.2049 > 10.20.11.33.759945860: reply ok 44
>> getattr ERROR: Operation not permitted
>> 10:51:55.515949 IP 10.20.11.33.921 > 10.20.3.2.2049: Flags [.], ack
>> 3569361440, win 115, options [nop,nop,TS val 599750 ecr 255058541],
> length 0
>> 10:52:04.130310 IP 10.20.11.33.793500292 > 10.20.3.2.2049: 76 getattr fh
>> 0,0/24
>> 10:52:04.152178 IP 10.20.3.2.2049 > 10.20.11.33.793500292: reply ok 44
>> getattr ERROR: Operation not permitted
>>
>> If I run the same mount command (mount -oro,intr host:dir mountpoint)
>> from within a virtual machine with 2.6.38.2 there, everything mounts OK.
>
> Does the attached patch help?
No, still the operation not permitted in the tcpdump output and no mount.
thanks,
--
js
suse labs
On Tue, 2011-04-12 at 20:34 +0200, Jiri Slaby wrote:
> On 04/12/2011 08:31 PM, Trond Myklebust wrote:
> >> Yes, it fixes the problem. But it waits 15s before it times out. This is
> >> inacceptable for automounted NFS dirs.
> >
> > I'm still confused as to why you are hitting it at all. In the normal
> > autonegotiation case, the client should be trying to use AUTH_SYS first
> > and then trying rpcsec_gss if and only if that fails.
> >
> > Are you really exporting a filesystem using AUTH_NULL as the only
> > supported flavour?
>
> I don't know, I connect to a nfs server which is not maintained by me.
> It looks like that. How can I find out?
A wireshark trace of a successful mount would help.
--
Trond Myklebust
Linux NFS client maintainer
NetApp
[email protected]
http://www.netapp.com
On 04/11/2011 10:40 PM, Jiri Slaby wrote:
> Ccing Bryan
>
> On 04/11/2011 10:40 PM, Jiri Slaby wrote:
>> On 04/07/2011 08:42 AM, Jiri Slaby wrote:
>>> On 04/06/2011 10:44 PM, Myklebust, Trond wrote:
>>>> On Sat, 2011-04-02 at 10:56 +0200, Jiri Slaby wrote:
>>>>> On 03/31/2011 11:48 PM, [email protected] wrote:
>>>>>> The mm-of-the-moment snapshot 2011-03-31-14-48 has been uploaded to
>>>>>
>>>>> Hi, nfs client is defunct in this kernel. Tcpdump says:
>>>>> 10:51:55.489717 IP 10.20.11.33.759945860 > 10.20.3.2.2049: 132 getattr
>>>>> fh 0,0/24
>>>>> 10:51:55.515927 IP 10.20.3.2.2049 > 10.20.11.33.759945860: reply ok 44
>>>>> getattr ERROR: Operation not permitted
>>>>> 10:51:55.515949 IP 10.20.11.33.921 > 10.20.3.2.2049: Flags [.], ack
>>>>> 3569361440, win 115, options [nop,nop,TS val 599750 ecr 255058541],
>>>> length 0
>>>>> 10:52:04.130310 IP 10.20.11.33.793500292 > 10.20.3.2.2049: 76 getattr fh
>>>>> 0,0/24
>>>>> 10:52:04.152178 IP 10.20.3.2.2049 > 10.20.11.33.793500292: reply ok 44
>>>>> getattr ERROR: Operation not permitted
>>>>>
>>>>> If I run the same mount command (mount -oro,intr host:dir mountpoint)
>>>>> from within a virtual machine with 2.6.38.2 there, everything mounts OK.
>>>>
>>>> Does the attached patch help?
>>>
>>> No, still the operation not permitted in the tcpdump output and no mount.
>>
>> The next tree from 20110411 still doesn't work. The topmost commit in
>> fs/nfs/namespace.c is:
>> commit 418875900e3de4831c84f86ae4756690dac5be77
>> Author: Bryan Schumaker <[email protected]>
>> Date: Wed Apr 6 14:33:28 2011 -0400
>>
>> NFS: Fix a signed vs. unsigned secinfo bug
>>
>>
>> I bisected it to (in vanilla already):
>>
>> 8f70e95f9f4159184f557a1db60c909d7c1bd2e3 is the first bad commit
>> commit 8f70e95f9f4159184f557a1db60c909d7c1bd2e3
>> Author: Bryan Schumaker <[email protected]>
>> Date: Thu Mar 24 17:12:31 2011 +0000
>>
>> NFS: Determine initial mount security
>>
>> When sec=<something> is not presented as a mount option,
>> we should attempt to determine what security flavor the
>> server is using.
>>
>> Signed-off-by: Bryan Schumaker <[email protected]>
>> Signed-off-by: Trond Myklebust <[email protected]>
>>
>> :040000 040000 8e5a640b37e00f0df21e1d9cd9aff160df2d5938
>> 0152daa67bc8d12e32cda5f4a036807d2e380392 M fs
>> :040000 040000 f74aa33f8597cb82cd0fd7d90d84e0660b7f5804
>> 527bc0ca6975cedc7e684b45dc9961f8aaf1207a M include
>> :040000 040000 87559d2f211ea905343a86c8551b6610dd239891
>> 7e4ee0e5eddf12474b6de9e7fdb6218b6165bdb2 M net
Sorry for an extra message. I've just found out that there appears
messages in dmesg:
[ 58.656048] RPC: AUTH_GSS upcall timed out.
[ 58.656050] Please check user daemon is running.
[ 88.656065] RPC: AUTH_GSS upcall timed out.
[ 88.656068] Please check user daemon is running.
[ 118.656077] RPC: AUTH_GSS upcall timed out.
[ 118.656080] Please check user daemon is running.
[ 148.656049] RPC: AUTH_GSS upcall timed out.
[ 148.656052] Please check user daemon is running.
[ 178.656046] RPC: AUTH_GSS upcall timed out.
[ 178.656049] Please check user daemon is running.
I instrumented the code and it's stuck with trying RPC_AUTH_GSS_KRB5.
I don't use GSS at all.
regards,
--
js
suse labs
On 04/11/2011 10:56 PM, Bryan Schumaker wrote:
> On 04/11/2011 04:40 PM, Jiri Slaby wrote:
>> On 04/07/2011 08:42 AM, Jiri Slaby wrote:
>>> On 04/06/2011 10:44 PM, Myklebust, Trond wrote:
>>>> On Sat, 2011-04-02 at 10:56 +0200, Jiri Slaby wrote:
>>>>> On 03/31/2011 11:48 PM, [email protected] wrote:
>>>>>> The mm-of-the-moment snapshot 2011-03-31-14-48 has been uploaded to
>>>>>
>>>>> Hi, nfs client is defunct in this kernel. Tcpdump says:
>>>>> 10:51:55.489717 IP 10.20.11.33.759945860 > 10.20.3.2.2049: 132 getattr
>>>>> fh 0,0/24
>>>>> 10:51:55.515927 IP 10.20.3.2.2049 > 10.20.11.33.759945860: reply ok 44
>>>>> getattr ERROR: Operation not permitted
>>>>> 10:51:55.515949 IP 10.20.11.33.921 > 10.20.3.2.2049: Flags [.], ack
>>>>> 3569361440, win 115, options [nop,nop,TS val 599750 ecr 255058541],
>>>> length 0
>>>>> 10:52:04.130310 IP 10.20.11.33.793500292 > 10.20.3.2.2049: 76 getattr fh
>>>>> 0,0/24
>>>>> 10:52:04.152178 IP 10.20.3.2.2049 > 10.20.11.33.793500292: reply ok 44
>>>>> getattr ERROR: Operation not permitted
>>>>>
>>>>> If I run the same mount command (mount -oro,intr host:dir mountpoint)
>>>>> from within a virtual machine with 2.6.38.2 there, everything mounts OK.
>>>>
>>>> Does the attached patch help?
>>>
>>> No, still the operation not permitted in the tcpdump output and no mount.
>
> Does this patch help?
>
> - Bryan
>
> When attempting an initial mount, we should only attempt other
> authflavors if AUTH_UNIX receives a NFS4ERR_WRONGSEC error.
> This allows other errors to be passed back to userspace programs.
>
> Signed-off-by: Bryan Schumaker <[email protected]>
> ---
> diff --git a/fs/nfs/nfs4proc.c b/fs/nfs/nfs4proc.c
> index dfd1e6d..9bf41ea 100644
> --- a/fs/nfs/nfs4proc.c
> +++ b/fs/nfs/nfs4proc.c
> @@ -2204,8 +2204,6 @@ static int nfs4_lookup_root_sec(struct nfs_server *server, struct nfs_fh *fhandl
> goto out;
> }
> ret = nfs4_lookup_root(server, fhandle, info);
> - if (ret < 0)
> - ret = -EAGAIN;
> out:
> return ret;
> }
> @@ -2226,7 +2224,7 @@ static int nfs4_proc_get_root(struct nfs_server *server, struct nfs_fh *fhandle,
>
> for (i = 0; i < len; i++) {
No, the patch fixes a problem I have after I add the following test here:
if (flav_array[i] > 100)
continue;
Without this test it still loops inside gss auth create function with:
RPC: AUTH_GSS upcall timed out.
> status = nfs4_lookup_root_sec(server, fhandle, info, flav_array[i]);
> - if (status == 0)
> + if (status != -EPERM)
> break;
> }
> if (status == 0)
thanks,
--
js
suse labs
Ccing Bryan
On 04/11/2011 10:40 PM, Jiri Slaby wrote:
> On 04/07/2011 08:42 AM, Jiri Slaby wrote:
>> On 04/06/2011 10:44 PM, Myklebust, Trond wrote:
>>> On Sat, 2011-04-02 at 10:56 +0200, Jiri Slaby wrote:
>>>> On 03/31/2011 11:48 PM, [email protected] wrote:
>>>>> The mm-of-the-moment snapshot 2011-03-31-14-48 has been uploaded to
>>>>
>>>> Hi, nfs client is defunct in this kernel. Tcpdump says:
>>>> 10:51:55.489717 IP 10.20.11.33.759945860 > 10.20.3.2.2049: 132 getattr
>>>> fh 0,0/24
>>>> 10:51:55.515927 IP 10.20.3.2.2049 > 10.20.11.33.759945860: reply ok 44
>>>> getattr ERROR: Operation not permitted
>>>> 10:51:55.515949 IP 10.20.11.33.921 > 10.20.3.2.2049: Flags [.], ack
>>>> 3569361440, win 115, options [nop,nop,TS val 599750 ecr 255058541],
>>> length 0
>>>> 10:52:04.130310 IP 10.20.11.33.793500292 > 10.20.3.2.2049: 76 getattr fh
>>>> 0,0/24
>>>> 10:52:04.152178 IP 10.20.3.2.2049 > 10.20.11.33.793500292: reply ok 44
>>>> getattr ERROR: Operation not permitted
>>>>
>>>> If I run the same mount command (mount -oro,intr host:dir mountpoint)
>>>> from within a virtual machine with 2.6.38.2 there, everything mounts OK.
>>>
>>> Does the attached patch help?
>>
>> No, still the operation not permitted in the tcpdump output and no mount.
>
> The next tree from 20110411 still doesn't work. The topmost commit in
> fs/nfs/namespace.c is:
> commit 418875900e3de4831c84f86ae4756690dac5be77
> Author: Bryan Schumaker <[email protected]>
> Date: Wed Apr 6 14:33:28 2011 -0400
>
> NFS: Fix a signed vs. unsigned secinfo bug
>
>
> I bisected it to (in vanilla already):
>
> 8f70e95f9f4159184f557a1db60c909d7c1bd2e3 is the first bad commit
> commit 8f70e95f9f4159184f557a1db60c909d7c1bd2e3
> Author: Bryan Schumaker <[email protected]>
> Date: Thu Mar 24 17:12:31 2011 +0000
>
> NFS: Determine initial mount security
>
> When sec=<something> is not presented as a mount option,
> we should attempt to determine what security flavor the
> server is using.
>
> Signed-off-by: Bryan Schumaker <[email protected]>
> Signed-off-by: Trond Myklebust <[email protected]>
>
> :040000 040000 8e5a640b37e00f0df21e1d9cd9aff160df2d5938
> 0152daa67bc8d12e32cda5f4a036807d2e380392 M fs
> :040000 040000 f74aa33f8597cb82cd0fd7d90d84e0660b7f5804
> 527bc0ca6975cedc7e684b45dc9961f8aaf1207a M include
> :040000 040000 87559d2f211ea905343a86c8551b6610dd239891
> 7e4ee0e5eddf12474b6de9e7fdb6218b6165bdb2 M net
>
> thanks,
--
js
suse labs