2019-03-14 19:32:31

by Marc Dionne

[permalink] [raw]
Subject: Hanging nfs mount - bisected to merge commit 06b5fc3ad94e

I have an odd nfs issue with the current mainline kernel, where a
mount hangs and eventually times out when attempting a mount with nfs
version 4 and that is not offered by the server. With prior kernels
it fails quickly with "mount.nfs: Protocol not supported".

To reproduce, I have rpc.nfsd start with "-N 4", then try to mount
something local with -o nfsvers=4, for instance:
$ mount -t nfs -o nfsvers=4 localhost:/s /mnt

I bisected the behaviour down to commit 06b5fc3ad94e ("Merge tag
'nfs-rdma-for-5.1-1' of
git://git.linux-nfs.org/projects/anna/linux-nfs"). It's a merge
commit, but one that has quite a bit of conflict resolution. I also
verified that the 2 parent commits of the merge (5085607d2091 and
2c94b8eca1a2) both behave as old kernels did.

Thanks
Marc


2019-03-15 17:54:55

by Trond Myklebust

[permalink] [raw]
Subject: Re: Hanging nfs mount - bisected to merge commit 06b5fc3ad94e

Hi Marc

On Thu, 2019-03-14 at 16:32 -0300, Marc Dionne wrote:
> I have an odd nfs issue with the current mainline kernel, where a
> mount hangs and eventually times out when attempting a mount with nfs
> version 4 and that is not offered by the server. With prior kernels
> it fails quickly with "mount.nfs: Protocol not supported".
>
> To reproduce, I have rpc.nfsd start with "-N 4", then try to mount
> something local with -o nfsvers=4, for instance:
> $ mount -t nfs -o nfsvers=4 localhost:/s /mnt
>
> I bisected the behaviour down to commit 06b5fc3ad94e ("Merge tag
> 'nfs-rdma-for-5.1-1' of
> git://git.linux-nfs.org/projects/anna/linux-nfs"). It's a merge
> commit, but one that has quite a bit of conflict resolution. I also
> verified that the 2 parent commits of the merge (5085607d2091 and
> 2c94b8eca1a2) both behave as old kernels did.
>
> Thanks
> Marc

Can you please check if the following patch and/or the 'testing' branch
in
http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=shortlog;h=refs/heads/testing
fixes the issue?

Thanks
Trond
8<----------------------------------------------
From 513149607d19bc3821386fb5ac75f8b99fd4b115 Mon Sep 17 00:00:00 2001
From: Trond Myklebust <[email protected]>
Date: Fri, 15 Mar 2019 12:55:59 -0400
Subject: [PATCH 2/6] SUNRPC: Fix the minimal size for reply buffer allocation

We must at minimum allocate enough memory to be able to see any auth
errors in the reply from the server.

Fixes: 2c94b8eca1a26 ("SUNRPC: Use au_rslack when computing reply...")
Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/clnt.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index 4216fe33204a..310873895578 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1730,7 +1730,12 @@ call_allocate(struct rpc_task *task)
req->rq_callsize = RPC_CALLHDRSIZE + (auth->au_cslack << 1) +
proc->p_arglen;
req->rq_callsize <<= 2;
- req->rq_rcvsize = RPC_REPHDRSIZE + auth->au_rslack + proc->p_replen;
+ /*
+ * Note: the reply buffer must at minimum allocate enough space
+ * for the 'struct accepted_reply' from RFC5531.
+ */
+ req->rq_rcvsize = RPC_REPHDRSIZE + auth->au_rslack + \
+ max_t(size_t, proc->p_replen, 2);
req->rq_rcvsize <<= 2;

status = xprt->ops->buf_alloc(task);
--
2.20.1

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2019-03-15 19:19:07

by Marc Dionne

[permalink] [raw]
Subject: Re: Hanging nfs mount - bisected to merge commit 06b5fc3ad94e

On Fri, Mar 15, 2019 at 2:54 PM Trond Myklebust <[email protected]> wrote:
>
> Hi Marc
>
> On Thu, 2019-03-14 at 16:32 -0300, Marc Dionne wrote:
> > I have an odd nfs issue with the current mainline kernel, where a
> > mount hangs and eventually times out when attempting a mount with nfs
> > version 4 and that is not offered by the server. With prior kernels
> > it fails quickly with "mount.nfs: Protocol not supported".
> >
> > To reproduce, I have rpc.nfsd start with "-N 4", then try to mount
> > something local with -o nfsvers=4, for instance:
> > $ mount -t nfs -o nfsvers=4 localhost:/s /mnt
> >
> > I bisected the behaviour down to commit 06b5fc3ad94e ("Merge tag
> > 'nfs-rdma-for-5.1-1' of
> > git://git.linux-nfs.org/projects/anna/linux-nfs"). It's a merge
> > commit, but one that has quite a bit of conflict resolution. I also
> > verified that the 2 parent commits of the merge (5085607d2091 and
> > 2c94b8eca1a2) both behave as old kernels did.
> >
> > Thanks
> > Marc
>
> Can you please check if the following patch and/or the 'testing' branch
> in
> http://git.linux-nfs.org/?p=trondmy/linux-nfs.git;a=shortlog;h=refs/heads/testing
> fixes the issue?
>
> Thanks
> Trond
> 8<----------------------------------------------
> From 513149607d19bc3821386fb5ac75f8b99fd4b115 Mon Sep 17 00:00:00 2001
> From: Trond Myklebust <[email protected]>
> Date: Fri, 15 Mar 2019 12:55:59 -0400
> Subject: [PATCH 2/6] SUNRPC: Fix the minimal size for reply buffer allocation
>
> We must at minimum allocate enough memory to be able to see any auth
> errors in the reply from the server.
>
> Fixes: 2c94b8eca1a26 ("SUNRPC: Use au_rslack when computing reply...")
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/clnt.c | 7 ++++++-
> 1 file changed, 6 insertions(+), 1 deletion(-)
>
> diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
> index 4216fe33204a..310873895578 100644
> --- a/net/sunrpc/clnt.c
> +++ b/net/sunrpc/clnt.c
> @@ -1730,7 +1730,12 @@ call_allocate(struct rpc_task *task)
> req->rq_callsize = RPC_CALLHDRSIZE + (auth->au_cslack << 1) +
> proc->p_arglen;
> req->rq_callsize <<= 2;
> - req->rq_rcvsize = RPC_REPHDRSIZE + auth->au_rslack + proc->p_replen;
> + /*
> + * Note: the reply buffer must at minimum allocate enough space
> + * for the 'struct accepted_reply' from RFC5531.
> + */
> + req->rq_rcvsize = RPC_REPHDRSIZE + auth->au_rslack + \
> + max_t(size_t, proc->p_replen, 2);
> req->rq_rcvsize <<= 2;
>
> status = xprt->ops->buf_alloc(task);
> --
> 2.20.1

Hi Trond,

I can confirm that this patch does fix my issue.

Thanks,
Marc