2013-12-20 17:10:46

by Gareth Williams

[permalink] [raw]
Subject: Question ref Running NFS at V4 Only

Hi,

I'm trying to run NFS with protocol version 4 only (that is, with v2 &
v3 disabled) on a CentOS 6.5 install running as a KVM guest.

The RedHat documentation (amongst others) states that rpcbind isn't
needed with v4, but if I start nfs without rpcbind I get errors.

I've spent a couple of days (on and off) on Google trying to get an
answer and have posted on the CentOS forum, but the nearest I can find
is an archive on this mailing list from over two years ago (and it's not
identical):-

http://www.spinics.net/lists/linux-nfs/msg16907.html

In /etc/sysconfig/nfs I have:-

MOUNTD_NFS_V2="no"
MOUNTD_NFS_V3="no"

RPCNFSDARGS="-N 2 -N 3"

When I attempt to start NFS either using the init scripts or manually
with:-

rpc.nfsd -N 2 -N 3

I get:-

rpc.nfsd: writing fd to kernel failed: errno 111 (Connection refused)
rpc.nfsd: unable to set any sockets for nfsd

The mailing list archive's answer was that the kernel was too old, so I
installed 3.10.24 from elrepo but the symptom doesn't change.

If I start rpcbind, then everything works, but as far as I can see, I
shouldn't have to do that unless I'm supporting v2 & v3.

Finally, please accept my apologies for wasting your time if I've missed
something obvious.

Kind regards,

Gareth


2013-12-27 19:40:37

by Chuck Lever III

[permalink] [raw]
Subject: Re: Question ref Running NFS at V4 Only


On Dec 27, 2013, at 1:43 PM, J.;Bruce Fields <[email protected]> wrote:

> On Fri, Dec 27, 2013 at 11:05:05AM -0500, Chuck Lever wrote:
>> Hi-
>>
>> On Dec 27, 2013, at 5:17 AM, Kinglong Mee <[email protected]> wrote:
>>
>>> On 12/24/2013 01:39 AM, J. Bruce Fields wrote:
>>>> On Fri, Dec 20, 2013 at 05:10:42PM +0000, Gareth Williams wrote:
>>>>> Hi,
>>>>>
>>>>> I'm trying to run NFS with protocol version 4 only (that is, with v2
>>>>> & v3 disabled) on a CentOS 6.5 install running as a KVM guest.
>>>>>
>>>>> The RedHat documentation (amongst others) states that rpcbind isn't
>>>>> needed with v4, but if I start nfs without rpcbind I get errors.
>>>>
>>>> I suspect the kernel code needs to be fixed to not attempt to register
>>>> with rpcbind n the v4-only case. (Or to attempt to register but ignore
>>>> any error, I'm not sure which is best.)
>>>>
>>>> And this may not be the only issue in the v4-only case. This isn't
>>>> really a priority for me right now, but I'd happily look at patches.
>>>
>>> Hi all,
>>>
>>> I make a patch for this problem, please have a check, thanks.
>>>
>>> From 64c1f96348213f39b9411ab25699a292edbef4ef Mon Sep 17 00:00:00 2001
>>> From: Kinglong Mee <[email protected]>
>>> Date: Fri, 27 Dec 2013 18:06:25 +0800
>>> Subject: [PATCH] NFSD: supports nfsv4 service without rpcbind
>>>
>>> 1. set vs_hidden in nfsd_version4 to avoid register nfsv4 to rpcbind
>>
>> IMO we do want the NFS port registered if rpcbind is running. NFSv4 is not a hidden service, like the client's callback server which can only be discovered by a forward advertisement (SETCLIENTID).
>>
>> I think I prefer ignoring the rpcb_set error for NFSv4.
>
> Agreed. My only concern would be that there be no unnecessary delays or
> errors logged in the v4-only case if rpcbind isn't running.

I believe the rpcb_set upcall now uses the AF_LOCAL transport, which should be able to detect immediately that rpcbind is not listening.

The OP did not report a delay or hang, thankfully.

>
> --b.
>
>>
>>
>>> 2. don't start lockd when only supports nfsv4.
>>>
>>> Reported-by: Gareth Williams <[email protected]>
>>> Signed-off-by: Kinglong Mee <[email protected]>
>>> ---
>>> fs/nfsd/netns.h | 3 +++
>>> fs/nfsd/nfs4proc.c | 1 +
>>> fs/nfsd/nfsctl.c | 3 +++
>>> fs/nfsd/nfssvc.c | 21 ++++++++++++++++-----
>>> 4 files changed, 23 insertions(+), 5 deletions(-)
>>>
>>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
>>> index 849a7c3..ae2c179 100644
>>> --- a/fs/nfsd/netns.h
>>> +++ b/fs/nfsd/netns.h
>>> @@ -96,6 +96,9 @@ struct nfsd_net {
>>>
>>> bool nfsd_net_up;
>>>
>>> + bool lockd_up;
>>> + u32 nfsd_needs_lockd;
>>> +
>>> /*
>>> * Time of server startup
>>> */
>>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>>> index 419572f..1496376 100644
>>> --- a/fs/nfsd/nfs4proc.c
>>> +++ b/fs/nfsd/nfs4proc.c
>>> @@ -1881,6 +1881,7 @@ struct svc_version nfsd_version4 = {
>>> .vs_proc = nfsd_procedures4,
>>> .vs_dispatch = nfsd_dispatch,
>>> .vs_xdrsize = NFS4_SVC_XDRSIZE,
>>> + .vs_hidden = 1,
>>> };
>>>
>>> /*
>>> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
>>> index 7f55517..8c7b0f0 100644
>>> --- a/fs/nfsd/nfsctl.c
>>> +++ b/fs/nfsd/nfsctl.c
>>> @@ -575,6 +575,9 @@ static ssize_t __write_versions(struct file *file, char *buf, size_t size)
>>> switch(num) {
>>> case 2:
>>> case 3:
>>> + nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
>>> + nn->nfsd_needs_lockd = nfsd_vers(num, NFSD_TEST);
>>> + break;
>>> case 4:
>>> nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
>>> break;
>>> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
>>> index 760c85a..2b841d8 100644
>>> --- a/fs/nfsd/nfssvc.c
>>> +++ b/fs/nfsd/nfssvc.c
>>> @@ -255,9 +255,14 @@ static int nfsd_startup_net(int nrservs, struct net *net)
>>> ret = nfsd_init_socks(net);
>>> if (ret)
>>> goto out_socks;
>>> - ret = lockd_up(net);
>>> - if (ret)
>>> - goto out_socks;
>>> +
>>> + if (nn->nfsd_needs_lockd && !nn->lockd_up) {
>>> + ret = lockd_up(net);
>>> + if (ret)
>>> + goto out_socks;
>>> + nn->lockd_up = 1;
>>> + }
>>> +
>>> ret = nfs4_state_start_net(net);
>>> if (ret)
>>> goto out_lockd;
>>> @@ -266,7 +271,10 @@ static int nfsd_startup_net(int nrservs, struct net *net)
>>> return 0;
>>>
>>> out_lockd:
>>> - lockd_down(net);
>>> + if (nn->lockd_up) {
>>> + lockd_down(net);
>>> + nn->lockd_up = 0;
>>> + }
>>> out_socks:
>>> nfsd_shutdown_generic();
>>> return ret;
>>> @@ -277,7 +285,10 @@ static void nfsd_shutdown_net(struct net *net)
>>> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>>>
>>> nfs4_state_shutdown_net(net);
>>> - lockd_down(net);
>>> + if (nn->lockd_up) {
>>> + lockd_down(net);
>>> + nn->lockd_up = 0;
>>> + }
>>> nn->nfsd_net_up = false;
>>> nfsd_shutdown_generic();
>>> }
>>> --
>>> 1.8.4.2
>>>
>>>
>>> --
>>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>>> the body of a message to [email protected]
>>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>> --
>> Chuck Lever
>> chuck[dot]lever[at]oracle[dot]com
>>
>>
>>
>>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2013-12-30 11:24:32

by Kinglong Mee

[permalink] [raw]
Subject: [PATCH 1/2] SUNRPC: supports ignoring error from svc_register

NFSv4 needs ignoring the return errno from svc_register,
so adds a flags for svc_version.

Signed-off-by: Kinglong Mee <[email protected]>
---
include/linux/sunrpc/svc.h | 1 +
net/sunrpc/svc.c | 25 +++++++++++++++++--------
2 files changed, 18 insertions(+), 8 deletions(-)

diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 6eecfc2..9ca8280 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -388,6 +388,7 @@ struct svc_version {

unsigned int vs_hidden : 1; /* Don't register with portmapper.
* Only used for nfsacl so far. */
+ unsigned int vs_ignore_err : 1;

/* Override dispatch function (e.g. when caching replies).
* A return value of 0 means drop the request.
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index e7fbe36..7dba3c3 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -916,9 +916,6 @@ static int __svc_register(struct net *net, const char *progname,
#endif
}

- if (error < 0)
- printk(KERN_WARNING "svc: failed to register %sv%u RPC "
- "service (errno %d).\n", progname, version, -error);
return error;
}

@@ -937,6 +934,7 @@ int svc_register(const struct svc_serv *serv, struct net *net,
const unsigned short port)
{
struct svc_program *progp;
+ struct svc_version *vers;
unsigned int i;
int error = 0;

@@ -946,7 +944,8 @@ int svc_register(const struct svc_serv *serv, struct net *net,

for (progp = serv->sv_program; progp; progp = progp->pg_next) {
for (i = 0; i < progp->pg_nvers; i++) {
- if (progp->pg_vers[i] == NULL)
+ vers = progp->pg_vers[i];
+ if (vers == NULL)
continue;

dprintk("svc: svc_register(%sv%d, %s, %u, %u)%s\n",
@@ -955,16 +954,26 @@ int svc_register(const struct svc_serv *serv, struct net *net,
proto == IPPROTO_UDP? "udp" : "tcp",
port,
family,
- progp->pg_vers[i]->vs_hidden?
- " (but not telling portmap)" : "");
+ vers->vs_hidden ?
+ " (but not telling portmap)" : "");

- if (progp->pg_vers[i]->vs_hidden)
+ if (vers->vs_hidden)
continue;

error = __svc_register(net, progp->pg_name, progp->pg_prog,
i, family, proto, port);
- if (error < 0)
+
+ if (vers->vs_ignore_err) {
+ error = 0;
+ continue;
+ }
+
+ if (error < 0) {
+ printk(KERN_WARNING "svc: failed to register "
+ "%sv%u RPC service (errno %d).\n",
+ progp->pg_name, i, -error);
break;
+ }
}
}

--
1.8.4.2


2013-12-30 11:22:47

by Kinglong Mee

[permalink] [raw]
Subject: Re: Question ref Running NFS at V4 Only

On 12/29/2013 05:11 PM, Kinglong Mee wrote:
> Hi all,
>
> I found the commit 561ec1603171cd9b38dcf6cac53e8710f437a48d
> "SUNRPC: call_connect_status should recheck bind and connect status on error"
> causes the loop. Without this patch, I can get error immediately as
> Williams reports.
>
> I will make a patch for this problem without the commit.
> Before that, we need fix the loop.

After reverting that commit, two new patches for this problem will be sent out.

thanks,
Kinglong Mee

> On Sun, Dec 29, 2013 at 4:17 PM, Kinglong Mee <[email protected]> wrote:
>> After open the debug log, found rpc.nfsd hang in a loop in __rpc_execute.
>>
>> [ 6179.978202] RPC: 1 sync task resuming
>> [ 6179.981254] RPC: 1 xprt_connect_status: retrying
>> [ 6179.984289] RPC: 1 call_connect_status (status -11)
>> [ 6179.987292] RPC: 1 call_bind (status 0)
>> [ 6179.990273] RPC: 1 call_connect xprt da4d5000 is not connected
>> [ 6179.993271] RPC: 1 xprt_connect xprt da4d5000 is not connected
>> [ 6179.996196] RPC: 1 sleep_on(queue "xprt_pending" time 5876962)
>> [ 6179.999043] RPC: 1 added to queue da4d518c "xprt_pending"
>> [ 6180.001885] RPC: 1 setting alarm for 60000 ms
>> [ 6180.004725] RPC: xs_connect scheduled xprt da4d5000
>> [ 6180.007549] RPC: 1 sync task going to sleep
>> [ 6180.049927] RPC: disconnecting xprt da4d5000 to reuse port
>> [ 6180.054460] RPC: AF_UNSPEC connect return code 0
>> [ 6180.059560] RPC: worker connecting xprt da4d5000 via tcp to
>> 127.0.0.1 (port 111)
>> [ 6180.062384] RPC: xs_tcp_state_change client da4d5000...
>> [ 6180.065013] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
>> [ 6180.067891] RPC: disconnected transport da4d5000
>> [ 6180.070465] RPC: 1 __rpc_wake_up_task (now 5877036)
>> [ 6180.073014] RPC: 1 disabling timer
>> [ 6180.075553] RPC: 1 removed from queue da4d518c "xprt_pending"
>> [ 6180.078036] RPC: __rpc_wake_up_task done
>> [ 6180.080545] RPC: da4d5000 connect status 115 connected 0 sock state
>> 7
>> [ 6180.085953] RPC: 1 sync task resuming
>> [ 6180.088376] RPC: 1 xprt_connect_status: retrying
>> [ 6180.090699] RPC: 1 call_connect_status (status -11)
>>
>> thanks,
>> Kinglong Mee
>>
>> 在 2013年12月29日,下午2:39,Kinglong Mee <[email protected]> 写道:
>>
>> I get the trace when rpc.nfsd hang.
>>
>> 1608 Dec 29 14:25:12 localhost kernel: [ 1224.449293] rpc.nfsd
>> D c0d94300 0 1199 991 0x00000080
>>
>> 1609 Dec 29 14:25:12 localhost kernel: [ 1224.451347] d9a9bc98
>> 00000086 c046aa48 c0d94300 ddb66540 c0d79300 58625871 0000011c
>>
>> 1610 Dec 29 14:25:12 localhost kernel: [ 1224.453426] c0d79300
>> dfff4300 dcd93a80 c0461738 00000000 c0d94300 00000000 00000020
>>
>> 1611 Dec 29 14:25:12 localhost kernel: [ 1224.455701] da4d53a0
>> 00000020 ddb66540 da4d53b0 d9a9bc84 c046add9 dcd93a80 00000292
>>
>> 1612 Dec 29 14:25:12 localhost kernel: [ 1224.457853] Call Trace:
>>
>> 1613 Dec 29 14:25:12 localhost kernel: [ 1224.459919] [<c046aa48>] ?
>> insert_work+0x38/0x80
>>
>> 1614 Dec 29 14:25:12 localhost kernel: [ 1224.462055] [<c0461738>] ?
>> mod_timer+0xe8/0x1c0
>>
>> 1615 Dec 29 14:25:12 localhost kernel: [ 1224.464230] [<c046add9>] ?
>> __queue_delayed_work+0x89/0x140
>>
>> 1616 Dec 29 14:25:12 localhost kernel: [ 1224.466304] [<e092d970>] ?
>> __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
>>
>> 1617 Dec 29 14:25:12 localhost kernel: [ 1224.468561] [<c09bd543>]
>> schedule+0x23/0x60
>>
>> 1618 Dec 29 14:25:12 localhost kernel: [ 1224.470671] [<e092d99d>]
>> rpc_wait_bit_killable+0x2d/0x80 [sunrpc]
>>
>> 1619 Dec 29 14:25:12 localhost kernel: [ 1224.472785] [<c09bdae1>]
>> __wait_on_bit+0x51/0x70
>>
>> 1620 Dec 29 14:25:12 localhost kernel: [ 1224.474974] [<e092d970>] ?
>> __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
>>
>> 1621 Dec 29 14:25:12 localhost kernel: [ 1224.477170] [<e092d970>] ?
>> __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
>>
>> 1622 Dec 29 14:25:12 localhost kernel: [ 1224.479389] [<c09bdb5b>]
>> out_of_line_wait_on_bit+0x5b/0x70
>>
>> 1623 Dec 29 14:25:12 localhost kernel: [ 1224.481739] [<c048e440>] ?
>> autoremove_wake_function+0x40/0x40
>>
>> 1624 Dec 29 14:25:12 localhost kernel: [ 1224.483881] [<e092e703>]
>> __rpc_execute+0x1f3/0x3a0 [sunrpc]
>>
>> 1625 Dec 29 14:25:12 localhost kernel: [ 1224.486030] [<c0516283>] ?
>> mempool_alloc_slab+0x13/0x20
>>
>> 1626 Dec 29 14:25:12 localhost kernel: [ 1224.488194] [<c051638e>] ?
>> mempool_alloc+0x3e/0x100
>>
>> 1627 Dec 29 14:25:12 localhost kernel: [ 1224.490307] [<e0925d80>] ?
>> call_bind_status+0x260/0x260 [sunrpc]
>>
>> 1628 Dec 29 14:25:12 localhost kernel: [ 1224.492559] [<c048e09c>] ?
>> wake_up_bit+0x1c/0x20
>>
>> 1629 Dec 29 14:25:12 localhost kernel: [ 1224.494844] [<e092f866>]
>> rpc_execute+0x56/0x90 [sunrpc]
>>
>> 1630 Dec 29 14:25:12 localhost kernel: [ 1224.496935] [<e0926cf9>]
>> rpc_run_task+0x59/0x70 [sunrpc]
>>
>> 1631 Dec 29 14:25:12 localhost kernel: [ 1224.499122] [<e0926d4c>]
>> rpc_call_sync+0x3c/0x90 [sunrpc]
>>
>> 1632 Dec 29 14:25:12 localhost kernel: [ 1224.501260] [<e0926de8>]
>> rpc_ping+0x48/0x60 [sunrpc]
>>
>> 1633 Dec 29 14:25:12 localhost kernel: [ 1224.503367] [<e092703b>]
>> rpc_bind_new_program+0x4b/0x70 [sunrpc]
>>
>> 1634 Dec 29 14:25:12 localhost kernel: [ 1224.505608] [<e0938333>]
>> rpcb_create_local+0x163/0x1f0 [sunrpc]
>>
>> 1635 Dec 29 14:25:12 localhost kernel: [ 1224.507720] [<e0932199>] ?
>> __svc_create+0x119/0x1f0 [sunrpc]
>>
>> 1636 Dec 29 14:25:12 localhost kernel: [ 1224.509830] [<e0932016>]
>> svc_rpcb_setup+0x16/0x30 [sunrpc]
>>
>> 1637 Dec 29 14:25:12 localhost kernel: [ 1224.511963] [<e0932052>]
>> svc_bind+0x22/0x30 [sunrpc]
>>
>> 1638 Dec 29 14:25:12 localhost kernel: [ 1224.514056] [<e09b73a4>]
>> nfsd_create_serv+0xc4/0x1d0 [nfsd]
>>
>> 1639 Dec 29 14:25:12 localhost kernel: [ 1224.516844] [<e09b7600>] ?
>> nfsd_destroy+0x70/0x70 [nfsd]
>>
>> 1640 Dec 29 14:25:12 localhost kernel: [ 1224.518870] [<e09b8d1f>]
>> write_ports+0x21f/0x2b0 [nfsd]
>>
>> 1641 Dec 29 14:25:12 localhost kernel: [ 1224.521359] [<c06a182c>] ?
>> _copy_from_user+0x2c/0x40
>>
>> 1642 Dec 29 14:25:12 localhost kernel: [ 1224.523358] [<c05854fe>] ?
>> simple_transaction_get+0x8e/0xa0
>>
>> 1643 Dec 29 14:25:12 localhost kernel: [ 1224.525383] [<e09b8b00>] ?
>> write_recoverydir+0xf0/0xf0 [nfsd]
>>
>> 1644 Dec 29 14:25:12 localhost kernel: [ 1224.527386] [<e09b7f3b>]
>> nfsctl_transaction_write+0x3b/0x60 [nfsd]
>>
>> 1645 Dec 29 14:25:12 localhost kernel: [ 1224.529302] [<e09b7f00>] ?
>> export_features_show+0x30/0x30 [nfsd]
>>
>> 1646 Dec 29 14:25:12 localhost kernel: [ 1224.531366] [<c0564695>]
>> vfs_write+0x95/0x1c0
>>
>> 1647 Dec 29 14:25:12 localhost kernel: [ 1224.533401] [<c0564d49>]
>> SyS_write+0x49/0x90
>>
>> 1648 Dec 29 14:25:12 localhost kernel: [ 1224.535380] [<c09c730d>]
>> sysenter_do_call+0x12/0x28
>>
>> thanks,
>> Kinglong Mee
>>
>> 2013/12/28 Kinglong Mee <[email protected]>:
>>
>>
>> 在 2013年12月28日,上午3:40,Chuck Lever <[email protected]> 写道:
>>
>>
>> On Dec 27, 2013, at 1:43 PM, J.;Bruce Fields <[email protected]> wrote:
>>
>> On Fri, Dec 27, 2013 at 11:05:05AM -0500, Chuck Lever wrote:
>>
>> Hi-
>>
>> On Dec 27, 2013, at 5:17 AM, Kinglong Mee <[email protected]> wrote:
>>
>> On 12/24/2013 01:39 AM, J. Bruce Fields wrote:
>>
>> On Fri, Dec 20, 2013 at 05:10:42PM +0000, Gareth Williams wrote:
>>
>> Hi,
>>
>> I'm trying to run NFS with protocol version 4 only (that is, with v2
>> & v3 disabled) on a CentOS 6.5 install running as a KVM guest.
>>
>> The RedHat documentation (amongst others) states that rpcbind isn't
>> needed with v4, but if I start nfs without rpcbind I get errors.
>>
>>
>> I suspect the kernel code needs to be fixed to not attempt to register
>> with rpcbind n the v4-only case. (Or to attempt to register but ignore
>> any error, I'm not sure which is best.)
>>
>> And this may not be the only issue in the v4-only case. This isn't
>> really a priority for me right now, but I'd happily look at patches.
>>
>>
>> Hi all,
>>
>> I make a patch for this problem, please have a check, thanks.
>>
>> From 64c1f96348213f39b9411ab25699a292edbef4ef Mon Sep 17 00:00:00 2001
>> From: Kinglong Mee <[email protected]>
>> Date: Fri, 27 Dec 2013 18:06:25 +0800
>> Subject: [PATCH] NFSD: supports nfsv4 service without rpcbind
>>
>> 1. set vs_hidden in nfsd_version4 to avoid register nfsv4 to rpcbind
>>
>>
>> IMO we do want the NFS port registered if rpcbind is running. NFSv4 is not
>> a hidden service, like the client's callback server which can only be
>> discovered by a forward advertisement (SETCLIENTID).
>>
>> I think I prefer ignoring the rpcb_set error for NFSv4.
>>
>>
>> Agreed. My only concern would be that there be no unnecessary delays or
>> errors logged in the v4-only case if rpcbind isn't running.
>>
>>
>> I believe the rpcb_set upcall now uses the AF_LOCAL transport, which should
>> be able to detect immediately that rpcbind is not listening.
>>
>> The OP did not report a delay or hang, thankfully.
>>
>>
>> I meet a problem when testing on Fedora 20 with latest kernel,
>> svc_register for nfsv4 not report immediately, instead of a delay and
>> return EIO.
>>
>> After that, rpc.nfsd also hang there, not return utils rpcbind start.
>> I will have a check for that.
>>
>> thanks.
>> Kinglong Mee
>>
>>
>>
>> --b.
>>
>>
>>
>> 2. don't start lockd when only supports nfsv4.
>>
>> Reported-by: Gareth Williams <[email protected]>
>> Signed-off-by: Kinglong Mee <[email protected]>
>> ---
>> fs/nfsd/netns.h | 3 +++
>> fs/nfsd/nfs4proc.c | 1 +
>> fs/nfsd/nfsctl.c | 3 +++
>> fs/nfsd/nfssvc.c | 21 ++++++++++++++++-----
>> 4 files changed, 23 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
>> index 849a7c3..ae2c179 100644
>> --- a/fs/nfsd/netns.h
>> +++ b/fs/nfsd/netns.h
>> @@ -96,6 +96,9 @@ struct nfsd_net {
>>
>> bool nfsd_net_up;
>>
>> + bool lockd_up;
>> + u32 nfsd_needs_lockd;
>> +
>> /*
>> * Time of server startup
>> */
>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>> index 419572f..1496376 100644
>> --- a/fs/nfsd/nfs4proc.c
>> +++ b/fs/nfsd/nfs4proc.c
>> @@ -1881,6 +1881,7 @@ struct svc_version nfsd_version4 = {
>> .vs_proc = nfsd_procedures4,
>> .vs_dispatch = nfsd_dispatch,
>> .vs_xdrsize = NFS4_SVC_XDRSIZE,
>> + .vs_hidden = 1,
>> };
>>
>> /*
>> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
>> index 7f55517..8c7b0f0 100644
>> --- a/fs/nfsd/nfsctl.c
>> +++ b/fs/nfsd/nfsctl.c
>> @@ -575,6 +575,9 @@ static ssize_t __write_versions(struct file *file, char
>> *buf, size_t size)
>> switch(num) {
>> case 2:
>> case 3:
>> + nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
>> + nn->nfsd_needs_lockd = nfsd_vers(num, NFSD_TEST);
>> + break;
>> case 4:
>> nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
>> break;
>> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
>> index 760c85a..2b841d8 100644
>> --- a/fs/nfsd/nfssvc.c
>> +++ b/fs/nfsd/nfssvc.c
>> @@ -255,9 +255,14 @@ static int nfsd_startup_net(int nrservs, struct net
>> *net)
>> ret = nfsd_init_socks(net);
>> if (ret)
>> goto out_socks;
>> - ret = lockd_up(net);
>> - if (ret)
>> - goto out_socks;
>> +
>> + if (nn->nfsd_needs_lockd && !nn->lockd_up) {
>> + ret = lockd_up(net);
>> + if (ret)
>> + goto out_socks;
>> + nn->lockd_up = 1;
>> + }
>> +
>> ret = nfs4_state_start_net(net);
>> if (ret)
>> goto out_lockd;
>> @@ -266,7 +271,10 @@ static int nfsd_startup_net(int nrservs, struct net
>> *net)
>> return 0;
>>
>> out_lockd:
>> - lockd_down(net);
>> + if (nn->lockd_up) {
>> + lockd_down(net);
>> + nn->lockd_up = 0;
>> + }
>> out_socks:
>> nfsd_shutdown_generic();
>> return ret;
>> @@ -277,7 +285,10 @@ static void nfsd_shutdown_net(struct net *net)
>> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>>
>> nfs4_state_shutdown_net(net);
>> - lockd_down(net);
>> + if (nn->lockd_up) {
>> + lockd_down(net);
>> + nn->lockd_up = 0;
>> + }
>> nn->nfsd_net_up = false;
>> nfsd_shutdown_generic();
>> }
>> --
>> 1.8.4.2
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>> --
>> Chuck Lever
>> chuck[dot]lever[at]oracle[dot]com
>>
>>
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>>
>> --
>> Chuck Lever
>> chuck[dot]lever[at]oracle[dot]com
>>
>>
>>
>


2013-12-29 09:12:00

by Kinglong Mee

[permalink] [raw]
Subject: Re: Question ref Running NFS at V4 Only

Hi all,

I found the commit 561ec1603171cd9b38dcf6cac53e8710f437a48d
"SUNRPC: call_connect_status should recheck bind and connect status on error"
causes the loop. Without this patch, I can get error immediately as
Williams reports.

I will make a patch for this problem without the commit.
Before that, we need fix the loop.

Ps: cc Trond

thanks,
Kinglong Mee

On Sun, Dec 29, 2013 at 4:17 PM, Kinglong Mee <[email protected]> wrote:
> After open the debug log, found rpc.nfsd hang in a loop in __rpc_execute.
>
> [ 6179.978202] RPC: 1 sync task resuming
> [ 6179.981254] RPC: 1 xprt_connect_status: retrying
> [ 6179.984289] RPC: 1 call_connect_status (status -11)
> [ 6179.987292] RPC: 1 call_bind (status 0)
> [ 6179.990273] RPC: 1 call_connect xprt da4d5000 is not connected
> [ 6179.993271] RPC: 1 xprt_connect xprt da4d5000 is not connected
> [ 6179.996196] RPC: 1 sleep_on(queue "xprt_pending" time 5876962)
> [ 6179.999043] RPC: 1 added to queue da4d518c "xprt_pending"
> [ 6180.001885] RPC: 1 setting alarm for 60000 ms
> [ 6180.004725] RPC: xs_connect scheduled xprt da4d5000
> [ 6180.007549] RPC: 1 sync task going to sleep
> [ 6180.049927] RPC: disconnecting xprt da4d5000 to reuse port
> [ 6180.054460] RPC: AF_UNSPEC connect return code 0
> [ 6180.059560] RPC: worker connecting xprt da4d5000 via tcp to
> 127.0.0.1 (port 111)
> [ 6180.062384] RPC: xs_tcp_state_change client da4d5000...
> [ 6180.065013] RPC: state 7 conn 0 dead 0 zapped 1 sk_shutdown 3
> [ 6180.067891] RPC: disconnected transport da4d5000
> [ 6180.070465] RPC: 1 __rpc_wake_up_task (now 5877036)
> [ 6180.073014] RPC: 1 disabling timer
> [ 6180.075553] RPC: 1 removed from queue da4d518c "xprt_pending"
> [ 6180.078036] RPC: __rpc_wake_up_task done
> [ 6180.080545] RPC: da4d5000 connect status 115 connected 0 sock state
> 7
> [ 6180.085953] RPC: 1 sync task resuming
> [ 6180.088376] RPC: 1 xprt_connect_status: retrying
> [ 6180.090699] RPC: 1 call_connect_status (status -11)
>
> thanks,
> Kinglong Mee
>
> 在 2013年12月29日,下午2:39,Kinglong Mee <[email protected]> 写道:
>
> I get the trace when rpc.nfsd hang.
>
> 1608 Dec 29 14:25:12 localhost kernel: [ 1224.449293] rpc.nfsd
> D c0d94300 0 1199 991 0x00000080
>
> 1609 Dec 29 14:25:12 localhost kernel: [ 1224.451347] d9a9bc98
> 00000086 c046aa48 c0d94300 ddb66540 c0d79300 58625871 0000011c
>
> 1610 Dec 29 14:25:12 localhost kernel: [ 1224.453426] c0d79300
> dfff4300 dcd93a80 c0461738 00000000 c0d94300 00000000 00000020
>
> 1611 Dec 29 14:25:12 localhost kernel: [ 1224.455701] da4d53a0
> 00000020 ddb66540 da4d53b0 d9a9bc84 c046add9 dcd93a80 00000292
>
> 1612 Dec 29 14:25:12 localhost kernel: [ 1224.457853] Call Trace:
>
> 1613 Dec 29 14:25:12 localhost kernel: [ 1224.459919] [<c046aa48>] ?
> insert_work+0x38/0x80
>
> 1614 Dec 29 14:25:12 localhost kernel: [ 1224.462055] [<c0461738>] ?
> mod_timer+0xe8/0x1c0
>
> 1615 Dec 29 14:25:12 localhost kernel: [ 1224.464230] [<c046add9>] ?
> __queue_delayed_work+0x89/0x140
>
> 1616 Dec 29 14:25:12 localhost kernel: [ 1224.466304] [<e092d970>] ?
> __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
>
> 1617 Dec 29 14:25:12 localhost kernel: [ 1224.468561] [<c09bd543>]
> schedule+0x23/0x60
>
> 1618 Dec 29 14:25:12 localhost kernel: [ 1224.470671] [<e092d99d>]
> rpc_wait_bit_killable+0x2d/0x80 [sunrpc]
>
> 1619 Dec 29 14:25:12 localhost kernel: [ 1224.472785] [<c09bdae1>]
> __wait_on_bit+0x51/0x70
>
> 1620 Dec 29 14:25:12 localhost kernel: [ 1224.474974] [<e092d970>] ?
> __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
>
> 1621 Dec 29 14:25:12 localhost kernel: [ 1224.477170] [<e092d970>] ?
> __rpc_wait_for_completion_task+0x30/0x30 [sunrpc]
>
> 1622 Dec 29 14:25:12 localhost kernel: [ 1224.479389] [<c09bdb5b>]
> out_of_line_wait_on_bit+0x5b/0x70
>
> 1623 Dec 29 14:25:12 localhost kernel: [ 1224.481739] [<c048e440>] ?
> autoremove_wake_function+0x40/0x40
>
> 1624 Dec 29 14:25:12 localhost kernel: [ 1224.483881] [<e092e703>]
> __rpc_execute+0x1f3/0x3a0 [sunrpc]
>
> 1625 Dec 29 14:25:12 localhost kernel: [ 1224.486030] [<c0516283>] ?
> mempool_alloc_slab+0x13/0x20
>
> 1626 Dec 29 14:25:12 localhost kernel: [ 1224.488194] [<c051638e>] ?
> mempool_alloc+0x3e/0x100
>
> 1627 Dec 29 14:25:12 localhost kernel: [ 1224.490307] [<e0925d80>] ?
> call_bind_status+0x260/0x260 [sunrpc]
>
> 1628 Dec 29 14:25:12 localhost kernel: [ 1224.492559] [<c048e09c>] ?
> wake_up_bit+0x1c/0x20
>
> 1629 Dec 29 14:25:12 localhost kernel: [ 1224.494844] [<e092f866>]
> rpc_execute+0x56/0x90 [sunrpc]
>
> 1630 Dec 29 14:25:12 localhost kernel: [ 1224.496935] [<e0926cf9>]
> rpc_run_task+0x59/0x70 [sunrpc]
>
> 1631 Dec 29 14:25:12 localhost kernel: [ 1224.499122] [<e0926d4c>]
> rpc_call_sync+0x3c/0x90 [sunrpc]
>
> 1632 Dec 29 14:25:12 localhost kernel: [ 1224.501260] [<e0926de8>]
> rpc_ping+0x48/0x60 [sunrpc]
>
> 1633 Dec 29 14:25:12 localhost kernel: [ 1224.503367] [<e092703b>]
> rpc_bind_new_program+0x4b/0x70 [sunrpc]
>
> 1634 Dec 29 14:25:12 localhost kernel: [ 1224.505608] [<e0938333>]
> rpcb_create_local+0x163/0x1f0 [sunrpc]
>
> 1635 Dec 29 14:25:12 localhost kernel: [ 1224.507720] [<e0932199>] ?
> __svc_create+0x119/0x1f0 [sunrpc]
>
> 1636 Dec 29 14:25:12 localhost kernel: [ 1224.509830] [<e0932016>]
> svc_rpcb_setup+0x16/0x30 [sunrpc]
>
> 1637 Dec 29 14:25:12 localhost kernel: [ 1224.511963] [<e0932052>]
> svc_bind+0x22/0x30 [sunrpc]
>
> 1638 Dec 29 14:25:12 localhost kernel: [ 1224.514056] [<e09b73a4>]
> nfsd_create_serv+0xc4/0x1d0 [nfsd]
>
> 1639 Dec 29 14:25:12 localhost kernel: [ 1224.516844] [<e09b7600>] ?
> nfsd_destroy+0x70/0x70 [nfsd]
>
> 1640 Dec 29 14:25:12 localhost kernel: [ 1224.518870] [<e09b8d1f>]
> write_ports+0x21f/0x2b0 [nfsd]
>
> 1641 Dec 29 14:25:12 localhost kernel: [ 1224.521359] [<c06a182c>] ?
> _copy_from_user+0x2c/0x40
>
> 1642 Dec 29 14:25:12 localhost kernel: [ 1224.523358] [<c05854fe>] ?
> simple_transaction_get+0x8e/0xa0
>
> 1643 Dec 29 14:25:12 localhost kernel: [ 1224.525383] [<e09b8b00>] ?
> write_recoverydir+0xf0/0xf0 [nfsd]
>
> 1644 Dec 29 14:25:12 localhost kernel: [ 1224.527386] [<e09b7f3b>]
> nfsctl_transaction_write+0x3b/0x60 [nfsd]
>
> 1645 Dec 29 14:25:12 localhost kernel: [ 1224.529302] [<e09b7f00>] ?
> export_features_show+0x30/0x30 [nfsd]
>
> 1646 Dec 29 14:25:12 localhost kernel: [ 1224.531366] [<c0564695>]
> vfs_write+0x95/0x1c0
>
> 1647 Dec 29 14:25:12 localhost kernel: [ 1224.533401] [<c0564d49>]
> SyS_write+0x49/0x90
>
> 1648 Dec 29 14:25:12 localhost kernel: [ 1224.535380] [<c09c730d>]
> sysenter_do_call+0x12/0x28
>
> thanks,
> Kinglong Mee
>
> 2013/12/28 Kinglong Mee <[email protected]>:
>
>
> 在 2013年12月28日,上午3:40,Chuck Lever <[email protected]> 写道:
>
>
> On Dec 27, 2013, at 1:43 PM, J.;Bruce Fields <[email protected]> wrote:
>
> On Fri, Dec 27, 2013 at 11:05:05AM -0500, Chuck Lever wrote:
>
> Hi-
>
> On Dec 27, 2013, at 5:17 AM, Kinglong Mee <[email protected]> wrote:
>
> On 12/24/2013 01:39 AM, J. Bruce Fields wrote:
>
> On Fri, Dec 20, 2013 at 05:10:42PM +0000, Gareth Williams wrote:
>
> Hi,
>
> I'm trying to run NFS with protocol version 4 only (that is, with v2
> & v3 disabled) on a CentOS 6.5 install running as a KVM guest.
>
> The RedHat documentation (amongst others) states that rpcbind isn't
> needed with v4, but if I start nfs without rpcbind I get errors.
>
>
> I suspect the kernel code needs to be fixed to not attempt to register
> with rpcbind n the v4-only case. (Or to attempt to register but ignore
> any error, I'm not sure which is best.)
>
> And this may not be the only issue in the v4-only case. This isn't
> really a priority for me right now, but I'd happily look at patches.
>
>
> Hi all,
>
> I make a patch for this problem, please have a check, thanks.
>
> From 64c1f96348213f39b9411ab25699a292edbef4ef Mon Sep 17 00:00:00 2001
> From: Kinglong Mee <[email protected]>
> Date: Fri, 27 Dec 2013 18:06:25 +0800
> Subject: [PATCH] NFSD: supports nfsv4 service without rpcbind
>
> 1. set vs_hidden in nfsd_version4 to avoid register nfsv4 to rpcbind
>
>
> IMO we do want the NFS port registered if rpcbind is running. NFSv4 is not
> a hidden service, like the client's callback server which can only be
> discovered by a forward advertisement (SETCLIENTID).
>
> I think I prefer ignoring the rpcb_set error for NFSv4.
>
>
> Agreed. My only concern would be that there be no unnecessary delays or
> errors logged in the v4-only case if rpcbind isn't running.
>
>
> I believe the rpcb_set upcall now uses the AF_LOCAL transport, which should
> be able to detect immediately that rpcbind is not listening.
>
> The OP did not report a delay or hang, thankfully.
>
>
> I meet a problem when testing on Fedora 20 with latest kernel,
> svc_register for nfsv4 not report immediately, instead of a delay and
> return EIO.
>
> After that, rpc.nfsd also hang there, not return utils rpcbind start.
> I will have a check for that.
>
> thanks.
> Kinglong Mee
>
>
>
> --b.
>
>
>
> 2. don't start lockd when only supports nfsv4.
>
> Reported-by: Gareth Williams <[email protected]>
> Signed-off-by: Kinglong Mee <[email protected]>
> ---
> fs/nfsd/netns.h | 3 +++
> fs/nfsd/nfs4proc.c | 1 +
> fs/nfsd/nfsctl.c | 3 +++
> fs/nfsd/nfssvc.c | 21 ++++++++++++++++-----
> 4 files changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index 849a7c3..ae2c179 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -96,6 +96,9 @@ struct nfsd_net {
>
> bool nfsd_net_up;
>
> + bool lockd_up;
> + u32 nfsd_needs_lockd;
> +
> /*
> * Time of server startup
> */
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 419572f..1496376 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -1881,6 +1881,7 @@ struct svc_version nfsd_version4 = {
> .vs_proc = nfsd_procedures4,
> .vs_dispatch = nfsd_dispatch,
> .vs_xdrsize = NFS4_SVC_XDRSIZE,
> + .vs_hidden = 1,
> };
>
> /*
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 7f55517..8c7b0f0 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -575,6 +575,9 @@ static ssize_t __write_versions(struct file *file, char
> *buf, size_t size)
> switch(num) {
> case 2:
> case 3:
> + nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
> + nn->nfsd_needs_lockd = nfsd_vers(num, NFSD_TEST);
> + break;
> case 4:
> nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
> break;
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 760c85a..2b841d8 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -255,9 +255,14 @@ static int nfsd_startup_net(int nrservs, struct net
> *net)
> ret = nfsd_init_socks(net);
> if (ret)
> goto out_socks;
> - ret = lockd_up(net);
> - if (ret)
> - goto out_socks;
> +
> + if (nn->nfsd_needs_lockd && !nn->lockd_up) {
> + ret = lockd_up(net);
> + if (ret)
> + goto out_socks;
> + nn->lockd_up = 1;
> + }
> +
> ret = nfs4_state_start_net(net);
> if (ret)
> goto out_lockd;
> @@ -266,7 +271,10 @@ static int nfsd_startup_net(int nrservs, struct net
> *net)
> return 0;
>
> out_lockd:
> - lockd_down(net);
> + if (nn->lockd_up) {
> + lockd_down(net);
> + nn->lockd_up = 0;
> + }
> out_socks:
> nfsd_shutdown_generic();
> return ret;
> @@ -277,7 +285,10 @@ static void nfsd_shutdown_net(struct net *net)
> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>
> nfs4_state_shutdown_net(net);
> - lockd_down(net);
> + if (nn->lockd_up) {
> + lockd_down(net);
> + nn->lockd_up = 0;
> + }
> nn->nfsd_net_up = false;
> nfsd_shutdown_generic();
> }
> --
> 1.8.4.2
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
>
>
>

2013-12-23 17:40:02

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Question ref Running NFS at V4 Only

On Fri, Dec 20, 2013 at 05:10:42PM +0000, Gareth Williams wrote:
> Hi,
>
> I'm trying to run NFS with protocol version 4 only (that is, with v2
> & v3 disabled) on a CentOS 6.5 install running as a KVM guest.
>
> The RedHat documentation (amongst others) states that rpcbind isn't
> needed with v4, but if I start nfs without rpcbind I get errors.

I suspect the kernel code needs to be fixed to not attempt to register
with rpcbind n the v4-only case. (Or to attempt to register but ignore
any error, I'm not sure which is best.)

And this may not be the only issue in the v4-only case. This isn't
really a priority for me right now, but I'd happily look at patches.

--b.

>
> I've spent a couple of days (on and off) on Google trying to get an
> answer and have posted on the CentOS forum, but the nearest I can
> find is an archive on this mailing list from over two years ago
> (and it's not identical):-
>
> http://www.spinics.net/lists/linux-nfs/msg16907.html
>
> In /etc/sysconfig/nfs I have:-
>
> MOUNTD_NFS_V2="no"
> MOUNTD_NFS_V3="no"
>
> RPCNFSDARGS="-N 2 -N 3"
>
> When I attempt to start NFS either using the init scripts or
> manually with:-
>
> rpc.nfsd -N 2 -N 3
>
> I get:-
>
> rpc.nfsd: writing fd to kernel failed: errno 111 (Connection refused)
> rpc.nfsd: unable to set any sockets for nfsd
>
> The mailing list archive's answer was that the kernel was too old,
> so I installed 3.10.24 from elrepo but the symptom doesn't change.
>
> If I start rpcbind, then everything works, but as far as I can see,
> I shouldn't have to do that unless I'm supporting v2 & v3.
>
> Finally, please accept my apologies for wasting your time if I've
> missed something obvious.
>
> Kind regards,
>
> Gareth
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

2013-12-29 06:39:06

by Kinglong Mee

[permalink] [raw]
Subject: Re: Question ref Running NFS at V4 Only

I get the trace when rpc.nfsd hang.

1608 Dec 29 14:25:12 localhost kernel: [ 1224.449293] rpc.nfsd
D c0d94300 0 1199 991 0x00000080

1609 Dec 29 14:25:12 localhost kernel: [ 1224.451347] d9a9bc98
00000086 c046aa48 c0d94300 ddb66540 c0d79300 58625871 0000011c

1610 Dec 29 14:25:12 localhost kernel: [ 1224.453426] c0d79300
dfff4300 dcd93a80 c0461738 00000000 c0d94300 00000000 00000020

1611 Dec 29 14:25:12 localhost kernel: [ 1224.455701] da4d53a0
00000020 ddb66540 da4d53b0 d9a9bc84 c046add9 dcd93a80 00000292

1612 Dec 29 14:25:12 localhost kernel: [ 1224.457853] Call Trace:

1613 Dec 29 14:25:12 localhost kernel: [ 1224.459919] [<c046aa48>] ?
insert_work+0x38/0x80

1614 Dec 29 14:25:12 localhost kernel: [ 1224.462055] [<c0461738>] ?
mod_timer+0xe8/0x1c0

1615 Dec 29 14:25:12 localhost kernel: [ 1224.464230] [<c046add9>] ?
__queue_delayed_work+0x89/0x140

1616 Dec 29 14:25:12 localhost kernel: [ 1224.466304] [<e092d970>] ?
__rpc_wait_for_completion_task+0x30/0x30 [sunrpc]

1617 Dec 29 14:25:12 localhost kernel: [ 1224.468561] [<c09bd543>]
schedule+0x23/0x60

1618 Dec 29 14:25:12 localhost kernel: [ 1224.470671] [<e092d99d>]
rpc_wait_bit_killable+0x2d/0x80 [sunrpc]

1619 Dec 29 14:25:12 localhost kernel: [ 1224.472785] [<c09bdae1>]
__wait_on_bit+0x51/0x70

1620 Dec 29 14:25:12 localhost kernel: [ 1224.474974] [<e092d970>] ?
__rpc_wait_for_completion_task+0x30/0x30 [sunrpc]

1621 Dec 29 14:25:12 localhost kernel: [ 1224.477170] [<e092d970>] ?
__rpc_wait_for_completion_task+0x30/0x30 [sunrpc]

1622 Dec 29 14:25:12 localhost kernel: [ 1224.479389] [<c09bdb5b>]
out_of_line_wait_on_bit+0x5b/0x70

1623 Dec 29 14:25:12 localhost kernel: [ 1224.481739] [<c048e440>] ?
autoremove_wake_function+0x40/0x40

1624 Dec 29 14:25:12 localhost kernel: [ 1224.483881] [<e092e703>]
__rpc_execute+0x1f3/0x3a0 [sunrpc]

1625 Dec 29 14:25:12 localhost kernel: [ 1224.486030] [<c0516283>] ?
mempool_alloc_slab+0x13/0x20

1626 Dec 29 14:25:12 localhost kernel: [ 1224.488194] [<c051638e>] ?
mempool_alloc+0x3e/0x100

1627 Dec 29 14:25:12 localhost kernel: [ 1224.490307] [<e0925d80>] ?
call_bind_status+0x260/0x260 [sunrpc]

1628 Dec 29 14:25:12 localhost kernel: [ 1224.492559] [<c048e09c>] ?
wake_up_bit+0x1c/0x20

1629 Dec 29 14:25:12 localhost kernel: [ 1224.494844] [<e092f866>]
rpc_execute+0x56/0x90 [sunrpc]

1630 Dec 29 14:25:12 localhost kernel: [ 1224.496935] [<e0926cf9>]
rpc_run_task+0x59/0x70 [sunrpc]

1631 Dec 29 14:25:12 localhost kernel: [ 1224.499122] [<e0926d4c>]
rpc_call_sync+0x3c/0x90 [sunrpc]

1632 Dec 29 14:25:12 localhost kernel: [ 1224.501260] [<e0926de8>]
rpc_ping+0x48/0x60 [sunrpc]

1633 Dec 29 14:25:12 localhost kernel: [ 1224.503367] [<e092703b>]
rpc_bind_new_program+0x4b/0x70 [sunrpc]

1634 Dec 29 14:25:12 localhost kernel: [ 1224.505608] [<e0938333>]
rpcb_create_local+0x163/0x1f0 [sunrpc]

1635 Dec 29 14:25:12 localhost kernel: [ 1224.507720] [<e0932199>] ?
__svc_create+0x119/0x1f0 [sunrpc]

1636 Dec 29 14:25:12 localhost kernel: [ 1224.509830] [<e0932016>]
svc_rpcb_setup+0x16/0x30 [sunrpc]

1637 Dec 29 14:25:12 localhost kernel: [ 1224.511963] [<e0932052>]
svc_bind+0x22/0x30 [sunrpc]

1638 Dec 29 14:25:12 localhost kernel: [ 1224.514056] [<e09b73a4>]
nfsd_create_serv+0xc4/0x1d0 [nfsd]

1639 Dec 29 14:25:12 localhost kernel: [ 1224.516844] [<e09b7600>] ?
nfsd_destroy+0x70/0x70 [nfsd]

1640 Dec 29 14:25:12 localhost kernel: [ 1224.518870] [<e09b8d1f>]
write_ports+0x21f/0x2b0 [nfsd]

1641 Dec 29 14:25:12 localhost kernel: [ 1224.521359] [<c06a182c>] ?
_copy_from_user+0x2c/0x40

1642 Dec 29 14:25:12 localhost kernel: [ 1224.523358] [<c05854fe>] ?
simple_transaction_get+0x8e/0xa0

1643 Dec 29 14:25:12 localhost kernel: [ 1224.525383] [<e09b8b00>] ?
write_recoverydir+0xf0/0xf0 [nfsd]

1644 Dec 29 14:25:12 localhost kernel: [ 1224.527386] [<e09b7f3b>]
nfsctl_transaction_write+0x3b/0x60 [nfsd]

1645 Dec 29 14:25:12 localhost kernel: [ 1224.529302] [<e09b7f00>] ?
export_features_show+0x30/0x30 [nfsd]

1646 Dec 29 14:25:12 localhost kernel: [ 1224.531366] [<c0564695>]
vfs_write+0x95/0x1c0

1647 Dec 29 14:25:12 localhost kernel: [ 1224.533401] [<c0564d49>]
SyS_write+0x49/0x90

1648 Dec 29 14:25:12 localhost kernel: [ 1224.535380] [<c09c730d>]
sysenter_do_call+0x12/0x28

thanks,
Kinglong Mee

2013/12/28 Kinglong Mee <[email protected]>:
>
> ?? 2013??12??28?գ?????3:40??Chuck Lever <[email protected]> д????
>
>
> On Dec 27, 2013, at 1:43 PM, J.;Bruce Fields <[email protected]> wrote:
>
> On Fri, Dec 27, 2013 at 11:05:05AM -0500, Chuck Lever wrote:
>
> Hi-
>
> On Dec 27, 2013, at 5:17 AM, Kinglong Mee <[email protected]> wrote:
>
> On 12/24/2013 01:39 AM, J. Bruce Fields wrote:
>
> On Fri, Dec 20, 2013 at 05:10:42PM +0000, Gareth Williams wrote:
>
> Hi,
>
> I'm trying to run NFS with protocol version 4 only (that is, with v2
> & v3 disabled) on a CentOS 6.5 install running as a KVM guest.
>
> The RedHat documentation (amongst others) states that rpcbind isn't
> needed with v4, but if I start nfs without rpcbind I get errors.
>
>
> I suspect the kernel code needs to be fixed to not attempt to register
> with rpcbind n the v4-only case. (Or to attempt to register but ignore
> any error, I'm not sure which is best.)
>
> And this may not be the only issue in the v4-only case. This isn't
> really a priority for me right now, but I'd happily look at patches.
>
>
> Hi all,
>
> I make a patch for this problem, please have a check, thanks.
>
> From 64c1f96348213f39b9411ab25699a292edbef4ef Mon Sep 17 00:00:00 2001
> From: Kinglong Mee <[email protected]>
> Date: Fri, 27 Dec 2013 18:06:25 +0800
> Subject: [PATCH] NFSD: supports nfsv4 service without rpcbind
>
> 1. set vs_hidden in nfsd_version4 to avoid register nfsv4 to rpcbind
>
>
> IMO we do want the NFS port registered if rpcbind is running. NFSv4 is not
> a hidden service, like the client's callback server which can only be
> discovered by a forward advertisement (SETCLIENTID).
>
> I think I prefer ignoring the rpcb_set error for NFSv4.
>
>
> Agreed. My only concern would be that there be no unnecessary delays or
> errors logged in the v4-only case if rpcbind isn't running.
>
>
> I believe the rpcb_set upcall now uses the AF_LOCAL transport, which should
> be able to detect immediately that rpcbind is not listening.
>
> The OP did not report a delay or hang, thankfully.
>
>
> I meet a problem when testing on Fedora 20 with latest kernel,
> svc_register for nfsv4 not report immediately, instead of a delay and
> return EIO.
>
> After that, rpc.nfsd also hang there, not return utils rpcbind start.
> I will have a check for that.
>
> thanks.
> Kinglong Mee
>
>
>
> --b.
>
>
>
> 2. don't start lockd when only supports nfsv4.
>
> Reported-by: Gareth Williams <[email protected]>
> Signed-off-by: Kinglong Mee <[email protected]>
> ---
> fs/nfsd/netns.h | 3 +++
> fs/nfsd/nfs4proc.c | 1 +
> fs/nfsd/nfsctl.c | 3 +++
> fs/nfsd/nfssvc.c | 21 ++++++++++++++++-----
> 4 files changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index 849a7c3..ae2c179 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -96,6 +96,9 @@ struct nfsd_net {
>
> bool nfsd_net_up;
>
> + bool lockd_up;
> + u32 nfsd_needs_lockd;
> +
> /*
> * Time of server startup
> */
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 419572f..1496376 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -1881,6 +1881,7 @@ struct svc_version nfsd_version4 = {
> .vs_proc = nfsd_procedures4,
> .vs_dispatch = nfsd_dispatch,
> .vs_xdrsize = NFS4_SVC_XDRSIZE,
> + .vs_hidden = 1,
> };
>
> /*
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 7f55517..8c7b0f0 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -575,6 +575,9 @@ static ssize_t __write_versions(struct file *file, char
> *buf, size_t size)
> switch(num) {
> case 2:
> case 3:
> + nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
> + nn->nfsd_needs_lockd = nfsd_vers(num, NFSD_TEST);
> + break;
> case 4:
> nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
> break;
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 760c85a..2b841d8 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -255,9 +255,14 @@ static int nfsd_startup_net(int nrservs, struct net
> *net)
> ret = nfsd_init_socks(net);
> if (ret)
> goto out_socks;
> - ret = lockd_up(net);
> - if (ret)
> - goto out_socks;
> +
> + if (nn->nfsd_needs_lockd && !nn->lockd_up) {
> + ret = lockd_up(net);
> + if (ret)
> + goto out_socks;
> + nn->lockd_up = 1;
> + }
> +
> ret = nfs4_state_start_net(net);
> if (ret)
> goto out_lockd;
> @@ -266,7 +271,10 @@ static int nfsd_startup_net(int nrservs, struct net
> *net)
> return 0;
>
> out_lockd:
> - lockd_down(net);
> + if (nn->lockd_up) {
> + lockd_down(net);
> + nn->lockd_up = 0;
> + }
> out_socks:
> nfsd_shutdown_generic();
> return ret;
> @@ -277,7 +285,10 @@ static void nfsd_shutdown_net(struct net *net)
> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>
> nfs4_state_shutdown_net(net);
> - lockd_down(net);
> + if (nn->lockd_up) {
> + lockd_down(net);
> + nn->lockd_up = 0;
> + }
> nn->nfsd_net_up = false;
> nfsd_shutdown_generic();
> }
> --
> 1.8.4.2
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
>
>
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
>
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
>
>

2013-12-31 03:18:29

by Kinglong Mee

[permalink] [raw]
Subject: Re: [PATCH 2/2] NFSD: supports nfsv4 service without rpcbind

On 12/31/2013 01:54 AM, Chuck Lever wrote:
>
> On Dec 30, 2013, at 6:25 AM, Kinglong Mee <[email protected]> wrote:
>
>> 1. set vs_ignore_err for nfsd_version4
>> 2. don't start lockd when only supports nfsv4
>>
>> Signed-off-by: Kinglong Mee <[email protected]>
>> ---
>> fs/nfsd/netns.h | 1 +
>> fs/nfsd/nfs4proc.c | 1 +
>> fs/nfsd/nfssvc.c | 26 +++++++++++++++++++++-----
>> 3 files changed, 23 insertions(+), 5 deletions(-)
>>
>> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
>> index 849a7c3..d32b3aa 100644
>> --- a/fs/nfsd/netns.h
>> +++ b/fs/nfsd/netns.h
>> @@ -95,6 +95,7 @@ struct nfsd_net {
>> time_t nfsd4_grace;
>>
>> bool nfsd_net_up;
>> + bool lockd_up;
>>
>> /*
>> * Time of server startup
>> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
>> index 419572f..9320986 100644
>> --- a/fs/nfsd/nfs4proc.c
>> +++ b/fs/nfsd/nfs4proc.c
>> @@ -1881,6 +1881,7 @@ struct svc_version nfsd_version4 = {
>> .vs_proc = nfsd_procedures4,
>> .vs_dispatch = nfsd_dispatch,
>> .vs_xdrsize = NFS4_SVC_XDRSIZE,
>> + .vs_ignore_err = 1,
>
> It's better, I think, to include this hunk in 1/1. That way, that one patch can be merged into stable, or cherry-picked by a distribution.
>
> Also, just a nit: ".vs_ignore_err" is a rather generic name. Something more specific like ".vs_rpcb_optnl" would be nicer, or reverse the logic and call it ".vs_rpcb_needed".

Make sense.
Thanks for you suggestion.

>
>> };
>>
>> /*
>> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
>> index 760c85a..55b5b57 100644
>> --- a/fs/nfsd/nfssvc.c
>> +++ b/fs/nfsd/nfssvc.c
>> @@ -241,6 +241,11 @@ static void nfsd_shutdown_generic(void)
>> nfsd_racache_shutdown();
>> }
>>
>> +static bool nfsd_needs_lockd(void)
>> +{
>> + return (nfsd_versions[2] != NULL) || (nfsd_versions[3] != NULL);
>> +}
>
> How does this logic know which version of nfsd is being started?

In the latest kernel (3.13), nfsd_versions records the version of nfsd which being started,
nfsd_version records the version of nfsd which can be supported by codes.

In function named nfsd_vers, when setting a version of nfsd, nfsd_versions[vers] will be signed
to nfsd_version[vers], in contrast, when clearing, nfsd_versions[vers] will be signed to NULL.

So, we can check nfsd_versions[vers] directly for according version.

thanks,
Kinglong Mee

>
>> +
>> static int nfsd_startup_net(int nrservs, struct net *net)
>> {
>> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>> @@ -255,9 +260,14 @@ static int nfsd_startup_net(int nrservs, struct net *net)
>> ret = nfsd_init_socks(net);
>> if (ret)
>> goto out_socks;
>> - ret = lockd_up(net);
>> - if (ret)
>> - goto out_socks;
>> +
>> + if (nfsd_needs_lockd() && !nn->lockd_up) {
>> + ret = lockd_up(net);
>> + if (ret)
>> + goto out_socks;
>> + nn->lockd_up = 1;
>> + }
>> +
>> ret = nfs4_state_start_net(net);
>> if (ret)
>> goto out_lockd;
>> @@ -266,7 +276,10 @@ static int nfsd_startup_net(int nrservs, struct net *net)
>> return 0;
>>
>> out_lockd:
>> - lockd_down(net);
>> + if (nn->lockd_up) {
>> + lockd_down(net);
>> + nn->lockd_up = 0;
>> + }
>> out_socks:
>> nfsd_shutdown_generic();
>> return ret;
>> @@ -277,7 +290,10 @@ static void nfsd_shutdown_net(struct net *net)
>> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>>
>> nfs4_state_shutdown_net(net);
>> - lockd_down(net);
>> + if (nn->lockd_up) {
>> + lockd_down(net);
>> + nn->lockd_up = 0;
>> + }
>> nn->nfsd_net_up = false;
>> nfsd_shutdown_generic();
>> }
>> --
>> 1.8.4.2
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
>
>
>
>


2013-12-31 19:40:04

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 3/4] SUNRPC: Report connection error values to rpc_tasks on the pending queue

Currently we only report EAGAIN, which is not descriptive enough for
softconn tasks.

Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/xprtsock.c | 41 ++++++++++++++++++++++++++++++++++++-----
1 file changed, 36 insertions(+), 5 deletions(-)

diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index dd9d295813cf..ab006b7b7ab8 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -257,6 +257,7 @@ struct sock_xprt {
void (*old_data_ready)(struct sock *, int);
void (*old_state_change)(struct sock *);
void (*old_write_space)(struct sock *);
+ void (*old_error_report)(struct sock *);
};

/*
@@ -274,6 +275,11 @@ struct sock_xprt {
*/
#define TCP_RPC_REPLY (1UL << 6)

+static inline struct rpc_xprt *xprt_from_sock(struct sock *sk)
+{
+ return (struct rpc_xprt *) sk->sk_user_data;
+}
+
static inline struct sockaddr *xs_addr(struct rpc_xprt *xprt)
{
return (struct sockaddr *) &xprt->addr;
@@ -799,6 +805,7 @@ static void xs_save_old_callbacks(struct sock_xprt *transport, struct sock *sk)
transport->old_data_ready = sk->sk_data_ready;
transport->old_state_change = sk->sk_state_change;
transport->old_write_space = sk->sk_write_space;
+ transport->old_error_report = sk->sk_error_report;
}

static void xs_restore_old_callbacks(struct sock_xprt *transport, struct sock *sk)
@@ -806,6 +813,33 @@ static void xs_restore_old_callbacks(struct sock_xprt *transport, struct sock *s
sk->sk_data_ready = transport->old_data_ready;
sk->sk_state_change = transport->old_state_change;
sk->sk_write_space = transport->old_write_space;
+ sk->sk_error_report = transport->old_error_report;
+}
+
+/**
+ * xs_error_report - callback to handle TCP socket state errors
+ * @sk: socket
+ *
+ * Note: we don't call sock_error() since there may be a rpc_task
+ * using the socket, and so we don't want to clear sk->sk_err.
+ */
+static void xs_error_report(struct sock *sk)
+{
+ struct rpc_xprt *xprt;
+ int err;
+
+ read_lock_bh(&sk->sk_callback_lock);
+ if (!(xprt = xprt_from_sock(sk)))
+ goto out;
+
+ err = -sk->sk_err;
+ if (err == 0)
+ goto out;
+ dprintk("RPC: xs_error_report client %p, error=%d...\n",
+ xprt, -err);
+ xprt_wake_pending_tasks(xprt, err);
+ out:
+ read_unlock_bh(&sk->sk_callback_lock);
}

static void xs_reset_transport(struct sock_xprt *transport)
@@ -885,11 +919,6 @@ static void xs_destroy(struct rpc_xprt *xprt)
module_put(THIS_MODULE);
}

-static inline struct rpc_xprt *xprt_from_sock(struct sock *sk)
-{
- return (struct rpc_xprt *) sk->sk_user_data;
-}
-
static int xs_local_copy_to_xdr(struct xdr_buf *xdr, struct sk_buff *skb)
{
struct xdr_skb_reader desc = {
@@ -1869,6 +1898,7 @@ static int xs_local_finish_connecting(struct rpc_xprt *xprt,
sk->sk_user_data = xprt;
sk->sk_data_ready = xs_local_data_ready;
sk->sk_write_space = xs_udp_write_space;
+ sk->sk_error_report = xs_error_report;
sk->sk_allocation = GFP_ATOMIC;

xprt_clear_connected(xprt);
@@ -2146,6 +2176,7 @@ static int xs_tcp_finish_connecting(struct rpc_xprt *xprt, struct socket *sock)
sk->sk_data_ready = xs_tcp_data_ready;
sk->sk_state_change = xs_tcp_state_change;
sk->sk_write_space = xs_tcp_write_space;
+ sk->sk_error_report = xs_error_report;
sk->sk_allocation = GFP_ATOMIC;

/* socket options */
--
1.8.4.2


2013-12-27 18:43:12

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Question ref Running NFS at V4 Only

On Fri, Dec 27, 2013 at 11:05:05AM -0500, Chuck Lever wrote:
> Hi-
>
> On Dec 27, 2013, at 5:17 AM, Kinglong Mee <[email protected]> wrote:
>
> > On 12/24/2013 01:39 AM, J. Bruce Fields wrote:
> >> On Fri, Dec 20, 2013 at 05:10:42PM +0000, Gareth Williams wrote:
> >>> Hi,
> >>>
> >>> I'm trying to run NFS with protocol version 4 only (that is, with v2
> >>> & v3 disabled) on a CentOS 6.5 install running as a KVM guest.
> >>>
> >>> The RedHat documentation (amongst others) states that rpcbind isn't
> >>> needed with v4, but if I start nfs without rpcbind I get errors.
> >>
> >> I suspect the kernel code needs to be fixed to not attempt to register
> >> with rpcbind n the v4-only case. (Or to attempt to register but ignore
> >> any error, I'm not sure which is best.)
> >>
> >> And this may not be the only issue in the v4-only case. This isn't
> >> really a priority for me right now, but I'd happily look at patches.
> >
> > Hi all,
> >
> > I make a patch for this problem, please have a check, thanks.
> >
> > From 64c1f96348213f39b9411ab25699a292edbef4ef Mon Sep 17 00:00:00 2001
> > From: Kinglong Mee <[email protected]>
> > Date: Fri, 27 Dec 2013 18:06:25 +0800
> > Subject: [PATCH] NFSD: supports nfsv4 service without rpcbind
> >
> > 1. set vs_hidden in nfsd_version4 to avoid register nfsv4 to rpcbind
>
> IMO we do want the NFS port registered if rpcbind is running. NFSv4 is not a hidden service, like the client's callback server which can only be discovered by a forward advertisement (SETCLIENTID).
>
> I think I prefer ignoring the rpcb_set error for NFSv4.

Agreed. My only concern would be that there be no unnecessary delays or
errors logged in the v4-only case if rpcbind isn't running.

--b.

>
>
> > 2. don't start lockd when only supports nfsv4.
> >
> > Reported-by: Gareth Williams <[email protected]>
> > Signed-off-by: Kinglong Mee <[email protected]>
> > ---
> > fs/nfsd/netns.h | 3 +++
> > fs/nfsd/nfs4proc.c | 1 +
> > fs/nfsd/nfsctl.c | 3 +++
> > fs/nfsd/nfssvc.c | 21 ++++++++++++++++-----
> > 4 files changed, 23 insertions(+), 5 deletions(-)
> >
> > diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> > index 849a7c3..ae2c179 100644
> > --- a/fs/nfsd/netns.h
> > +++ b/fs/nfsd/netns.h
> > @@ -96,6 +96,9 @@ struct nfsd_net {
> >
> > bool nfsd_net_up;
> >
> > + bool lockd_up;
> > + u32 nfsd_needs_lockd;
> > +
> > /*
> > * Time of server startup
> > */
> > diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> > index 419572f..1496376 100644
> > --- a/fs/nfsd/nfs4proc.c
> > +++ b/fs/nfsd/nfs4proc.c
> > @@ -1881,6 +1881,7 @@ struct svc_version nfsd_version4 = {
> > .vs_proc = nfsd_procedures4,
> > .vs_dispatch = nfsd_dispatch,
> > .vs_xdrsize = NFS4_SVC_XDRSIZE,
> > + .vs_hidden = 1,
> > };
> >
> > /*
> > diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> > index 7f55517..8c7b0f0 100644
> > --- a/fs/nfsd/nfsctl.c
> > +++ b/fs/nfsd/nfsctl.c
> > @@ -575,6 +575,9 @@ static ssize_t __write_versions(struct file *file, char *buf, size_t size)
> > switch(num) {
> > case 2:
> > case 3:
> > + nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
> > + nn->nfsd_needs_lockd = nfsd_vers(num, NFSD_TEST);
> > + break;
> > case 4:
> > nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
> > break;
> > diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> > index 760c85a..2b841d8 100644
> > --- a/fs/nfsd/nfssvc.c
> > +++ b/fs/nfsd/nfssvc.c
> > @@ -255,9 +255,14 @@ static int nfsd_startup_net(int nrservs, struct net *net)
> > ret = nfsd_init_socks(net);
> > if (ret)
> > goto out_socks;
> > - ret = lockd_up(net);
> > - if (ret)
> > - goto out_socks;
> > +
> > + if (nn->nfsd_needs_lockd && !nn->lockd_up) {
> > + ret = lockd_up(net);
> > + if (ret)
> > + goto out_socks;
> > + nn->lockd_up = 1;
> > + }
> > +
> > ret = nfs4_state_start_net(net);
> > if (ret)
> > goto out_lockd;
> > @@ -266,7 +271,10 @@ static int nfsd_startup_net(int nrservs, struct net *net)
> > return 0;
> >
> > out_lockd:
> > - lockd_down(net);
> > + if (nn->lockd_up) {
> > + lockd_down(net);
> > + nn->lockd_up = 0;
> > + }
> > out_socks:
> > nfsd_shutdown_generic();
> > return ret;
> > @@ -277,7 +285,10 @@ static void nfsd_shutdown_net(struct net *net)
> > struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> >
> > nfs4_state_shutdown_net(net);
> > - lockd_down(net);
> > + if (nn->lockd_up) {
> > + lockd_down(net);
> > + nn->lockd_up = 0;
> > + }
> > nn->nfsd_net_up = false;
> > nfsd_shutdown_generic();
> > }
> > --
> > 1.8.4.2
> >
> >
> > --
> > To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> > the body of a message to [email protected]
> > More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> --
> Chuck Lever
> chuck[dot]lever[at]oracle[dot]com
>
>
>
>

2013-12-31 19:40:03

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 1/4] SUNRPC: Ensure xprt_connect_status handles all potential connection errors

Currently, xprt_connect_status will convert connection error values such
as ECONNREFUSED, ECONNRESET, ... into EIO, which means that they never
get handled.

Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/xprt.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
index 04199bc8416f..ddd198e90292 100644
--- a/net/sunrpc/xprt.c
+++ b/net/sunrpc/xprt.c
@@ -749,6 +749,11 @@ static void xprt_connect_status(struct rpc_task *task)
}

switch (task->tk_status) {
+ case -ECONNREFUSED:
+ case -ECONNRESET:
+ case -ECONNABORTED:
+ case -ENETUNREACH:
+ case -EHOSTUNREACH:
case -EAGAIN:
dprintk("RPC: %5u xprt_connect_status: retrying\n", task->tk_pid);
break;
--
1.8.4.2


2013-12-31 05:17:28

by Kinglong Mee

[permalink] [raw]
Subject: [PATCH 1/2 v2] SUNRPC/NFSD: Supports new option for ignoring the result of svc_register

Williams reports starting NFSv4 failed without rpcbind, get error,

#rpc.nfsd -N 2 -N 3
rpc.nfsd: writing fd to kernel failed: errno 111 (Connection refused)
rpc.nfsd: unable to set any sockets for nfsd

For this, NFSv4 needs ignoring the result from svc_register,
so adds a flags in svc_version.

v2:
use meaningful name vs_rpcb_optnl insteads vs_ignore_err
move setting nfsd_version4's vs_rpcb_optnl option from old patch [2/2]

Reported-by: Gareth Williams <[email protected]>
Reviewed-by: Chuck Lever <[email protected]>
Signed-off-by: Kinglong Mee <[email protected]>
---
fs/nfsd/nfs4proc.c | 1 +
include/linux/sunrpc/svc.h | 4 +++-
net/sunrpc/svc.c | 25 +++++++++++++++++--------
3 files changed, 21 insertions(+), 9 deletions(-)

diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 419572f..db3d5b9 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1881,6 +1881,7 @@ struct svc_version nfsd_version4 = {
.vs_proc = nfsd_procedures4,
.vs_dispatch = nfsd_dispatch,
.vs_xdrsize = NFS4_SVC_XDRSIZE,
+ .vs_rpcb_optnl = 1,
};

/*
diff --git a/include/linux/sunrpc/svc.h b/include/linux/sunrpc/svc.h
index 6eecfc2..10e28d1 100644
--- a/include/linux/sunrpc/svc.h
+++ b/include/linux/sunrpc/svc.h
@@ -386,8 +386,10 @@ struct svc_version {
struct svc_procedure * vs_proc; /* per-procedure info */
u32 vs_xdrsize; /* xdrsize needed for this version */

- unsigned int vs_hidden : 1; /* Don't register with portmapper.
+ unsigned int vs_hidden : 1, /* Don't register with portmapper.
* Only used for nfsacl so far. */
+ vs_rpcb_optnl:1;/* Don't care the result of register.
+ * Only used for nfsv4. */

/* Override dispatch function (e.g. when caching replies).
* A return value of 0 means drop the request.
diff --git a/net/sunrpc/svc.c b/net/sunrpc/svc.c
index e7fbe36..5de6801 100644
--- a/net/sunrpc/svc.c
+++ b/net/sunrpc/svc.c
@@ -916,9 +916,6 @@ static int __svc_register(struct net *net, const char *progname,
#endif
}

- if (error < 0)
- printk(KERN_WARNING "svc: failed to register %sv%u RPC "
- "service (errno %d).\n", progname, version, -error);
return error;
}

@@ -937,6 +934,7 @@ int svc_register(const struct svc_serv *serv, struct net *net,
const unsigned short port)
{
struct svc_program *progp;
+ struct svc_version *vers;
unsigned int i;
int error = 0;

@@ -946,7 +944,8 @@ int svc_register(const struct svc_serv *serv, struct net *net,

for (progp = serv->sv_program; progp; progp = progp->pg_next) {
for (i = 0; i < progp->pg_nvers; i++) {
- if (progp->pg_vers[i] == NULL)
+ vers = progp->pg_vers[i];
+ if (vers == NULL)
continue;

dprintk("svc: svc_register(%sv%d, %s, %u, %u)%s\n",
@@ -955,16 +954,26 @@ int svc_register(const struct svc_serv *serv, struct net *net,
proto == IPPROTO_UDP? "udp" : "tcp",
port,
family,
- progp->pg_vers[i]->vs_hidden?
- " (but not telling portmap)" : "");
+ vers->vs_hidden ?
+ " (but not telling portmap)" : "");

- if (progp->pg_vers[i]->vs_hidden)
+ if (vers->vs_hidden)
continue;

error = __svc_register(net, progp->pg_name, progp->pg_prog,
i, family, proto, port);
- if (error < 0)
+
+ if (vers->vs_rpcb_optnl) {
+ error = 0;
+ continue;
+ }
+
+ if (error < 0) {
+ printk(KERN_WARNING "svc: failed to register "
+ "%sv%u RPC service (errno %d).\n",
+ progp->pg_name, i, -error);
break;
+ }
}
}

--
1.8.4.2

2013-12-30 11:26:05

by Kinglong Mee

[permalink] [raw]
Subject: [PATCH 2/2] NFSD: supports nfsv4 service without rpcbind

1. set vs_ignore_err for nfsd_version4
2. don't start lockd when only supports nfsv4

Signed-off-by: Kinglong Mee <[email protected]>
---
fs/nfsd/netns.h | 1 +
fs/nfsd/nfs4proc.c | 1 +
fs/nfsd/nfssvc.c | 26 +++++++++++++++++++++-----
3 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 849a7c3..d32b3aa 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -95,6 +95,7 @@ struct nfsd_net {
time_t nfsd4_grace;

bool nfsd_net_up;
+ bool lockd_up;

/*
* Time of server startup
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 419572f..9320986 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1881,6 +1881,7 @@ struct svc_version nfsd_version4 = {
.vs_proc = nfsd_procedures4,
.vs_dispatch = nfsd_dispatch,
.vs_xdrsize = NFS4_SVC_XDRSIZE,
+ .vs_ignore_err = 1,
};

/*
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 760c85a..55b5b57 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -241,6 +241,11 @@ static void nfsd_shutdown_generic(void)
nfsd_racache_shutdown();
}

+static bool nfsd_needs_lockd(void)
+{
+ return (nfsd_versions[2] != NULL) || (nfsd_versions[3] != NULL);
+}
+
static int nfsd_startup_net(int nrservs, struct net *net)
{
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
@@ -255,9 +260,14 @@ static int nfsd_startup_net(int nrservs, struct net *net)
ret = nfsd_init_socks(net);
if (ret)
goto out_socks;
- ret = lockd_up(net);
- if (ret)
- goto out_socks;
+
+ if (nfsd_needs_lockd() && !nn->lockd_up) {
+ ret = lockd_up(net);
+ if (ret)
+ goto out_socks;
+ nn->lockd_up = 1;
+ }
+
ret = nfs4_state_start_net(net);
if (ret)
goto out_lockd;
@@ -266,7 +276,10 @@ static int nfsd_startup_net(int nrservs, struct net *net)
return 0;

out_lockd:
- lockd_down(net);
+ if (nn->lockd_up) {
+ lockd_down(net);
+ nn->lockd_up = 0;
+ }
out_socks:
nfsd_shutdown_generic();
return ret;
@@ -277,7 +290,10 @@ static void nfsd_shutdown_net(struct net *net)
struct nfsd_net *nn = net_generic(net, nfsd_net_id);

nfs4_state_shutdown_net(net);
- lockd_down(net);
+ if (nn->lockd_up) {
+ lockd_down(net);
+ nn->lockd_up = 0;
+ }
nn->nfsd_net_up = false;
nfsd_shutdown_generic();
}
--
1.8.4.2

2013-12-31 19:40:05

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 4/4] SUNRPC: Add tracepoint for socket errors

Signed-off-by: Trond Myklebust <[email protected]>
---
include/trace/events/sunrpc.h | 1 +
net/sunrpc/xprtsock.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
index d51d16c7afd8..ddc179b7a105 100644
--- a/include/trace/events/sunrpc.h
+++ b/include/trace/events/sunrpc.h
@@ -301,6 +301,7 @@ DECLARE_EVENT_CLASS(xs_socket_event_done,

DEFINE_RPC_SOCKET_EVENT(rpc_socket_state_change);
DEFINE_RPC_SOCKET_EVENT_DONE(rpc_socket_connect);
+DEFINE_RPC_SOCKET_EVENT_DONE(rpc_socket_error);
DEFINE_RPC_SOCKET_EVENT_DONE(rpc_socket_reset_connection);
DEFINE_RPC_SOCKET_EVENT(rpc_socket_close);
DEFINE_RPC_SOCKET_EVENT(rpc_socket_shutdown);
diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
index ab006b7b7ab8..25dbfa971948 100644
--- a/net/sunrpc/xprtsock.c
+++ b/net/sunrpc/xprtsock.c
@@ -837,6 +837,7 @@ static void xs_error_report(struct sock *sk)
goto out;
dprintk("RPC: xs_error_report client %p, error=%d...\n",
xprt, -err);
+ trace_rpc_socket_error(xprt, sk->sk_socket, err);
xprt_wake_pending_tasks(xprt, err);
out:
read_unlock_bh(&sk->sk_callback_lock);
--
1.8.4.2


2013-12-27 16:05:20

by Chuck Lever III

[permalink] [raw]
Subject: Re: Question ref Running NFS at V4 Only

Hi-

On Dec 27, 2013, at 5:17 AM, Kinglong Mee <[email protected]> wrote:

> On 12/24/2013 01:39 AM, J. Bruce Fields wrote:
>> On Fri, Dec 20, 2013 at 05:10:42PM +0000, Gareth Williams wrote:
>>> Hi,
>>>
>>> I'm trying to run NFS with protocol version 4 only (that is, with v2
>>> & v3 disabled) on a CentOS 6.5 install running as a KVM guest.
>>>
>>> The RedHat documentation (amongst others) states that rpcbind isn't
>>> needed with v4, but if I start nfs without rpcbind I get errors.
>>
>> I suspect the kernel code needs to be fixed to not attempt to register
>> with rpcbind n the v4-only case. (Or to attempt to register but ignore
>> any error, I'm not sure which is best.)
>>
>> And this may not be the only issue in the v4-only case. This isn't
>> really a priority for me right now, but I'd happily look at patches.
>
> Hi all,
>
> I make a patch for this problem, please have a check, thanks.
>
> From 64c1f96348213f39b9411ab25699a292edbef4ef Mon Sep 17 00:00:00 2001
> From: Kinglong Mee <[email protected]>
> Date: Fri, 27 Dec 2013 18:06:25 +0800
> Subject: [PATCH] NFSD: supports nfsv4 service without rpcbind
>
> 1. set vs_hidden in nfsd_version4 to avoid register nfsv4 to rpcbind

IMO we do want the NFS port registered if rpcbind is running. NFSv4 is not a hidden service, like the client's callback server which can only be discovered by a forward advertisement (SETCLIENTID).

I think I prefer ignoring the rpcb_set error for NFSv4.


> 2. don't start lockd when only supports nfsv4.
>
> Reported-by: Gareth Williams <[email protected]>
> Signed-off-by: Kinglong Mee <[email protected]>
> ---
> fs/nfsd/netns.h | 3 +++
> fs/nfsd/nfs4proc.c | 1 +
> fs/nfsd/nfsctl.c | 3 +++
> fs/nfsd/nfssvc.c | 21 ++++++++++++++++-----
> 4 files changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index 849a7c3..ae2c179 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -96,6 +96,9 @@ struct nfsd_net {
>
> bool nfsd_net_up;
>
> + bool lockd_up;
> + u32 nfsd_needs_lockd;
> +
> /*
> * Time of server startup
> */
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 419572f..1496376 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -1881,6 +1881,7 @@ struct svc_version nfsd_version4 = {
> .vs_proc = nfsd_procedures4,
> .vs_dispatch = nfsd_dispatch,
> .vs_xdrsize = NFS4_SVC_XDRSIZE,
> + .vs_hidden = 1,
> };
>
> /*
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index 7f55517..8c7b0f0 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -575,6 +575,9 @@ static ssize_t __write_versions(struct file *file, char *buf, size_t size)
> switch(num) {
> case 2:
> case 3:
> + nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
> + nn->nfsd_needs_lockd = nfsd_vers(num, NFSD_TEST);
> + break;
> case 4:
> nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
> break;
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 760c85a..2b841d8 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -255,9 +255,14 @@ static int nfsd_startup_net(int nrservs, struct net *net)
> ret = nfsd_init_socks(net);
> if (ret)
> goto out_socks;
> - ret = lockd_up(net);
> - if (ret)
> - goto out_socks;
> +
> + if (nn->nfsd_needs_lockd && !nn->lockd_up) {
> + ret = lockd_up(net);
> + if (ret)
> + goto out_socks;
> + nn->lockd_up = 1;
> + }
> +
> ret = nfs4_state_start_net(net);
> if (ret)
> goto out_lockd;
> @@ -266,7 +271,10 @@ static int nfsd_startup_net(int nrservs, struct net *net)
> return 0;
>
> out_lockd:
> - lockd_down(net);
> + if (nn->lockd_up) {
> + lockd_down(net);
> + nn->lockd_up = 0;
> + }
> out_socks:
> nfsd_shutdown_generic();
> return ret;
> @@ -277,7 +285,10 @@ static void nfsd_shutdown_net(struct net *net)
> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>
> nfs4_state_shutdown_net(net);
> - lockd_down(net);
> + if (nn->lockd_up) {
> + lockd_down(net);
> + nn->lockd_up = 0;
> + }
> nn->nfsd_net_up = false;
> nfsd_shutdown_generic();
> }
> --
> 1.8.4.2
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com





2013-12-27 10:17:52

by Kinglong Mee

[permalink] [raw]
Subject: Re: Question ref Running NFS at V4 Only

On 12/24/2013 01:39 AM, J. Bruce Fields wrote:
> On Fri, Dec 20, 2013 at 05:10:42PM +0000, Gareth Williams wrote:
>> Hi,
>>
>> I'm trying to run NFS with protocol version 4 only (that is, with v2
>> & v3 disabled) on a CentOS 6.5 install running as a KVM guest.
>>
>> The RedHat documentation (amongst others) states that rpcbind isn't
>> needed with v4, but if I start nfs without rpcbind I get errors.
>
> I suspect the kernel code needs to be fixed to not attempt to register
> with rpcbind n the v4-only case. (Or to attempt to register but ignore
> any error, I'm not sure which is best.)
>
> And this may not be the only issue in the v4-only case. This isn't
> really a priority for me right now, but I'd happily look at patches.

Hi all,

I make a patch for this problem, please have a check, thanks.

>From 64c1f96348213f39b9411ab25699a292edbef4ef Mon Sep 17 00:00:00 2001
From: Kinglong Mee <[email protected]>
Date: Fri, 27 Dec 2013 18:06:25 +0800
Subject: [PATCH] NFSD: supports nfsv4 service without rpcbind

1. set vs_hidden in nfsd_version4 to avoid register nfsv4 to rpcbind
2. don't start lockd when only supports nfsv4.

Reported-by: Gareth Williams <[email protected]>
Signed-off-by: Kinglong Mee <[email protected]>
---
fs/nfsd/netns.h | 3 +++
fs/nfsd/nfs4proc.c | 1 +
fs/nfsd/nfsctl.c | 3 +++
fs/nfsd/nfssvc.c | 21 ++++++++++++++++-----
4 files changed, 23 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 849a7c3..ae2c179 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -96,6 +96,9 @@ struct nfsd_net {

bool nfsd_net_up;

+ bool lockd_up;
+ u32 nfsd_needs_lockd;
+
/*
* Time of server startup
*/
diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
index 419572f..1496376 100644
--- a/fs/nfsd/nfs4proc.c
+++ b/fs/nfsd/nfs4proc.c
@@ -1881,6 +1881,7 @@ struct svc_version nfsd_version4 = {
.vs_proc = nfsd_procedures4,
.vs_dispatch = nfsd_dispatch,
.vs_xdrsize = NFS4_SVC_XDRSIZE,
+ .vs_hidden = 1,
};

/*
diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index 7f55517..8c7b0f0 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -575,6 +575,9 @@ static ssize_t __write_versions(struct file *file, char *buf, size_t size)
switch(num) {
case 2:
case 3:
+ nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
+ nn->nfsd_needs_lockd = nfsd_vers(num, NFSD_TEST);
+ break;
case 4:
nfsd_vers(num, sign == '-' ? NFSD_CLEAR : NFSD_SET);
break;
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 760c85a..2b841d8 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -255,9 +255,14 @@ static int nfsd_startup_net(int nrservs, struct net *net)
ret = nfsd_init_socks(net);
if (ret)
goto out_socks;
- ret = lockd_up(net);
- if (ret)
- goto out_socks;
+
+ if (nn->nfsd_needs_lockd && !nn->lockd_up) {
+ ret = lockd_up(net);
+ if (ret)
+ goto out_socks;
+ nn->lockd_up = 1;
+ }
+
ret = nfs4_state_start_net(net);
if (ret)
goto out_lockd;
@@ -266,7 +271,10 @@ static int nfsd_startup_net(int nrservs, struct net *net)
return 0;

out_lockd:
- lockd_down(net);
+ if (nn->lockd_up) {
+ lockd_down(net);
+ nn->lockd_up = 0;
+ }
out_socks:
nfsd_shutdown_generic();
return ret;
@@ -277,7 +285,10 @@ static void nfsd_shutdown_net(struct net *net)
struct nfsd_net *nn = net_generic(net, nfsd_net_id);

nfs4_state_shutdown_net(net);
- lockd_down(net);
+ if (nn->lockd_up) {
+ lockd_down(net);
+ nn->lockd_up = 0;
+ }
nn->nfsd_net_up = false;
nfsd_shutdown_generic();
}
--
1.8.4.2



2013-12-30 17:54:12

by Chuck Lever III

[permalink] [raw]
Subject: Re: [PATCH 2/2] NFSD: supports nfsv4 service without rpcbind


On Dec 30, 2013, at 6:25 AM, Kinglong Mee <[email protected]> wrote:

> 1. set vs_ignore_err for nfsd_version4
> 2. don't start lockd when only supports nfsv4
>
> Signed-off-by: Kinglong Mee <[email protected]>
> ---
> fs/nfsd/netns.h | 1 +
> fs/nfsd/nfs4proc.c | 1 +
> fs/nfsd/nfssvc.c | 26 +++++++++++++++++++++-----
> 3 files changed, 23 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index 849a7c3..d32b3aa 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -95,6 +95,7 @@ struct nfsd_net {
> time_t nfsd4_grace;
>
> bool nfsd_net_up;
> + bool lockd_up;
>
> /*
> * Time of server startup
> diff --git a/fs/nfsd/nfs4proc.c b/fs/nfsd/nfs4proc.c
> index 419572f..9320986 100644
> --- a/fs/nfsd/nfs4proc.c
> +++ b/fs/nfsd/nfs4proc.c
> @@ -1881,6 +1881,7 @@ struct svc_version nfsd_version4 = {
> .vs_proc = nfsd_procedures4,
> .vs_dispatch = nfsd_dispatch,
> .vs_xdrsize = NFS4_SVC_XDRSIZE,
> + .vs_ignore_err = 1,

It's better, I think, to include this hunk in 1/1. That way, that one patch can be merged into stable, or cherry-picked by a distribution.

Also, just a nit: ".vs_ignore_err" is a rather generic name. Something more specific like ".vs_rpcb_optnl" would be nicer, or reverse the logic and call it ".vs_rpcb_needed".

> };
>
> /*
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 760c85a..55b5b57 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -241,6 +241,11 @@ static void nfsd_shutdown_generic(void)
> nfsd_racache_shutdown();
> }
>
> +static bool nfsd_needs_lockd(void)
> +{
> + return (nfsd_versions[2] != NULL) || (nfsd_versions[3] != NULL);
> +}

How does this logic know which version of nfsd is being started?

> +
> static int nfsd_startup_net(int nrservs, struct net *net)
> {
> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> @@ -255,9 +260,14 @@ static int nfsd_startup_net(int nrservs, struct net *net)
> ret = nfsd_init_socks(net);
> if (ret)
> goto out_socks;
> - ret = lockd_up(net);
> - if (ret)
> - goto out_socks;
> +
> + if (nfsd_needs_lockd() && !nn->lockd_up) {
> + ret = lockd_up(net);
> + if (ret)
> + goto out_socks;
> + nn->lockd_up = 1;
> + }
> +
> ret = nfs4_state_start_net(net);
> if (ret)
> goto out_lockd;
> @@ -266,7 +276,10 @@ static int nfsd_startup_net(int nrservs, struct net *net)
> return 0;
>
> out_lockd:
> - lockd_down(net);
> + if (nn->lockd_up) {
> + lockd_down(net);
> + nn->lockd_up = 0;
> + }
> out_socks:
> nfsd_shutdown_generic();
> return ret;
> @@ -277,7 +290,10 @@ static void nfsd_shutdown_net(struct net *net)
> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>
> nfs4_state_shutdown_net(net);
> - lockd_down(net);
> + if (nn->lockd_up) {
> + lockd_down(net);
> + nn->lockd_up = 0;
> + }
> nn->nfsd_net_up = false;
> nfsd_shutdown_generic();
> }
> --
> 1.8.4.2
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Chuck Lever
chuck[dot]lever[at]oracle[dot]com




2013-12-31 19:40:03

by Trond Myklebust

[permalink] [raw]
Subject: [PATCH 2/4] SUNRPC: Handle connect errors ECONNABORTED and EHOSTUNREACH

Ensure that call_bind_status, call_connect_status, call_transmit_status and
call_status all are capable of handling ECONNABORTED and EHOSTUNREACH.

Signed-off-by: Trond Myklebust <[email protected]>
---
net/sunrpc/clnt.c | 7 ++++++-
1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/net/sunrpc/clnt.c b/net/sunrpc/clnt.c
index f09b7db2c492..b9276a63eaf1 100644
--- a/net/sunrpc/clnt.c
+++ b/net/sunrpc/clnt.c
@@ -1729,6 +1729,7 @@ call_bind_status(struct rpc_task *task)
return;
case -ECONNREFUSED: /* connection problems */
case -ECONNRESET:
+ case -ECONNABORTED:
case -ENOTCONN:
case -EHOSTDOWN:
case -EHOSTUNREACH:
@@ -1799,7 +1800,9 @@ call_connect_status(struct rpc_task *task)
return;
case -ECONNREFUSED:
case -ECONNRESET:
+ case -ECONNABORTED:
case -ENETUNREACH:
+ case -EHOSTUNREACH:
/* retry with existing socket, after a delay */
rpc_delay(task, 3*HZ);
if (RPC_IS_SOFTCONN(task))
@@ -1902,6 +1905,7 @@ call_transmit_status(struct rpc_task *task)
break;
}
case -ECONNRESET:
+ case -ECONNABORTED:
case -ENOTCONN:
case -EPIPE:
rpc_task_force_reencode(task);
@@ -2011,8 +2015,9 @@ call_status(struct rpc_task *task)
xprt_conditional_disconnect(req->rq_xprt,
req->rq_connect_cookie);
break;
- case -ECONNRESET:
case -ECONNREFUSED:
+ case -ECONNRESET:
+ case -ECONNABORTED:
rpc_force_rebind(clnt);
rpc_delay(task, 3*HZ);
case -EPIPE:
--
1.8.4.2


2013-12-31 05:17:38

by Kinglong Mee

[permalink] [raw]
Subject: [PATCH 2/2 v2] NFSD: Don't start lockd when only NFSv4 is running

When starting without nfsv2 and nfsv3, nfsd should not setup lockd,
especially when rpcbind is stop.

v2:
move setting nfsd_version4's vs_rpcb_optnl option to new patch [1/2]

Reported-by: Gareth Williams <[email protected]>
Reviewed-by: Chuck Lever <[email protected]>
Signed-off-by: Kinglong Mee <[email protected]>
---
fs/nfsd/netns.h | 1 +
fs/nfsd/nfssvc.c | 26 +++++++++++++++++++++-----
2 files changed, 22 insertions(+), 5 deletions(-)

diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
index 849a7c3..d32b3aa 100644
--- a/fs/nfsd/netns.h
+++ b/fs/nfsd/netns.h
@@ -95,6 +95,7 @@ struct nfsd_net {
time_t nfsd4_grace;

bool nfsd_net_up;
+ bool lockd_up;

/*
* Time of server startup
diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 760c85a..55b5b57 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -241,6 +241,11 @@ static void nfsd_shutdown_generic(void)
nfsd_racache_shutdown();
}

+static bool nfsd_needs_lockd(void)
+{
+ return (nfsd_versions[2] != NULL) || (nfsd_versions[3] != NULL);
+}
+
static int nfsd_startup_net(int nrservs, struct net *net)
{
struct nfsd_net *nn = net_generic(net, nfsd_net_id);
@@ -255,9 +260,14 @@ static int nfsd_startup_net(int nrservs, struct net *net)
ret = nfsd_init_socks(net);
if (ret)
goto out_socks;
- ret = lockd_up(net);
- if (ret)
- goto out_socks;
+
+ if (nfsd_needs_lockd() && !nn->lockd_up) {
+ ret = lockd_up(net);
+ if (ret)
+ goto out_socks;
+ nn->lockd_up = 1;
+ }
+
ret = nfs4_state_start_net(net);
if (ret)
goto out_lockd;
@@ -266,7 +276,10 @@ static int nfsd_startup_net(int nrservs, struct net *net)
return 0;

out_lockd:
- lockd_down(net);
+ if (nn->lockd_up) {
+ lockd_down(net);
+ nn->lockd_up = 0;
+ }
out_socks:
nfsd_shutdown_generic();
return ret;
@@ -277,7 +290,10 @@ static void nfsd_shutdown_net(struct net *net)
struct nfsd_net *nn = net_generic(net, nfsd_net_id);

nfs4_state_shutdown_net(net);
- lockd_down(net);
+ if (nn->lockd_up) {
+ lockd_down(net);
+ nn->lockd_up = 0;
+ }
nn->nfsd_net_up = false;
nfsd_shutdown_generic();
}
--
1.8.4.2

2014-01-05 20:19:18

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 4/4] SUNRPC: Add tracepoint for socket errors


On Jan 5, 2014, at 14:54, J. Bruce Fields <[email protected]> wrote:

> On Tue, Dec 31, 2013 at 02:39:41PM -0500, Trond Myklebust wrote:
>> Signed-off-by: Trond Myklebust <[email protected]>
>> ---
>> include/trace/events/sunrpc.h | 1 +
>> net/sunrpc/xprtsock.c | 1 +
>> 2 files changed, 2 insertions(+)
>
> ACK to all these from me. I'm assuming they'll go in through your tree
> for 3.14.

I?m queuing them up for that, yes.

Cheers
Trond

2014-01-06 03:28:48

by Kinglong Mee

[permalink] [raw]
Subject: NFSD: fix compile warning without CONFIG_NFSD_V3

Without CONFIG_NFSD_V3, compile will get warning as,

fs/nfsd/nfssvc.c: In function 'nfsd_svc':
>> fs/nfsd/nfssvc.c:246:60: warning: array subscript is above array bounds [-Warray-bounds]
return (nfsd_versions[2] != NULL) || (nfsd_versions[3] != NULL);
^

Reported-by: kbuild test robot <[email protected]>
Signed-off-by: Kinglong Mee <[email protected]>
---
fs/nfsd/nfssvc.c | 4 ++++
1 file changed, 4 insertions(+)

diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
index 55b5b57..9a4a5f9 100644
--- a/fs/nfsd/nfssvc.c
+++ b/fs/nfsd/nfssvc.c
@@ -243,7 +243,11 @@ static void nfsd_shutdown_generic(void)

static bool nfsd_needs_lockd(void)
{
+#if defined(CONFIG_NFSD_V3)
return (nfsd_versions[2] != NULL) || (nfsd_versions[3] != NULL);
+#else
+ return (nfsd_versions[2] != NULL);
+#endif
}

static int nfsd_startup_net(int nrservs, struct net *net)
--
1.8.4.2

2014-01-06 18:44:49

by J. Bruce Fields

[permalink] [raw]
Subject: Re: NFSD: fix compile warning without CONFIG_NFSD_V3

On Mon, Jan 06, 2014 at 11:28:41AM +0800, Kinglong Mee wrote:
> Without CONFIG_NFSD_V3, compile will get warning as,
>
> fs/nfsd/nfssvc.c: In function 'nfsd_svc':
> >> fs/nfsd/nfssvc.c:246:60: warning: array subscript is above array bounds [-Warray-bounds]
> return (nfsd_versions[2] != NULL) || (nfsd_versions[3] != NULL);

Thanks, applying.

Though it might be simpler to define the array to always be length 4.

--b.

> ^
>
> Reported-by: kbuild test robot <[email protected]>
> Signed-off-by: Kinglong Mee <[email protected]>
> ---
> fs/nfsd/nfssvc.c | 4 ++++
> 1 file changed, 4 insertions(+)
>
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 55b5b57..9a4a5f9 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -243,7 +243,11 @@ static void nfsd_shutdown_generic(void)
>
> static bool nfsd_needs_lockd(void)
> {
> +#if defined(CONFIG_NFSD_V3)
> return (nfsd_versions[2] != NULL) || (nfsd_versions[3] != NULL);
> +#else
> + return (nfsd_versions[2] != NULL);
> +#endif
> }
>
> static int nfsd_startup_net(int nrservs, struct net *net)
> --
> 1.8.4.2

2014-01-03 23:18:03

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 2/2 v2] NFSD: Don't start lockd when only NFSv4 is running

On Tue, Dec 31, 2013 at 01:17:30PM +0800, Kinglong Mee wrote:
> When starting without nfsv2 and nfsv3, nfsd should not setup lockd,
> especially when rpcbind is stop.
>
> v2:
> move setting nfsd_version4's vs_rpcb_optnl option to new patch [1/2]

Thanks, these look good. Applying both with minor rewrites of the
changelogs.

--b.

>
> Reported-by: Gareth Williams <[email protected]>
> Reviewed-by: Chuck Lever <[email protected]>
> Signed-off-by: Kinglong Mee <[email protected]>
> ---
> fs/nfsd/netns.h | 1 +
> fs/nfsd/nfssvc.c | 26 +++++++++++++++++++++-----
> 2 files changed, 22 insertions(+), 5 deletions(-)
>
> diff --git a/fs/nfsd/netns.h b/fs/nfsd/netns.h
> index 849a7c3..d32b3aa 100644
> --- a/fs/nfsd/netns.h
> +++ b/fs/nfsd/netns.h
> @@ -95,6 +95,7 @@ struct nfsd_net {
> time_t nfsd4_grace;
>
> bool nfsd_net_up;
> + bool lockd_up;
>
> /*
> * Time of server startup
> diff --git a/fs/nfsd/nfssvc.c b/fs/nfsd/nfssvc.c
> index 760c85a..55b5b57 100644
> --- a/fs/nfsd/nfssvc.c
> +++ b/fs/nfsd/nfssvc.c
> @@ -241,6 +241,11 @@ static void nfsd_shutdown_generic(void)
> nfsd_racache_shutdown();
> }
>
> +static bool nfsd_needs_lockd(void)
> +{
> + return (nfsd_versions[2] != NULL) || (nfsd_versions[3] != NULL);
> +}
> +
> static int nfsd_startup_net(int nrservs, struct net *net)
> {
> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
> @@ -255,9 +260,14 @@ static int nfsd_startup_net(int nrservs, struct net *net)
> ret = nfsd_init_socks(net);
> if (ret)
> goto out_socks;
> - ret = lockd_up(net);
> - if (ret)
> - goto out_socks;
> +
> + if (nfsd_needs_lockd() && !nn->lockd_up) {
> + ret = lockd_up(net);
> + if (ret)
> + goto out_socks;
> + nn->lockd_up = 1;
> + }
> +
> ret = nfs4_state_start_net(net);
> if (ret)
> goto out_lockd;
> @@ -266,7 +276,10 @@ static int nfsd_startup_net(int nrservs, struct net *net)
> return 0;
>
> out_lockd:
> - lockd_down(net);
> + if (nn->lockd_up) {
> + lockd_down(net);
> + nn->lockd_up = 0;
> + }
> out_socks:
> nfsd_shutdown_generic();
> return ret;
> @@ -277,7 +290,10 @@ static void nfsd_shutdown_net(struct net *net)
> struct nfsd_net *nn = net_generic(net, nfsd_net_id);
>
> nfs4_state_shutdown_net(net);
> - lockd_down(net);
> + if (nn->lockd_up) {
> + lockd_down(net);
> + nn->lockd_up = 0;
> + }
> nn->nfsd_net_up = false;
> nfsd_shutdown_generic();
> }
> --
> 1.8.4.2

2014-01-02 04:52:20

by Kinglong Mee

[permalink] [raw]
Subject: Re: [PATCH 1/4] SUNRPC: Ensure xprt_connect_status handles all potential connection errors

Hi Trond,

With the whole patchset, rpc.nfsd will return immediately as,
[root@localhost linux-2.6]# rpc.nfsd -N 2 -N 3
rpc.nfsd: writing fd to kernel failed: errno 13 (Permission denied)
rpc.nfsd: writing fd to kernel failed: errno 13 (Permission denied)
rpc.nfsd: unable to set any sockets for nfsd
[root@localhost linux-2.6]# dmesg
[ 1263.249079] svc: failed to register nfsdv4 RPC service (errno 13).
[ 1263.257789] svc: failed to register nfsdv4 RPC service (errno 13).

But, I think errno 13 cannot give user the correct meaning.
As before, errno 111 (Connection refused) maybe better.

thanks,
Kinglong Mee

On 01/01/2014 03:39 AM, Trond Myklebust wrote:
> Currently, xprt_connect_status will convert connection error values such
> as ECONNREFUSED, ECONNRESET, ... into EIO, which means that they never
> get handled.
>
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> net/sunrpc/xprt.c | 5 +++++
> 1 file changed, 5 insertions(+)
>
> diff --git a/net/sunrpc/xprt.c b/net/sunrpc/xprt.c
> index 04199bc8416f..ddd198e90292 100644
> --- a/net/sunrpc/xprt.c
> +++ b/net/sunrpc/xprt.c
> @@ -749,6 +749,11 @@ static void xprt_connect_status(struct rpc_task *task)
> }
>
> switch (task->tk_status) {
> + case -ECONNREFUSED:
> + case -ECONNRESET:
> + case -ECONNABORTED:
> + case -ENETUNREACH:
> + case -EHOSTUNREACH:
> case -EAGAIN:
> dprintk("RPC: %5u xprt_connect_status: retrying\n", task->tk_pid);
> break;
>


2014-01-05 19:54:09

by J. Bruce Fields

[permalink] [raw]
Subject: Re: [PATCH 4/4] SUNRPC: Add tracepoint for socket errors

On Tue, Dec 31, 2013 at 02:39:41PM -0500, Trond Myklebust wrote:
> Signed-off-by: Trond Myklebust <[email protected]>
> ---
> include/trace/events/sunrpc.h | 1 +
> net/sunrpc/xprtsock.c | 1 +
> 2 files changed, 2 insertions(+)

ACK to all these from me. I'm assuming they'll go in through your tree
for 3.14.

--b.

>
> diff --git a/include/trace/events/sunrpc.h b/include/trace/events/sunrpc.h
> index d51d16c7afd8..ddc179b7a105 100644
> --- a/include/trace/events/sunrpc.h
> +++ b/include/trace/events/sunrpc.h
> @@ -301,6 +301,7 @@ DECLARE_EVENT_CLASS(xs_socket_event_done,
>
> DEFINE_RPC_SOCKET_EVENT(rpc_socket_state_change);
> DEFINE_RPC_SOCKET_EVENT_DONE(rpc_socket_connect);
> +DEFINE_RPC_SOCKET_EVENT_DONE(rpc_socket_error);
> DEFINE_RPC_SOCKET_EVENT_DONE(rpc_socket_reset_connection);
> DEFINE_RPC_SOCKET_EVENT(rpc_socket_close);
> DEFINE_RPC_SOCKET_EVENT(rpc_socket_shutdown);
> diff --git a/net/sunrpc/xprtsock.c b/net/sunrpc/xprtsock.c
> index ab006b7b7ab8..25dbfa971948 100644
> --- a/net/sunrpc/xprtsock.c
> +++ b/net/sunrpc/xprtsock.c
> @@ -837,6 +837,7 @@ static void xs_error_report(struct sock *sk)
> goto out;
> dprintk("RPC: xs_error_report client %p, error=%d...\n",
> xprt, -err);
> + trace_rpc_socket_error(xprt, sk->sk_socket, err);
> xprt_wake_pending_tasks(xprt, err);
> out:
> read_unlock_bh(&sk->sk_callback_lock);
> --
> 1.8.4.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-nfs" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html