2020-01-16 19:10:08

by Olga Kornievskaia

[permalink] [raw]
Subject: [PATCH 1/1] NFSv4.0 allow nconnect for v4.0

From: Olga Kornievskaia <[email protected]>

Signed-off-by: Olga Kornievskaia <[email protected]>
---
fs/nfs/nfs4client.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
index 460d625..4df3fb0 100644
--- a/fs/nfs/nfs4client.c
+++ b/fs/nfs/nfs4client.c
@@ -881,7 +881,7 @@ static int nfs4_set_client(struct nfs_server *server,

if (minorversion == 0)
__set_bit(NFS_CS_REUSEPORT, &cl_init.init_flags);
- else if (proto == XPRT_TRANSPORT_TCP)
+ if (proto == XPRT_TRANSPORT_TCP)
cl_init.nconnect = nconnect;

if (server->flags & NFS_MOUNT_NORESVPORT)
--
1.8.3.1


2020-01-17 21:10:21

by Anna Schumaker

[permalink] [raw]
Subject: Re: [PATCH 1/1] NFSv4.0 allow nconnect for v4.0

Hi Olga,

On Thu, 2020-01-16 at 14:08 -0500, Olga Kornievskaia wrote:
> From: Olga Kornievskaia <[email protected]>

Have you done any testing with nconnect and the v4.0 replay caches? I did some
digging on the mailing list and found this in one of the cover letters from
Trond: "The feature is only enabled for NFSv4.1 and NFSv4.2 for now; I don't
feel comfortable subjecting NFSv3/v4 replay caches to this treatment yet."

Thanks,
Anna

>
> Signed-off-by: Olga Kornievskaia <[email protected]>
> ---
> fs/nfs/nfs4client.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> index 460d625..4df3fb0 100644
> --- a/fs/nfs/nfs4client.c
> +++ b/fs/nfs/nfs4client.c
> @@ -881,7 +881,7 @@ static int nfs4_set_client(struct nfs_server *server,
>
> if (minorversion == 0)
> __set_bit(NFS_CS_REUSEPORT, &cl_init.init_flags);
> - else if (proto == XPRT_TRANSPORT_TCP)
> + if (proto == XPRT_TRANSPORT_TCP)
> cl_init.nconnect = nconnect;
>
> if (server->flags & NFS_MOUNT_NORESVPORT)

2020-01-17 21:14:36

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 1/1] NFSv4.0 allow nconnect for v4.0

On Fri, 2020-01-17 at 21:09 +0000, Schumaker, Anna wrote:
> Hi Olga,
>
> On Thu, 2020-01-16 at 14:08 -0500, Olga Kornievskaia wrote:
> > From: Olga Kornievskaia <[email protected]>
>
> Have you done any testing with nconnect and the v4.0 replay caches? I
> did some
> digging on the mailing list and found this in one of the cover
> letters from
> Trond: "The feature is only enabled for NFSv4.1 and NFSv4.2 for now;
> I don't
> feel comfortable subjecting NFSv3/v4 replay caches to this treatment
> yet."
>

That comment should be considered obsolete. The current code works hard
to ensure that we replay using the same connection (or at least the
same source/dest IP+ports) so that NFSv3/v4.0 DRCs work as expected.
For that reason we've had NFSv3 support since the feature was merged.
The NFSv4.0 support was just forgotten.

> Thanks,
> Anna
>
> > Signed-off-by: Olga Kornievskaia <[email protected]>
> > ---
> > fs/nfs/nfs4client.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > index 460d625..4df3fb0 100644
> > --- a/fs/nfs/nfs4client.c
> > +++ b/fs/nfs/nfs4client.c
> > @@ -881,7 +881,7 @@ static int nfs4_set_client(struct nfs_server
> > *server,
> >
> > if (minorversion == 0)
> > __set_bit(NFS_CS_REUSEPORT, &cl_init.init_flags);
> > - else if (proto == XPRT_TRANSPORT_TCP)
> > + if (proto == XPRT_TRANSPORT_TCP)
> > cl_init.nconnect = nconnect;
> >
> > if (server->flags & NFS_MOUNT_NORESVPORT)
--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2020-01-17 21:17:12

by Anna Schumaker

[permalink] [raw]
Subject: Re: [PATCH 1/1] NFSv4.0 allow nconnect for v4.0

On Fri, 2020-01-17 at 21:14 +0000, Trond Myklebust wrote:
> On Fri, 2020-01-17 at 21:09 +0000, Schumaker, Anna wrote:
> > Hi Olga,
> >
> > On Thu, 2020-01-16 at 14:08 -0500, Olga Kornievskaia wrote:
> > > From: Olga Kornievskaia <[email protected]>
> >
> > Have you done any testing with nconnect and the v4.0 replay caches? I
> > did some
> > digging on the mailing list and found this in one of the cover
> > letters from
> > Trond: "The feature is only enabled for NFSv4.1 and NFSv4.2 for now;
> > I don't
> > feel comfortable subjecting NFSv3/v4 replay caches to this treatment
> > yet."
> >
>
> That comment should be considered obsolete. The current code works hard
> to ensure that we replay using the same connection (or at least the
> same source/dest IP+ports) so that NFSv3/v4.0 DRCs work as expected.
> For that reason we've had NFSv3 support since the feature was merged.
> The NFSv4.0 support was just forgotten.

Thanks for the explanation! I'll add the patch.

Anna

>
> > Thanks,
> > Anna
> >
> > > Signed-off-by: Olga Kornievskaia <[email protected]>
> > > ---
> > > fs/nfs/nfs4client.c | 2 +-
> > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > >
> > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > > index 460d625..4df3fb0 100644
> > > --- a/fs/nfs/nfs4client.c
> > > +++ b/fs/nfs/nfs4client.c
> > > @@ -881,7 +881,7 @@ static int nfs4_set_client(struct nfs_server
> > > *server,
> > >
> > > if (minorversion == 0)
> > > __set_bit(NFS_CS_REUSEPORT, &cl_init.init_flags);
> > > - else if (proto == XPRT_TRANSPORT_TCP)
> > > + if (proto == XPRT_TRANSPORT_TCP)
> > > cl_init.nconnect = nconnect;
> > >
> > > if (server->flags & NFS_MOUNT_NORESVPORT)
> --
> Trond Myklebust
> Linux NFS client maintainer, Hammerspace
> [email protected]
>
>

2020-01-20 15:36:04

by Steve Dickson

[permalink] [raw]
Subject: Re: [PATCH 1/1] NFSv4.0 allow nconnect for v4.0

Hello,

On 1/16/20 2:08 PM, Olga Kornievskaia wrote:
> From: Olga Kornievskaia <[email protected]>
>
> Signed-off-by: Olga Kornievskaia <[email protected]>
> ---
> fs/nfs/nfs4client.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> index 460d625..4df3fb0 100644
> --- a/fs/nfs/nfs4client.c
> +++ b/fs/nfs/nfs4client.c
> @@ -881,7 +881,7 @@ static int nfs4_set_client(struct nfs_server *server,
>
> if (minorversion == 0)
> __set_bit(NFS_CS_REUSEPORT, &cl_init.init_flags);
> - else if (proto == XPRT_TRANSPORT_TCP)
> + if (proto == XPRT_TRANSPORT_TCP)
> cl_init.nconnect = nconnect;
>
> if (server->flags & NFS_MOUNT_NORESVPORT)
>
Tested-by: Steve Dickson <[email protected]>

With this patch v4.0 mounts act just like v4.1/v4.2 mounts
But is that a good thing. :-)

Here is what I've found in my testing...

mount -onconnect=12 172.31.1.54:/home/tmp /mnt/tmp

Will create 12 TCP connections and maintain those 12
connections until the umount happens. By maintain I mean
if the connection times out, it is reconnected
to maintain the 12 connections

# mount -onconnect=12 172.31.1.54:/home/tmp /mnt/tmp
# netstat -an | grep 172.31.1.54 | wc -l
12
# netstat -an | grep 172.31.1.54
tcp 0 0 172.31.1.24:901 172.31.1.54:2049 ESTABLISHED
tcp 0 0 172.31.1.24:667 172.31.1.54:2049 ESTABLISHED
tcp 0 0 172.31.1.24:746 172.31.1.54:2049 ESTABLISHED
tcp 0 0 172.31.1.24:672 172.31.1.54:2049 ESTABLISHED
tcp 0 0 172.31.1.24:832 172.31.1.54:2049 ESTABLISHED
tcp 0 0 172.31.1.24:895 172.31.1.54:2049 ESTABLISHED
tcp 0 0 172.31.1.24:673 172.31.1.54:2049 ESTABLISHED
tcp 0 0 172.31.1.24:732 172.31.1.54:2049 ESTABLISHED
tcp 0 0 172.31.1.24:795 172.31.1.54:2049 ESTABLISHED
tcp 0 0 172.31.1.24:918 172.31.1.54:2049 ESTABLISHED
tcp 0 0 172.31.1.24:674 172.31.1.54:2049 ESTABLISHED
tcp 0 0 172.31.1.24:953 172.31.1.54:2049 ESTABLISHED

# umount /mnt/tmp
# netstat -an | grep 172.31.1.54 | wc -l
12
# netstat -an | grep 172.31.1.54
tcp 0 0 172.31.1.24:901 172.31.1.54:2049 TIME_WAIT
tcp 0 0 172.31.1.24:667 172.31.1.54:2049 TIME_WAIT
tcp 0 0 172.31.1.24:746 172.31.1.54:2049 TIME_WAIT
tcp 0 0 172.31.1.24:672 172.31.1.54:2049 TIME_WAIT
tcp 0 0 172.31.1.24:832 172.31.1.54:2049 TIME_WAIT
tcp 0 0 172.31.1.24:895 172.31.1.54:2049 TIME_WAIT
tcp 0 0 172.31.1.24:673 172.31.1.54:2049 TIME_WAIT
tcp 0 0 172.31.1.24:732 172.31.1.54:2049 TIME_WAIT
tcp 0 0 172.31.1.24:795 172.31.1.54:2049 TIME_WAIT
tcp 0 0 172.31.1.24:918 172.31.1.54:2049 TIME_WAIT
tcp 0 0 172.31.1.24:674 172.31.1.54:2049 TIME_WAIT
tcp 0 0 172.31.1.24:953 172.31.1.54:2049 TIME_WAIT

Is this the expected behavior?

If so I have a few concerns...

* The connections walk all over the /etc/services namespace. Meaning
using ports that are reserved for registered services, something
we've tried to avoid in userland by not binding to privilege ports and
use of backlist ports via /etc/bindresvport.blacklist

* When the unmount happens, all those connections go into TIME_WAIT on
privilege ports and there are only so many of those. Not good during mount
storms (when a server reboots and thousand of home dirs are remounted).

* No man page describing the new feature.

I realize there is not much we can do about some of these
(aka umount==>TIME_WAIT) but I think we need to document
what we are doing to people's connection namespace when
they use this feature.

steved.

2020-01-23 00:25:25

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 1/1] NFSv4.0 allow nconnect for v4.0

On Mon, 2020-01-20 at 10:35 -0500, Steve Dickson wrote:
> Hello,
>
> On 1/16/20 2:08 PM, Olga Kornievskaia wrote:
> > From: Olga Kornievskaia <[email protected]>
> >
> > Signed-off-by: Olga Kornievskaia <[email protected]>
> > ---
> > fs/nfs/nfs4client.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > index 460d625..4df3fb0 100644
> > --- a/fs/nfs/nfs4client.c
> > +++ b/fs/nfs/nfs4client.c
> > @@ -881,7 +881,7 @@ static int nfs4_set_client(struct nfs_server
> > *server,
> >
> > if (minorversion == 0)
> > __set_bit(NFS_CS_REUSEPORT, &cl_init.init_flags);
> > - else if (proto == XPRT_TRANSPORT_TCP)
> > + if (proto == XPRT_TRANSPORT_TCP)
> > cl_init.nconnect = nconnect;
> >
> > if (server->flags & NFS_MOUNT_NORESVPORT)
> >
> Tested-by: Steve Dickson <[email protected]>
>
> With this patch v4.0 mounts act just like v4.1/v4.2 mounts
> But is that a good thing. :-)
>
> Here is what I've found in my testing...
>
> mount -onconnect=12 172.31.1.54:/home/tmp /mnt/tmp
>
> Will create 12 TCP connections and maintain those 12
> connections until the umount happens. By maintain I mean
> if the connection times out, it is reconnected
> to maintain the 12 connections
>
> # mount -onconnect=12 172.31.1.54:/home/tmp /mnt/tmp
> # netstat -an | grep 172.31.1.54 | wc -l
> 12
> # netstat -an | grep 172.31.1.54
> tcp 0 0
> 172.31.1.24:901 172.31.1.54:2049 ESTABLISHED
> tcp 0 0
> 172.31.1.24:667 172.31.1.54:2049 ESTABLISHED
> tcp 0 0
> 172.31.1.24:746 172.31.1.54:2049 ESTABLISHED
> tcp 0 0
> 172.31.1.24:672 172.31.1.54:2049 ESTABLISHED
> tcp 0 0
> 172.31.1.24:832 172.31.1.54:2049 ESTABLISHED
> tcp 0 0
> 172.31.1.24:895 172.31.1.54:2049 ESTABLISHED
> tcp 0 0
> 172.31.1.24:673 172.31.1.54:2049 ESTABLISHED
> tcp 0 0
> 172.31.1.24:732 172.31.1.54:2049 ESTABLISHED
> tcp 0 0
> 172.31.1.24:795 172.31.1.54:2049 ESTABLISHED
> tcp 0 0
> 172.31.1.24:918 172.31.1.54:2049 ESTABLISHED
> tcp 0 0
> 172.31.1.24:674 172.31.1.54:2049 ESTABLISHED
> tcp 0 0
> 172.31.1.24:953 172.31.1.54:2049 ESTABLISHED
>
> # umount /mnt/tmp
> # netstat -an | grep 172.31.1.54 | wc -l
> 12
> # netstat -an | grep 172.31.1.54
> tcp 0 0
> 172.31.1.24:901 172.31.1.54:2049 TIME_WAIT
> tcp 0 0
> 172.31.1.24:667 172.31.1.54:2049 TIME_WAIT
> tcp 0 0
> 172.31.1.24:746 172.31.1.54:2049 TIME_WAIT
> tcp 0 0
> 172.31.1.24:672 172.31.1.54:2049 TIME_WAIT
> tcp 0 0
> 172.31.1.24:832 172.31.1.54:2049 TIME_WAIT
> tcp 0 0
> 172.31.1.24:895 172.31.1.54:2049 TIME_WAIT
> tcp 0 0
> 172.31.1.24:673 172.31.1.54:2049 TIME_WAIT
> tcp 0 0
> 172.31.1.24:732 172.31.1.54:2049 TIME_WAIT
> tcp 0 0
> 172.31.1.24:795 172.31.1.54:2049 TIME_WAIT
> tcp 0 0
> 172.31.1.24:918 172.31.1.54:2049 TIME_WAIT
> tcp 0 0
> 172.31.1.24:674 172.31.1.54:2049 TIME_WAIT
> tcp 0 0
> 172.31.1.24:953 172.31.1.54:2049 TIME_WAIT
>
> Is this the expected behavior?
>
> If so I have a few concerns...
>
> * The connections walk all over the /etc/services namespace. Meaning
> using ports that are reserved for registered services, something
> we've tried to avoid in userland by not binding to privilege ports
> and
> use of backlist ports via /etc/bindresvport.blacklist
>
> * When the unmount happens, all those connections go into TIME_WAIT
> on
> privilege ports and there are only so many of those. Not good during
> mount
> storms (when a server reboots and thousand of home dirs are
> remounted).
>
> * No man page describing the new feature.
>
> I realize there is not much we can do about some of these
> (aka umount==>TIME_WAIT) but I think we need to document
> what we are doing to people's connection namespace when
> they use this feature.

I'm not sure that I understand the concern. The connections are limited
to a specific window of ports by the min_resvport/max_resvport sunrpc
module parameters just as they were before we added 'nconnect'. Nothing
has changed in the way we choose ports...

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2020-01-27 18:39:45

by Steve Dickson

[permalink] [raw]
Subject: Re: [PATCH 1/1] NFSv4.0 allow nconnect for v4.0



On 1/22/20 7:23 PM, Trond Myklebust wrote:
> On Mon, 2020-01-20 at 10:35 -0500, Steve Dickson wrote:
>> Hello,
>>
>> On 1/16/20 2:08 PM, Olga Kornievskaia wrote:
>>> From: Olga Kornievskaia <[email protected]>
>>>
>>> Signed-off-by: Olga Kornievskaia <[email protected]>
>>> ---
>>> fs/nfs/nfs4client.c | 2 +-
>>> 1 file changed, 1 insertion(+), 1 deletion(-)
>>>
>>> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
>>> index 460d625..4df3fb0 100644
>>> --- a/fs/nfs/nfs4client.c
>>> +++ b/fs/nfs/nfs4client.c
>>> @@ -881,7 +881,7 @@ static int nfs4_set_client(struct nfs_server
>>> *server,
>>>
>>> if (minorversion == 0)
>>> __set_bit(NFS_CS_REUSEPORT, &cl_init.init_flags);
>>> - else if (proto == XPRT_TRANSPORT_TCP)
>>> + if (proto == XPRT_TRANSPORT_TCP)
>>> cl_init.nconnect = nconnect;
>>>
>>> if (server->flags & NFS_MOUNT_NORESVPORT)
>>>
>> Tested-by: Steve Dickson <[email protected]>
>>
>> With this patch v4.0 mounts act just like v4.1/v4.2 mounts
>> But is that a good thing. :-)
>>
>> Here is what I've found in my testing...
>>
>> mount -onconnect=12 172.31.1.54:/home/tmp /mnt/tmp
>>
>> Will create 12 TCP connections and maintain those 12
>> connections until the umount happens. By maintain I mean
>> if the connection times out, it is reconnected
>> to maintain the 12 connections
>>
>> # mount -onconnect=12 172.31.1.54:/home/tmp /mnt/tmp
>> # netstat -an | grep 172.31.1.54 | wc -l
>> 12
>> # netstat -an | grep 172.31.1.54
>> tcp 0 0
>> 172.31.1.24:901 172.31.1.54:2049 ESTABLISHED
>> tcp 0 0
>> 172.31.1.24:667 172.31.1.54:2049 ESTABLISHED
>> tcp 0 0
>> 172.31.1.24:746 172.31.1.54:2049 ESTABLISHED
>> tcp 0 0
>> 172.31.1.24:672 172.31.1.54:2049 ESTABLISHED
>> tcp 0 0
>> 172.31.1.24:832 172.31.1.54:2049 ESTABLISHED
>> tcp 0 0
>> 172.31.1.24:895 172.31.1.54:2049 ESTABLISHED
>> tcp 0 0
>> 172.31.1.24:673 172.31.1.54:2049 ESTABLISHED
>> tcp 0 0
>> 172.31.1.24:732 172.31.1.54:2049 ESTABLISHED
>> tcp 0 0
>> 172.31.1.24:795 172.31.1.54:2049 ESTABLISHED
>> tcp 0 0
>> 172.31.1.24:918 172.31.1.54:2049 ESTABLISHED
>> tcp 0 0
>> 172.31.1.24:674 172.31.1.54:2049 ESTABLISHED
>> tcp 0 0
>> 172.31.1.24:953 172.31.1.54:2049 ESTABLISHED
>>
>> # umount /mnt/tmp
>> # netstat -an | grep 172.31.1.54 | wc -l
>> 12
>> # netstat -an | grep 172.31.1.54
>> tcp 0 0
>> 172.31.1.24:901 172.31.1.54:2049 TIME_WAIT
>> tcp 0 0
>> 172.31.1.24:667 172.31.1.54:2049 TIME_WAIT
>> tcp 0 0
>> 172.31.1.24:746 172.31.1.54:2049 TIME_WAIT
>> tcp 0 0
>> 172.31.1.24:672 172.31.1.54:2049 TIME_WAIT
>> tcp 0 0
>> 172.31.1.24:832 172.31.1.54:2049 TIME_WAIT
>> tcp 0 0
>> 172.31.1.24:895 172.31.1.54:2049 TIME_WAIT
>> tcp 0 0
>> 172.31.1.24:673 172.31.1.54:2049 TIME_WAIT
>> tcp 0 0
>> 172.31.1.24:732 172.31.1.54:2049 TIME_WAIT
>> tcp 0 0
>> 172.31.1.24:795 172.31.1.54:2049 TIME_WAIT
>> tcp 0 0
>> 172.31.1.24:918 172.31.1.54:2049 TIME_WAIT
>> tcp 0 0
>> 172.31.1.24:674 172.31.1.54:2049 TIME_WAIT
>> tcp 0 0
>> 172.31.1.24:953 172.31.1.54:2049 TIME_WAIT
>>
>> Is this the expected behavior?
>>
>> If so I have a few concerns...
>>
>> * The connections walk all over the /etc/services namespace. Meaning
>> using ports that are reserved for registered services, something
>> we've tried to avoid in userland by not binding to privilege ports
>> and
>> use of backlist ports via /etc/bindresvport.blacklist
>>
>> * When the unmount happens, all those connections go into TIME_WAIT
>> on
>> privilege ports and there are only so many of those. Not good during
>> mount
>> storms (when a server reboots and thousand of home dirs are
>> remounted).
>>
>> * No man page describing the new feature.
>>
>> I realize there is not much we can do about some of these
>> (aka umount==>TIME_WAIT) but I think we need to document
>> what we are doing to people's connection namespace when
>> they use this feature.
>
> I'm not sure that I understand the concern. The connections are limited
> to a specific window of ports by the min_resvport/max_resvport sunrpc
> module parameters just as they were before we added 'nconnect'. Nothing
> has changed in the way we choose ports...
>
Maybe this problem has existed for a while...

Here are the mins/max ports
RPC_DEF_MIN_RESVPORT (665U)
RPC_DEF_MAX_RESVPORT (1023U)

From /etc/services there are the services in that range
acp(674), ha-cluster(694), kerberos-adm(749), kerberos-iv(750)
webster(765), phonebook(767), rsync(873), rquotad(875),
telnets(992), imaps(993), pop3s(995)

Granted a lot of these are unused/legacy services, but some of
them, like imaps and rsync, are still used.

My point is since the nconnect connections are persistent, for
the life of the mount, we could end up squatting on ports other
services will needed.

Maybe there is not much we can do about this... But we should explain
somewhere, like the man page, that nconnect will create up to 16
persistent connection on register privilege ports.

steved.

2020-01-27 19:35:25

by Trond Myklebust

[permalink] [raw]
Subject: Re: [PATCH 1/1] NFSv4.0 allow nconnect for v4.0

On Mon, 2020-01-27 at 13:39 -0500, Steve Dickson wrote:
>
> On 1/22/20 7:23 PM, Trond Myklebust wrote:
> > On Mon, 2020-01-20 at 10:35 -0500, Steve Dickson wrote:
> > > Hello,
> > >
> > > On 1/16/20 2:08 PM, Olga Kornievskaia wrote:
> > > > From: Olga Kornievskaia <[email protected]>
> > > >
> > > > Signed-off-by: Olga Kornievskaia <[email protected]>
> > > > ---
> > > > fs/nfs/nfs4client.c | 2 +-
> > > > 1 file changed, 1 insertion(+), 1 deletion(-)
> > > >
> > > > diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> > > > index 460d625..4df3fb0 100644
> > > > --- a/fs/nfs/nfs4client.c
> > > > +++ b/fs/nfs/nfs4client.c
> > > > @@ -881,7 +881,7 @@ static int nfs4_set_client(struct
> > > > nfs_server
> > > > *server,
> > > >
> > > > if (minorversion == 0)
> > > > __set_bit(NFS_CS_REUSEPORT,
> > > > &cl_init.init_flags);
> > > > - else if (proto == XPRT_TRANSPORT_TCP)
> > > > + if (proto == XPRT_TRANSPORT_TCP)
> > > > cl_init.nconnect = nconnect;
> > > >
> > > > if (server->flags & NFS_MOUNT_NORESVPORT)
> > > >
> > > Tested-by: Steve Dickson <[email protected]>
> > >
> > > With this patch v4.0 mounts act just like v4.1/v4.2 mounts
> > > But is that a good thing. :-)
> > >
> > > Here is what I've found in my testing...
> > >
> > > mount -onconnect=12 172.31.1.54:/home/tmp /mnt/tmp
> > >
> > > Will create 12 TCP connections and maintain those 12
> > > connections until the umount happens. By maintain I mean
> > > if the connection times out, it is reconnected
> > > to maintain the 12 connections
> > >
> > > # mount -onconnect=12 172.31.1.54:/home/tmp /mnt/tmp
> > > # netstat -an | grep 172.31.1.54 | wc -l
> > > 12
> > > # netstat -an | grep 172.31.1.54
> > > tcp 0 0
> > > 172.31.1.24:901 172.31.1.54:2049 ESTABLISHED
> > > tcp 0 0
> > > 172.31.1.24:667 172.31.1.54:2049 ESTABLISHED
> > > tcp 0 0
> > > 172.31.1.24:746 172.31.1.54:2049 ESTABLISHED
> > > tcp 0 0
> > > 172.31.1.24:672 172.31.1.54:2049 ESTABLISHED
> > > tcp 0 0
> > > 172.31.1.24:832 172.31.1.54:2049 ESTABLISHED
> > > tcp 0 0
> > > 172.31.1.24:895 172.31.1.54:2049 ESTABLISHED
> > > tcp 0 0
> > > 172.31.1.24:673 172.31.1.54:2049 ESTABLISHED
> > > tcp 0 0
> > > 172.31.1.24:732 172.31.1.54:2049 ESTABLISHED
> > > tcp 0 0
> > > 172.31.1.24:795 172.31.1.54:2049 ESTABLISHED
> > > tcp 0 0
> > > 172.31.1.24:918 172.31.1.54:2049 ESTABLISHED
> > > tcp 0 0
> > > 172.31.1.24:674 172.31.1.54:2049 ESTABLISHED
> > > tcp 0 0
> > > 172.31.1.24:953 172.31.1.54:2049 ESTABLISHED
> > >
> > > # umount /mnt/tmp
> > > # netstat -an | grep 172.31.1.54 | wc -l
> > > 12
> > > # netstat -an | grep 172.31.1.54
> > > tcp 0 0
> > > 172.31.1.24:901 172.31.1.54:2049 TIME_WAIT
> > > tcp 0 0
> > > 172.31.1.24:667 172.31.1.54:2049 TIME_WAIT
> > > tcp 0 0
> > > 172.31.1.24:746 172.31.1.54:2049 TIME_WAIT
> > > tcp 0 0
> > > 172.31.1.24:672 172.31.1.54:2049 TIME_WAIT
> > > tcp 0 0
> > > 172.31.1.24:832 172.31.1.54:2049 TIME_WAIT
> > > tcp 0 0
> > > 172.31.1.24:895 172.31.1.54:2049 TIME_WAIT
> > > tcp 0 0
> > > 172.31.1.24:673 172.31.1.54:2049 TIME_WAIT
> > > tcp 0 0
> > > 172.31.1.24:732 172.31.1.54:2049 TIME_WAIT
> > > tcp 0 0
> > > 172.31.1.24:795 172.31.1.54:2049 TIME_WAIT
> > > tcp 0 0
> > > 172.31.1.24:918 172.31.1.54:2049 TIME_WAIT
> > > tcp 0 0
> > > 172.31.1.24:674 172.31.1.54:2049 TIME_WAIT
> > > tcp 0 0
> > > 172.31.1.24:953 172.31.1.54:2049 TIME_WAIT
> > >
> > > Is this the expected behavior?
> > >
> > > If so I have a few concerns...
> > >
> > > * The connections walk all over the /etc/services namespace.
> > > Meaning
> > > using ports that are reserved for registered services, something
> > > we've tried to avoid in userland by not binding to privilege
> > > ports
> > > and
> > > use of backlist ports via /etc/bindresvport.blacklist
> > >
> > > * When the unmount happens, all those connections go into
> > > TIME_WAIT
> > > on
> > > privilege ports and there are only so many of those. Not good
> > > during
> > > mount
> > > storms (when a server reboots and thousand of home dirs are
> > > remounted).
> > >
> > > * No man page describing the new feature.
> > >
> > > I realize there is not much we can do about some of these
> > > (aka umount==>TIME_WAIT) but I think we need to document
> > > what we are doing to people's connection namespace when
> > > they use this feature.
> >
> > I'm not sure that I understand the concern. The connections are
> > limited
> > to a specific window of ports by the min_resvport/max_resvport
> > sunrpc
> > module parameters just as they were before we added 'nconnect'.
> > Nothing
> > has changed in the way we choose ports...
> >
> Maybe this problem has existed for a while...
>
> Here are the mins/max ports
> RPC_DEF_MIN_RESVPORT (665U)
> RPC_DEF_MAX_RESVPORT (1023U)
>
> From /etc/services there are the services in that range
> acp(674), ha-cluster(694), kerberos-adm(749), kerberos-iv(750)
> webster(765), phonebook(767), rsync(873), rquotad(875),
> telnets(992), imaps(993), pop3s(995)
>
> Granted a lot of these are unused/legacy services, but some of
> them, like imaps and rsync, are still used.
>
> My point is since the nconnect connections are persistent, for
> the life of the mount, we could end up squatting on ports other
> services will needed.
>
> Maybe there is not much we can do about this... But we should explain
> somewhere, like the man page, that nconnect will create up to 16
> persistent connection on register privilege ports.


If users have a need to run servers on a port that might chosen by a
kernel nfs, lockd or rpcbind client, then they can guarantee no
collisions by redefining the 'privileged ports' available to sunrpc to
any arbitrary range <portnr1> - <portnr2>:

Either

* Reserve the ports at module load time, by adding a line to a config
file /etc/modprobe.d/foo.conf of the form
options sunrpc min_resvport=<portnr1> max_resvport=<portnr2>

or

* Change the port reservation after module load (but before mounting
your first NFS filesystem) using
echo '<portnr1>' > /sys/module/sunrpc/parameters/min_resvport
echo '<portnr2>' > /sys/module/sunrpc/parameters/max_resvport

This is something users ought to be doing already if they need
guaranteed availability of specific ports in the range 665-1023 while
using the NFS client. That is entirely independent of whether or not
they are using nconnect.

--
Trond Myklebust
Linux NFS client maintainer, Hammerspace
[email protected]


2020-01-31 19:56:57

by Olga Kornievskaia

[permalink] [raw]
Subject: Re: [PATCH 1/1] NFSv4.0 allow nconnect for v4.0

Trond, Anna,

If it's not too late then I'd like to re-tract this patch. I
understand if somebody else would really like to have this
functionality in then please speak up. If it's too late to pull I also
understand but I need to at least try.

Thank you.

On Thu, Jan 16, 2020 at 2:09 PM Olga Kornievskaia
<[email protected]> wrote:
>
> From: Olga Kornievskaia <[email protected]>
>
> Signed-off-by: Olga Kornievskaia <[email protected]>
> ---
> fs/nfs/nfs4client.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/nfs/nfs4client.c b/fs/nfs/nfs4client.c
> index 460d625..4df3fb0 100644
> --- a/fs/nfs/nfs4client.c
> +++ b/fs/nfs/nfs4client.c
> @@ -881,7 +881,7 @@ static int nfs4_set_client(struct nfs_server *server,
>
> if (minorversion == 0)
> __set_bit(NFS_CS_REUSEPORT, &cl_init.init_flags);
> - else if (proto == XPRT_TRANSPORT_TCP)
> + if (proto == XPRT_TRANSPORT_TCP)
> cl_init.nconnect = nconnect;
>
> if (server->flags & NFS_MOUNT_NORESVPORT)
> --
> 1.8.3.1
>