2006-10-18 06:39:45

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: NFS inconsistent behaviour

On Wed, Oct 18, 2006 at 10:22:44AM +0900, Mohit Katiyar wrote:
> I checked it today and when i issued the netstat -t ,I could see a lot
> of tcp connections in TIME_WAIT state.
> Is this a normal behaviour?

yes... but see below

> So we cannot mount and umount infinitely
> with tcp option? Why there are so many connections in waiting state?

I think it's called the 2MSL wait: there may be TCP segments on the
wire which (in theory) could disrupt new connections which reuse local
and remote port so the ports stay in use for a few minutes. This is
standard TCP behavior but only occurs when connections are improperly
shutdown. Apparently this happens when umounting a tcp NFS mount but
also for a lot of other tcp based RPC (showmount, rpcinfo). I'm not
sure who's to blame but it might be the rpc functions inside glibc.

I'd switch to NFS over udp if this is problem.

(cc'ed to nfs mailing list)

--
Frank


2006-10-18 17:57:09

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS inconsistent behaviour

On Wed, 2006-10-18 at 08:39 +0200, Frank van Maarseveen wrote:
> On Wed, Oct 18, 2006 at 10:22:44AM +0900, Mohit Katiyar wrote:
> > I checked it today and when i issued the netstat -t ,I could see a lot
> > of tcp connections in TIME_WAIT state.
> > Is this a normal behaviour?
>
> yes... but see below
>
> > So we cannot mount and umount infinitely
> > with tcp option? Why there are so many connections in waiting state?
>
> I think it's called the 2MSL wait: there may be TCP segments on the
> wire which (in theory) could disrupt new connections which reuse local
> and remote port so the ports stay in use for a few minutes. This is
> standard TCP behavior but only occurs when connections are improperly
> shutdown. Apparently this happens when umounting a tcp NFS mount but
> also for a lot of other tcp based RPC (showmount, rpcinfo). I'm not
> sure who's to blame but it might be the rpc functions inside glibc.
>
> I'd switch to NFS over udp if this is problem.

Just out of interest. Why does anyone actually _want_ to keep
mount/umounting to the point where they run out of ports? That is going
to kill performance in all sorts of unhealthy ways, not least by
completely screwing over any caching.

Note also that you _can_ change the range of ports used by the NFS
client itself at least. Just edit /proc/sys/sunrpc/{min,max}_resvport.
On the server side, you can use the 'insecure' option in order to allow
mounts that originate from non-privileged ports (i.e. port > 1024).
If you are using strong authentication (for instance RPCSEC_GSS/krb5)
then that actually makes a lot of sense, since the only reason for the
privileged port requirement was to disallow unprivileged NFS clients.

Cheers,
Trond

2006-10-18 18:37:34

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS inconsistent behaviour

On Wed, 2006-10-18 at 13:57 -0400, Trond Myklebust wrote:
> Note also that you _can_ change the range of ports used by the NFS
> client itself at least. Just edit /proc/sys/sunrpc/{min,max}_resvport.
> On the server side, you can use the 'insecure' option in order to allow
> mounts that originate from non-privileged ports (i.e. port > 1024).
> If you are using strong authentication (for instance RPCSEC_GSS/krb5)
> then that actually makes a lot of sense, since the only reason for the
> privileged port requirement was to disallow unprivileged NFS clients.

Oops... Something got lost there. That last sentence should read

...since the only reason for the privileged port requirement was
to disallow unprivileged NFS clients that could be used to spoof
other user identities via the weak AUTH_SYS authentication.

Cheers,
Trond


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-10-18 18:38:13

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: NFS inconsistent behaviour

On Wed, Oct 18, 2006 at 01:57:09PM -0400, Trond Myklebust wrote:
> On Wed, 2006-10-18 at 08:39 +0200, Frank van Maarseveen wrote:
> > On Wed, Oct 18, 2006 at 10:22:44AM +0900, Mohit Katiyar wrote:
> > > I checked it today and when i issued the netstat -t ,I could see a lot
> > > of tcp connections in TIME_WAIT state.
> > > Is this a normal behaviour?
> >
> > yes... but see below
> >
> > > So we cannot mount and umount infinitely
> > > with tcp option? Why there are so many connections in waiting state?
> >
> > I think it's called the 2MSL wait: there may be TCP segments on the
> > wire which (in theory) could disrupt new connections which reuse local
> > and remote port so the ports stay in use for a few minutes. This is
> > standard TCP behavior but only occurs when connections are improperly
> > shutdown. Apparently this happens when umounting a tcp NFS mount but
> > also for a lot of other tcp based RPC (showmount, rpcinfo). I'm not
> > sure who's to blame but it might be the rpc functions inside glibc.
> >
> > I'd switch to NFS over udp if this is problem.
>
> Just out of interest. Why does anyone actually _want_ to keep
> mount/umounting to the point where they run out of ports? That is going
> to kill performance in all sorts of unhealthy ways, not least by
> completely screwing over any caching.

I ran out of privileged ports due to treemounting on /net from about 50
servers. The autofs program map for this uses the "showmount" command and
that one apparently uses privileged ports too (buried inside RPC client
libs part of glibc IIRC). The combination broke autofs and a number of
other services because there were no privileged ports left anymore.

So it can happen in practice.

> Note also that you _can_ change the range of ports used by the NFS
> client itself at least. Just edit /proc/sys/sunrpc/{min,max}_resvport.
> On the server side, you can use the 'insecure' option in order to allow
> mounts that originate from non-privileged ports (i.e. port > 1024).

Increasing the privileged port range in the kernel might be doable in
some cases. It might be useful to extend it to include port 2049 too.

--
Frank

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-10-18 19:26:43

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS inconsistent behaviour

On Wed, 2006-10-18 at 20:38 +0200, Frank van Maarseveen wrote:
> I ran out of privileged ports due to treemounting on /net from about 50
> servers. The autofs program map for this uses the "showmount" command and
> that one apparently uses privileged ports too (buried inside RPC client
> libs part of glibc IIRC). The combination broke autofs and a number of
> other services because there were no privileged ports left anymore.

Yeah. The RPC library appears to always try to grab a privileged port if
it can. One solution would be to have the autofs scripts drop all
privileges before calling showmount.

I suppose we could also change the showmount program to create a socket
that is bound to an unprivileged port, then use
clnttcp_create()/clntudp_create().

We could probably do the same in the "mount" program when doing things
like interrogating the portmapper, probing for rpc ports etc. The only
case where mount might actually need to use a privileged port is when
talking to mountd. Even then, it could be trained to first try using an
unprivileged port.

Cheers,
Trond


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-10-18 20:09:42

by Frank van Maarseveen

[permalink] [raw]
Subject: Re: NFS inconsistent behaviour

On Wed, Oct 18, 2006 at 03:26:20PM -0400, Trond Myklebust wrote:
> On Wed, 2006-10-18 at 20:38 +0200, Frank van Maarseveen wrote:
> > I ran out of privileged ports due to treemounting on /net from about 50
> > servers. The autofs program map for this uses the "showmount" command and
> > that one apparently uses privileged ports too (buried inside RPC client
> > libs part of glibc IIRC). The combination broke autofs and a number of
> > other services because there were no privileged ports left anymore.
>
> Yeah. The RPC library appears to always try to grab a privileged port if
> it can. One solution would be to have the autofs scripts drop all
> privileges before calling showmount.
>
> I suppose we could also change the showmount program to create a socket
> that is bound to an unprivileged port, then use
> clnttcp_create()/clntudp_create().
>
> We could probably do the same in the "mount" program when doing things
> like interrogating the portmapper, probing for rpc ports etc. The only
> case where mount might actually need to use a privileged port is when
> talking to mountd. Even then, it could be trained to first try using an
> unprivileged port.

If we could fix why there are that many connections in state TIME_WAIT
then using privileged ports would not be a problem either.

--
Frank

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-10-18 20:17:47

by Chuck Lever

[permalink] [raw]
Subject: Re: NFS inconsistent behaviour

On 10/18/06, Frank van Maarseveen <[email protected]> wrote:
> On Wed, Oct 18, 2006 at 03:26:20PM -0400, Trond Myklebust wrote:
> > On Wed, 2006-10-18 at 20:38 +0200, Frank van Maarseveen wrote:
> > > I ran out of privileged ports due to treemounting on /net from about 50
> > > servers. The autofs program map for this uses the "showmount" command and
> > > that one apparently uses privileged ports too (buried inside RPC client
> > > libs part of glibc IIRC). The combination broke autofs and a number of
> > > other services because there were no privileged ports left anymore.
> >
> > Yeah. The RPC library appears to always try to grab a privileged port if
> > it can. One solution would be to have the autofs scripts drop all
> > privileges before calling showmount.
> >
> > I suppose we could also change the showmount program to create a socket
> > that is bound to an unprivileged port, then use
> > clnttcp_create()/clntudp_create().
> >
> > We could probably do the same in the "mount" program when doing things
> > like interrogating the portmapper, probing for rpc ports etc. The only
> > case where mount might actually need to use a privileged port is when
> > talking to mountd. Even then, it could be trained to first try using an
> > unprivileged port.
>
> If we could fix why there are that many connections in state TIME_WAIT
> then using privileged ports would not be a problem either.

Some discussion on both FreeBSD and Linux mailing lists suggests that
ignoring TIME_WAIT has some risk to it, so that may not be an
advisable path to take. However, there are probably some cases where
it is safe, such as idle timeouts, where the client is certain there
is no traffic in flight.

Both client implementations (kernel and glibc) should re-use port
numbers or connections aggressively. To that end, the kernel RPC
client is already doing this. I know Red Hat has suggested using a
connection manager for user-level RPC applications to share. In
addition the kernel NFS client is sharing connections to a server
between all mount points going to that server.

--
"We who cut mere stones must always be envisioning cathedrals"
-- Quarry worker's creed

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-10-18 20:44:52

by Trond Myklebust

[permalink] [raw]
Subject: Re: NFS inconsistent behaviour

On Wed, 2006-10-18 at 16:17 -0400, Chuck Lever wrote:
> Both client implementations (kernel and glibc) should re-use port
> numbers or connections aggressively. To that end, the kernel RPC
> client is already doing this. I know Red Hat has suggested using a
> connection manager for user-level RPC applications to share. In
> addition the kernel NFS client is sharing connections to a server
> between all mount points going to that server.

IIRC, Mike Waychison did some work a couple of years ago on a userspace
daemon that managed RPC connections.

Cheers,
Trond


-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-10-19 01:53:44

by Mohit Katiyar

[permalink] [raw]
Subject: Re: NFS inconsistent behaviour

Yes, I do not want to mount unmount infinitely but was just checking
out of curiosity but mounting/unmounting infinitely works comepletely
fine on SLES 9 which uses 2.6.5 kernel. I was just wondering what has
been changed that it does not work now?

On 10/19/06, Trond Myklebust <[email protected]> wrote:
> On Wed, 2006-10-18 at 08:39 +0200, Frank van Maarseveen wrote:
> > On Wed, Oct 18, 2006 at 10:22:44AM +0900, Mohit Katiyar wrote:
> > > I checked it today and when i issued the netstat -t ,I could see a lot
> > > of tcp connections in TIME_WAIT state.
> > > Is this a normal behaviour?
> >
> > yes... but see below
> >
> > > So we cannot mount and umount infinitely
> > > with tcp option? Why there are so many connections in waiting state?
> >
> > I think it's called the 2MSL wait: there may be TCP segments on the
> > wire which (in theory) could disrupt new connections which reuse local
> > and remote port so the ports stay in use for a few minutes. This is
> > standard TCP behavior but only occurs when connections are improperly
> > shutdown. Apparently this happens when umounting a tcp NFS mount but
> > also for a lot of other tcp based RPC (showmount, rpcinfo). I'm not
> > sure who's to blame but it might be the rpc functions inside glibc.
> >
> > I'd switch to NFS over udp if this is problem.
>
> Just out of interest. Why does anyone actually _want_ to keep
> mount/umounting to the point where they run out of ports? That is going
> to kill performance in all sorts of unhealthy ways, not least by
> completely screwing over any caching.
>
> Note also that you _can_ change the range of ports used by the NFS
> client itself at least. Just edit /proc/sys/sunrpc/{min,max}_resvport.
> On the server side, you can use the 'insecure' option in order to allow
> mounts that originate from non-privileged ports (i.e. port > 1024).
> If you are using strong authentication (for instance RPCSEC_GSS/krb5)
> then that actually makes a lot of sense, since the only reason for the
> privileged port requirement was to disallow unprivileged NFS clients.
>
> Cheers,
> Trond
>
>

-------------------------------------------------------------------------
Using Tomcat but need to do more? Need to support web services, security?
Get stuff done quickly with pre-integrated technology to make your job easier
Download IBM WebSphere Application Server v.1.0.1 based on Apache Geronimo
http://sel.as-us.falkag.net/sel?cmd=lnk&kid=120709&bid=263057&dat=121642
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2006-10-19 12:11:17

by Alan

[permalink] [raw]
Subject: Re: [NFS] NFS inconsistent behaviour

Ar Mer, 2006-10-18 am 16:17 -0400, ysgrifennodd Chuck Lever:
> Some discussion on both FreeBSD and Linux mailing lists suggests that
> ignoring TIME_WAIT has some risk to it, so that may not be an

Ignoring time wait leads to corrupted sessions and can lead to tcp food
fights. It exists for a reason although the protocol itself actually
does still have flaws in this area (which are kept in the locked
cupboard full of skeletons at the IETF 8) )

Alan