2003-10-14 15:51:44

by Ogden, Aaron A.

[permalink] [raw]
Subject: ~800 mountpoint limitation


Message: 1
Date: Mon, 13 Oct 2003 11:23:51 +0800 (WST)
From: Ian Kent <[email protected]>
To: "Ogden, Aaron A." <[email protected]>
cc: Mike Waychison <[email protected]>,
autofs mailing list <[email protected]>,
<[email protected]>
Subject: Re: [NFS] RE: [autofs] multiple servers per automount

> On Fri, 10 Oct 2003, Ogden, Aaron A. wrote:
>=20
> >
> >
> > > So this would indicate that even if there is a device system that
can
> > > increase the number of unnamed devices that subsystems like NFS
cannot
> > > handle this many mounts.
> >
> > Maybe. I'm not 100% certain though. Currently I am holding steady
at
> > 710 active mounts, I am going to write a little script to mount more
in
> > small increments, ie. read a list of ~1000 mountpoints from /home,
mount
> > a few of them, check the filesystems, and repeat... this way I will
know
> > exactly where things break down.
>
> Interesting.
>=20
> If you can edge it up then it's probably not an available port
> restriction.
>=20
> There may be more than one issue at work here.

Hello all,
This limitation has been confirmed to my satisfaction, I just had a test
machine crack at 799 mountpoints, so it seems that Chuck's RPC theory
may be correct. Mike Waychison and Eric Werme mentioned that Solaris
and HPUX use a multiplexed scheme to reduce the number of ports required
for NFS clients, perhaps that is the direction linux NFS/RPC should be
heading.

--aaron


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-10-18 04:09:44

by Trond Myklebust

[permalink] [raw]
Subject: Re: ~800 mountpoint limitation

>>>>> " " == Charles Lever <Lever> writes:

> trond has heard my complaints about this before.... sharing an
> RPC transport socket across mounts is an interesting solution
> in some ways, but i'm concerned about the performance
> scalability of this solution, especially since the RPC slot
> table size is fixed at a relatively small 16 entries.

The answer to this problem is to make the slot table size
configurable. I believe you even have a patch for that, Chuck ;-)

> imagine sharing 16 RPC slots across all the mounts on a very
> busy multi-user system. if one mount backs up (say because one
> of the server's disks gets busy), that makes all the mounts
> sharing that slot table unusable.


I'm not really sure that I buy the argument about the server disks
getting busy. That will tend to hit you whether or not you are sharing
RPC slots, since it also ties up server resources.

The main point with putting all the transport to a given server on one
socket is that the UDP/TCP congestion control algorithms can then
function efficiently. Instead of competing against other packets
originating from itself, the client only has to deal with competition
from other clients.

Cheers,
Trond


-------------------------------------------------------
This SF.net email sponsored by: Enterprise Linux Forum Conference & Expo
The Event For Linux Datacenter Solutions & Strategies in The Enterprise
Linux in the Boardroom; in the Front Office; & in the Server Room
http://www.enterpriselinuxforum.com
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-10-14 16:06:15

by Trond Myklebust

[permalink] [raw]
Subject: Re: ~800 mountpoint limitation

>>>>> " " == Aaron A Ogden <Ogden> writes:

> Hello all, This limitation has been confirmed to my
> satisfaction, I just had a test machine crack at 799
> mountpoints, so it seems that Chuck's RPC theory may be
> correct. Mike Waychison and Eric Werme mentioned that Solaris
> and HPUX use a multiplexed scheme to reduce the number of ports
> required for NFS clients, perhaps that is the direction linux
> NFS/RPC should be heading.

Yes, I already started work on that. In fact a patch already exists on

http://www.fys.uio.no/~trondmy/src/2.5.73/linux-2.5.73-09-xprt.dif

to multiplex all RPC traffic to a given server/port combination
through a single socket.

It's been put on hold though due to a combination of higher priority
interrupts, and now the 2.6.0 code-freeze...

Cheers,
Trond


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-10-14 16:28:23

by Ogden, Aaron A.

[permalink] [raw]
Subject: RE: ~800 mountpoint limitation


Excellent! I was hoping that you were following the discussion. It is
something you can fix though? It seems that RPC may be involved as
well...

Is there any hope of a patch for 2.4.20+ or is this strictly 2.6 code?

-----Original Message-----
From: Trond Myklebust [mailto:[email protected]]=20
Sent: Tuesday, October 14, 2003 11:06 AM
To: Ogden, Aaron A.
Cc: [email protected]
Subject: Re: [NFS] ~800 mountpoint limitation


>>>>> " " =3D=3D Aaron A Ogden <Ogden> writes:

> Hello all, This limitation has been confirmed to my
> satisfaction, I just had a test machine crack at 799
> mountpoints, so it seems that Chuck's RPC theory may be
> correct. Mike Waychison and Eric Werme mentioned that Solaris
> and HPUX use a multiplexed scheme to reduce the number of ports
> required for NFS clients, perhaps that is the direction linux
> NFS/RPC should be heading.

Yes, I already started work on that. In fact a patch already exists on

http://www.fys.uio.no/~trondmy/src/2.5.73/linux-2.5.73-09-xprt.dif

to multiplex all RPC traffic to a given server/port combination
through a single socket.

It's been put on hold though due to a combination of higher priority
interrupts, and now the 2.6.0 code-freeze...

Cheers,
Trond


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-10-14 16:44:31

by Trond Myklebust

[permalink] [raw]
Subject: RE: ~800 mountpoint limitation

>>>>> " " == Aaron A Ogden <Ogden> writes:

> Excellent! I was hoping that you were following the
> discussion. It is something you can fix though? It seems that
> RPC may be involved as well...

> Is there any hope of a patch for 2.4.20+ or is this strictly
> 2.6 code?

I'm really too busy right now to do a backport for 2.4.x (hell, I'm
even too busy to forward-port to 2.6.0-test7 8-().

That said, it shouldn't be hard to do. I've made sure that the 2.4.x
and 2.6.x RPC code have been tracking one another, so there should be
few real differences to worry about. For all I know, the patch may
even apply without any changes...

Cheers,
Trond


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-10-14 16:45:41

by Lever, Charles

[permalink] [raw]
Subject: RE: ~800 mountpoint limitation

trond has heard my complaints about this before....

sharing an RPC transport socket across mounts is an
interesting solution in some ways, but i'm concerned
about the performance scalability of this solution,
especially since the RPC slot table size is fixed at
a relatively small 16 entries. imagine sharing 16
RPC slots across all the mounts on a very busy
multi-user system. if one mount backs up (say
because one of the server's disks gets busy), that
makes all the mounts sharing that slot table unusable.

this solution also won't allow you to mount more than
800 different servers, for similar reasons to today's
800 mounts per client limitation. you can still only
have about 800 transport sockets.

so there are some issues to be worked out.

> -----Original Message-----
> From: Ogden, Aaron A. [mailto:[email protected]]
> Sent: Tuesday, October 14, 2003 12:28 PM
> To: Trond Myklebust
> Cc: [email protected]
> Subject: RE: [NFS] ~800 mountpoint limitation
>=20
>=20
>=20
> Excellent! I was hoping that you were following the=20
> discussion. It is
> something you can fix though? It seems that RPC may be involved as
> well...
>=20
> Is there any hope of a patch for 2.4.20+ or is this strictly 2.6 code?
>=20
> -----Original Message-----
> From: Trond Myklebust [mailto:[email protected]]=20
> Sent: Tuesday, October 14, 2003 11:06 AM
> To: Ogden, Aaron A.
> Cc: [email protected]
> Subject: Re: [NFS] ~800 mountpoint limitation
>=20
>=20
> >>>>> " " =3D=3D Aaron A Ogden <Ogden> writes:
>=20
> > Hello all, This limitation has been confirmed to my
> > satisfaction, I just had a test machine crack at 799
> > mountpoints, so it seems that Chuck's RPC theory may be
> > correct. Mike Waychison and Eric Werme mentioned that Solaris
> > and HPUX use a multiplexed scheme to reduce the number of ports
> > required for NFS clients, perhaps that is the direction linux
> > NFS/RPC should be heading.
>=20
> Yes, I already started work on that. In fact a patch already exists on
>=20
> http://www.fys.uio.no/~trondmy/src/2.5.73/linux-2.5.73-09-xprt.dif
>=20
> to multiplex all RPC traffic to a given server/port combination
> through a single socket.
>=20
> It's been put on hold though due to a combination of higher priority
> interrupts, and now the 2.6.0 code-freeze...
>=20
> Cheers,
> Trond
>=20
>=20
> -------------------------------------------------------
> This SF.net email is sponsored by: SF.net Giveback Program.
> SourceForge.net hosts over 70,000 Open Source Projects.
> See the people who have HELPED US provide better services:
> Click here: http://sourceforge.net/supporters.php
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>=20


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-10-14 17:04:21

by Ogden, Aaron A.

[permalink] [raw]
Subject: RE: ~800 mountpoint limitation


Even so, mounting from 800 separate fileservers would provide much more
than we have now. I wonder if there is any real-world site that could
even approach 800 NFS servers? (and if so, why???) With the RPC
limitations of today I can shatter the limit very easily, all I have to
do is mount all of my home directories, there are more than 800.

Re: scalability and performance, I think I'll let Trond and Neil comment
on that, but Trond's patch for 2.5.73 sounds like a big improvement over
the current situation.

I've also noticed that I get RPC errors when I try to mount too many
things at once, for example if I mount 200+ autofs-managed NFS
mountpoints via script by cd'ing into them. If I insert a short pause
(1/2 second to 1 second) between each mount attempt everything is fine.
Sounds like some kind of race condition to me...

-A

-----Original Message-----
From: Lever, Charles [mailto:[email protected]]=20
Sent: Tuesday, October 14, 2003 11:45 AM
To: Ogden, Aaron A.
Cc: [email protected]
Subject: RE: [NFS] ~800 mountpoint limitation


trond has heard my complaints about this before....

sharing an RPC transport socket across mounts is an
interesting solution in some ways, but i'm concerned
about the performance scalability of this solution,
especially since the RPC slot table size is fixed at
a relatively small 16 entries. imagine sharing 16
RPC slots across all the mounts on a very busy
multi-user system. if one mount backs up (say
because one of the server's disks gets busy), that
makes all the mounts sharing that slot table unusable.

this solution also won't allow you to mount more than
800 different servers, for similar reasons to today's
800 mounts per client limitation. you can still only
have about 800 transport sockets.

so there are some issues to be worked out.


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-10-15 12:56:15

by Ian Kent

[permalink] [raw]
Subject: RE: ~800 mountpoint limitation

On Tue, 14 Oct 2003, Ogden, Aaron A. wrote:

>
> Even so, mounting from 800 separate fileservers would provide much more
> than we have now. I wonder if there is any real-world site that could
> even approach 800 NFS servers? (and if so, why???) With the RPC
> limitations of today I can shatter the limit very easily, all I have to
> do is mount all of my home directories, there are more than 800.

I think Charles is concerned about performance under adverse conditions.
Certainly a big deal for a busy site like yours.

Even now I can't even saturate a 100m/bit network link (~ 70 - 80% is
all I can get) on a copy with Linux NFS, so performance is a big deal.

>
> Re: scalability and performance, I think I'll let Trond and Neil comment
> on that, but Trond's patch for 2.5.73 sounds like a big improvement over
> the current situation.
>
> I've also noticed that I get RPC errors when I try to mount too many
> things at once, for example if I mount 200+ autofs-managed NFS
> mountpoints via script by cd'ing into them. If I insert a short pause
> (1/2 second to 1 second) between each mount attempt everything is fine.
> Sounds like some kind of race condition to me...

As I mentioned before it appears that multiple connections are made to the
portmapper as well. It appears to be using up a bunch of ports. You
could reduce the amount of time that these ports hang around before
being released to get'em back quicker. Much like people do on busy web
servers or perhaps this function is a good candidate for multiplexing as
well.

I haven't done this on Linux, so I can't help with the how, but it should
be possible.

--

,-._|\ Ian Kent
/ \ Perth, Western Australia
*_.--._/ E-mail: [email protected]
v Web: http://themaw.net/



-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
SourceForge.net hosts over 70,000 Open Source Projects.
See the people who have HELPED US provide better services:
Click here: http://sourceforge.net/supporters.php
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-12-15 13:34:32

by Ian Kent

[permalink] [raw]
Subject: RE: ~800 mountpoint limitation

On Tue, 14 Oct 2003, Trond Myklebust wrote:

> >>>>> " " == Aaron A Ogden <Ogden> writes:
>
> > Excellent! I was hoping that you were following the
> > discussion. It is something you can fix though? It seems that
> > RPC may be involved as well...
>
> > Is there any hope of a patch for 2.4.20+ or is this strictly
> > 2.6 code?
>
> I'm really too busy right now to do a backport for 2.4.x (hell, I'm
> even too busy to forward-port to 2.6.0-test7 8-().
>
> That said, it shouldn't be hard to do. I've made sure that the 2.4.x
> and 2.6.x RPC code have been tracking one another, so there should be
> few real differences to worry about. For all I know, the patch may
> even apply without any changes...
>

Hi Trond,

Well, I had a go at updating the patch to apply against 2.6.0-test11
tonight. Hope this is something you wanted done.

There are a couple of things that I'm unsure about. Particularly a line in
call_connect_status that is assigned a different state than what the
original source in the patch had.

So, if you have a little time, could you check my work.

I have included a bunch of info in case you need it.

The patch lives at /pub/linux/kernel/people/raven/nfs. Included there are
two files with names ending in 'questionable'. They where the only changes
that I thought might need your attention. Additionally, I made a tar of
the 'orig' and 'rej' files produced from applying the original patch in
case you want to check them. I haven't tested this at all yet. I think it
best to wait wait for your feedback as to the accuracy of the work.

If the patch is OK with you then I will have a go at back porting it to
2.4. If Aaron is able to, we can have him hammer it to death and perhaps
collect some performance data.

Comments please?

Ian




-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-12-15 15:46:34

by Ogden, Aaron A.

[permalink] [raw]
Subject: RE: ~800 mountpoint limitation



> -----Original Message-----
> From: Ian Kent [mailto:[email protected]]=20
> Sent: Monday, December 15, 2003 7:34 AM
> To: Trond Myklebust
> Cc: Ogden, Aaron A.; [email protected]
> Subject: RE: [NFS] ~800 mountpoint limitation
>=20
> [---snip---]
>=20
> If the patch is OK with you then I will have a go at back porting it
to=20
> 2.4. If Aaron is able to, we can have him hammer it to death and
perhaps=20
> collect some performance data.
>=20
> Comments please?
> Ian

Sounds great, I am running autofs-4.1.0 on two test machines and have
not encountered any problems yet. The test machines are RH 7.3 with
kernel 2.4.20-24 + autofs4-20031201. The patch discussed in this thread
is intended to correct the RPC problems yes? (limit of ~800 mountpoints,
etc.) If it works I feel that it would be a welcome addition to linux
even if it isn't strictly necessary for most people.

I have other machines I can run the new code on, so yes, I can test it
out. We have more than 800 users so I can break the old code just by
walking down the auto_home map. Hopefully the new code will be able to
handle more mounts and at greater speed without failing... =20

-A


-------------------------------------------------------
This SF.net email is sponsored by: SF.net Giveback Program.
Does SourceForge.net help you be more productive? Does it
help you create better code? SHARE THE LOVE, and help us help
YOU! Click Here: http://sourceforge.net/donate/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs