2007-03-07 23:39:45

by NeilBrown

[permalink] [raw]
Subject: Re: Delays on "first" access to a NFS mount

On Wednesday March 7, [email protected] wrote:
>
> So could you remind me what the uses cases are here? Who is it that
> requires demand loading, and why?

Partly it is the principle that demand-based configuration is more
flexible. Witness the various efforts to replace rc.d scripts with
something event/demand based.

The IP->clientname table must be demand loaded because you obviously
cannot know all needed IP addresses in advance. (The rmtab experience
proves that)

The clientname+path->export-options table must be demand loaded
because - depending a bit of how you choose client names and how
complicated /etc/exports is - you either don't know all client names
in advance, or computing them all is complex and wasteful.

The fsid->path table could possible be made 'static', but I think
demand-loading is still best. There are multiple possible fsids for
some filesystems, and telling the kernel about all of them when only
one will be used seems wasteful. And the filesystems may not all be
available when you try to create the static table. You could update
the table at every mount, but with demand-loading, you don't have to.

Imagine having hundreds of filesystems on some sort of library (a CD
library?) where each can be identified by a UUID which gets stored in
the fsid in the filehandle.
Imagine a simple extension to mountd so that a call-out were made when
an unknown filehandle arrived. This callout could mount the required
filesystem and export it. Maybe the library only allows 3 filesystems
to be mounted at a time, so it would unmount the lease-recently-used
one.

How are you going to handle that system except with demand-loading of
the fsid->path table?

>
> I'll promise to write it all down someplace and then hopefully we won't
> have to re-ask the same questions too many times....

Sounds like a fine idea.
I have often wanted to write a 'Linux commentary' that explains all
the hows and whys of things. I even started some bits once (to help
me understand the VFS layer). But Linux changes so fast that any
entry in such a commentary would be out-of-date before it was
written....

NeilBrown

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2007-03-08 05:13:26

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Delays on "first" access to a NFS mount

On Thu, Mar 08, 2007 at 10:39:23AM +1100, Neil Brown wrote:
> Imagine having hundreds of filesystems on some sort of library (a CD
> library?) where each can be identified by a UUID which gets stored in
> the fsid in the filehandle.
> Imagine a simple extension to mountd so that a call-out were made when
> an unknown filehandle arrived. This callout could mount the required
> filesystem and export it. Maybe the library only allows 3 filesystems
> to be mounted at a time, so it would unmount the lease-recently-used
> one.

Maybe. Is this practical? Do we know of any cases of users doing this?
Do you block forever if you try to access 4 filesystems at once? I
dunno....

> I have often wanted to write a 'Linux commentary' that explains all
> the hows and whys of things. I even started some bits once (to help
> me understand the VFS layer). But Linux changes so fast that any
> entry in such a commentary would be out-of-date before it was
> written....

And there's a lot to document. I mean, look:

http://www.oreilly.com/catalog/understandlni/

Just over a thousand pages, just covering the kernel networking code
(and only some of it at that). Maybe they just lost the forest for the
trees, but still, yipes.

--b.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-08 13:44:09

by Olaf Kirch

[permalink] [raw]
Subject: Re: Delays on "first" access to a NFS mount

On Thursday 08 March 2007 06:14, J. Bruce Fields wrote:
> Maybe. Is this practical? Do we know of any cases of users doing this?
> Do you block forever if you try to access 4 filesystems at once? I
> dunno....

IIRC SGI had a storage appliance a while back that included a tape robot,
but it was hiding the details somewhere deep inside XFS. I remember seeing
patches involving nfsd and dmapi (I can see you cringe, Christoph :-)

Note that in real-life scenarios, we're sometimes talking about literally
thousands of exported file systems. My previous employer has a customer with
such a setup, using NetApp filers. We had some trouble getting the Linux
client to survive in this environment, as it ran out of privileged ports
way too quickly. Absurd as it may sound, this kind of setup seems to be
the trend.

Now think about handling a system with several thousand exported
file systems on the server side - if you need to look at each file system
before nfsd is ready to service requests, we're talking of a considerable
delay in boot time. In the worst case we're talking about several thousand
*disks* that need to be spun up, and fuses going pop-pop-pop.

Short summary - if you want to scale beyond small work group servers,
you need something that scales well. Demand loading the exports
table does.

Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[email protected] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-08 21:26:53

by J. Bruce Fields

[permalink] [raw]
Subject: Re: Delays on "first" access to a NFS mount

On Thu, Mar 08, 2007 at 02:43:16PM +0100, Olaf Kirch wrote:
> Now think about handling a system with several thousand exported
> file systems on the server side - if you need to look at each file system
> before nfsd is ready to service requests, we're talking of a considerable
> delay in boot time. In the worst case we're talking about several thousand
> *disks* that need to be spun up, and fuses going pop-pop-pop.
>
> Short summary - if you want to scale beyond small work group servers,
> you need something that scales well. Demand loading the exports
> table does.

There's some confusion here--the reason that this was happening was that
we're mapping filehandles to exports by stat()ing the root of every
exported filesystem. That may be an obstacle to handling large numbers
of exports, but it's not really related to the demand-loading question.

So why does demand-loading scale better? Is the worry just the kernel
memory required to store the export table for thousands of mostly
inactive exports?

So you need the mountpoints for the exported filesystem, the export
options, and the name of the client(s). If that adds up to a few K,
thousands would add up to a few Megs.

Are there other problems? (E.g. does the VFS handle thousands of mounts
well?)

--b.

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2007-03-09 15:04:33

by Olaf Kirch

[permalink] [raw]
Subject: Re: Delays on "first" access to a NFS mount

On Thursday 08 March 2007 22:27, J. Bruce Fields wrote:
> There's some confusion here--the reason that this was happening was that
> we're mapping filehandles to exports by stat()ing the root of every
> exported filesystem. That may be an obstacle to handling large numbers
> of exports, but it's not really related to the demand-loading question.

Sorry if I was expressing myself poorly. What I was driving at was that
it makes sense to not mount all file systems prior to starting the
NFS server.

> So why does demand-loading scale better? Is the worry just the kernel
> memory required to store the export table for thousands of mostly
> inactive exports?

It means you can start serving files without having to wait for all
file systems to be mounted (and having their journals replayed,
etc). All you need is a way for mountd to figure out whether a file system
is there already (so we can push the rootfh into the kernel) or whether it's
not (so nfsd can return EJUKEBOX or defer the request)

Olaf
--
Olaf Kirch | --- o --- Nous sommes du soleil we love when we play
[email protected] | / | \ sol.dhoop.naytheet.ah kin.ir.samse.qurax

-------------------------------------------------------------------------
Take Surveys. Earn Cash. Influence the Future of IT
Join SourceForge.net's Techsay panel and you'll get the chance to share your
opinions on IT & business topics through brief surveys-and earn cash
http://www.techsay.com/default.php?page=join.php&p=sourceforge&CID=DEVDEV
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs