2003-06-09 19:13:00

by Leonardo H. Machado

[permalink] [raw]
Subject: Re: cachefs on linux



Dear Sirs,

I'm using linux and solaris for a while and, after seaching all
the web, newsgroups, and mailing lists I could not find the answer for a
very simple question. Before emailing Alan Cox or any other guru (that
might not answer me) I will try to ask you. Here is my simple question:

Why has Solaris a CacheFS file system, while linux doesn't? Is it
because cachefs is VERY difficult to implement (It should be no barrier
for our gurus), or because there's no such a big demand for this marvelous
FS, or else, because no one thought of it?

There are certanly some cacheFS implementations around the web,
like CODA, but they are not free and not even so good as Solaris CacheFS.

Would you please help me with this question or at least tell me
where are the answers?

Thank you very much.


//leoh
main(){int j=1234;char t[]=":@abcdefghijklmnopqrstuvwxyz.\n"
,*i = "iqgbgxmlvivuc\n:wwnfwsdoi"; char *strchr(char *,int);
while(*i){j+=strchr(t,*i++)-t;j%=sizeof t-1;putchar(t[j]);}}




2003-06-09 20:29:45

by Matthias Schniedermeyer

[permalink] [raw]
Subject: Re: cachefs on linux

On Mon, Jun 09, 2003 at 04:26:01PM -0300, Leonardo H. Machado wrote:
>
> Why has Solaris a CacheFS file system, while linux doesn't?

Is this a "You don't know it, you don't need it" thing?



Bis denn

--
Real Programmers consider "what you see is what you get" to be just as
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated,
cryptic, powerful, unforgiving, dangerous.

2003-06-09 20:36:07

by Shawn

[permalink] [raw]
Subject: Re: cachefs on linux

Well, it's a nice way to simulate writing on r/o filesystems IIRC. Like
mounting a cdrom then writing to it, but you're not.

Was that was this was? Anyway, linux also does not have unionFS. If it
was that big of a deal, someone would write it. As it is, it's a
whizbang no one cares about enough.

On Mon, 2003-06-09 at 15:42, Matthias Schniedermeyer wrote:
> On Mon, Jun 09, 2003 at 04:26:01PM -0300, Leonardo H. Machado wrote:
> >
> > Why has Solaris a CacheFS file system, while linux doesn't?
>
> Is this a "You don't know it, you don't need it" thing?
>
>
>
> Bis denn

2003-06-09 20:43:29

by Matthias Schniedermeyer

[permalink] [raw]
Subject: Re: cachefs on linux

On Mon, Jun 09, 2003 at 03:49:36PM -0500, Shawn wrote:
> On Mon, 2003-06-09 at 15:42, Matthias Schniedermeyer wrote:
> > On Mon, Jun 09, 2003 at 04:26:01PM -0300, Leonardo H. Machado wrote:
> > > Why has Solaris a CacheFS file system, while linux doesn't?
> >
> > Is this a "You don't know it, you don't need it" thing?
> Well, it's a nice way to simulate writing on r/o filesystems IIRC. Like
> mounting a cdrom then writing to it, but you're not.
>
> Was that was this was? Anyway, linux also does not have unionFS. If it
> was that big of a deal, someone would write it. As it is, it's a
> whizbang no one cares about enough.

I remember this as "translucent"<whatever>.

IIRC you can do this with "bind"-mounting the writeable-dir over the
read-only dir, but that's from rusted memory, maybe i'm wrong.





Bis denn

--
Real Programmers consider "what you see is what you get" to be just as
bad a concept in Text Editors as it is in women. No, the Real Programmer
wants a "you asked for it, you got it" text editor -- complicated,
cryptic, powerful, unforgiving, dangerous.

2003-06-10 08:15:51

by Sean Hunter

[permalink] [raw]
Subject: Re: cachefs on linux

On Mon, Jun 09, 2003 at 03:49:36PM -0500, Shawn wrote:
> Well, it's a nice way to simulate writing on r/o filesystems IIRC. Like
> mounting a cdrom then writing to it, but you're not.
>
> Was that was this was? Anyway, linux also does not have unionFS. If it
> was that big of a deal, someone would write it. As it is, it's a
> whizbang no one cares about enough.

Its particularly handy for fast read-only NFS stuff. We have thousands
of linux hosts and distributing software to all of them is a pain. With
cachefs with NFS as the "back" filesystem, you push to the masters and
the clients get the changes over NFS and then store them in their local
cache so your software distribution nightmare becomes no problem at all.
Clients read off the local disk if they can, but fetch over NFS as
required. You can tune the cache size on all of the client machines so
they can cache more or less of the most recently used NFS junk on its
local disk.

Sean

2003-06-10 19:03:47

by Rob Landley

[permalink] [raw]
Subject: Re: cachefs on linux

On Tuesday 10 June 2003 04:29, Sean Hunter wrote:
> On Mon, Jun 09, 2003 at 03:49:36PM -0500, Shawn wrote:
> > Well, it's a nice way to simulate writing on r/o filesystems IIRC. Like
> > mounting a cdrom then writing to it, but you're not.
> >
> > Was that was this was? Anyway, linux also does not have unionFS. If it
> > was that big of a deal, someone would write it. As it is, it's a
> > whizbang no one cares about enough.
>
> Its particularly handy for fast read-only NFS stuff. We have thousands
> of linux hosts and distributing software to all of them is a pain. With
> cachefs with NFS as the "back" filesystem, you push to the masters and
> the clients get the changes over NFS and then store them in their local
> cache so your software distribution nightmare becomes no problem at all.
> Clients read off the local disk if they can, but fetch over NFS as
> required. You can tune the cache size on all of the client machines so
> they can cache more or less of the most recently used NFS junk on its
> local disk.
>
> Sean

Technically cachefs is just a union mount with tmpfs or ramfs as the overlay
on the underlying filesystem. Doing a seperate cachefs is kind of pointless
in Linux.

I believe the reason we're using "union mount" is that was the term Al Viro
used when mentioning the concept in his to-do list months and months ago.
that was before he dropped off the face of the planet for a while, and the
Second Coming of Al Viro has yet to express an opinion on the matter, that I
am aware of.

I myself want union mounts to seperate the current procfs into a "procfs" that
does a subdirectory for each PID, and a "crapfs" that does all the legacy
stuff that got shoehorned into procfs back when it was the main virtual
filesystem for exporting system state. (For legacy systems, union mount
them. For new systems, use just /proc and /sys with no need for crapfs.
Easy migration path to something sane, that way...)

A lot of the VFS stuff has gotten cleaned up in 2.5 already. You can detach a
filesystem ala unlink (it goes out of the namespace now, but actually gets
unmounted when the last open filehandle in it gets closed.) I'm told there
is also now a umount equivalent of "kill -9". There was also talk of
per-process namespaces (so your mount doesn't have to show up in another
process's filesystem tree). As soon as I get brave enough to put 2.5 on main
work laptop, I intend to start playing with this sort of thing...

In 2.4 you could already remount the same filesystem in two or three places,
and there's --bind to graft a subdirectory into the three as if it were a
root directory. And mounting one filesystem over another doesn't cause
problems (although the underlying filesystem is completely hidden). What you
CAN'T do in 2.4 is move an existing mountpoint so you can free up what it was
mounted under without unmounting it.

Yes, I cared about this. Example: during initfs: mount /var, losetup
/dev/loop0 /var/firmware.img, then exit so root device becomes /dev/loop0,
and have init-first script "mount --move /initrd/var /var" (so init-first
doesn't have to actually know where /var is partition-wise, since it's _IN_
the read-only firmware we loopback mounted earlier, and we can't umount
/initrd/var anyway since it's got an open file in it for the loopback mount).
I ended up doing mount --bind instead and just living with the duplicate
mount under /initrd/var (and never being able to umount /initrd and free up
the three megabytes of ram). In 2.5, I can do a lot better...

As to the current status of union mounts, I have no idea. It was more or less
lumped in with the initramfs type work back around the halloween freeze,
which fell through the cracks for a while, and seems to have been resurrected
by other people (klib, etc) recently. Let's see...

The most recent 2.5 status thing says union mounts are post 2.6:

http://kernelnewbies.org/status/Status-27-May-2003.html

Whether that's accurate, or means 2.6.x or 2.7, I couldn't tell you...

Rob

2003-06-10 20:28:27

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: cachefs on linux

On Tue, 10 Jun 2003, Rob Landley wrote:
> On Tuesday 10 June 2003 04:29, Sean Hunter wrote:
> > On Mon, Jun 09, 2003 at 03:49:36PM -0500, Shawn wrote:
> > > Well, it's a nice way to simulate writing on r/o filesystems IIRC. Like
> > > mounting a cdrom then writing to it, but you're not.
> > >
> > > Was that was this was? Anyway, linux also does not have unionFS. If it
> > > was that big of a deal, someone would write it. As it is, it's a
> > > whizbang no one cares about enough.
> >
> > Its particularly handy for fast read-only NFS stuff. We have thousands
> > of linux hosts and distributing software to all of them is a pain. With
> > cachefs with NFS as the "back" filesystem, you push to the masters and
> > the clients get the changes over NFS and then store them in their local
> > cache so your software distribution nightmare becomes no problem at all.
> > Clients read off the local disk if they can, but fetch over NFS as
> > required. You can tune the cache size on all of the client machines so
> > they can cache more or less of the most recently used NFS junk on its
> > local disk.
> >
> > Sean
>
> Technically cachefs is just a union mount with tmpfs or ramfs as the overlay
> on the underlying filesystem. Doing a seperate cachefs is kind of pointless
> in Linux.

That is not correct (unless there is something about tmpfs/ramfs that I
have missed).

cachefs is very powerfull because it caches to both ram AND to local disk
storage. Thus for example you can use cachefs to mount cdroms and then the
first time some blocks are read they will come from the cdrom disk and
subsequent reads of the same blocks will come out of the local hard drive
and/or the local ram which is of course a lot faster. And you can do the
same for nfs or any other slow and/or non-local file system in order to
implement a faster cache.

Also the cache is intelligent in that the LRU blocks are discarded when
the cache is full (or to be precise above a certain adjustable threshold)
and is replaced by data that is fetched from the slow/remote fs.

AFAIK union mounting with tmpfs/ramfs could never give you such caching
behaviour as cachefs on Solaris...

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2003-06-11 00:38:44

by Rob Landley

[permalink] [raw]
Subject: Re: cachefs on linux

On Tuesday 10 June 2003 16:39, Anton Altaparmakov wrote:
> On Tue, 10 Jun 2003, Rob Landley wrote:

> > Technically cachefs is just a union mount with tmpfs or ramfs as the
> > overlay on the underlying filesystem. Doing a seperate cachefs is kind
> > of pointless in Linux.
>
> That is not correct (unless there is something about tmpfs/ramfs that I
> have missed).
>
> cachefs is very powerfull because it caches to both ram AND to local disk
> storage. Thus for example you can use cachefs to mount cdroms and then the
> first time some blocks are read they will come from the cdrom disk and
> subsequent reads of the same blocks will come out of the local hard drive
> and/or the local ram which is of course a lot faster. And you can do the
> same for nfs or any other slow and/or non-local file system in order to
> implement a faster cache.

Linux automatically caches files in ram, although mount hints that "this
underlying data isn't going to change, so don't worry about coherence" would
be nice. (Maybe there are some already, I dunno quite what the semantics of
read-only NFS mounts are...)

When cache is evicted due to memory pressure, the general assumption is that
there are no pathologically slow connections in the system, so flushing NFS
or CDROM data to swap would probably be a loss.

Maybe this is a bad assumption. I know OS/2 used to prefault in DLL's and
then swap them out immediately to avoid duplicating the linking overhead.
(You may barf now. But it bought them some interesting benchmark numbers at
the time...)

> Also the cache is intelligent in that the LRU blocks are discarded when
> the cache is full (or to be precise above a certain adjustable threshold)
> and is replaced by data that is fetched from the slow/remote fs.
>
> AFAIK union mounting with tmpfs/ramfs could never give you such caching
> behaviour as cachefs on Solaris...

We've never really needed it. What kind of setup causes a demand for it?
(800 machines mounting their root partition off of a single NFS server, type
thing? Booting all of them after a power failure doesn't bring the setup to
its knees anyway?) These days ram is pretty cheap. I admit that's a
cop-out...

It doesn't so much sound like there's a need for another filesystem as a need
for mount hints to the existing cacheing behavior. (I.E. how expensive is a
read from this device vs a read from that device if they are, indeed,
seriously out of whack.) Then again, if it's only used for a bogged down
read-only NFS server on a machine with a fast local swap device (which, for
some reason, doesn't want root to live on that writeable partition...)

How about extracting a tarball into tmpfs and using that to hold the data in
question? (Sounds like it'd work fine on boot, for example.) If the data
changes while it's mounted, your cacheing sounds dangerous. If the data
doesn't change while it's mounted, you effectively prefault the whole thing
across the wire in compressed form exactly once and fling a much as is needed
out to swap, with no CPU drain on the server (and CPU on the client's
generally pretty cheap) and no kernel modification.

This may not be what you want, but I don't really know what you're trying to
do. It seems you want a filesystem that:

A) Is designed for read-only remote mounts that don't change.
B) On a slow or heavily used server,
C) Contains a dataset that's too big to store locally.

I get it. You're doing 3D render farm clusters with Honking Big Datasets(tm),
aren't you?

If the tarball->tmpfs idea isn't helpful, then no, Linux doesn't have a
clusterfs I'm aware of. Try thumping the Filesystem in Userspace guys.
http://sourceforge.net/forum/forum.php?forum_id=254100

> Best regards,
>
> Anton

Rob

2003-06-11 09:49:56

by Bernd Eckenfels

[permalink] [raw]
Subject: Re: cachefs on linux

In article <[email protected]> you wrote:
> On Tuesday 10 June 2003 04:29, Sean Hunter wrote:
...
>> Its particularly handy for fast read-only NFS stuff. We have thousands
>> of linux hosts and distributing software to all of them is a pain. With
>> cachefs with NFS as the "back" filesystem, you push to the masters and
>> the clients get the changes over NFS and then store them in their local
>> cache so your software distribution nightmare becomes no problem at all.

This is not a good idea, unless you have a transactional semantic for the
fetches from the backend. Otherwise you have a mixture of old and new files.

>> Clients read off the local disk if they can, but fetch over NFS as
>> required. You can tune the cache size on all of the client machines so
>> they can cache more or less of the most recently used NFS junk on its
>> local disk.

This is btw exactly what CODA and AFS does best.

> Technically cachefs is just a union mount with tmpfs or ramfs as the overlay
> on the underlying filesystem. Doing a seperate cachefs is kind of pointless
> in Linux.

I think it is a bit different, since the cache is on disk and can be larger.
If you want to put that in swap space, you may quickly exceed some VM
limits. So there is a difference.

Greetings
Bernd
--
eckes privat - http://www.eckes.org/
Project Freefire - http://www.freefire.org/

2003-06-11 11:06:38

by Hirokazu Takahashi

[permalink] [raw]
Subject: Re: cachefs on linux

Hello,

I think the main benfit of cachefs is on NFS servers. Cachefs of
clients can help them to reduce their loads. We know many clients
may share one huge NFS server.
(e.g. Streaming systems which contents may be extremly huge.)

> >> Its particularly handy for fast read-only NFS stuff. We have thousands
> >> of linux hosts and distributing software to all of them is a pain. With
> >> cachefs with NFS as the "back" filesystem, you push to the masters and
> >> the clients get the changes over NFS and then store them in their local
> >> cache so your software distribution nightmare becomes no problem at all.
>
> This is not a good idea, unless you have a transactional semantic for the
> fetches from the backend. Otherwise you have a mixture of old and new files.
>
> >> Clients read off the local disk if they can, but fetch over NFS as
> >> required. You can tune the cache size on all of the client machines so
> >> they can cache more or less of the most recently used NFS junk on its
> >> local disk.
>
> This is btw exactly what CODA and AFS does best.
>
> > Technically cachefs is just a union mount with tmpfs or ramfs as the overlay
> > on the underlying filesystem. Doing a seperate cachefs is kind of pointless
> > in Linux.

Cachefs is different from union mount as it is required to synchronize
contents in it with contents on NFS servers.
I guess it isn't easy to keep them as any other clients may modify the
contents of the servers at any time.

> I think it is a bit different, since the cache is on disk and can be larger.
> If you want to put that in swap space, you may quickly exceed some VM
> limits. So there is a difference.

And we should use filesystem as cache space so that we can make
cachefs be persistent cache. After rebooting all cache remains valid.

Thank you,
Hirokazu Takahashi.

2003-06-11 22:13:23

by J.A. Magallon

[permalink] [raw]
Subject: Re: cachefs on linux


On 06.11, Hirokazu Takahashi wrote:
> Hello,
>
> I think the main benfit of cachefs is on NFS servers. Cachefs of
> clients can help them to reduce their loads. We know many clients
> may share one huge NFS server.
> (e.g. Streaming systems which contents may be extremly huge.)
>

Tha main use of cachefs I've seen was Sun's network-booting workstations.
We had a bunch of old suns (IPX, that was a 68k), with small disks
used to cache / , /usr and so on from an nfs server. You have the
benefits of just one centralized install and the ones from local
storage for more used files....

--
J.A. Magallon <[email protected]> \ Software is like sex:
werewolf.able.es \ It's better when it's free
Mandrake Linux release 9.2 (Cooker) for i586
Linux 2.4.21-rc7-jam1 (gcc 3.3 (Mandrake Linux 9.2 3.3-1mdk))