2003-03-28 12:20:08

by James Pearson

[permalink] [raw]
Subject: NFSv4 and client caching to local disk?

I'm trying to find out more about 'cachefs' type file systems that can
cache NFS data to a client's local disk - I've come across a couple of
references that seem to indicate that this may be possible with NFSv4.

Does (will?) the Linux NFSv4 client support this feature?

Thanks

James Pearson



-------------------------------------------------------
This SF.net email is sponsored by:
The Definitive IT and Networking Event. Be There!
NetWorld+Interop Las Vegas 2003 -- Register today!
http://ads.sourceforge.net/cgi-bin/redirect.pl?keyn0001en
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs


2003-03-28 15:02:02

by J. Bruce Fields

[permalink] [raw]
Subject: Re: NFSv4 and client caching to local disk?

On Fri, Mar 28, 2003 at 12:22:30PM +0000, James Pearson wrote:
> I'm trying to find out more about 'cachefs' type file systems that can
> cache NFS data to a client's local disk - I've come across a couple of
> references that seem to indicate that this may be possible with NFSv4.
>
> Does (will?) the Linux NFSv4 client support this feature?

Nobody's working on it now as far as I know. You might do a search for
'cachefs' on the linux-kernel mailing list:

http://groups.google.com/groups?hl=en&lr=&ie=UTF-8&oe=UTF-8&q=group%3Afa.linux.kernel+cachefs&btnG=Google+Search

I seem to remember some relevant discussions there.--Bruce F.


-------------------------------------------------------
This SF.net email is sponsored by:
The Definitive IT and Networking Event. Be There!
NetWorld+Interop Las Vegas 2003 -- Register today!
http://ads.sourceforge.net/cgi-bin/redirect.pl?keyn0001en
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-03-28 15:16:26

by Lever, Charles

[permalink] [raw]
Subject: RE: NFSv4 and client caching to local disk?

hi james-

> I'm trying to find out more about 'cachefs' type file systems=20
> that can=20
> cache NFS data to a client's local disk - I've come across a=20
> couple of=20
> references that seem to indicate that this may be possible with NFSv4.
>=20
> Does (will?) the Linux NFSv4 client support this feature?

Sun implemented cachefs on Solaris for earlier versions of NFS.
it really has nothing to do with which version of NFS that is
in use.

there is sporadic interest in a cachefs on Linux, and i know of
at least one generic prototype. a specific implementation of
client-side disk caching that is available today is contained
in the Linux OpenAFS client (known as the AFS cache manager).

cachefs becomes rather more interesting when used in conjunction
with NFSv4 file delegations -- that makes NFS behave in a fashion
similar to AFS client-side disk caching with callbacks.

there currently is no explicit plan to implement cachefs by the
team that is working on NFSv4 for Linux. a disk cache has
rather limited usefulness compared to a memory cache. what is
your application for it?



-------------------------------------------------------
This SF.net email is sponsored by:
The Definitive IT and Networking Event. Be There!
NetWorld+Interop Las Vegas 2003 -- Register today!
http://ads.sourceforge.net/cgi-bin/redirect.pl?keyn0001en
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-03-31 12:34:31

by James Pearson

[permalink] [raw]
Subject: Re: NFSv4 and client caching to local disk?

"Lever, Charles" wrote:
>
> hi james-
>
> > I'm trying to find out more about 'cachefs' type file systems
> > that can
> > cache NFS data to a client's local disk - I've come across a
> > couple of
> > references that seem to indicate that this may be possible with NFSv4.
> >
> > Does (will?) the Linux NFSv4 client support this feature?
>
> Sun implemented cachefs on Solaris for earlier versions of NFS.
> it really has nothing to do with which version of NFS that is
> in use.

I've briefly looked at this on IRIX clients some time ago - but we don't
have many IRIX boxes in use now.

> there is sporadic interest in a cachefs on Linux, and i know of
> at least one generic prototype. a specific implementation of
> client-side disk caching that is available today is contained
> in the Linux OpenAFS client (known as the AFS cache manager).

I have done some limited searching of the net for info and come across a
few attempts with earlier kernels - and I am aware of some the
issues/problems associated with doing this.

> cachefs becomes rather more interesting when used in conjunction
> with NFSv4 file delegations -- that makes NFS behave in a fashion
> similar to AFS client-side disk caching with callbacks.

My hopes were raised as the NFSv4 specs suggest that low level support
for client disk caches is available - rather than being a 'bolt-on' for
earlier versions - hence my question.

> there currently is no explicit plan to implement cachefs by the
> team that is working on NFSv4 for Linux. a disk cache has
> rather limited usefulness compared to a memory cache. what is
> your application for it?

Mainly for reading large files that don't change (much) - i.e.
effectively in a read-only mode. The total size of the required files is
greater than the clients memory size - in fact any disk caching of NFS
data doesn't necessarily need to survive a reboot of the client, so
caching NFS file system data to swap could be enough for my needs ...

James Pearson


-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-03-31 18:39:45

by Jake Gold

[permalink] [raw]
Subject: Re: NFSv4 and client caching to local disk?

I am also extremely interested in something like this.

I think the Zeus web server's model for in-memory caching would be great for NFS clients doing disk caching.

Something like this would be great:

cache_directory = /cache
Directory where cache files are stored
cache_files
Size of the web server file cache (number of files)
cache_small_file
Maximum size of a 'small' file (bytes) (system page size)
cache_large_file
Minimum size of a 'large' file (bytes)
cache_stat_expire
Time for which the response of a stat() call is cached (seconds)
cache_max_bytes
Maximum size to reserve for cached files (bytes) (0 = no limit)
cache_flush_interval
Time after which unaccessed files are flushed from the cache (seconds)
cache_cooling_time Integer
any file modified in the last 'n' seconds is not cached
cache_max_filename_length Integer
filenames greater than this length aren't cached (Zero is no limit)


I have a FAS940 with a cluster of Linux machines mounting _read-only_ volumes over NFSv3.
My situation is the same as James Pearson's....those machines read large, rarely modified files, where memory caching doesn't help a whole lot. It would be very nice to be able to have my under-worked local disks take some of the load off the filer.

Are there currently _any_ client-side NFS caching solutions (even something that requires some extra work)?

Thanks in advance,
Jake

On Mon, 31 Mar 2003 13:33:39 +0100
James Pearson <[email protected]> wrote:

> "Lever, Charles" wrote:
> >
> > hi james-
> >
> > > I'm trying to find out more about 'cachefs' type file systems
> > > that can
> > > cache NFS data to a client's local disk - I've come across a
> > > couple of
> > > references that seem to indicate that this may be possible with NFSv4.
> > >
> > > Does (will?) the Linux NFSv4 client support this feature?
> >
> > Sun implemented cachefs on Solaris for earlier versions of NFS.
> > it really has nothing to do with which version of NFS that is
> > in use.
>
> I've briefly looked at this on IRIX clients some time ago - but we don't
> have many IRIX boxes in use now.
>
> > there is sporadic interest in a cachefs on Linux, and i know of
> > at least one generic prototype. a specific implementation of
> > client-side disk caching that is available today is contained
> > in the Linux OpenAFS client (known as the AFS cache manager).
>
> I have done some limited searching of the net for info and come across a
> few attempts with earlier kernels - and I am aware of some the
> issues/problems associated with doing this.
>
> > cachefs becomes rather more interesting when used in conjunction
> > with NFSv4 file delegations -- that makes NFS behave in a fashion
> > similar to AFS client-side disk caching with callbacks.
>
> My hopes were raised as the NFSv4 specs suggest that low level support
> for client disk caches is available - rather than being a 'bolt-on' for
> earlier versions - hence my question.
>
> > there currently is no explicit plan to implement cachefs by the
> > team that is working on NFSv4 for Linux. a disk cache has
> > rather limited usefulness compared to a memory cache. what is
> > your application for it?
>
> Mainly for reading large files that don't change (much) - i.e.
> effectively in a read-only mode. The total size of the required files is
> greater than the clients memory size - in fact any disk caching of NFS
> data doesn't necessarily need to survive a reboot of the client, so
> caching NFS file system data to swap could be enough for my needs ...
>
> James Pearson
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: ValueWeb:
> Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
> No other company gives more support or power for your dedicated server
> http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>


-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-03-31 19:44:06

by Peter Astrand

[permalink] [raw]
Subject: Re: NFSv4 and client caching to local disk?


> I'm trying to find out more about 'cachefs' type file systems that can
> cache NFS data to a client's local disk - I've come across a couple of
> references that seem to indicate that this may be possible with NFSv4.

I recently found out about http://infradead.org/cgi-bin/cvsweb.cgi/afs/,
which includes a "cachefs" filesystem. It's written by David Howells
<[email protected]>, and was announced as:

"I'm writing a cache filesystem for primarily for caching AFS pages, but
that also can be used for caching other network FS pages (such as NFSv4,
which Jeff Garzik and Trond Myklebust are interested in, I think)."

"Look in the afs/fs/cachefs/ and afs/include/linux/ directories. It's not
complete yet, but does allow me to do some of the functions required, such
as fully journalled block allocation and index creation and entry
allocation."

I haven't tried it, though.

--
/Peter ?strand <[email protected]>




-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-03-31 21:50:59

by Lever, Charles

[permalink] [raw]
Subject: RE: NFSv4 and client caching to local disk?

> I am also extremely interested in something like this.

caching large files on local disk is one of the few
reasonable uses for a disk-based NFS client cache.

> I think the Zeus web server's model for in-memory caching=20
> would be great for NFS clients doing disk caching.
>=20
> Something like this would be great:
>=20
> cache_directory =3D /cache
> Directory where cache files are stored
> cache_files
> Size of the web server file cache (number of files)
> cache_small_file
> Maximum size of a 'small' file (bytes) (system page size)
> cache_large_file
> Minimum size of a 'large' file (bytes)
> cache_stat_expire
> Time for which the response of a stat() call is cached (seconds)

the NFS protocol specifies certain rules for when attribute
cache entries expire, so this kind of control over your client's
cache isn't possible without breaking the NFS protocol.

> cache_max_bytes
> Maximum size to reserve for cached files (bytes) (0 =3D no limit)
> cache_flush_interval
> Time after which unaccessed files are flushed from the=20
> cache (seconds)
> cache_cooling_time Integer
> any file modified in the last 'n' seconds is not cached
> cache_max_filename_length Integer
> filenames greater than this length aren't cached (Zero=20
> is no limit)=20


> I have a FAS940 with a cluster of Linux machines mounting=20
> _read-only_ volumes over NFSv3.
> My situation is the same as James Pearson's....those machines=20
> read large, rarely modified files, where memory caching=20
> doesn't help a whole lot. It would be very nice to be able to=20
> have my under-worked local disks take some of the load off the filer.

have you determined why the filer is "overworked?" which
version of NFS client do you use? which protocol (TCP/UDP)?
if UDP, have you looked for signs of IP fragmentation?

the best thing you can do for now is make sure the bandwidth
between your filer and clients is at its maximum.

> Are there currently _any_ client-side NFS caching solutions=20
> (even something that requires some extra work)?
>=20
> Thanks in advance,
> Jake
>=20
> On Mon, 31 Mar 2003 13:33:39 +0100
> James Pearson <[email protected]> wrote:
>=20
> > "Lever, Charles" wrote:
> > >=20
> > > hi james-
> > >=20
> > > > I'm trying to find out more about 'cachefs' type file systems
> > > > that can
> > > > cache NFS data to a client's local disk - I've come across a
> > > > couple of
> > > > references that seem to indicate that this may be=20
> possible with NFSv4.
> > > >
> > > > Does (will?) the Linux NFSv4 client support this feature?
> > >=20
> > > Sun implemented cachefs on Solaris for earlier versions of NFS.
> > > it really has nothing to do with which version of NFS that is
> > > in use.
> >=20
> > I've briefly looked at this on IRIX clients some time ago -=20
> but we don't
> > have many IRIX boxes in use now.
> > =20
> > > there is sporadic interest in a cachefs on Linux, and i know of
> > > at least one generic prototype. a specific implementation of
> > > client-side disk caching that is available today is contained
> > > in the Linux OpenAFS client (known as the AFS cache manager).
> >=20
> > I have done some limited searching of the net for info and=20
> come across a
> > few attempts with earlier kernels - and I am aware of some the
> > issues/problems associated with doing this.
> > =20
> > > cachefs becomes rather more interesting when used in conjunction
> > > with NFSv4 file delegations -- that makes NFS behave in a fashion
> > > similar to AFS client-side disk caching with callbacks.
> >=20
> > My hopes were raised as the NFSv4 specs suggest that low=20
> level support
> > for client disk caches is available - rather than being a=20
> 'bolt-on' for
> > earlier versions - hence my question.
> > =20
> > > there currently is no explicit plan to implement cachefs by the
> > > team that is working on NFSv4 for Linux. a disk cache has
> > > rather limited usefulness compared to a memory cache. what is
> > > your application for it?
> >=20
> > Mainly for reading large files that don't change (much) - i.e.
> > effectively in a read-only mode. The total size of the=20
> required files is
> > greater than the clients memory size - in fact any disk=20
> caching of NFS
> > data doesn't necessarily need to survive a reboot of the client, so
> > caching NFS file system data to swap could be enough for my=20
> needs ...
> >=20
> > James Pearson
> >=20
> >=20
> > -------------------------------------------------------
> > This SF.net email is sponsored by: ValueWeb:=20
> > Dedicated Hosting for just $79/mo with 500 GB of bandwidth!=20
> > No other company gives more support or power for your=20
> dedicated server
> > http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
> > _______________________________________________
> > NFS maillist - [email protected]
> > https://lists.sourceforge.net/lists/listinfo/nfs
> >=20
>=20


-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-04-01 00:04:32

by Jake Gold

[permalink] [raw]
Subject: Re: NFSv4 and client caching to local disk?

On Mon, 31 Mar 2003 13:50:52 -0800
"Lever, Charles" <[email protected]> wrote:

> > I am also extremely interested in something like this.
>
> caching large files on local disk is one of the few
> reasonable uses for a disk-based NFS client cache.

> the NFS protocol specifies certain rules for when attribute
> cache entries expire, so this kind of control over your client's
> cache isn't possible without breaking the NFS protocol.
>

You're right, of course, I was just copying feature-for-feature from the Zeus Web server tunables page.

>
> have you determined why the filer is "overworked?" which
> version of NFS client do you use? which protocol (TCP/UDP)?
> if UDP, have you looked for signs of IP fragmentation?
>
> the best thing you can do for now is make sure the bandwidth
> between your filer and clients is at its maximum.

Well..this is certainly something I'm looking at ...In fact..I have huge (length wise) trouble ticket with NetApp on this issue that has
all of my configuration information in it...(#481133 if you are able to check it :)

This is a DS14 on a FAS940...I'm getting, what I think, is pretty poor performance...
And the obvious problem is disk utilization...which..even if I do have an actual configuration problem, I would really like to minimize filer
disk activity in the future (client disk caching).

I switched to TCP 32k (w|r)size from UDP..I've tried 8k, 16k, 32k buffer sizes..(I modified options nfs.(tcp|udp).xfersize accordingly...)....nothing effected disk utilization significantly.

Mount options: rw,noatime,fg,nfsvers=3,rsize=32768,wsize=32768,hard,intr,nolock,nocto,timeo=600,actimeo=5,tcp,addr=10.8.1.21

I'm using jumbo frames (at 8998 MTU with the e1000 driver on a stock (trimmed) 2.4.20 from kernel.org)...Redhat 7.3 base system.

AMS01NAS001> sysstat -u 1
CPU Total Net kB/s Disk kB/s Tape kB/s Cache Cache CP CP Disk
ops/s in out read write read write age hit time ty util
30% 2251 556 45735 42042 0 0 0 48s 94% 0% - 100%
30% 2008 557 45496 43080 0 0 0 48s 94% 0% - 100%
30% 2087 553 45542 42876 0 0 0 48s 94% 0% - 100%
28% 1879 524 45562 40680 0 0 0 48s 94% 0% - 100%
29% 2030 532 43986 41652 0 0 0 48s 94% 0% - 100%
30% 2093 530 45739 42320 0 0 0 48s 94% 0% - 100%
30% 2023 560 45544 43024 0 0 0 48s 94% 50% T 100%
30% 2114 545 45768 42940 8 0 0 48s 94% 100% : 100%
30% 1870 542 46671 42516 0 0 0 48s 94% 100% : 100%
31% 2131 536 45723 44172 0 0 0 48s 94% 100% : 100%
30% 1960 544 46833 43664 0 0 0 48s 94% 100% : 100%
31% 2140 559 47989 44004 0 0 0 48s 94% 100% : 100%
31% 2094 565 46966 43472 0 0 0 48s 94% 100% : 100%
31% 2070 555 47402 44596 0 0 0 48s 94% 100% : 100%
29% 2027 543 45051 41424 0 0 0 48s 94% 100% : 100%
27% 1817 523 44423 39748 0 0 0 48s 94% 100% : 100%
30% 2129 525 44410 42532 372 0 0 48s 94% 70% : 100%
30% 2114 547 45015 42136 0 0 0 48s 94% 0% - 100%
29% 1962 528 44107 41144 0 0 0 48s 94% 0% - 100%

AMS01NAS001> nfsstat -l
10.8.2.1 <hostname unknown>
10.8.2.11 <hostname unknown> NFSOPS = 7298114 ( 7%)
10.8.2.12 <hostname unknown> NFSOPS = 7638814 ( 8%)
10.8.2.13 <hostname unknown> NFSOPS = 7743632 ( 8%)
10.8.2.15 <hostname unknown> NFSOPS = 7509558 ( 8%)
10.8.2.16 <hostname unknown> NFSOPS = 7473626 ( 7%)
10.8.2.17 <hostname unknown> NFSOPS = 7497070 ( 8%)
10.8.2.21 <hostname unknown> NFSOPS = 7586200 ( 8%)
10.8.2.22 <hostname unknown> NFSOPS = 7780060 ( 8%)
10.8.2.23 <hostname unknown> NFSOPS = 7920911 ( 8%)
10.8.2.24 <hostname unknown> NFSOPS = 7959184 ( 8%)
10.8.2.25 <hostname unknown> NFSOPS = 7643417 ( 8%)
10.8.2.26 <hostname unknown> NFSOPS = 7984915 ( 8%)
10.8.2.30 <hostname unknown> NFSOPS = 7479921 ( 7%)


NFSv2 information removed manually:

AMS01NAS001> nfsstat -t

Server rpc:
TCP:
calls badcalls nullrecv badlen xdrcall
100037183 0 0 0 0

UDP:
calls badcalls nullrecv badlen xdrcall
0 0 0 0 0

Server nfs:
calls badcalls
100037038 0

Server nfs V3: (100037038 calls)
null getattr setattr lookup access readlink read
0 0% 20691796 21%0 0% 178967 0% 40195 0% 148 0% 79124715 79%
write create mkdir symlink mknod remove rmdir
0 0% 0 0% 0 0% 0 0% 0 0% 0 0% 0 0%
rename link readdir readdir+ fsstat fsinfo pathconf
0 0% 0 0% 425 0% 0 0% 396 0% 396 0% 0 0%
commit
0 0%

Read request stats (version 3)
0-511 512-1023 1K-2047 2K-4095 4K-8191 8K-16383 16K-32767 32K-65535 64K-131071
0 0 0 0 1363016 749726 1023688 75988430 0
Write request stats (version 3)
0-511 512-1023 1K-2047 2K-4095 4K-8191 8K-16383 16K-32767 32K-65535 64K-131071
0 0 0 0 0 0 0 0 0



Any help you can provide would be very appreciated.

Thanks,
Jake


-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-04-01 19:05:55

by Bogdan Costescu

[permalink] [raw]
Subject: Re: NFSv4 and client caching to local disk?

On Mon, 31 Mar 2003, Jake Gold wrote:

> I have a FAS940 with a cluster of Linux machines mounting _read-only_
> volumes over NFSv3. My situation is the same as James Pearson's....those
> machines read large, rarely modified files, where memory caching doesn't
> help a whole lot. It would be very nice to be able to have my
> under-worked local disks take some of the load off the filer.

How about doing it in the application itself ? Often application has more
knowledge about what it needs than the OS. So in a case like the one that
you described I would copy the rarely modified file(s) to the local disk
and just before accessing them in any way check to see if they were
modified. If this is true read-only you don't even need to check this, but
otherwise you could use some rsync-style transfer to get only the parts
that were modified (assuming that new data is not completely different)...
Modification can be detected from the file attributes, md5sum or something
else.

The application level caching is more interesting when the cache size is
smaller than the data set. The OS would typically use LRU strategy to
make some more room in the cache when needed, but this is not always the
best approach - the application is the only one that can have some more
information about what data will be needed in the near future.

--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [email protected]



-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-04-01 19:57:44

by Skottie Miller

[permalink] [raw]
Subject: Re: NFSv4 and client caching to local disk?

Bogdan Costescu wrote:
> On Mon, 31 Mar 2003, Jake Gold wrote:
>
>>I have a FAS940 with a cluster of Linux machines mounting _read-only_
>>volumes over NFSv3. My situation is the same as James Pearson's....those
>>machines read large, rarely modified files, where memory caching doesn't
>>help a whole lot. It would be very nice to be able to have my
>>under-worked local disks take some of the load off the filer.
>
> How about doing it in the application itself ? Often application has more
> knowledge about what it needs than the OS. So in a case like the one that
> you described I would copy the rarely modified file(s) to the local disk
> and just before accessing them in any way check to see if they were
> modified. If this is true read-only you don't even need to check this, but
> otherwise you could use some rsync-style transfer to get only the parts
> that were modified (assuming that new data is not completely different)...
> Modification can be detected from the file attributes, md5sum or something
> else.

We do the caching to compute-node local disk, via the applications when we can.
But two things often make that difficult: (1) the working set of the cached data
is larger than the compute-note local disk, and (2) we don't own all the applications.

So, I complement the compute-node local disk caches (and sometimes replace them)
with a series of NetCache c2100 dNFS boxes. The origin data is on 960's,
which serve up their data to the user workstations, and to the caches.
The renderfarm compute nodes get almost all their high-traffic read-only
data off the caches. This maintains good interactive performance for the
user workstations and keeps the renderfarm fed.

-skottie


--

Scott Miller | Animation Technology
work: [email protected] | Dreamworks Feature Animation
life: [email protected]



-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-04-01 21:01:18

by Jake Gold

[permalink] [raw]
Subject: Re: NFSv4 and client caching to local disk?

Bogdan,

Thanks for the reply.

I am investigating that possibility...

We are using the Zeus Web Server to serve files off the NFS volume.
And there are some definite possibilities (using their Content Compression features)
or possibly a custom ISAPI solution we're considering...

I'm still very interested in a CacheFS-type solution, but I understand the lack of demand that
dictates its priority..


Jake

On Tue, 1 Apr 2003 21:05:36 +0200 (CEST)
Bogdan Costescu <[email protected]> wrote:
>
> How about doing it in the application itself ? Often application has more
> knowledge about what it needs than the OS. So in a case like the one that
> you described I would copy the rarely modified file(s) to the local disk
> and just before accessing them in any way check to see if they were
> modified. If this is true read-only you don't even need to check this, but
> otherwise you could use some rsync-style transfer to get only the parts
> that were modified (assuming that new data is not completely different)...
> Modification can be detected from the file attributes, md5sum or something
> else.
>
> The application level caching is more interesting when the cache size is
> smaller than the data set. The OS would typically use LRU strategy to
> make some more room in the cache when needed, but this is not always the
> best approach - the application is the only one that can have some more
> information about what data will be needed in the near future.
>
> --
> Bogdan Costescu
>
> IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
> Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
> Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
> E-mail: [email protected]
>
>
>
> -------------------------------------------------------
> This SF.net email is sponsored by: ValueWeb:
> Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
> No other company gives more support or power for your dedicated server
> http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
> _______________________________________________
> NFS maillist - [email protected]
> https://lists.sourceforge.net/lists/listinfo/nfs
>


-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-04-01 20:56:38

by James Pearson

[permalink] [raw]
Subject: Re: NFSv4 and client caching to local disk?

We already do some distribution of large 'static' files to clients - but
we don't have full control of the 3rd party applications we run as to
where they look for their data - therefore it would be 'nice' to have
the option of allowing the OS to do this for us ...

James Pearson

Bogdan Costescu wrote:
> On Mon, 31 Mar 2003, Jake Gold wrote:
>
>
>>I have a FAS940 with a cluster of Linux machines mounting _read-only_
>>volumes over NFSv3. My situation is the same as James Pearson's....those
>>machines read large, rarely modified files, where memory caching doesn't
>>help a whole lot. It would be very nice to be able to have my
>>under-worked local disks take some of the load off the filer.
>
>
> How about doing it in the application itself ? Often application has more
> knowledge about what it needs than the OS. So in a case like the one that
> you described I would copy the rarely modified file(s) to the local disk
> and just before accessing them in any way check to see if they were
> modified. If this is true read-only you don't even need to check this, but
> otherwise you could use some rsync-style transfer to get only the parts
> that were modified (assuming that new data is not completely different)...
> Modification can be detected from the file attributes, md5sum or something
> else.
>
> The application level caching is more interesting when the cache size is
> smaller than the data set. The OS would typically use LRU strategy to
> make some more room in the cache when needed, but this is not always the
> best approach - the application is the only one that can have some more
> information about what data will be needed in the near future.
>




-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-04-02 11:27:38

by Bogdan Costescu

[permalink] [raw]
Subject: Re: NFSv4 and client caching to local disk?

On Tue, 1 Apr 2003, Skottie Miller wrote:

> We do the caching to compute-node local disk, via the applications when
> we can. But two things often make that difficult: (1) the working set
> of the cached data is larger than the compute-note local disk,

Ah, so you define now a working set which is a subset of the whole data
set. In this case I would say that you can't win in any situation; it's
similar to a process that needs memory larger than RAM and has to use the
disk - either the OS does swapping or the process itself writes parts of
its memory to the disk, but it has to be done continuously as the process
jumps through memory - the disk is used in any case and the performance
is lowered... Buy larger disks :-)

> (2) we don't own all the applications.

Yes, obviously this is a problem and can only be solved if you are a big
enough customer for the software company :-)

But there might be some other problem: if the data set that is mostly read
only is needed on most or all clients, updating it on the server will
generate in a very short time a storm of requests for transfers of the new
content. If you are using some blind OS level caching, this will create a
big load on the server or network congestion because most or all clients
will try to get the new data. With application level caching you might do
nice stuff like using some priority lists, using one client that already
got the data to send it to another client, etc.

--
Bogdan Costescu

IWR - Interdisziplinaeres Zentrum fuer Wissenschaftliches Rechnen
Universitaet Heidelberg, INF 368, D-69120 Heidelberg, GERMANY
Telephone: +49 6221 54 8869, Telefax: +49 6221 54 8868
E-mail: [email protected]



-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-04-02 11:57:15

by James Pearson

[permalink] [raw]
Subject: Re: NFSv4 and client caching to local disk?

Bogdan Costescu wrote:
>
> On Tue, 1 Apr 2003, Skottie Miller wrote:
>
> > We do the caching to compute-node local disk, via the applications when
> > we can. But two things often make that difficult: (1) the working set
> > of the cached data is larger than the compute-note local disk,
>
> Ah, so you define now a working set which is a subset of the whole data
> set. In this case I would say that you can't win in any situation; it's
> similar to a process that needs memory larger than RAM and has to use the
> disk - either the OS does swapping or the process itself writes parts of
> its memory to the disk, but it has to be done continuously as the process
> jumps through memory - the disk is used in any case and the performance
> is lowered... Buy larger disks :-)
>
> > (2) we don't own all the applications.
>
> Yes, obviously this is a problem and can only be solved if you are a big
> enough customer for the software company :-)
>
> But there might be some other problem: if the data set that is mostly read
> only is needed on most or all clients, updating it on the server will
> generate in a very short time a storm of requests for transfers of the new
> content. If you are using some blind OS level caching, this will create a
> big load on the server or network congestion because most or all clients
> will try to get the new data. With application level caching you might do
> nice stuff like using some priority lists, using one client that already
> got the data to send it to another client, etc.
>

I would like to have some blind OS level caching as a starting point ...
then use some higher level application to seed the cache. In this way,
the 3rd party application doesn't need to be altered and can just get
its data from the OS's cached copy ...

James Pearson


-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs

2003-04-03 16:31:37

by Brashers_Per

[permalink] [raw]
Subject: Re: NFSv4 and client caching to local disk?

Since we are now outside of NFSv4, and into application level distribution,
I will throw in a few thoughts.

There are 3 major 'handles' that you can grab to move data; Files, Blocks,
and Tracks. Each of these has there limitations, and idiosyncrasies, so yes
integrating with apps is difficult. Running an agent and moving things
around via files is by far the most intuitive, I have had experience with
OnCourse doing this and it has proven stable. The big issue is how to find
the files in the first place, when to 'expire them' and such. Where as
rules can be written to handle this, I am a bit to faint of heart to do so.

On the block level we all know snap-mirror or Celerra Replicator, and the
associated freeze/thaw methods used. The trouble with block level
replication is, who is allowed to apply the changes, and how do the apps get
to know about it? Since we are fortunate enough to not suffer a life of
windowze, we are in pretty good shape. But.... If we move to v4 and start
down the road of client side caching, how do we know when the server shall
mark the clients mem map dirty? What tolerances are there for
inconsistencies, and do we care if it is just a read or do we check a finger
print every time....

Then comes the track level, think of raid 1 +1. That is a mirror of a
mirror that one can move all about. Now we get into even more trouble, as
we have a replica, but how does the server handle the swap off of the older
replica (assuming this is in a running state) to a newer one. Or worse yet
we only have a singe destination replica, so we either break rule #1 (never
allow 2 hosts to use the same partition at the same time) and make a
messaging interface, or we go through a freeze/thaw cycle again and flush
the mem map of the server mounting it. Even after all this is done, your
still back in the world of notifying your clients.

My $.02 is not to put replication/caching in the protocol, but to have some
'intelligent' rules based system that can act as a
push/pull/replicate/stale/redirect/refresh service, that can be tailored to
cope with the oddities that each application will be presenting. Failing
that the next smoothest thing is to do the third mirror, as the development
of such an interface to do the meta-data passing could prove extremely
interesting, multiple r/w clients on a single data source.... (MPFS does
this if your i/o size is large enough)

Hope this has provoked some thoughts.

Per


Date: Tue, 01 Apr 2003 21:56:05 +0100
From: James Pearson <[email protected]>
To: Bogdan Costescu <[email protected]>
CC: [email protected]
Subject: Re: [NFS] NFSv4 and client caching to local disk?

We already do some distribution of large 'static' files to clients - but
we don't have full control of the 3rd party applications we run as to
where they look for their data - therefore it would be 'nice' to have
the option of allowing the OS to do this for us ...

James Pearson

Bogdan Costescu wrote:
> On Mon, 31 Mar 2003, Jake Gold wrote:
>
>
>>I have a FAS940 with a cluster of Linux machines mounting _read-only_
>>volumes over NFSv3. My situation is the same as James Pearson's....those
>>machines read large, rarely modified files, where memory caching doesn't
>>help a whole lot. It would be very nice to be able to have my
>>under-worked local disks take some of the load off the filer.
>
>
> How about doing it in the application itself ? Often application has more
> knowledge about what it needs than the OS. So in a case like the one that
> you described I would copy the rarely modified file(s) to the local disk
> and just before accessing them in any way check to see if they were
> modified. If this is true read-only you don't even need to check this, but

> otherwise you could use some rsync-style transfer to get only the parts
> that were modified (assuming that new data is not completely different)...

> Modification can be detected from the file attributes, md5sum or something

> else.
>
> The application level caching is more interesting when the cache size is
> smaller than the data set. The OS would typically use LRU strategy to
> make some more room in the cache when needed, but this is not always the
> best approach - the application is the only one that can have some more
> information about what data will be needed in the near future.
>





-------------------------------------------------------
This SF.net email is sponsored by: ValueWeb:
Dedicated Hosting for just $79/mo with 500 GB of bandwidth!
No other company gives more support or power for your dedicated server
http://click.atdmt.com/AFF/go/sdnxxaff00300020aff/direct/01/
_______________________________________________
NFS maillist - [email protected]
https://lists.sourceforge.net/lists/listinfo/nfs