From: "Lever, Charles" Subject: RE: Disabling the client cache - O_DIRECT? Date: Mon, 23 May 2005 17:22:21 -0700 Message-ID: <482A3FA0050D21419C269D13989C611308539B88@lavender-fe.eng.netapp.com> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="----_=_NextPart_001_01C55FF6.A6D23416" Cc: Return-path: Received: from sc8-sf-mx1-b.sourceforge.net ([10.3.1.11] helo=sc8-sf-mx1.sourceforge.net) by sc8-sf-list2.sourceforge.net with esmtp (Exim 4.30) id 1DaNBk-0002V1-KQ for nfs@lists.sourceforge.net; Mon, 23 May 2005 17:22:28 -0700 Received: from mx1.netapp.com ([216.240.18.38]) by sc8-sf-mx1.sourceforge.net with esmtp (Exim 4.41) id 1DaNBj-0006n7-VI for nfs@lists.sourceforge.net; Mon, 23 May 2005 17:22:28 -0700 To: "Edward Hibbert" Sender: nfs-admin@lists.sourceforge.net Errors-To: nfs-admin@lists.sourceforge.net List-Unsubscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Post: List-Help: List-Subscribe: , List-Archive: This is a multi-part message in MIME format. ------_=_NextPart_001_01C55FF6.A6D23416 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: quoted-printable offset and alignment of memory buffer and byte offset in the file shouldn't matter for NFS files. in 2.6.9, there is a 16MB per request size limit on systems with 4KB pages. =20 _____ =20 From: Edward Hibbert [mailto:Edward.Hibbert@dataconnection.com]=20 Sent: Monday, May 23, 2005 4:57 PM To: Lever, Charles Cc: nfs@lists.sourceforge.net Subject: RE: [NFS] Disabling the client cache - O_DIRECT? =09 =09 I'm most interested in a 2.6 kernel (2.6.9). =20 You mention the offset and alignment (of the memory buffer, I presume). Are there any restrictions on the length of data? =20 I have a simple test program, which seems to show that I need to use 512-byte alignment on a 2.6.9 kernel, but given what you say I'll need to look into it more carefully. =20 Edward. -----Original Message----- From: Lever, Charles [mailto:Charles.Lever@netapp.com] Sent: 23 May 2005 21:47 To: Edward Hibbert Cc: nfs@lists.sourceforge.net Subject: RE: [NFS] Disabling the client cache - O_DIRECT? =09 =09 the alignment issues are different depending on which kernel you use. =20 in 2.6 there should be no alignment issues, you can do O_DIRECT I/O on NFS files at any offset and alignment. =20 in 2.4, O_DIRECT is supported on NFS files after 2.4.22 ish. buffer and offset alignment is restricted. at some point we may have changed the behavior, but i'm not remembering when and which kernels, because we backported some of the 2.6 behavior into the Red Hat kernels. i believe it depends on what your server returns as its on-disk block size (NFSv3 PATHCONF). =20 -----Original Message----- From: Edward Hibbert [mailto:Edward.Hibbert@dataconnection.com]=20 Sent: Monday, May 23, 2005 4:13 PM To: nfs@lists.sourceforge.net Subject: RE: [NFS] Disabling the client cache - O_DIRECT? =09 =09 And then there's O_DIRECT, which might do just what I want. Is that working over NFS now? =20 =20 I'm trying to use it but getting EINVAL back - can someone explain whether O_DIRECT over NFS has the alignment restrictions that O_DIRECT does in general? =20 -----Original Message----- From: nfs-admin@lists.sourceforge.net [mailto:nfs-admin@lists.sourceforge.net]On Behalf Of Edward Hibbert Sent: 23 May 2005 15:43 To: nfs@lists.sourceforge.net Subject: [NFS] Disabling the client cache using file locks =09 =09 The NFS FAQ says:=20 For greater consistency among clients, applications can use file locking, where a client purges file data when a file is locked, and flushes changes back to the server before unlocking a file. My experience with this is that to force a read to go to the NFS server, I:=20 - lock an area in the file (I think any area works, it doesn't need to overlap with what you want to read)=20 - read the data=20 - unlock it=20 When I want a write to go to the NFS server, I do the same. Obviously if I'm reading and writing the same area, I can save on the locks. This produces a lot of lock and unlock operations, and I find that:=20 - NFS servers aren't as good at handling a high volume of these as they are at reads and writes. =20 - lockd/statd on the client may also be a serialisation point (I have suggestive but not conclusive evidence).=20 So I'm looking for ways of reducing the number of locks. Possibilities:=20 - Make a patch to the kernel (or NFS utils?) to _entirely_ disable client-side caching of file contents over NFS. I have dedicated boxes which have no requirement to do client-side caching at all so this is a serious option. Any code pointers? - Use nolock to prevent the locks hitting the NFS server - but I think that would still hit lockd/statd on the client?=20 - Not to bother with the locks round the writes, but use fsync instead. Do folk think that would work?=20 Any comments or other ideas?=20 Edward.=20 ------_=_NextPart_001_01C55FF6.A6D23416 Content-Type: text/html; charset="us-ascii" Content-Transfer-Encoding: quoted-printable Message
offset and alignment of memory buffer and byte = offset in=20 the file shouldn't matter for NFS files.  in 2.6.9, there is a 16MB = per=20 request size limit on systems with 4KB pages.
 


From: Edward Hibbert=20 [mailto:Edward.Hibbert@dataconnection.com]
Sent: Monday, = May 23,=20 2005 4:57 PM
To: Lever, Charles
Cc:=20 nfs@lists.sourceforge.net
Subject: RE: [NFS] Disabling the = client=20 cache - O_DIRECT?

I'm=20 most interested in a 2.6 kernel (2.6.9).
 
You=20 mention the offset and alignment (of the memory buffer, I = presume).  Are=20 there any restrictions on the length of data?
 
I=20 have a simple test program, which seems to show that I need to use = 512-byte=20 alignment on a 2.6.9 kernel, but given what you say I'll need to look = into it=20 more carefully.
 
Edward.
-----Original Message-----
From: Lever, Charles=20 [mailto:Charles.Lever@netapp.com]
Sent: 23 May 2005=20 21:47
To: Edward Hibbert
Cc:=20 nfs@lists.sourceforge.net
Subject: RE: [NFS] Disabling the = client=20 cache - O_DIRECT?

the alignment issues are different depending on which = kernel you=20 use.
 
in=20 2.6 there should be no alignment issues, you can do O_DIRECT I/O on = NFS=20 files at any offset and alignment.
 
in=20 2.4, O_DIRECT is supported on NFS files after 2.4.22 ish.  = buffer and=20 offset alignment is restricted.  at some point we may have = changed the=20 behavior, but i'm not remembering when and which kernels, because we = backported some of the 2.6 behavior into the Red Hat kernels.  = i=20 believe it depends on what your server returns as its on-disk block = size=20 (NFSv3 PATHCONF).
 
-----Original Message-----
From: = Edward=20 Hibbert [mailto:Edward.Hibbert@dataconnection.com] =
Sent:=20 Monday, May 23, 2005 4:13 PM
To:=20 nfs@lists.sourceforge.net
Subject: RE: [NFS] Disabling = the=20 client cache - O_DIRECT?

And then there's O_DIRECT, which might do just what I = want. =20 Is that working over NFS now? 
 
I'm trying to use it but getting EINVAL back - can = someone explain=20 whether O_DIRECT over NFS has the alignment restrictions that = O_DIRECT=20 does in general?
 
-----Original Message-----
From:=20 nfs-admin@lists.sourceforge.net=20 [mailto:nfs-admin@lists.sourceforge.net]On Behalf Of = Edward=20 Hibbert
Sent: 23 May 2005 15:43
To:=20 nfs@lists.sourceforge.net
Subject: [NFS] Disabling the = client=20 cache using file locks

The NFS FAQ says: =

For greater consistency = among=20 clients, applications can use file locking, where a client = purges file=20 data when a file is locked, and flushes changes back to the = server=20 before unlocking a file.

My experience with this = is that to=20 force a read to go to the NFS server, I:
- lock an area in the file (I = think any area=20 works, it doesn't need to overlap with what you want to = read)=20
- read the data =
- unlock it

When I want a write to go = to the NFS=20 server, I do the same.  Obviously if I'm reading and = writing the=20 same area, I can save on the locks.

This produces a lot of = lock and=20 unlock operations, and I find that:
- NFS servers aren't as good at handling a high volume = of these=20 as they are at reads and writes. 
- lockd/statd on the client may = also be a=20 serialisation point (I have suggestive but not conclusive=20 evidence).

So I'm looking for ways = of reducing=20 the number of locks.  Possibilities:
- Make a patch to the kernel (or = NFS utils?)=20 to _entirely_ disable client-side caching of file contents over=20 NFS.  I have dedicated boxes which have no requirement to = do=20 client-side caching at all so this is a serious option.  = Any code=20 pointers?

- Use nolock to prevent = the locks=20 hitting the NFS server - but I think that would still hit = lockd/statd on=20 the client?
- Not = to bother=20 with the locks round the writes, but use fsync instead.  Do = folk=20 think that would work?

Any comments or other = ideas?=20

Edward.=20 =



------_=_NextPart_001_01C55FF6.A6D23416-- ------------------------------------------------------- This SF.Net email is sponsored by Oracle Space Sweepstakes Want to be the first software developer in space? Enter now for the Oracle Space Sweepstakes! http://ads.osdn.com/?ad_id=7412&alloc_id=16344&op=click _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs