From: "Alan Witz" Subject: Re: Corrupt Data when using NFS on Linux Date: Mon, 28 Oct 2002 18:05:06 -0500 Sender: nfs-admin@lists.sourceforge.net Message-ID: <007601c27ed6$74986240$2864a8c0@alanw> References: <6440EA1A6AA1D5118C6900902745938E07D54FE2@black.eng.netapp.com> Mime-Version: 1.0 Content-Type: multipart/alternative; boundary="----=_NextPart_000_0073_01C27EAC.8B9C34A0" Cc: Return-path: Received: from mail4.uunet.ca ([209.167.141.34]) by usw-sf-list1.sourceforge.net with esmtp (Exim 3.31-VA-mm2 #1 (Debian)) id 186Iwd-0007kM-00 for ; Mon, 28 Oct 2002 15:05:15 -0800 To: "Lever, Charles" Errors-To: nfs-admin@lists.sourceforge.net List-Help: List-Post: List-Subscribe: , List-Id: Discussion of NFS under Linux development, interoperability, and testing. List-Unsubscribe: , List-Archive: This is a multi-part message in MIME format. ------=_NextPart_000_0073_01C27EAC.8B9C34A0 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable Thanks for your quick response. The operating environment is Red Hat Linux 2.4.18-17.7.xsmp. We are = running NFS version 3. We have implemented some of our own rudimentary = file locking techniques to try and circumvent the problem. This = consists of creating a lock file which acts as a flag which tells the = other clients not to access the file. Basically, if the lock file = exists then the other clients will wait until the file is cleared before = writing to the database file. To ensure that this works properly the = "lock" flag is being created using the "ln" command so that the process = of checking for a lock and setting a lock is essentially done in one = step (thus eliminating the possibility of another client setting the = lock after the current client has checked for the lock but before it can = set the lock itself). We are also running NFS in synchronous mode to = try and reduce the chances of data corruption due to multiple clients. = The mount options are as follows: rsize=3D8192,wsize=3D8192,noac,hard,sync,nfsvers=3D3 Any thoughts would be greatly appreciated. Alan Witz ----- Original Message -----=20 From: Lever, Charles=20 To: 'Alan Witz'=20 Cc: nfs@lists.sourceforge.net=20 Sent: Monday, October 28, 2002 3:48 PM Subject: RE: [NFS] Corrupt Data when using NFS on Linux hi alan- that information is crap, and should be removed from whereever you = found it. the problem is that typical file systems used on *Linux* NFS servers = (like ext2) can't store time stamps with sub-second resolution. this is not a problem = with typical commercial NFS servers like Solaris or NetApp filers. i'm not aware = of any plan to address this specific problem in 2.5, but that doesn't mean it won't = be. can you tell us more about your environment, especially which kernel = is running on your clients and what mount options you're using? -----Original Message----- From: Alan Witz [mailto:awitz@magstarinc.com] Sent: Monday, October 28, 2002 3:07 PM To: nfs@lists.sourceforge.net Subject: [NFS] Corrupt Data when using NFS on Linux I work for a small software company that recently began using NFS to = implement a solution using a lesser-known database (Appgen). The = problem is that we're getting lots of corrupt database files in those = files modified via NFS. The on-line manual on linux.org makes the = following reference which I think may be relevant: 7.10. File Corruption When Using Multiple Clients If a file has been modified within one second of its previous = modification and left the same size, it will continue to generate the = same inode number. Because of this, constant reads and writes to a file = by multiple clients may cause file corruption. Fixing this bug requires = changes deep within the filesystem layer, and therefore it is a 2.5 = item.=20 I was wondering if someone could clarify what is meant by this. = What is the relevance of the inode number? And doesn't the inode of the = file stay the same even if it is being modified? Any help would be = greatly appreciated. Even some direction as to where else I might look = would be helpful. Thanks, Alan Witz ------=_NextPart_000_0073_01C27EAC.8B9C34A0 Content-Type: text/html; charset="iso-8859-1" Content-Transfer-Encoding: quoted-printable
Thanks for your quick = response.
 
The operating environment is Red Hat = Linux=20 2.4.18-17.7.xsmp.  We are running NFS version 3.  We have = implemented=20 some of our own rudimentary file locking techniques to try and = circumvent the=20 problem.  This consists of creating a lock file which acts as a = flag which=20 tells the other clients not to access the file.  Basically, if the = lock=20 file exists then the other clients will wait until the file is cleared = before=20 writing to the database file.  To ensure that this works properly = the=20 "lock" flag is being created using the "ln" command so that the process = of=20 checking for a lock and setting a lock is essentially done in one step = (thus=20 eliminating the possibility of another client setting the lock after the = current=20 client has checked for the lock but before it can set the lock = itself).  We=20 are also running NFS in synchronous mode to try and reduce the chances = of data=20 corruption due to multiple clients.  The mount options are as=20 follows:
 
   =20 rsize=3D8192,wsize=3D8192,noac,hard,sync,nfsvers=3D3
 
Any thoughts would be greatly=20 appreciated.
 
        Alan=20 Witz
 
 
----- Original Message -----
From:=20 Lever, Charles
Sent: Monday, October 28, 2002 = 3:48=20 PM
Subject: RE: [NFS] Corrupt Data = when=20 using NFS on Linux

hi=20 alan-
 
that=20 information is crap, and should be removed from whereever you found=20 it.
 
the=20 problem is that typical file systems used on *Linux* NFS servers (like = ext2) can't
store time stamps with sub-second resolution.  this is = not a=20 problem with typical
commercial NFS servers like Solaris or NetApp filers.  = i'm not=20 aware of any plan to
address this specific problem in 2.5, but that doesn't mean = it won't=20 be.
 
can=20 you tell us more about your environment, especially which kernel is=20 running
on=20 your clients and what mount options you're using?
 
-----Original Message-----
From: Alan Witz=20 [mailto:awitz@magstarinc.com]
Sent: Monday, October 28, = 2002 3:07=20 PM
To: nfs@lists.sourceforge.net
Subject: [NFS] = Corrupt=20 Data when using NFS on Linux

I work for a small software company = that=20 recently began using NFS to implement a solution using a = lesser-known=20 database (Appgen).  The problem is that we're getting lots of = corrupt=20 database files in those files modified via NFS.  The = on-line=20 manual on linux.org makes the following reference which I think = may be=20 relevant:
 
Alan=20 Witz
------=_NextPart_000_0073_01C27EAC.8B9C34A0-- ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ NFS maillist - NFS@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/nfs