Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id ; Wed, 15 Nov 2000 18:55:36 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id ; Wed, 15 Nov 2000 18:55:17 -0500 Received: from smaug.wrq.com ([150.215.17.2]:47378 "EHLO smaug.wrq.com") by vger.kernel.org with ESMTP id ; Wed, 15 Nov 2000 18:55:03 -0500 X-Sybari-Space: 00000000 00000000 00000000 From: Ivan Kanis MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Transfer-Encoding: 7bit Message-ID: <14867.6967.292440.394045@jedi.wrq.com> Date: Wed, 15 Nov 2000 15:24:39 -0800 (PST) To: linux-kernel@vger.kernel.org Subject: [BUG] knfsd causes file system corruption when files are locked. X-Mailer: VM 6.72 under 21.1 (patch 10) "Capitol Reef" XEmacs Lucid Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org [1.] knfsd causes file system corruption when files are locked. [2.] Lock down a file using the NLM_SHARE sharing mechanism. Remove the file. Unlock the file using NLM_UNSHARE. The filesystem does not recover the file space. I am running this on ext2fs. Fsck-ing the filesystem does not help. The only way to recover the space is to reformat the partition. [3.] knfsd, lock, NLM_SHARE, NLM_UNSHARE [4.] Linux version 2.2.16 (root@jedi) (gcc version 2.7.2.3) [5.] N/A [6.] This test.c program will reproduce the problem. You need to compile it on a Solaris machine because Linux fcntl does not support NLM_SHARE. -----start here #include #include #include int main (int argc, char *argv[]) { struct fshare lck; int fd, ret; if (argc != 2) { printf ("Usage: %s file to lock\n", argv[0]); return 1; } fd = open (argv[1], O_WRONLY); memset (&lck, 0, sizeof (struct flock)); lck.f_access = F_WRACC; lck.f_deny = F_NODNY; ret = fcntl (fd, F_SHARE, &lck); unlink (argv[1]); ret = fcntl (fd, F_UNSHARE, &lck); return 0; } -----end here Step to reproduce the problem - Compile the program: gcc test.c -o test - Mount a Linux nfs partition on Solaris: (Remember the partition will get corrupted, use a partition that you don't care about.) mount -o vers=2 jedi:/sandbox /mnt - Create a chunk of data on /mnt dd if=/dev/zero of=/mnt/chunk count=10000 - Do a df before running the program - Run the test program ./test /mnt/chunk - Run df again. The free space reamains the same. The space is gone till you reformat the partition. [7.] This bug was seen on a Debian 2.2 machine. We have seen the same thing happens on systems running Red Hat 6.2 and TurboLinux 6.0 distributions. [7.1] Environment: Kernel modules found Gnu C 2.95.2 Binutils 2.9.5.0.37 Linux C Library .. Dynamic Linker (ld.so) 1.9.11 ls: /usr/lib/libg++.so: No such file or directory Procps 2.0.6 Mount 2.10f Net-tools (1999-04-20) Kbd 0.99 Sh-utils 2.0 Sh-utils Parker. Sh-utils Sh-utils Inc. Sh-utils NO Sh-utils PURPOSE. [7.2] Processor information processor : 0 vendor_id : GenuineIntel cpu family : 6 model : 5 model name : Pentium II (Deschutes) stepping : 2 cpu MHz : 447.700 cache size : 512 KB fdiv_bug : no hlt_bug : no sep_bug : no f00f_bug : no coma_bug : no fpu : yes fpu_exception : yes cpuid level : 2 wp : yes flags : fpu vme de pse tsc msr pae mce cx8 apic sep mtrr pge mca cmov pat pse36 mmx fxsr bogomips : 891.29 [7.3] Module information aic7xxx 124328 1 nfsd 140436 8 (autoclean) snd-pcm-oss 16840 1 (autoclean) snd-pcm-plugin 13000 0 (autoclean) [snd-pcm-oss] snd-mixer-oss 4308 1 (autoclean) [snd-pcm-oss] snd-card-cs4236 5224 2 snd-mpu401-uart 2356 0 [snd-card-cs4236] snd-rawmidi 9752 0 [snd-mpu401-uart] snd-seq-device 3476 0 [snd-rawmidi] isapnp 27572 0 [snd-card-cs4236] snd-cs4236 20580 0 [snd-card-cs4236] snd-cs4231 19008 0 [snd-card-cs4236 snd-cs4236] snd-mixer 23536 0 [snd-mixer-oss snd-cs4236 snd-cs4231] snd-pcm 29784 0 [snd-pcm-oss snd-pcm-plugin snd-cs4231] snd-opl3 4328 0 [snd-card-cs4236] snd-timer 8224 0 [snd-cs4231 snd-pcm snd-opl3] snd-hwdep 3052 0 [snd-opl3] snd 36300 1 [snd-pcm-oss snd-pcm-plugin snd-mixer-oss snd-card-cs4236 snd-mpu401-uart snd-rawmidi snd-seq-device snd-cs4236 snd-cs4231 snd-mixer snd-pcm snd-opl3 snd-timer snd-hwdep] soundcore 2448 3 [snd] 3c59x 18212 1 [7.4] SCSI Information Attached devices: Host: scsi0 Channel: 00 Id: 05 Lun: 00 Vendor: NEC Model: CD-ROM DRIVE:465 Rev: 1.03 Type: CD-ROM ANSI SCSI revision: 02 [7.5] N/A [8.] Here is a trace from the Solaris snoop program while the test program mentioned above is running: sun -> jedi.wrq.com NFS C LOOKUP2 FH=2344 chunk jedi.wrq.com -> sun NFS R LOOKUP2 OK FH=D308 sun -> jedi.wrq.com NLM C SHARE3 OH=0009 FH=D308 Mode=0 Access=2 jedi.wrq.com -> sun NLM R SHARE3 OH=0009 granted 0 sun -> jedi.wrq.com NLM C UNSHARE3 OH=000A FH=D308 Mode=0 Access=2 jedi.wrq.com -> sun NLM R UNSHARE3 OH=000A granted 0 sun -> jedi.wrq.com NFS C GETATTR2 FH=2344 jedi.wrq.com -> sun NFS R GETATTR2 OK sun -> jedi.wrq.com NFS C LOOKUP2 FH=2344 chunk jedi.wrq.com -> sun NFS R LOOKUP2 OK FH=D308 sun -> jedi.wrq.com NFS C LOOKUP2 FH=2344 .nfs5FC7 jedi.wrq.com -> sun NFS R LOOKUP2 No such file or directory sun -> jedi.wrq.com NFS C RENAME2 FH=2344 chunk to FH=2344 .nfs5FC7 jedi.wrq.com -> sun NFS R RENAME2 OK sun -> jedi.wrq.com NLM C UNSHARE3 OH=000B FH=D308 Mode=0 Access=1 jedi.wrq.com -> sun NLM R UNSHARE3 OH=000B granted 0 sun -> jedi.wrq.com NFS C REMOVE2 FH=2344 .nfs5FC7 jedi.wrq.com -> sun NFS R REMOVE2 OK / - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org Please read the FAQ at http://www.tux.org/lkml/