2003-09-21 15:47:41

by Walt H

[permalink] [raw]
Subject: 2.6.0-test5-mm3 & XFS FS Corruption

execve("/bin/cp", ["cp", "fstab.backup", "fstab"], [/* 41 vars */]) = 0
uname({sysname="Linux", nodename="waltsathlon.localhost.net", release="2.6.0-test5-mm3", version="#2 SMP Fri Sep 19 19:34:53 PDT 2003", machine="i686"}) = 0
brk(0) = 0x8056000
open("/etc/ld.so.preload", O_RDONLY) = 3
fstat64(3, {st_dev=makedev(9, 4), st_ino=1711, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=131072, st_blocks=0, st_size=0, st_atime=2003/09/19-20:20:20, st_mtime=2003/09/02-20:05:15, st_ctime=2003/09/02-20:05:15}) = 0
close(3) = 0
open("/etc/ld.so.cache", O_RDONLY) = 3
fstat64(3, {st_dev=makedev(9, 4), st_ino=606953, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=131072, st_blocks=216, st_size=107854, st_atime=2003/09/20-21:38:16, st_mtime=2003/09/20-21:06:32, st_ctime=2003/09/20-21:06:32}) = 0
mmap2(NULL, 107854, PROT_READ, MAP_PRIVATE, 3, 0) = 0x40000000
close(3) = 0
open("/lib/libc.so.6", O_RDONLY) = 3
read(3, "\177ELF\1\1\1\0\0\0\0\0\0\0\0\0\3\0\3\0\1\0\0\0`\0305A"..., 1024) = 1024
fstat64(3, {st_dev=makedev(9, 4), st_ino=686802, st_mode=S_IFREG|0755, st_nlink=1, st_uid=0, st_gid=0, st_blksize=131072, st_blocks=2840, st_size=1452573, st_atime=2003/09/20-21:38:16, st_mtime=2003/08/08-20:07:48, st_ctime=2003/09/12-07:10:22}) = 0
mmap2(NULL, 4096, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0x4001b000
mmap2(0x4133c000, 1215204, PROT_READ|PROT_EXEC, MAP_PRIVATE, 3, 0) = 0x4133c000
mprotect(0x4145f000, 23268, PROT_NONE) = 0
mmap2(0x4145f000, 16384, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED, 3, 0x122) = 0x4145f000
mmap2(0x41463000, 6884, PROT_READ|PROT_WRITE, MAP_PRIVATE|MAP_FIXED|MAP_ANONYMOUS, -1, 0) = 0x41463000
close(3) = 0
munmap(0x40000000, 107854) = 0
brk(0) = 0x8056000
brk(0x8057000) = 0x8057000
brk(0) = 0x8057000
geteuid32() = 0
umask(0) = 022
lstat64("fstab", {st_dev=makedev(254, 4), st_ino=12583479, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=838, st_atime=2003/09/20-21:33:54, st_mtime=2003/09/19-20:11:19, st_ctime=2003/09/20-21:33:54}) = 0
stat64("fstab", {st_dev=makedev(254, 4), st_ino=12583479, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=838, st_atime=2003/09/20-21:33:54, st_mtime=2003/09/19-20:11:19, st_ctime=2003/09/20-21:33:54}) = 0
stat64("fstab.backup", {st_dev=makedev(254, 4), st_ino=12583203, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=697, st_atime=2003/09/20-08:32:58, st_mtime=2003/07/14-16:29:28, st_ctime=2003/07/15-17:38:17}) = 0
stat64("fstab", {st_dev=makedev(254, 4), st_ino=12583479, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=838, st_atime=2003/09/20-21:33:54, st_mtime=2003/09/19-20:11:19, st_ctime=2003/09/20-21:33:54}) = 0
open("fstab.backup", O_RDONLY|O_LARGEFILE) = 3
fstat64(3, {st_dev=makedev(254, 4), st_ino=12583203, st_mode=S_IFREG|0644, st_nlink=1, st_uid=0, st_gid=0, st_blksize=4096, st_blocks=8, st_size=697, st_atime=2003/09/20-08:32:58, st_mtime=2003/07/14-16:29:28, st_ctime=2003/07/15-17:38:17}) = 0
open("fstab", O_WRONLY|O_TRUNC|O_LARGEFILE) = -1 EPERM (Operation not permitted)
write(2, "cp: ", 4cp: ) = 4
write(2, "cannot create regular file `fsta"..., 34cannot create regular file `fstab') = 34
write(2, ": Operation not permitted", 25: Operation not permitted) = 25
write(2, "\n", 1
) = 1
close(3) = 0
_exit(1) = ?


Attachments:
fstab-strace.txt (3.76 kB)

2003-09-21 18:08:44

by Walt H

[permalink] [raw]
Subject: Re: 2.6.0-test5-mm3 & XFS FS Corruption (or not?)

Just a follow-up to my earlier post:

I've put in the xfs code from mm2 into the mm3 tree and all files get
copied and I can manually copy the fstab.backup file afterward. I
realized that the "rebuilding directory inode 256" was the lost+found
directory, which contained 4 old zero length files. That was the key.
XFS under -mm2 doesn't care about old lost+found directories, while -mm3
does. If I removed the source lost+found/ and retried rsync's with -mm3,
it finishes fine and I can copy fstab files. Adding a bogus lost+found
dir with any file in it at the source, and retrying the rsync will lead
to a state where I can't overwrite the existing /etc/fstab file at the
end. So it doesn't look like there's actually any filesystem corruption,
just a strange bug. Hope that helps,

-Walt

2003-09-21 19:48:32

by Steve Lord

[permalink] [raw]
Subject: Re: 2.6.0-test5-mm3 & XFS FS Corruption (or not?)

On Sun, 2003-09-21 at 13:08, Walt H wrote:
> Just a follow-up to my earlier post:
>
> I've put in the xfs code from mm2 into the mm3 tree and all files get
> copied and I can manually copy the fstab.backup file afterward. I
> realized that the "rebuilding directory inode 256" was the lost+found
> directory, which contained 4 old zero length files. That was the key.
> XFS under -mm2 doesn't care about old lost+found directories, while -mm3
> does. If I removed the source lost+found/ and retried rsync's with -mm3,
> it finishes fine and I can copy fstab files. Adding a bogus lost+found
> dir with any file in it at the source, and retrying the rsync will lead
> to a state where I can't overwrite the existing /etc/fstab file at the
> end. So it doesn't look like there's actually any filesystem corruption,
> just a strange bug. Hope that helps,
>
> -Walt
>

If I am correct, test5-mm3 contains a bad version of the xfs code, there
was a bug where the i_flags field was setup from an uninitialized stack
variable. mm3 came out during the two days this was in Linus's tree.
I had some very odd behavior with this code base, rm -r -f would try and
cd into files and other bizzare things, files could appear to be
immutable or append only or things they were not. This sounds like
similar behavior you that you saw. It is fixed in the latest code Linus
has.

Steve




2003-09-22 01:01:09

by Walt H

[permalink] [raw]
Subject: Re: 2.6.0-test5-mm3 & XFS FS Corruption (or not?)

Steve Lord wrote:

>
> If I am correct, test5-mm3 contains a bad version of the xfs code, there
> was a bug where the i_flags field was setup from an uninitialized stack
> variable. mm3 came out during the two days this was in Linus's tree.
> I had some very odd behavior with this code base, rm -r -f would try and
> cd into files and other bizzare things, files could appear to be
> immutable or append only or things they were not. This sounds like
> similar behavior you that you saw. It is fixed in the latest code Linus
> has.
>
> Steve

Thanks for the reply Steve. I'm guessing that this code hasn't hit CVS
yet, as I can still reproduce it with a current CVS @ 9/21/03 ~ 17:30
PST Sounds like this is a known issue, so I'll just go back to the xfs
code from -mm2 for now.

-Walt



2003-09-22 01:15:12

by Nathan Scott

[permalink] [raw]
Subject: Re: 2.6.0-test5-mm3 & XFS FS Corruption (or not?)

On Sun, Sep 21, 2003 at 06:01:06PM -0700, Walt H wrote:
> Steve Lord wrote:
> >
> > If I am correct, test5-mm3 contains a bad version of the xfs code, there
> > was a bug where the i_flags field was setup from an uninitialized stack
> > variable. mm3 came out during the two days this was in Linus's tree.
> > I had some very odd behavior with this code base, rm -r -f would try and
> > cd into files and other bizzare things, files could appear to be
> > immutable or append only or things they were not. This sounds like
> > similar behavior you that you saw. It is fixed in the latest code Linus
> > has.
>
> Thanks for the reply Steve. I'm guessing that this code hasn't hit CVS
> yet, as I can still reproduce it with a current CVS @ 9/21/03 ~ 17:30
> PST Sounds like this is a known issue, so I'll just go back to the xfs
> code from -mm2 for now.
>

The fix is below, I'd be interested in whether or not you still have
problems after applying this.

thanks.

--
Nathan


--- /usr/tmp/TmpDir.2990917-0/linux/fs/xfs/linux/xfs_vnode.c_1.117 Mon Sep 22 11:10:21 2003
+++ linux/fs/xfs/linux/xfs_vnode.c Fri Sep 19 13:17:14 2003
@@ -200,7 +200,7 @@
vn_trace_entry(vp, "vn_revalidate", (inst_t *)__return_address);
ASSERT(vp->v_fbhv != NULL);

- va.va_mask = XFS_AT_STAT;
+ va.va_mask = XFS_AT_STAT|XFS_AT_GENCOUNT;
VOP_GETATTR(vp, &va, 0, NULL, error);
if (!error) {
inode = LINVFS_GET_IP(vp);

2003-09-22 01:28:35

by Walt H

[permalink] [raw]
Subject: Re: 2.6.0-test5-mm3 & XFS FS Corruption (or not?)

Nathan Scott wrote:

> The fix is below, I'd be interested in whether or not you still have
> problems after applying this.
>
> thanks.
>

That appears to have cleared it up. I tried the tests I discovered in my
earlier e-mail of creating bogus lost+found etc... and couldn't get the
filesystem to fail. Mind you, I only ran an rsync over a 2GB filesystem,
but previously the problem was exhibited 100% of the time. I'll bang on
this for a while. Hopefully you don't hear back from me right away :)
Thanks,

-Walt