2000-11-23 06:22:27

by Mohammad A. Haque

[permalink] [raw]
Subject: ext2 filesystem corruptions back from dead? 2.4.0-test11

I just got these while doing many compiles on my box ....

Nov 23 00:40:06 viper kernel: EXT2-fs warning (device ide0(3,3)):
ext2_unlink: Deleting nonexistent file (622295), 0
Nov 23 00:40:06 viper kernel: = 1
Nov 23 00:40:06 viper kernel: EXT2-fs error (device ide0(3,3)):
ext2_free_blocks: Freeing blocks not in datazone - block = 540028982,
count = 1
Nov 23 00:40:06 viper kernel: EXT2-fs error (device ide0(3,3)):
ext2_free_blocks: Freeing blocks not in datazone - block = 540024880,
count = 1
Nov 23 00:40:06 viper kernel: EXT2-fs error (device ide0(3,3)):
ext2_free_blocks: Freeing blocks not in datazone - block = 170926128,
count = 1


What else should I provide?

[mhaque@viper mhaque]$ uname -a
Linux viper.haque.net 2.4.0-test11 #6 Sun Nov 19 22:17:33 EST 2000 i686
unknown

Nov 16 19:03:15 viper kernel: Uniform Multi-Platform E-IDE driver
Revision: 6.31
Nov 16 19:03:15 viper kernel: ide: Assuming 33MHz system bus speed for
PIO modes; override with idebus=xx
Nov 16 19:03:15 viper kernel: PIIX4: IDE controller on PCI bus 00 dev 39
Nov 16 19:03:15 viper kernel: PIIX4: chipset revision 1
Nov 16 19:03:15 viper kernel: PIIX4: not 100%% native mode: will probe
irqs later
Nov 16 19:03:15 viper kernel: ide0: BM-DMA at 0xf000-0xf007, BIOS
settings: hda:DMA, hdb:DMA
Nov 16 19:03:15 viper kernel: ide1: BM-DMA at 0xf008-0xf00f, BIOS
settings: hdc:DMA, hdd:DMA

Nov 16 19:03:15 viper kernel: hda: IBM-DJNA-371350, ATA DISK drive
Nov 16 19:03:15 viper kernel: hdb: CREATIVEDVD-ROM DVD2240E 12/24/97,
ATAPI CDROM drive
Nov 16 19:03:15 viper kernel: hdc: Maxtor 82160D2, ATA DISK drive
Nov 16 19:03:15 viper kernel: hdd: IBM-DTLA-307045, ATA DISK drive

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================


2000-11-23 06:33:40

by NeilBrown

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Thursday November 23, [email protected] wrote:
> I just got these while doing many compiles on my box ....
>
> Nov 23 00:40:06 viper kernel: EXT2-fs warning (device ide0(3,3)):
> ext2_unlink: Deleting nonexistent file (622295), 0
> Nov 23 00:40:06 viper kernel: = 1
> Nov 23 00:40:06 viper kernel: EXT2-fs error (device ide0(3,3)):
> ext2_free_blocks: Freeing blocks not in datazone - block = 540028982,
> count = 1
> Nov 23 00:40:06 viper kernel: EXT2-fs error (device ide0(3,3)):
> ext2_free_blocks: Freeing blocks not in datazone - block = 540024880,
> count = 1
> Nov 23 00:40:06 viper kernel: EXT2-fs error (device ide0(3,3)):
> ext2_free_blocks: Freeing blocks not in datazone - block = 170926128,
> count = 1

Oh, good. It's not just me and Tigran then. I was at first blaming
my raid5 code for this, but if you get it and Tigran gets it (reported
http://boudicca.tux.org/hypermail/linux-kernel/2000week48/0257.html
) then it's probably not me.

And interesting data point:
my raid5 reports when it get a read or write request on a block that
it currently has an outstanding read or write request. This report
gets triggered just after a spate of "Freeing blocks not in ...zone"
messages - there appear to be multiple write requests for the same
block.
This seems to suggest that something in the buffer cache is getting
corrupted.

Now if only we had a reliable way to reproduce it, we could start a
binary search for the offending patch... but I can only reproduce it
on a patched kernel after several hours of performance testing.

NeilBrown


>
>
> What else should I provide?
>
> [mhaque@viper mhaque]$ uname -a
> Linux viper.haque.net 2.4.0-test11 #6 Sun Nov 19 22:17:33 EST 2000 i686
> unknown
>
> Nov 16 19:03:15 viper kernel: Uniform Multi-Platform E-IDE driver
> Revision: 6.31
> Nov 16 19:03:15 viper kernel: ide: Assuming 33MHz system bus speed for
> PIO modes; override with idebus=xx
> Nov 16 19:03:15 viper kernel: PIIX4: IDE controller on PCI bus 00 dev 39
> Nov 16 19:03:15 viper kernel: PIIX4: chipset revision 1
> Nov 16 19:03:15 viper kernel: PIIX4: not 100%% native mode: will probe
> irqs later
> Nov 16 19:03:15 viper kernel: ide0: BM-DMA at 0xf000-0xf007, BIOS
> settings: hda:DMA, hdb:DMA
> Nov 16 19:03:15 viper kernel: ide1: BM-DMA at 0xf008-0xf00f, BIOS
> settings: hdc:DMA, hdd:DMA
>
> Nov 16 19:03:15 viper kernel: hda: IBM-DJNA-371350, ATA DISK drive
> Nov 16 19:03:15 viper kernel: hdb: CREATIVEDVD-ROM DVD2240E 12/24/97,
> ATAPI CDROM drive
> Nov 16 19:03:15 viper kernel: hdc: Maxtor 82160D2, ATA DISK drive
> Nov 16 19:03:15 viper kernel: hdd: IBM-DTLA-307045, ATA DISK drive
>
> --
>
> =====================================================================
> Mohammad A. Haque http://www.haque.net/
> [email protected]
>
> "Alcohol and calculus don't mix. Project Lead
> Don't drink and derive." --Unknown http://wm.themes.org/
> [email protected]
> =====================================================================
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> Please read the FAQ at http://www.tux.org/lkml/

2000-11-23 07:11:02

by Andreas Dilger

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

Mohammad A. Haque writes:
> I just got these while doing many compiles on my box ....
>
> Nov 23 00:40:06 viper kernel: EXT2-fs warning (device ide0(3,3)):
> ext2_unlink: Deleting nonexistent file (622295), 0
> Nov 23 00:40:06 viper kernel: = 1
> Nov 23 00:40:06 viper kernel: EXT2-fs error (device ide0(3,3)):
> ext2_free_blocks: Freeing blocks not in datazone - block = 540028982,
> count = 1
> Nov 23 00:40:06 viper kernel: EXT2-fs error (device ide0(3,3)):
> ext2_free_blocks: Freeing blocks not in datazone - block = 540024880,
> count = 1
> Nov 23 00:40:06 viper kernel: EXT2-fs error (device ide0(3,3)):
> ext2_free_blocks: Freeing blocks not in datazone - block = 170926128,
> count = 1

I'm not sure where the nonexistent file comes from. According to the
printf statement, you're trying to unlink a file with no links, so it
would be interesting to see if 622295 is a valid inode number (it
should be, or there would have been more error messages). Doing

dumpe2fs -h /dev/hda3

may help to find out where this bogus inode came from.

These block numbers decode to ASCII data:

540028982 = 0x20303036 = " 336"
540024880 = 0x20302030 = " 3 3"
170926128 = 0x0a302030 = "\n3 3"


There were problems like this quite a while ago (block numbers that are
really ASCII data)... I can't recall what the problem turned out to be
at that time.

I would suggest a full fsck to start with (you have probably already
done so). If you haven't done a full fsck on this filesystem in a long
time, there is a chance the corruption was from the old kernel bug.

Cheers, Andreas
--
Andreas Dilger \ "If a man ate a pound of pasta and a pound of antipasto,
\ would they cancel out, leaving him still hungry?"
http://www-mddsp.enel.ucalgary.ca/People/adilger/ -- Dogbert

2000-11-23 07:53:02

by P?r-Ola Nilsson

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

> 540028982 = 0x20303036 = " 336"
> 540024880 = 0x20302030 = " 3 3"
> 170926128 = 0x0a302030 = "\n3 3"
>
These should be:
540028982 = 0x20303036 = " 006"
540024880 = 0x20302030 = " 0 0"
170926128 = 0x0a302030 = "\n0 0"

/P?r-Ola Nilsson


2000-11-23 10:07:53

by Alexander Viro

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11



On Thu, 23 Nov 2000, Neil Brown wrote:

> Oh, good. It's not just me and Tigran then. I was at first blaming
> my raid5 code for this, but if you get it and Tigran gets it (reported
> http://boudicca.tux.org/hypermail/linux-kernel/2000week48/0257.html
> ) then it's probably not me.
>
> And interesting data point:
> my raid5 reports when it get a read or write request on a block that
> it currently has an outstanding read or write request. This report
> gets triggered just after a spate of "Freeing blocks not in ...zone"
> messages - there appear to be multiple write requests for the same
> block.
> This seems to suggest that something in the buffer cache is getting
> corrupted.
>
> Now if only we had a reliable way to reproduce it, we could start a
> binary search for the offending patch... but I can only reproduce it
> on a patched kernel after several hours of performance testing.

Guys, could you try to reproduce it with the following:
diff -urN rc11/fs/buffer.c rc11-ext2/fs/buffer.c
--- rc11/fs/buffer.c Mon Nov 20 01:18:59 2000
+++ rc11-ext2/fs/buffer.c Tue Nov 21 01:14:34 2000
@@ -1527,6 +1527,15 @@
}
return 0;
out:
+ bh = head;
+ do {
+ if (buffer_new(bh) && !buffer_uptodate(bh)) {
+ memset(bh->b_data, 0, bh->b_size);
+ set_bit(BH_Uptodate, &bh->b_state);
+ mark_buffer_dirty(bh);
+ }
+ bh = bh->b_this_page;
+ } while (bh != head);
return err;
}

diff -urN rc11/fs/ext2/file.c rc11-ext2/fs/ext2/file.c
--- rc11/fs/ext2/file.c Wed Oct 4 03:44:54 2000
+++ rc11-ext2/fs/ext2/file.c Tue Nov 21 01:14:34 2000
@@ -25,17 +25,6 @@
static loff_t ext2_file_lseek(struct file *, loff_t, int);
static int ext2_open_file (struct inode *, struct file *);

-#define EXT2_MAX_SIZE(bits) \
- (((EXT2_NDIR_BLOCKS + (1LL << (bits - 2)) + \
- (1LL << (bits - 2)) * (1LL << (bits - 2)) + \
- (1LL << (bits - 2)) * (1LL << (bits - 2)) * (1LL << (bits - 2))) * \
- (1LL << bits)) - 1)
-
-static long long ext2_max_sizes[] = {
-0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-EXT2_MAX_SIZE(10), EXT2_MAX_SIZE(11), EXT2_MAX_SIZE(12), EXT2_MAX_SIZE(13)
-};
-
/*
* Make sure the offset never goes beyond the 32-bit mark..
*/
@@ -56,7 +45,7 @@
if (offset<0)
return -EINVAL;
if (((unsigned long long) offset >> 32) != 0) {
- if (offset > ext2_max_sizes[EXT2_BLOCK_SIZE_BITS(inode->i_sb)])
+ if (offset >= inode->i_sb->u.ext2_sb.s_max_size)
return -EINVAL;
}
if (offset != file->f_pos) {
@@ -110,4 +99,5 @@

struct inode_operations ext2_file_inode_operations = {
truncate: ext2_truncate,
+ setattr: ext2_notify_change,
};
diff -urN rc11/fs/ext2/inode.c rc11-ext2/fs/ext2/inode.c
--- rc11/fs/ext2/inode.c Wed Oct 4 03:44:54 2000
+++ rc11-ext2/fs/ext2/inode.c Tue Nov 21 01:14:34 2000
@@ -153,11 +153,13 @@
* This function translates the block number into path in that tree -
* return value is the path length and @offsets[n] is the offset of
* pointer to (n+1)th node in the nth one. If @block is out of range
- * (negative or too large) warning is printed and zero returned.
+ * (negative or too large) we return zero. Warning is printed if @block
+ * is negative - that should never happen. Too large value is OK, it
+ * just means that ext2_get_block() should return -%EFBIG.
*
* Note: function doesn't find node addresses, so no IO is needed. All
* we need to know is the capacity of indirect blocks (taken from the
- * inode->i_sb).
+ * @inode->i_sb).
*/

/*
@@ -196,7 +198,7 @@
offsets[n++] = (i_block >> ptrs_bits) & (ptrs - 1);
offsets[n++] = i_block & (ptrs - 1);
} else {
- ext2_warning (inode->i_sb, "ext2_block_to_path", "block > big");
+ /* Too large, nothing to do here */
}
return n;
}
@@ -216,7 +218,7 @@
* i.e. little-endian 32-bit), chain[i].p contains the address of that
* number (it points into struct inode for i==0 and into the bh->b_data
* for i>0) and chain[i].bh points to the buffer_head of i-th indirect
- * block for i>0 and NULL for i==0. In other words, it holds the block
+ * block for i>0 and %NULL for i==0. In other words, it holds the block
* numbers of the chain, addresses they were taken from (and where we can
* verify that chain did not change) and buffer_heads hosting these
* numbers.
@@ -230,11 +232,11 @@
* or when it reads all @depth-1 indirect blocks successfully and finds
* the whole chain, all way to the data (returns %NULL, *err == 0).
*/
-static inline Indirect *ext2_get_branch(struct inode *inode,
- int depth,
- int *offsets,
- Indirect chain[4],
- int *err)
+static Indirect *ext2_get_branch(struct inode *inode,
+ int depth,
+ int *offsets,
+ Indirect chain[4],
+ int *err)
{
kdev_t dev = inode->i_dev;
int size = inode->i_sb->s_blocksize;
@@ -505,7 +507,7 @@

static int ext2_get_block(struct inode *inode, long iblock, struct buffer_head *bh_result, int create)
{
- int err = -EIO;
+ int err = -EFBIG;
int offsets[4];
Indirect chain[4];
Indirect *partial;
@@ -880,8 +882,6 @@
if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
S_ISLNK(inode->i_mode)))
return;
- if (IS_APPEND(inode) || IS_IMMUTABLE(inode))
- return;

ext2_discard_prealloc(inode);

@@ -1255,6 +1255,13 @@
retval = inode_change_ok(inode, iattr);
if (retval != 0)
goto out;
+
+ if (iattr->ia_valid & ATTR_SIZE) {
+ if (iattr->ia_size > inode->i_sb->u.ext2_sb.s_max_size) {
+ retval = -EFBIG;
+ goto out;
+ }
+ }

inode_setattr(inode, iattr);

diff -urN rc11/fs/ext2/super.c rc11-ext2/fs/ext2/super.c
--- rc11/fs/ext2/super.c Wed Oct 4 03:44:54 2000
+++ rc11-ext2/fs/ext2/super.c Tue Nov 21 01:14:34 2000
@@ -356,6 +356,19 @@

#define log2(n) ffz(~(n))

+/*
+ * maximal file size.
+ */
+static loff_t ext2_max_size(int bits)
+{
+ loff_t res = EXT2_NDIR_BLOCKS;
+ res += 1LL << (bits-2);
+ res += 1LL << (2*(bits-2));
+ res += 1LL << (3*(bits-2));
+ return res << bits;
+}
+
+
struct super_block * ext2_read_super (struct super_block * sb, void * data,
int silent)
{
@@ -517,6 +530,7 @@
log2 (EXT2_ADDR_PER_BLOCK(sb));
sb->u.ext2_sb.s_desc_per_block_bits =
log2 (EXT2_DESC_PER_BLOCK(sb));
+ sb->u.ext2_sb.s_max_size = ext2_max_size(sb->s_blocksize_bits);
if (sb->s_magic != EXT2_SUPER_MAGIC) {
if (!silent)
printk ("VFS: Can't find an ext2 filesystem on dev "
diff -urN rc11/fs/nfsd/vfs.c rc11-ext2/fs/nfsd/vfs.c
--- rc11/fs/nfsd/vfs.c Mon Nov 20 01:19:03 2000
+++ rc11-ext2/fs/nfsd/vfs.c Tue Nov 21 01:14:34 2000
@@ -23,7 +23,6 @@
#include <linux/locks.h>
#include <linux/fs.h>
#include <linux/major.h>
-#include <linux/ext2_fs.h>
#include <linux/proc_fs.h>
#include <linux/stat.h>
#include <linux/fcntl.h>
diff -urN rc11/fs/open.c rc11-ext2/fs/open.c
--- rc11/fs/open.c Thu Nov 2 22:38:59 2000
+++ rc11-ext2/fs/open.c Tue Nov 21 01:14:34 2000
@@ -102,7 +102,12 @@
goto out;
inode = nd.dentry->d_inode;

- error = -EACCES;
+ /* For directories it's -EISDIR, for other non-regulars - -EINVAL */
+ error = -EISDIR;
+ if (S_ISDIR(inode->i_mode))
+ goto dput_and_out;
+
+ error = -EINVAL;
if (!S_ISREG(inode->i_mode))
goto dput_and_out;

@@ -163,7 +168,7 @@
goto out;
dentry = file->f_dentry;
inode = dentry->d_inode;
- error = -EACCES;
+ error = -EINVAL;
if (!S_ISREG(inode->i_mode) || !(file->f_mode & FMODE_WRITE))
goto out_putf;
error = -EPERM;
diff -urN rc11/include/linux/ext2_fs.h rc11-ext2/include/linux/ext2_fs.h
--- rc11/include/linux/ext2_fs.h Sat Jul 29 12:08:57 2000
+++ rc11-ext2/include/linux/ext2_fs.h Tue Nov 21 02:02:01 2000
@@ -568,6 +568,8 @@
extern int ext2_sync_inode (struct inode *);
extern void ext2_discard_prealloc (struct inode *);

+extern int ext2_notify_change (struct dentry *, struct iattr *);
+
/* ioctl.c */
extern int ext2_ioctl (struct inode *, struct file *, unsigned int,
unsigned long);
diff -urN rc11/include/linux/ext2_fs_sb.h rc11-ext2/include/linux/ext2_fs_sb.h
--- rc11/include/linux/ext2_fs_sb.h Wed Oct 4 03:45:06 2000
+++ rc11-ext2/include/linux/ext2_fs_sb.h Tue Nov 21 01:14:34 2000
@@ -59,6 +59,7 @@
int s_feature_compat;
int s_feature_incompat;
int s_feature_ro_compat;
+ loff_t s_max_size;
};

#endif /* _LINUX_EXT2_FS_SB */
diff -urN rc11/kernel/ksyms.c rc11-ext2/kernel/ksyms.c
--- rc11/kernel/ksyms.c Mon Nov 20 01:19:12 2000
+++ rc11-ext2/kernel/ksyms.c Tue Nov 21 01:14:35 2000
@@ -23,8 +23,6 @@
#include <linux/serial.h>
#include <linux/locks.h>
#include <linux/delay.h>
-#include <linux/minix_fs.h>
-#include <linux/ext2_fs.h>
#include <linux/random.h>
#include <linux/reboot.h>
#include <linux/pagemap.h>
diff -urN rc11/mm/filemap.c rc11-ext2/mm/filemap.c
--- rc11/mm/filemap.c Mon Nov 20 01:19:12 2000
+++ rc11-ext2/mm/filemap.c Tue Nov 21 01:15:04 2000
@@ -2422,6 +2422,7 @@
unsigned long written;
long status;
int err;
+ unsigned bytes;

cached_page = NULL;

@@ -2466,7 +2467,7 @@
}

while (count) {
- unsigned long bytes, index, offset;
+ unsigned long index, offset;
char *kaddr;

/*
@@ -2491,7 +2492,7 @@

status = mapping->a_ops->prepare_write(file, page, offset, offset+bytes);
if (status)
- goto unlock;
+ goto sync_failure;
kaddr = page_address(page);
status = copy_from_user(kaddr+offset, buf, bytes);
flush_dcache_page(page);
@@ -2516,6 +2517,7 @@
if (status < 0)
break;
}
+done:
*ppos = pos;

if (cached_page)
@@ -2530,6 +2532,13 @@
ClearPageUptodate(page);
kunmap(page);
goto unlock;
+sync_failure:
+ UnlockPage(page);
+ deactivate_page(page);
+ page_cache_release(page);
+ if (pos + bytes > inode->i_size)
+ vmtruncate(inode, inode->i_size);
+ goto done;
}

void __init page_cache_init(unsigned long mempages)


2000-11-23 11:36:39

by NeilBrown

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Thursday November 23, [email protected] wrote:
>
> Guys, could you try to reproduce it with the following:

Well, I tried.... but it didn't go real well.

I build a 2 drive raid5 array (the script goes on to 3,4,5,6,7 drive
arrays), ran mkfs, mounted, ran "hdparm -t" on /dev/md0, ran bonnie.
So far so good.

Then I ran "dbench 20".

This produced lots of errors.

The first few were :

(3012) unlink CLIENTS/CLIENT9/~DMTMP/ACCESS/FASTENER.LDB failed (Operation not permitted)
.(3012) unlink CLIENTS/CLIENT11/~DMTMP/ACCESS/FASTENER.LDB failed (Operation not permitted)
.(3012) unlink CLIENTS/CLIENT2/~DMTMP/ACCESS/FASTENER.LDB failed (Operation not permitted)
.(3012) unlink CLIENTS/CLIENT0/~DMTMP/ACCESS/FASTENER.LDB failed (Operation not permitted)
.(3012) unlink CLIENTS/CLIENT8/~DMTMP/ACCESS/FASTENER.LDB failed (Operation not permitted)
.(3012) unlink CLIENTS/CLIENT5/~DMTMP/ACCESS/FASTENER.LDB failed (Operation not permitted)
...(3012) unlink CLIENTS/CLIENT18/~DMTMP/ACCESS/FASTENER.LDB failed (Operation not permitted)
.(3012) unlink CLIENTS/CLIENT4/~DMTMP/ACCESS/FASTENER.LDB failed (Operation not permitted)
.(3012) unlink CLIENTS/CLIENT19/~DMTMP/ACCESS/FASTENER.LDB failed (Operation not permitted)
....(11810) open CLIENTS/CLIENT0/~DMTMP/PARADOX/__S31.VAL failed for handle 4247 (Permission denied)
(3012) unlink CLIENTS/CLIENT10/~DMTMP/ACCESS/FASTENER.LDB failed (Operation not permitted)
..(11846) open CLIENTS/CLIENT0/~DMTMP/PARADOX/COURSES.VAL failed for handle 4251 (Permission denied)
(11855) nb_close: handle 4247 was not open
(11858) unlink CLIENTS/CLIENT0/~DMTMP/PARADOX/__S31.VAL failed (Operation not permitted)
.(11864) unlink CLIENTS/CLIENT0/~DMTMP/PARADOX/__S31.DB failed (Operation not permitted)
.(11810) open CLIENTS/CLIENT8/~DMTMP/PARADOX/__S31.VAL failed for handle 4247 (Permission denied)
..(3012) unlink CLIENTS/CLIENT3/~DMTMP/ACCESS/FASTENER.LDB failed (Operation not permitted)
(11928) open CLIENTS/CLIENT0/~DMTMP/PARADOX/__3F2C4.DB failed for handle 4259 (Permission denied)
(11935) nb_write: handle 4259 was not open size=2048 ofs=2048
(11940) nb_stat: CLIENTS/CLIENT0/~DMTMP/PARADOX/__3F2C4.DB wrong size 2048 4096

So it looks pretty sick.
After a reboot I tried some simple experiments. Watch what happened.

cage # echo hello > /afile
cage # > /afile
cage # echo there > /afile
bash: /afile: Permission denied
cage # echo hello > /bfile
cage # echo there > /bfile
cage # echo all >> /bfile
bash: /bfile: Permission denied
cage # echo > /cfile
cage # echo > /cfile
cage # echo xx > /cfile
bash: /cfile: Permission denied
cage # cat /vmlinuz > /dfile
cage # cat /vmlinuz > /dfile
cage # cat /vmlinuz >> /dfile
bash: /dfile: Permission denied
cage #

(all run as root).

It looks like extending a file is not allowed any more.

This is with 2.4.0-test11 plus assorted patches to knfsd (which should
be totally irrelevant) and raid5 (Which should not affect ext2), plus
your patch, which applied cleanly.

I'll try with a clean test11 plus your patches in the morning, but it
doesn't look good.

NeilBrown

2000-11-23 12:43:11

by NeilBrown

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Thursday November 23, [email protected] wrote:
> On Thursday November 23, [email protected] wrote:
> >
> > Guys, could you try to reproduce it with the following:
>
> Well, I tried.... but it didn't go real well.
>
.....
> It looks like extending a file is not allowed any more.
>
> This is with 2.4.0-test11 plus assorted patches to knfsd (which should
> be totally irrelevant) and raid5 (Which should not affect ext2), plus
> your patch, which applied cleanly.
>
> I'll try with a clean test11 plus your patches in the morning, but it
> doesn't look good.
>
> NeilBrown

It appears that lots of files have been marked "immutable" :-(

That patch contained

@@ -110,4 +99,5 @@

struct inode_operations ext2_file_inode_operations = {
truncate: ext2_truncate,
+ setattr: ext2_notify_change,
};


which enabled ext2_notify_change, however ext2_notify_change has a
bug.
It sets attributes from iattr->ia_attr_flags even
if ATTR_ATTR_FLAG is NOT SET in iattr->ia_valid.


NeilBrown

2000-11-23 13:23:30

by Guest section DW

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Thu, Nov 23, 2000 at 05:03:00PM +1100, Neil Brown wrote:

> Oh, good. It's not just me and Tigran then.

You have it all backwards. It would be good if it were
just you and Tigran. Unfortunately it also hits me.

(I am reorganizing my disks, copying large trees from
one place to the other. Always doing a diff -r between
old and new before removing the old version.
Yesterday I had a diff -r showing that the old version
was corrupted and the new was OK. Of course a second
look showed that the old version also was OK, the corruption
must have been in the buffer cache, not on disk.)

Andries

2000-11-23 17:06:31

by Alexander Viro

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11



On Thu, 23 Nov 2000, Neil Brown wrote:

> which enabled ext2_notify_change, however ext2_notify_change has a
> bug.
> It sets attributes from iattr->ia_attr_flags even
> if ATTR_ATTR_FLAG is NOT SET in iattr->ia_valid.

Arrrgh. Could you try that:

diff -urN rc11/fs/buffer.c rc11-ext2/fs/buffer.c
--- rc11/fs/buffer.c Mon Nov 20 01:18:59 2000
+++ rc11-ext2/fs/buffer.c Tue Nov 21 01:14:34 2000
@@ -1527,6 +1527,15 @@
}
return 0;
out:
+ bh = head;
+ do {
+ if (buffer_new(bh) && !buffer_uptodate(bh)) {
+ memset(bh->b_data, 0, bh->b_size);
+ set_bit(BH_Uptodate, &bh->b_state);
+ mark_buffer_dirty(bh);
+ }
+ bh = bh->b_this_page;
+ } while (bh != head);
return err;
}

diff -urN rc11/fs/ext2/file.c rc11-ext2/fs/ext2/file.c
--- rc11/fs/ext2/file.c Wed Oct 4 03:44:54 2000
+++ rc11-ext2/fs/ext2/file.c Tue Nov 21 01:14:34 2000
@@ -25,17 +25,6 @@
static loff_t ext2_file_lseek(struct file *, loff_t, int);
static int ext2_open_file (struct inode *, struct file *);

-#define EXT2_MAX_SIZE(bits) \
- (((EXT2_NDIR_BLOCKS + (1LL << (bits - 2)) + \
- (1LL << (bits - 2)) * (1LL << (bits - 2)) + \
- (1LL << (bits - 2)) * (1LL << (bits - 2)) * (1LL << (bits - 2))) * \
- (1LL << bits)) - 1)
-
-static long long ext2_max_sizes[] = {
-0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-EXT2_MAX_SIZE(10), EXT2_MAX_SIZE(11), EXT2_MAX_SIZE(12), EXT2_MAX_SIZE(13)
-};
-
/*
* Make sure the offset never goes beyond the 32-bit mark..
*/
@@ -56,7 +45,7 @@
if (offset<0)
return -EINVAL;
if (((unsigned long long) offset >> 32) != 0) {
- if (offset > ext2_max_sizes[EXT2_BLOCK_SIZE_BITS(inode->i_sb)])
+ if (offset >= inode->i_sb->u.ext2_sb.s_max_size)
return -EINVAL;
}
if (offset != file->f_pos) {
@@ -110,4 +99,5 @@

struct inode_operations ext2_file_inode_operations = {
truncate: ext2_truncate,
+ setattr: ext2_notify_change,
};
diff -urN rc11/fs/ext2/inode.c rc11-ext2/fs/ext2/inode.c
--- rc11/fs/ext2/inode.c Wed Oct 4 03:44:54 2000
+++ rc11-ext2/fs/ext2/inode.c Thu Nov 23 14:49:14 2000
@@ -153,11 +153,13 @@
* This function translates the block number into path in that tree -
* return value is the path length and @offsets[n] is the offset of
* pointer to (n+1)th node in the nth one. If @block is out of range
- * (negative or too large) warning is printed and zero returned.
+ * (negative or too large) we return zero. Warning is printed if @block
+ * is negative - that should never happen. Too large value is OK, it
+ * just means that ext2_get_block() should return -%EFBIG.
*
* Note: function doesn't find node addresses, so no IO is needed. All
* we need to know is the capacity of indirect blocks (taken from the
- * inode->i_sb).
+ * @inode->i_sb).
*/

/*
@@ -196,7 +198,7 @@
offsets[n++] = (i_block >> ptrs_bits) & (ptrs - 1);
offsets[n++] = i_block & (ptrs - 1);
} else {
- ext2_warning (inode->i_sb, "ext2_block_to_path", "block > big");
+ /* Too large, nothing to do here */
}
return n;
}
@@ -216,7 +218,7 @@
* i.e. little-endian 32-bit), chain[i].p contains the address of that
* number (it points into struct inode for i==0 and into the bh->b_data
* for i>0) and chain[i].bh points to the buffer_head of i-th indirect
- * block for i>0 and NULL for i==0. In other words, it holds the block
+ * block for i>0 and %NULL for i==0. In other words, it holds the block
* numbers of the chain, addresses they were taken from (and where we can
* verify that chain did not change) and buffer_heads hosting these
* numbers.
@@ -230,11 +232,11 @@
* or when it reads all @depth-1 indirect blocks successfully and finds
* the whole chain, all way to the data (returns %NULL, *err == 0).
*/
-static inline Indirect *ext2_get_branch(struct inode *inode,
- int depth,
- int *offsets,
- Indirect chain[4],
- int *err)
+static Indirect *ext2_get_branch(struct inode *inode,
+ int depth,
+ int *offsets,
+ Indirect chain[4],
+ int *err)
{
kdev_t dev = inode->i_dev;
int size = inode->i_sb->s_blocksize;
@@ -505,7 +507,7 @@

static int ext2_get_block(struct inode *inode, long iblock, struct buffer_head *bh_result, int create)
{
- int err = -EIO;
+ int err = -EFBIG;
int offsets[4];
Indirect chain[4];
Indirect *partial;
@@ -880,8 +882,6 @@
if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
S_ISLNK(inode->i_mode)))
return;
- if (IS_APPEND(inode) || IS_IMMUTABLE(inode))
- return;

ext2_discard_prealloc(inode);

@@ -1188,7 +1188,7 @@
raw_inode->i_dir_acl = cpu_to_le32(inode->u.ext2_i.i_dir_acl);
else {
raw_inode->i_size_high = cpu_to_le32(inode->i_size >> 32);
- if (raw_inode->i_size_high) {
+ if (raw_inode->i_size_high || (inode->i_size & (1<<31))) {
struct super_block *sb = inode->i_sb;
struct ext2_super_block *es = sb->u.ext2_sb.s_es;
if (!(es->s_feature_ro_compat & cpu_to_le32(EXT2_FEATURE_RO_COMPAT_LARGE_FILE))) {
@@ -1235,11 +1235,17 @@
return ext2_update_inode (inode, 1);
}

+static struct {unsigned attr, flag, ext2} ext2_attr[] = {
+ {ATTR_FLAG_SYNCRONOUS, S_SYNC, EXT2_SYNC_FL},
+ {ATTR_FLAG_NOATIME, S_NOATIME, EXT2_NOATIME_FL},
+ {ATTR_FLAG_APPEND, S_APPEND, EXT2_APPEND_FL},
+ {ATTR_FLAG_IMMUTABLE, S_IMMUTABLE, EXT2_IMMUTABLE_FL}
+}
+
int ext2_notify_change(struct dentry *dentry, struct iattr *iattr)
{
struct inode *inode = dentry->d_inode;
int retval;
- unsigned int flags;

retval = -EPERM;
if (iattr->ia_valid & ATTR_ATTR_FLAG &&
@@ -1256,36 +1262,27 @@
if (retval != 0)
goto out;

+ if (iattr->ia_valid & ATTR_SIZE) {
+ if (iattr->ia_size > inode->i_sb->u.ext2_sb.s_max_size) {
+ retval = -EFBIG;
+ goto out;
+ }
+ }
+
inode_setattr(inode, iattr);

- flags = iattr->ia_attr_flags;
- if (flags & ATTR_FLAG_SYNCRONOUS) {
- inode->i_flags |= S_SYNC;
- inode->u.ext2_i.i_flags |= EXT2_SYNC_FL;
- } else {
- inode->i_flags &= ~S_SYNC;
- inode->u.ext2_i.i_flags &= ~EXT2_SYNC_FL;
- }
- if (flags & ATTR_FLAG_NOATIME) {
- inode->i_flags |= S_NOATIME;
- inode->u.ext2_i.i_flags |= EXT2_NOATIME_FL;
- } else {
- inode->i_flags &= ~S_NOATIME;
- inode->u.ext2_i.i_flags &= ~EXT2_NOATIME_FL;
- }
- if (flags & ATTR_FLAG_APPEND) {
- inode->i_flags |= S_APPEND;
- inode->u.ext2_i.i_flags |= EXT2_APPEND_FL;
- } else {
- inode->i_flags &= ~S_APPEND;
- inode->u.ext2_i.i_flags &= ~EXT2_APPEND_FL;
- }
- if (flags & ATTR_FLAG_IMMUTABLE) {
- inode->i_flags |= S_IMMUTABLE;
- inode->u.ext2_i.i_flags |= EXT2_IMMUTABLE_FL;
- } else {
- inode->i_flags &= ~S_IMMUTABLE;
- inode->u.ext2_i.i_flags &= ~EXT2_IMMUTABLE_FL;
+ if (iattr->ia_valid & ATTR_ATTR_FLAG) {
+ unsigned flags = iattr->ia_attr_flags;
+ int i;
+ for (i=0; i<sizeof(ext2_attr)/sizeof(ext2_attr[0]); i++) {
+ if (flags & ext2_attr[i].attr) {
+ inode->i_flags |= ext2_attr[i].flag;
+ inode->u.ext2_i.i_flags |= ext2_attr[i].ext2;
+ } else {
+ inode->i_flags &= ~ext2_attr[i].flag;
+ inode->u.ext2_i.i_flags &= ~ext2_attr[i].ext2;
+ }
+ }
}
mark_inode_dirty(inode);
out:
diff -urN rc11/fs/ext2/super.c rc11-ext2/fs/ext2/super.c
--- rc11/fs/ext2/super.c Wed Oct 4 03:44:54 2000
+++ rc11-ext2/fs/ext2/super.c Tue Nov 21 01:14:34 2000
@@ -356,6 +356,19 @@

#define log2(n) ffz(~(n))

+/*
+ * maximal file size.
+ */
+static loff_t ext2_max_size(int bits)
+{
+ loff_t res = EXT2_NDIR_BLOCKS;
+ res += 1LL << (bits-2);
+ res += 1LL << (2*(bits-2));
+ res += 1LL << (3*(bits-2));
+ return res << bits;
+}
+
+
struct super_block * ext2_read_super (struct super_block * sb, void * data,
int silent)
{
@@ -517,6 +530,7 @@
log2 (EXT2_ADDR_PER_BLOCK(sb));
sb->u.ext2_sb.s_desc_per_block_bits =
log2 (EXT2_DESC_PER_BLOCK(sb));
+ sb->u.ext2_sb.s_max_size = ext2_max_size(sb->s_blocksize_bits);
if (sb->s_magic != EXT2_SUPER_MAGIC) {
if (!silent)
printk ("VFS: Can't find an ext2 filesystem on dev "
diff -urN rc11/fs/nfsd/vfs.c rc11-ext2/fs/nfsd/vfs.c
--- rc11/fs/nfsd/vfs.c Mon Nov 20 01:19:03 2000
+++ rc11-ext2/fs/nfsd/vfs.c Tue Nov 21 01:14:34 2000
@@ -23,7 +23,6 @@
#include <linux/locks.h>
#include <linux/fs.h>
#include <linux/major.h>
-#include <linux/ext2_fs.h>
#include <linux/proc_fs.h>
#include <linux/stat.h>
#include <linux/fcntl.h>
diff -urN rc11/fs/open.c rc11-ext2/fs/open.c
--- rc11/fs/open.c Thu Nov 2 22:38:59 2000
+++ rc11-ext2/fs/open.c Tue Nov 21 01:14:34 2000
@@ -102,7 +102,12 @@
goto out;
inode = nd.dentry->d_inode;

- error = -EACCES;
+ /* For directories it's -EISDIR, for other non-regulars - -EINVAL */
+ error = -EISDIR;
+ if (S_ISDIR(inode->i_mode))
+ goto dput_and_out;
+
+ error = -EINVAL;
if (!S_ISREG(inode->i_mode))
goto dput_and_out;

@@ -163,7 +168,7 @@
goto out;
dentry = file->f_dentry;
inode = dentry->d_inode;
- error = -EACCES;
+ error = -EINVAL;
if (!S_ISREG(inode->i_mode) || !(file->f_mode & FMODE_WRITE))
goto out_putf;
error = -EPERM;
diff -urN rc11/include/linux/ext2_fs.h rc11-ext2/include/linux/ext2_fs.h
--- rc11/include/linux/ext2_fs.h Sat Jul 29 12:08:57 2000
+++ rc11-ext2/include/linux/ext2_fs.h Tue Nov 21 02:02:01 2000
@@ -568,6 +568,8 @@
extern int ext2_sync_inode (struct inode *);
extern void ext2_discard_prealloc (struct inode *);

+extern int ext2_notify_change (struct dentry *, struct iattr *);
+
/* ioctl.c */
extern int ext2_ioctl (struct inode *, struct file *, unsigned int,
unsigned long);
diff -urN rc11/include/linux/ext2_fs_sb.h rc11-ext2/include/linux/ext2_fs_sb.h
--- rc11/include/linux/ext2_fs_sb.h Wed Oct 4 03:45:06 2000
+++ rc11-ext2/include/linux/ext2_fs_sb.h Tue Nov 21 01:14:34 2000
@@ -59,6 +59,7 @@
int s_feature_compat;
int s_feature_incompat;
int s_feature_ro_compat;
+ loff_t s_max_size;
};

#endif /* _LINUX_EXT2_FS_SB */
diff -urN rc11/kernel/ksyms.c rc11-ext2/kernel/ksyms.c
--- rc11/kernel/ksyms.c Mon Nov 20 01:19:12 2000
+++ rc11-ext2/kernel/ksyms.c Tue Nov 21 01:14:35 2000
@@ -23,8 +23,6 @@
#include <linux/serial.h>
#include <linux/locks.h>
#include <linux/delay.h>
-#include <linux/minix_fs.h>
-#include <linux/ext2_fs.h>
#include <linux/random.h>
#include <linux/reboot.h>
#include <linux/pagemap.h>
diff -urN rc11/mm/filemap.c rc11-ext2/mm/filemap.c
--- rc11/mm/filemap.c Mon Nov 20 01:19:12 2000
+++ rc11-ext2/mm/filemap.c Tue Nov 21 01:15:04 2000
@@ -2422,6 +2422,7 @@
unsigned long written;
long status;
int err;
+ unsigned bytes;

cached_page = NULL;

@@ -2466,7 +2467,7 @@
}

while (count) {
- unsigned long bytes, index, offset;
+ unsigned long index, offset;
char *kaddr;

/*
@@ -2491,7 +2492,7 @@

status = mapping->a_ops->prepare_write(file, page, offset, offset+bytes);
if (status)
- goto unlock;
+ goto sync_failure;
kaddr = page_address(page);
status = copy_from_user(kaddr+offset, buf, bytes);
flush_dcache_page(page);
@@ -2516,6 +2517,7 @@
if (status < 0)
break;
}
+done:
*ppos = pos;

if (cached_page)
@@ -2530,6 +2532,13 @@
ClearPageUptodate(page);
kunmap(page);
goto unlock;
+sync_failure:
+ UnlockPage(page);
+ deactivate_page(page);
+ page_cache_release(page);
+ if (pos + bytes > inode->i_size)
+ vmtruncate(inode, inode->i_size);
+ goto done;
}

void __init page_cache_init(unsigned long mempages)

2000-11-23 17:42:37

by Alexander Viro

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11



On Thu, 23 Nov 2000, Alexander Viro wrote:

> On Thu, 23 Nov 2000, Neil Brown wrote:
>
> > which enabled ext2_notify_change, however ext2_notify_change has a
> > bug.
> > It sets attributes from iattr->ia_attr_flags even
> > if ATTR_ATTR_FLAG is NOT SET in iattr->ia_valid.
>
> Arrrgh. Could you try that:

OK, I really need more coffee - wrong patch. My apologies. Correct (OK,
intended) one follows:

diff -urN rc11/fs/buffer.c rc11-ext2/fs/buffer.c
--- rc11/fs/buffer.c Mon Nov 20 01:18:59 2000
+++ rc11-ext2/fs/buffer.c Tue Nov 21 01:14:34 2000
@@ -1527,6 +1527,15 @@
}
return 0;
out:
+ bh = head;
+ do {
+ if (buffer_new(bh) && !buffer_uptodate(bh)) {
+ memset(bh->b_data, 0, bh->b_size);
+ set_bit(BH_Uptodate, &bh->b_state);
+ mark_buffer_dirty(bh);
+ }
+ bh = bh->b_this_page;
+ } while (bh != head);
return err;
}

diff -urN rc11/fs/ext2/file.c rc11-ext2/fs/ext2/file.c
--- rc11/fs/ext2/file.c Wed Oct 4 03:44:54 2000
+++ rc11-ext2/fs/ext2/file.c Tue Nov 21 01:14:34 2000
@@ -25,17 +25,6 @@
static loff_t ext2_file_lseek(struct file *, loff_t, int);
static int ext2_open_file (struct inode *, struct file *);

-#define EXT2_MAX_SIZE(bits) \
- (((EXT2_NDIR_BLOCKS + (1LL << (bits - 2)) + \
- (1LL << (bits - 2)) * (1LL << (bits - 2)) + \
- (1LL << (bits - 2)) * (1LL << (bits - 2)) * (1LL << (bits - 2))) * \
- (1LL << bits)) - 1)
-
-static long long ext2_max_sizes[] = {
-0, 0, 0, 0, 0, 0, 0, 0, 0, 0,
-EXT2_MAX_SIZE(10), EXT2_MAX_SIZE(11), EXT2_MAX_SIZE(12), EXT2_MAX_SIZE(13)
-};
-
/*
* Make sure the offset never goes beyond the 32-bit mark..
*/
@@ -56,7 +45,7 @@
if (offset<0)
return -EINVAL;
if (((unsigned long long) offset >> 32) != 0) {
- if (offset > ext2_max_sizes[EXT2_BLOCK_SIZE_BITS(inode->i_sb)])
+ if (offset >= inode->i_sb->u.ext2_sb.s_max_size)
return -EINVAL;
}
if (offset != file->f_pos) {
@@ -110,4 +99,5 @@

struct inode_operations ext2_file_inode_operations = {
truncate: ext2_truncate,
+ setattr: ext2_notify_change,
};
diff -urN rc11/fs/ext2/inode.c rc11-ext2/fs/ext2/inode.c
--- rc11/fs/ext2/inode.c Wed Oct 4 03:44:54 2000
+++ rc11-ext2/fs/ext2/inode.c Thu Nov 23 14:52:17 2000
@@ -153,11 +153,13 @@
* This function translates the block number into path in that tree -
* return value is the path length and @offsets[n] is the offset of
* pointer to (n+1)th node in the nth one. If @block is out of range
- * (negative or too large) warning is printed and zero returned.
+ * (negative or too large) we return zero. Warning is printed if @block
+ * is negative - that should never happen. Too large value is OK, it
+ * just means that ext2_get_block() should return -%EFBIG.
*
* Note: function doesn't find node addresses, so no IO is needed. All
* we need to know is the capacity of indirect blocks (taken from the
- * inode->i_sb).
+ * @inode->i_sb).
*/

/*
@@ -196,7 +198,7 @@
offsets[n++] = (i_block >> ptrs_bits) & (ptrs - 1);
offsets[n++] = i_block & (ptrs - 1);
} else {
- ext2_warning (inode->i_sb, "ext2_block_to_path", "block > big");
+ /* Too large, nothing to do here */
}
return n;
}
@@ -216,7 +218,7 @@
* i.e. little-endian 32-bit), chain[i].p contains the address of that
* number (it points into struct inode for i==0 and into the bh->b_data
* for i>0) and chain[i].bh points to the buffer_head of i-th indirect
- * block for i>0 and NULL for i==0. In other words, it holds the block
+ * block for i>0 and %NULL for i==0. In other words, it holds the block
* numbers of the chain, addresses they were taken from (and where we can
* verify that chain did not change) and buffer_heads hosting these
* numbers.
@@ -230,11 +232,11 @@
* or when it reads all @depth-1 indirect blocks successfully and finds
* the whole chain, all way to the data (returns %NULL, *err == 0).
*/
-static inline Indirect *ext2_get_branch(struct inode *inode,
- int depth,
- int *offsets,
- Indirect chain[4],
- int *err)
+static Indirect *ext2_get_branch(struct inode *inode,
+ int depth,
+ int *offsets,
+ Indirect chain[4],
+ int *err)
{
kdev_t dev = inode->i_dev;
int size = inode->i_sb->s_blocksize;
@@ -505,7 +507,7 @@

static int ext2_get_block(struct inode *inode, long iblock, struct buffer_head *bh_result, int create)
{
- int err = -EIO;
+ int err = -EFBIG;
int offsets[4];
Indirect chain[4];
Indirect *partial;
@@ -880,8 +882,6 @@
if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
S_ISLNK(inode->i_mode)))
return;
- if (IS_APPEND(inode) || IS_IMMUTABLE(inode))
- return;

ext2_discard_prealloc(inode);

@@ -1188,7 +1188,7 @@
raw_inode->i_dir_acl = cpu_to_le32(inode->u.ext2_i.i_dir_acl);
else {
raw_inode->i_size_high = cpu_to_le32(inode->i_size >> 32);
- if (raw_inode->i_size_high) {
+ if (raw_inode->i_size_high || (inode->i_size & (1<<31))) {
struct super_block *sb = inode->i_sb;
struct ext2_super_block *es = sb->u.ext2_sb.s_es;
if (!(es->s_feature_ro_compat & cpu_to_le32(EXT2_FEATURE_RO_COMPAT_LARGE_FILE))) {
@@ -1235,11 +1235,17 @@
return ext2_update_inode (inode, 1);
}

+static struct {unsigned attr, flag, ext2;} ext2_attr[] = {
+ {ATTR_FLAG_SYNCRONOUS, S_SYNC, EXT2_SYNC_FL},
+ {ATTR_FLAG_NOATIME, S_NOATIME, EXT2_NOATIME_FL},
+ {ATTR_FLAG_APPEND, S_APPEND, EXT2_APPEND_FL},
+ {ATTR_FLAG_IMMUTABLE, S_IMMUTABLE, EXT2_IMMUTABLE_FL}
+};
+
int ext2_notify_change(struct dentry *dentry, struct iattr *iattr)
{
struct inode *inode = dentry->d_inode;
int retval;
- unsigned int flags;

retval = -EPERM;
if (iattr->ia_valid & ATTR_ATTR_FLAG &&
@@ -1256,36 +1262,27 @@
if (retval != 0)
goto out;

+ if (iattr->ia_valid & ATTR_SIZE) {
+ if (iattr->ia_size > inode->i_sb->u.ext2_sb.s_max_size) {
+ retval = -EFBIG;
+ goto out;
+ }
+ }
+
inode_setattr(inode, iattr);

- flags = iattr->ia_attr_flags;
- if (flags & ATTR_FLAG_SYNCRONOUS) {
- inode->i_flags |= S_SYNC;
- inode->u.ext2_i.i_flags |= EXT2_SYNC_FL;
- } else {
- inode->i_flags &= ~S_SYNC;
- inode->u.ext2_i.i_flags &= ~EXT2_SYNC_FL;
- }
- if (flags & ATTR_FLAG_NOATIME) {
- inode->i_flags |= S_NOATIME;
- inode->u.ext2_i.i_flags |= EXT2_NOATIME_FL;
- } else {
- inode->i_flags &= ~S_NOATIME;
- inode->u.ext2_i.i_flags &= ~EXT2_NOATIME_FL;
- }
- if (flags & ATTR_FLAG_APPEND) {
- inode->i_flags |= S_APPEND;
- inode->u.ext2_i.i_flags |= EXT2_APPEND_FL;
- } else {
- inode->i_flags &= ~S_APPEND;
- inode->u.ext2_i.i_flags &= ~EXT2_APPEND_FL;
- }
- if (flags & ATTR_FLAG_IMMUTABLE) {
- inode->i_flags |= S_IMMUTABLE;
- inode->u.ext2_i.i_flags |= EXT2_IMMUTABLE_FL;
- } else {
- inode->i_flags &= ~S_IMMUTABLE;
- inode->u.ext2_i.i_flags &= ~EXT2_IMMUTABLE_FL;
+ if (iattr->ia_valid & ATTR_ATTR_FLAG) {
+ unsigned flags = iattr->ia_attr_flags;
+ int i;
+ for (i=0; i<sizeof(ext2_attr)/sizeof(ext2_attr[0]); i++) {
+ if (flags & ext2_attr[i].attr) {
+ inode->i_flags |= ext2_attr[i].flag;
+ inode->u.ext2_i.i_flags |= ext2_attr[i].ext2;
+ } else {
+ inode->i_flags &= ~ext2_attr[i].flag;
+ inode->u.ext2_i.i_flags &= ~ext2_attr[i].ext2;
+ }
+ }
}
mark_inode_dirty(inode);
out:
diff -urN rc11/fs/ext2/super.c rc11-ext2/fs/ext2/super.c
--- rc11/fs/ext2/super.c Wed Oct 4 03:44:54 2000
+++ rc11-ext2/fs/ext2/super.c Tue Nov 21 01:14:34 2000
@@ -356,6 +356,19 @@

#define log2(n) ffz(~(n))

+/*
+ * maximal file size.
+ */
+static loff_t ext2_max_size(int bits)
+{
+ loff_t res = EXT2_NDIR_BLOCKS;
+ res += 1LL << (bits-2);
+ res += 1LL << (2*(bits-2));
+ res += 1LL << (3*(bits-2));
+ return res << bits;
+}
+
+
struct super_block * ext2_read_super (struct super_block * sb, void * data,
int silent)
{
@@ -517,6 +530,7 @@
log2 (EXT2_ADDR_PER_BLOCK(sb));
sb->u.ext2_sb.s_desc_per_block_bits =
log2 (EXT2_DESC_PER_BLOCK(sb));
+ sb->u.ext2_sb.s_max_size = ext2_max_size(sb->s_blocksize_bits);
if (sb->s_magic != EXT2_SUPER_MAGIC) {
if (!silent)
printk ("VFS: Can't find an ext2 filesystem on dev "
diff -urN rc11/fs/nfsd/vfs.c rc11-ext2/fs/nfsd/vfs.c
--- rc11/fs/nfsd/vfs.c Mon Nov 20 01:19:03 2000
+++ rc11-ext2/fs/nfsd/vfs.c Tue Nov 21 01:14:34 2000
@@ -23,7 +23,6 @@
#include <linux/locks.h>
#include <linux/fs.h>
#include <linux/major.h>
-#include <linux/ext2_fs.h>
#include <linux/proc_fs.h>
#include <linux/stat.h>
#include <linux/fcntl.h>
diff -urN rc11/fs/open.c rc11-ext2/fs/open.c
--- rc11/fs/open.c Thu Nov 2 22:38:59 2000
+++ rc11-ext2/fs/open.c Tue Nov 21 01:14:34 2000
@@ -102,7 +102,12 @@
goto out;
inode = nd.dentry->d_inode;

- error = -EACCES;
+ /* For directories it's -EISDIR, for other non-regulars - -EINVAL */
+ error = -EISDIR;
+ if (S_ISDIR(inode->i_mode))
+ goto dput_and_out;
+
+ error = -EINVAL;
if (!S_ISREG(inode->i_mode))
goto dput_and_out;

@@ -163,7 +168,7 @@
goto out;
dentry = file->f_dentry;
inode = dentry->d_inode;
- error = -EACCES;
+ error = -EINVAL;
if (!S_ISREG(inode->i_mode) || !(file->f_mode & FMODE_WRITE))
goto out_putf;
error = -EPERM;
diff -urN rc11/include/linux/ext2_fs.h rc11-ext2/include/linux/ext2_fs.h
--- rc11/include/linux/ext2_fs.h Sat Jul 29 12:08:57 2000
+++ rc11-ext2/include/linux/ext2_fs.h Tue Nov 21 02:02:01 2000
@@ -568,6 +568,8 @@
extern int ext2_sync_inode (struct inode *);
extern void ext2_discard_prealloc (struct inode *);

+extern int ext2_notify_change (struct dentry *, struct iattr *);
+
/* ioctl.c */
extern int ext2_ioctl (struct inode *, struct file *, unsigned int,
unsigned long);
diff -urN rc11/include/linux/ext2_fs_sb.h rc11-ext2/include/linux/ext2_fs_sb.h
--- rc11/include/linux/ext2_fs_sb.h Wed Oct 4 03:45:06 2000
+++ rc11-ext2/include/linux/ext2_fs_sb.h Tue Nov 21 01:14:34 2000
@@ -59,6 +59,7 @@
int s_feature_compat;
int s_feature_incompat;
int s_feature_ro_compat;
+ loff_t s_max_size;
};

#endif /* _LINUX_EXT2_FS_SB */
diff -urN rc11/kernel/ksyms.c rc11-ext2/kernel/ksyms.c
--- rc11/kernel/ksyms.c Mon Nov 20 01:19:12 2000
+++ rc11-ext2/kernel/ksyms.c Tue Nov 21 01:14:35 2000
@@ -23,8 +23,6 @@
#include <linux/serial.h>
#include <linux/locks.h>
#include <linux/delay.h>
-#include <linux/minix_fs.h>
-#include <linux/ext2_fs.h>
#include <linux/random.h>
#include <linux/reboot.h>
#include <linux/pagemap.h>
diff -urN rc11/mm/filemap.c rc11-ext2/mm/filemap.c
--- rc11/mm/filemap.c Mon Nov 20 01:19:12 2000
+++ rc11-ext2/mm/filemap.c Tue Nov 21 01:15:04 2000
@@ -2422,6 +2422,7 @@
unsigned long written;
long status;
int err;
+ unsigned bytes;

cached_page = NULL;

@@ -2466,7 +2467,7 @@
}

while (count) {
- unsigned long bytes, index, offset;
+ unsigned long index, offset;
char *kaddr;

/*
@@ -2491,7 +2492,7 @@

status = mapping->a_ops->prepare_write(file, page, offset, offset+bytes);
if (status)
- goto unlock;
+ goto sync_failure;
kaddr = page_address(page);
status = copy_from_user(kaddr+offset, buf, bytes);
flush_dcache_page(page);
@@ -2516,6 +2517,7 @@
if (status < 0)
break;
}
+done:
*ppos = pos;

if (cached_page)
@@ -2530,6 +2532,13 @@
ClearPageUptodate(page);
kunmap(page);
goto unlock;
+sync_failure:
+ UnlockPage(page);
+ deactivate_page(page);
+ page_cache_release(page);
+ if (pos + bytes > inode->i_size)
+ vmtruncate(inode, inode->i_size);
+ goto done;
}

void __init page_cache_init(unsigned long mempages)

2000-11-23 21:44:03

by Tigran Aivazian

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

Hi Alexander,

I am "hammering" an ext2 filesystem with all sorts (bonnies, make -j8
bzImage, cp -a dir1 dir2 + all these over localhost NFSv3) for a while and
so far it survives. The system is 2way SMP with 1G RAM.

However, I can't say that _without_ your patch the above did _not_
survive. The corruptions usually come from real useful work and not from
articfical tests (unfortunately)....

Regards,
Tigran

2000-11-23 21:44:04

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

I'm still trying to reproduce the darn thing w/o the patch. No luck so
far.

Maybe I'll put some mission critical stuff on my machine. Then it'll pop
up like clock works. Thats the way everythign is supposed to work right?
=)

Tigran Aivazian wrote:
> However, I can't say that _without_ your patch the above did _not_
> survive. The corruptions usually come from real useful work and not from
> articfical tests (unfortunately)....

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-24 02:59:51

by NeilBrown

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Thursday November 23, [email protected] wrote:
>
>
> On Thu, 23 Nov 2000, Alexander Viro wrote:
>
> > On Thu, 23 Nov 2000, Neil Brown wrote:
> >
> > > which enabled ext2_notify_change, however ext2_notify_change has a
> > > bug.
> > > It sets attributes from iattr->ia_attr_flags even
> > > if ATTR_ATTR_FLAG is NOT SET in iattr->ia_valid.
> >
> > Arrrgh. Could you try that:
>
> OK, I really need more coffee - wrong patch. My apologies. Correct (OK,
> intended) one follows:

Hmmm. either you need more coffee, or I need a new compiler.
I'm using 2.95.2, and there seems to be some question marks over that.

Unfortunately debian/potato doesn't seem to offer anything else
(Except 2.7.2), so I'll try to download and compile egcs-1.1.2 and see
how that works.

I ran my test script, which builds a variety of raid5 arrays with
varying numbers of drives and chunk sizes, and runs mkfs/bonnie/dbench
on each array, and it got through about 8 file systems but choked on
the 9th by trying to allocate lots of blocks in the system zone (after
running for about an hour).

NeilBrown

2000-11-24 05:29:13

by Ion Badulescu

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Thu, 23 Nov 2000 13:52:52 +0100, Guest section DW <[email protected]> wrote:
> On Thu, Nov 23, 2000 at 05:03:00PM +1100, Neil Brown wrote:
>
>> Oh, good. It's not just me and Tigran then.
>
> You have it all backwards. It would be good if it were
> just you and Tigran. Unfortunately it also hits me.
>
> (I am reorganizing my disks, copying large trees from
> one place to the other. Always doing a diff -r between
> old and new before removing the old version.
> Yesterday I had a diff -r showing that the old version
> was corrupted and the new was OK. Of course a second
> look showed that the old version also was OK, the corruption
> must have been in the buffer cache, not on disk.)

Are these disks IDE disks by any chance?

Ion

--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.

2000-11-24 05:59:35

by Guest section DW

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Thu, Nov 23, 2000 at 08:58:39PM -0800, Ion Badulescu wrote:

> > (I am reorganizing my disks, copying large trees from
> > one place to the other. Always doing a diff -r between
> > old and new before removing the old version.
> > Yesterday I had a diff -r showing that the old version
> > was corrupted and the new was OK. Of course a second
> > look showed that the old version also was OK, the corruption
> > must have been in the buffer cache, not on disk.)
>
> Are these disks IDE disks by any chance?

Yes.

2000-11-24 06:03:26

by Alexander Viro

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11



On Fri, 24 Nov 2000, Neil Brown wrote:

> I ran my test script, which builds a variety of raid5 arrays with
> varying numbers of drives and chunk sizes, and runs mkfs/bonnie/dbench
> on each array, and it got through about 8 file systems but choked on
> the 9th by trying to allocate lots of blocks in the system zone (after
> running for about an hour).

Bloody interesting. I don't see anything recent that could affect the
areas in question. Intersting versions to check: 11-pre5 and 11-pre6.
It smells like buffer cache corruption, but I don't see anything
relevant. __generic_unplug_device() change loock pretty innocent,
ditto for bh_kmap() ones in raid5 and on ext2 side we had two obviously
equivalent replacements (pre5->pre6). No buffer.c changes, no VM ones.
Urgh.

2000-11-24 06:06:16

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

Yep. Unless of course they are SCSI with an identity crisis =P

Ion Badulescu wrote:
>
> Are these disks IDE disks by any chance?
>
> Ion
>

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-24 06:14:21

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

I got the error while I was compiling XFree86 4 CVS and kernel. So
that's what I've been doing in multiples along witha couple otehr things
thrown inthe mix to generate lots of disk i/o.

Nothing yet, but I'm pretty sure my machine hates me for putting it
through this.

Alexander Viro wrote:
> Bloody interesting. I don't see anything recent that could affect the
> areas in question. Intersting versions to check: 11-pre5 and 11-pre6.
> It smells like buffer cache corruption, but I don't see anything
> relevant. __generic_unplug_device() change loock pretty innocent,
> ditto for bh_kmap() ones in raid5 and on ext2 side we had two obviously
> equivalent replacements (pre5->pre6). No buffer.c changes, no VM ones.
> Urgh.

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-24 06:27:48

by Alexander Viro

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11



On Fri, 24 Nov 2000, Mohammad A. Haque wrote:

> I got the error while I was compiling XFree86 4 CVS and kernel. So
> that's what I've been doing in multiples along witha couple otehr things
> thrown inthe mix to generate lots of disk i/o.

Error messages would be interesting... So far we have _both_ 2.95 and 2.91
involved, raid and non-raid alike. Just fscking peachy... OK, let's try
to eliminate ext2 changes (if that helps we have a big problem somewhere,
but that's at least something):

patch -p1 -R <<EOF
--- rc11-pre5/fs/ext2/ialloc.c Wed Oct 4 03:44:54 2000
+++ rc11-pre6/fs/ext2/ialloc.c Fri Nov 17 02:23:19 2000
@@ -274,15 +274,13 @@
return NULL;
}

- inode = get_empty_inode ();
+ sb = dir->i_sb;
+ inode = new_inode(sb);
if (!inode) {
*err = -ENOMEM;
return NULL;
}

- sb = dir->i_sb;
- inode->i_sb = sb;
- inode->i_flags = 0;
lock_super (sb);
es = sb->u.ext2_sb.s_es;
repeat:
@@ -430,9 +428,6 @@
mark_buffer_dirty(sb->u.ext2_sb.s_sbh);
sb->s_dirt = 1;
inode->i_mode = mode;
- inode->i_sb = sb;
- inode->i_nlink = 1;
- inode->i_dev = sb->s_dev;
inode->i_uid = current->fsuid;
if (test_opt (sb, GRPID))
inode->i_gid = dir->i_gid;
EOF

Notice that if reverting that change stops the fs corruption we _still_
have a problem - the only case when it could help is if something touches
an inode allocated by get_empty_inode() before it gets included into the
hash.

BTW, folks, while we are looking at the configurations - how about highmem
and SMP vs. UP?

2000-11-24 06:30:18

by Andre Hedrick

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Thu, 23 Nov 2000, Ion Badulescu wrote:

> > Yesterday I had a diff -r showing that the old version
> > was corrupted and the new was OK. Of course a second
> > look showed that the old version also was OK, the corruption
> > must have been in the buffer cache, not on disk.)
>
> Are these disks IDE disks by any chance?

What the F*** does that have to do with the price of eggs in china, heh?
Just maybe if you could follow a thread, you would see that that Alex Viro
has pointed out that changes in the FS layer as dorked things.

Since there have been not kernel changes to the driver that effect the
code since 2.4.0-test5 or test6 and it now randomly shows up after five or
six revisions out from the change, and the changes were chipset only.

Please make your point.

Andre Hedrick
Linux ATA Development

2000-11-24 06:42:16

by Alexander Viro

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11



On Thu, 23 Nov 2000, Andre Hedrick wrote:

> What the F*** does that have to do with the price of eggs in china, heh?
> Just maybe if you could follow a thread, you would see that that Alex Viro
> has pointed out that changes in the FS layer as dorked things.

?
If you have a l-k feed from future - please share. I'm not saying that
fs/* is not the source of that stuff, but I sure as hell had not said
that it is. I simply don't know yet.

> Since there have been not kernel changes to the driver that effect the
> code since 2.4.0-test5 or test6 and it now randomly shows up after five or
> six revisions out from the change, and the changes were chipset only.

generic_unplug_device() was changed more or less recently. I doubt that
it is relevant, but...

2000-11-24 07:00:54

by Andre Hedrick

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Fri, 24 Nov 2000, Alexander Viro wrote:

>
>
> On Thu, 23 Nov 2000, Andre Hedrick wrote:
>
> > What the F*** does that have to do with the price of eggs in china, heh?
> > Just maybe if you could follow a thread, you would see that that Alex Viro
> > has pointed out that changes in the FS layer as dorked things.
>
> ?
> If you have a l-k feed from future - please share. I'm not saying that

Date: Thu, 23 Nov 2000 04:37:21 -0500 (EST)

> fs/* is not the source of that stuff, but I sure as hell had not said
> that it is. I simply don't know yet.

You were pointing out changes to reproduce the effect.

> > Since there have been not kernel changes to the driver that effect the
> > code since 2.4.0-test5 or test6 and it now randomly shows up after five or
> > six revisions out from the change, and the changes were chipset only.
>
> generic_unplug_device() was changed more or less recently. I doubt that
> it is relevant, but...

Cool, the issue was that I get tried of people blaming the ATA subsystem
for things that it does not do or has control over. Basically, I kill
bogus threads that try to tag me with an old problem of the past that was
a hardware issue.

Given the latest stats that more than 90% of the linux install base is
hinged on me getting the low-level engine core correct, I go on benders
when cheap shots are take across the bow.

Now the only issue that is even on the radar map is a potential 1GB cross
copy execution where I have a single report that md5sums do not match.
I have yet to reproduce it even with the identical hardware sent to me.

I questioned _A_ about this and there may be a case that is OS independent
which is more important to me than other. This is a non-fixable for
6->12 months, until I kick some tail in the standards committee meetings
over this point. If it is a reality, Linux and Microsoft will join as the
OS's represented there and force the change. Only because there are
potentical side-bars that NT is effected also in these rare cases.

Cheers,

Andre Hedrick
Linux ATA Development


2000-11-24 07:33:44

by Alexander Viro

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11



On Thu, 23 Nov 2000, Andre Hedrick wrote:

[I wrote]
> > ?
> > If you have a l-k feed from future - please share. I'm not saying that
>
> Date: Thu, 23 Nov 2000 04:37:21 -0500 (EST)
>
> > fs/* is not the source of that stuff, but I sure as hell had not said
> > that it is. I simply don't know yet.
>
> You were pointing out changes to reproduce the effect.

Erm... Since then the problem had been reproduced on the patched tree, so
we apparently have something else. Behaviour on disk/quota overflow is
a separate story - even with fixes for that problem stays.

> > > Since there have been not kernel changes to the driver that effect the
> > > code since 2.4.0-test5 or test6 and it now randomly shows up after five or
> > > six revisions out from the change, and the changes were chipset only.
> >
> > generic_unplug_device() was changed more or less recently. I doubt that
> > it is relevant, but...
>
> Cool, the issue was that I get tried of people blaming the ATA subsystem
> for things that it does not do or has control over. Basically, I kill
> bogus threads that try to tag me with an old problem of the past that was
> a hardware issue.

<shrug> I don't see any attempts to tag you (or ATA subsystem, for that matter)
in that thread. And thread is hardly bogus... I agree that changes in
drivers/ide/* are very unlikely to be the source of that, but information
of that kind can help to weed out some of the changes in ll_rw_blk.c.

2000-11-24 08:28:42

by Andre Hedrick

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Fri, 24 Nov 2000, Alexander Viro wrote:

> <shrug> I don't see any attempts to tag you (or ATA subsystem, for that matter)
> in that thread. And thread is hardly bogus... I agree that changes in

We agree that the "thread" is valid, trust that point.
There was a quick pointed question that present, "Is it an IDE disk?" to
paraphase the statement.

> drivers/ide/* are very unlikely to be the source of that, but information
> of that kind can help to weed out some of the changes in ll_rw_blk.c.

What may be even more helpful is when I get arround to making an option,
for some outstanding patches for 2.5, that would allow for user-space
pattern pushing through the driver that gets properly inserted in to the
list/buffer-head to make it pass through the block layer. This kind of
testing will allow for nibble level tracing through everything, I hope.

Cheers,

Andre Hedrick
CTO Timpanogas Research Group
EVP Linux Development, TRG
Linux ATA Development

2000-11-24 09:13:45

by Ion Badulescu

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Thu, 23 Nov 2000, Andre Hedrick wrote:

> Since there have been not kernel changes to the driver that effect the
> code since 2.4.0-test5 or test6 and it now randomly shows up after five or
> six revisions out from the change, and the changes were chipset only.
>
> Please make your point.

My point is simple: I'm trying to see if there is a pattern. I've had
filesystems corrupted with 2.2.18 + the backported IDE driver. Other
people have had filesystems corrupted with 2.4.0 + the same IDE driver.
If *all* people seeing f/s corruption have IDE disks and *none* of them
have SCSI, there might be something worth looking into. It might as well
be pure coincidence.

What's especially bothering me is the fact that I've seen the IDE driver
choke on DMA or something, and then continue on with life, while serving
*bad* *data* to the upper layers. Even if there were real problems with
the DMA transfers (which is not the case, 2.2.18pre without the IDE patch
runs flawlessly), a driver should never ever serve bad blocks to the f/s
layer. Locking up the machine completely, like some SCSI low-level drivers
do, is much better.

************************************************************************
So I'm asking the same question, to all those who have seen unexplained
filesystem corruption with 2.4.0: are you using IDE drives? If the answer
is yes, can you check the logs and see if, at *any* point before the
corruption occurred, the IDE driver choked and disabled DMA for *any* of
your disks?
************************************************************************

Even if 90% of the installed base is IDE and 10% is SCSI, in terms of how
heavily the hardware is being stressed the advantage of IDE over SCSI is
definitely not 9:1.

And Andre, don't take this personally. We're just trying to save our
precious data here, nothing more. :-) If something comes out of this
inquiry, it might just give you a lead.

Thanks,
Ion

--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.

2000-11-24 09:21:37

by Ion Badulescu

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Fri, 24 Nov 2000, Mohammad A. Haque wrote:

> Yep. Unless of course they are SCSI with an identity crisis =P

Ok. Are there any IDE-related errors in your logs prior to getting the f/s
corruption? They could be relevant no matter how much time passed between
them and the first signs of corruption.

Are your drives running with UDMA transfers enabled?

Thanks,
Ion

--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.

2000-11-24 11:37:04

by Mike Ricketts

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Fri, 24 Nov 2000, Ion Badulescu wrote:

> So I'm asking the same question, to all those who have seen unexplained
> filesystem corruption with 2.4.0: are you using IDE drives? If the answer
> is yes, can you check the logs and see if, at *any* point before the
> corruption occurred, the IDE driver choked and disabled DMA for *any* of
> your disks?

I have both IDE and SCSI drives in my machine, but have only seen
corruption on the SCSI drives. That doesn't mean that the problem only
exists on the SCSI drives - they IDE ones are not frequently written to.
I have disabled DMA myself on all my IDE drives because if I enable it,
the IDE driver always chokes the first time they are anything like
hammered (well, it always used to - I haven't actually tried it recently).

--
Mike Ricketts <[email protected]> Phone: +44 7968 381810

Humility is the first of the virtues -- for other people.
-- Oliver Wendell Holmes

2000-11-24 11:39:45

by Tigran Aivazian

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

I have seen ext2 filesystem corruption both on SCSI and IDE drives.

Tigran

On Fri, 24 Nov 2000, Mike Ricketts wrote:

> On Fri, 24 Nov 2000, Ion Badulescu wrote:
>
> > So I'm asking the same question, to all those who have seen unexplained
> > filesystem corruption with 2.4.0: are you using IDE drives? If the answer
> > is yes, can you check the logs and see if, at *any* point before the
> > corruption occurred, the IDE driver choked and disabled DMA for *any* of
> > your disks?
>
> I have both IDE and SCSI drives in my machine, but have only seen
> corruption on the SCSI drives. That doesn't mean that the problem only
> exists on the SCSI drives - they IDE ones are not frequently written to.
> I have disabled DMA myself on all my IDE drives because if I enable it,
> the IDE driver always chokes the first time they are anything like
> hammered (well, it always used to - I haven't actually tried it recently).
>
>

2000-11-24 14:19:29

by Guest section DW

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Fri, Nov 24, 2000 at 12:51:05AM -0800, Ion Badulescu wrote:

> Ok. Are there any IDE-related errors in your logs

Once, after a reboot:

Nov 22 17:25:50 mette kernel: hdf: status error: status=0x58 { DriveReady SeekComplete DataRequest }
Nov 22 17:25:50 mette kernel: hdf: drive not ready for command
Nov 22 17:25:50 mette kernel: hdf: status timeout: status=0xd0 { Busy }
Nov 22 17:25:50 mette kernel: hdf: drive not ready for command
Nov 22 17:25:52 mette kernel: ide2: reset: success

(But I described the situation where the data on disk was correct
and the date in core was not - almost certainly this is not an IDE problem.)

Andries

2000-11-24 15:37:23

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

I get the following followingon every reboot once.

Nov 23 01:14:37 viper kernel: hdb: drive_cmd: status=0x51 { DriveReady
SeekComplete Error }
Nov 23 01:14:37 viper kernel: hdb: drive_cmd: error=0x04
Nov 23 01:14:37 viper kernel: hdb: drive_cmd: status=0x51 { DriveReady
SeekComplete Error }
Nov 23 01:14:37 viper kernel: hdb: drive_cmd: error=0x04

hdb is my DVD drive. But other than that I haven't seen any other ide
related errors.


I found these two lines nested in between alot of other messages that I
missed before.

Nov 23 00:35:11 viper kernel: EXT2-fs error (device ide0(3,3)):
ext2_free_blocks: bit already cleared for block 147021
Nov 23 00:35:11 viper kernel: EXT2-fs error (device ide0(3,3)):
ext2_free_blocks: bit already cleared for block 147021

Then I get these ....

Nov 23 00:40:06 viper kernel: EXT2-fs warning (device ide0(3,3)):
ext2_unlink: Deleting nonexistent file (622295), 0
Nov 23 00:40:06 viper kernel: = 1
Nov 23 00:40:06 viper kernel: EXT2-fs error (device ide0(3,3)):
ext2_free_blocks: Freeing blocks not in datazone - block = 540028982,
count = 1
Nov 23 00:40:06 viper kernel: EXT2-fs error (device ide0(3,3)):
ext2_free_blocks: Freeing blocks not in datazone - block = 540024880,
count = 1

[mhaque@viper mhaque]$ sudo hdparm -iv /dev/hda

/dev/hda:
multcount = 16 (on)
I/O support = 3 (32-bit w/sync)
unmaskirq = 0 (off)
using_dma = 1 (on)
keepsettings = 1 (on)
nowerr = 0 (off)
readonly = 0 (off)
readahead = 128 (on)
geometry = 1650/255/63, sectors = 26520480, start = 0

Model=IBM-DJNA-371350, FwRev=J76OA30K, SerialNo=GM0GMFE4929
Config={ HardSect NotMFM HdSw>15uSec Fixed DTR>10Mbs }
RawCHS=16383/16/63, TrkSize=0, SectSize=0, ECCbytes=34
BuffType=DualPortCache, BuffSize=1966kB, MaxMultSect=16, MultSect=16
CurCHS=16383/16/63, CurSects=-66060037, LBA=yes, LBAsects=26520480
IORDY=on/off, tPIO={min:240,w/IORDY:120}, tDMA={min:120,rec:120}
PIO modes: pio0 pio1 pio2 pio3 pio4
DMA modes: mdma0 mdma1 mdma2 udma0 udma1 *udma2 udma3 udma4

[mhaque@viper mhaque]$ sudo hdparm -iv /dev/hdb

/dev/hdb:
HDIO_GET_MULTCOUNT failed: Invalid argument
I/O support = 3 (32-bit w/sync)
unmaskirq = 0 (off)
using_dma = 1 (on)
keepsettings = 1 (on)
HDIO_GET_NOWERR failed: Invalid argument
readonly = 1 (on)
readahead = 128 (on)
HDIO_GETGEO failed: Invalid argument

Model=CREATIVEDVD-ROM DVD2240E 12/24/97, FwRev=1.7A, SerialNo=
Config={ Fixed Removeable DTR<=5Mbs DTR>10Mbs nonMagnetic }
RawCHS=0/0/0, TrkSize=0, SectSize=0, ECCbytes=0
BuffType=unknown, BuffSize=0kB, MaxMultSect=0
(maybe): CurCHS=0/0/0, CurSects=0, LBA=yes, LBAsects=0
IORDY=on/off, tPIO={min:120,w/IORDY:120}, tDMA={min:120,rec:150}
PIO modes: pio0 pio1 pio2 pio4
DMA modes: sdma0 sdma1 sdma2 sdma? mdma0 mdma1 *mdma2

Ion Badulescu wrote:
>
> Ok. Are there any IDE-related errors in your logs prior to getting the f/s
> corruption? They could be relevant no matter how much time passed between
> them and the first signs of corruption.
>
> Are your drives running with UDMA transfers enabled?
>
> Thanks,
> Ion
>
> --
> It is better to keep your mouth shut and be thought a fool,
> than to open it and remove all doubt.

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-24 18:12:27

by Linus Torvalds

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

In article <[email protected]>,
Guest section DW <[email protected]> wrote:
>
>(But I described the situation where the data on disk was correct
>and the date in core was not - almost certainly this is not an IDE problem.)

Ehh.. It only means that it would have been a read failure instead of a
write failure.

Linus

2000-11-25 02:52:13

by Andre Hedrick

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Fri, 24 Nov 2000, Mike Ricketts wrote:

> On Fri, 24 Nov 2000, Ion Badulescu wrote:
>
> > So I'm asking the same question, to all those who have seen unexplained
> > filesystem corruption with 2.4.0: are you using IDE drives? If the answer
> > is yes, can you check the logs and see if, at *any* point before the
> > corruption occurred, the IDE driver choked and disabled DMA for *any* of
> > your disks?
>
> I have both IDE and SCSI drives in my machine, but have only seen
> corruption on the SCSI drives. That doesn't mean that the problem only
> exists on the SCSI drives - they IDE ones are not frequently written to.
> I have disabled DMA myself on all my IDE drives because if I enable it,
> the IDE driver always chokes the first time they are anything like
> hammered (well, it always used to - I haven't actually tried it recently).

This is the kind of data point that is needed.
A possible storage class independent problem.
More important, what was the first kernel you began to notice this
problem. Next, I need you to enable the DMA engine in ATA to verify that
is happening on both classes.

Cheers,

Andre Hedrick
CTO Timpanogas Research Group
EVP Linux Development, TRG
Linux ATA Development

2000-11-25 23:12:56

by Rick Bunke

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

I read up on this thread in the archives (the last message in thread was
posted on the 24th) so I'm sorry if this has already been said.

I'm having the same problem with 2.4.0-test10, but I don't have the
problem with 2.4.0-test9. So i think the bug might have been introduced
in 10.

When I try to compile xfree86 (the DRI version) under test10 my system
locks up and then has to clean up a bunch of errors in the filesystem on
reboot. I went back, removed my build tree, recreated it, recompiled
again and it would lock up again. I did this a few times under kernel
2.4.0-test10 and it would consistantly lock up, then I went to the kernel
archive and found this thread which says the problem is in test11. I then
decided I better try out test9 and see if it was just me. I rebooted into
kernel 2.4.0-test9, removed my build tree, recreated it, then recompiled
again and it worked just fine without locking up. That is what leads me
to believe the problem was introduced in test10 and probably carried over
to test11. Hope this helps.

If you want more info just let me know
my email address is [email protected].

Have Fun
Rick

Information is the currency of democracy.
- Thomas Jefferson

2000-11-27 06:20:36

by NeilBrown

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Friday November 24, [email protected] wrote:
>
>
> On Fri, 24 Nov 2000, Neil Brown wrote:
>
> > I ran my test script, which builds a variety of raid5 arrays with
> > varying numbers of drives and chunk sizes, and runs mkfs/bonnie/dbench
> > on each array, and it got through about 8 file systems but choked on
> > the 9th by trying to allocate lots of blocks in the system zone (after
> > running for about an hour).
>
> Bloody interesting. I don't see anything recent that could affect the
> areas in question. Intersting versions to check: 11-pre5 and 11-pre6.
> It smells like buffer cache corruption, but I don't see anything
> relevant. __generic_unplug_device() change loock pretty innocent,
> ditto for bh_kmap() ones in raid5 and on ext2 side we had two obviously
> equivalent replacements (pre5->pre6). No buffer.c changes, no VM ones.
> Urgh.

Turns out my data is a false alarm. It was a bug in my raid5 code -
and not a recent bug either - that was causing my filesystem
corruption.

So if your earlier patches work for everybody else then they look like
a good way to go. I have fixed my fatal flaw and I cannot reproduce
the problems any more. Patch has gone to Alan.

NeilBrown

2000-11-28 23:26:17

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

Ok, I'm not sure what else to try. I've even tried throwing around 1.6
GB of data, and copying and deleting at the same time. Nothing. Again,
this is _without_ the patches sent by Alexander.

I think I'm just gonna go on to test12-pre2.

Neil Brown wrote:
>
> Turns out my data is a false alarm. It was a bug in my raid5 code -
> and not a recent bug either - that was causing my filesystem
> corruption.
>
> So if your earlier patches work for everybody else then they look like
> a good way to go. I have fixed my fatal flaw and I cannot reproduce
> the problems any more. Patch has gone to Alan.

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-29 06:06:11

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

Ok, I just found a file with about the first 4k of it filled with nulls
(^@^@). No telling if this was a result of what originally started this
thread or not. I hadn't accessed that file since Nov 9th.


--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-29 07:15:31

by Ion Badulescu

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Tue, 28 Nov 2000 23:37:49 -0500, Mohammad A. Haque <[email protected]> wrote:
> Ok, I just found a file with about the first 4k of it filled with nulls
> (^@^@). No telling if this was a result of what originally started this
> thread or not. I hadn't accessed that file since Nov 9th.

1k- or 4k-block filesystem? Also, can you count the nulls to see if
they are exactly 4096 of them?

Thanks,
Ion

--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.

2000-11-29 07:39:22

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

[mhaque@viper mhaque]$ df
Filesystem 1k-blocks Used Available Use% Mounted on
/dev/hda3 12737128 9988400 2101712 83% /
/dev/hda2 46668 15106 29153 35% /boot
/dev/hdd1 44327416 26319188 15756484 63% /home2
none 8388608 11944 8376664 1% /dev/shm

Yes, exactly 4096 nulls.

Ion Badulescu wrote:
> 1k- or 4k-block filesystem? Also, can you count the nulls to see if
> they are exactly 4096 of them?

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-29 07:50:53

by Ion Badulescu

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

On Wed, 29 Nov 2000, Mohammad A. Haque wrote:

> [mhaque@viper mhaque]$ df
> Filesystem 1k-blocks Used Available Use% Mounted on
> /dev/hda3 12737128 9988400 2101712 83% /
> /dev/hda2 46668 15106 29153 35% /boot
> /dev/hdd1 44327416 26319188 15756484 63% /home2
> none 8388608 11944 8376664 1% /dev/shm

No, you misunderstood me. df is always going to say 1k-blocks, but that
doesn't mean that the filesystem's allocation unit is actually 1k.

Try doing a tune2fs -l on the device holding the filesystem and grep for
"Block size". Although... looking at the numbers above, it's almost
certainly 4k.

> Yes, exactly 4096 nulls.

That's what I thought... thanks.

Ion

--
It is better to keep your mouth shut and be thought a fool,
than to open it and remove all doubt.

2000-11-29 07:55:04

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

Whoops, my bad. Yes, 4k blocks.

Block size: 4096


Ion Badulescu wrote:
>
> No, you misunderstood me. df is always going to say 1k-blocks, but that
> doesn't mean that the filesystem's allocation unit is actually 1k.
>
> Try doing a tune2fs -l on the device holding the filesystem and grep for
> "Block size". Although... looking at the numbers above, it's almost
> certainly 4k.
>
> That's what I thought... thanks.
>

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================

2000-11-29 09:23:29

by Tigran Aivazian

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

Mohammad,

can you please tell me if that 4K corrupted block in a file was on a UP
machine or SMP? So far I have not seen a corruption on UP machines, only
SMP.

Regards,
Tigran

2000-11-29 10:32:49

by Mohammad A. Haque

[permalink] [raw]
Subject: Re: ext2 filesystem corruptions back from dead? 2.4.0-test11

UP

Tigran Aivazian wrote:
>
> Mohammad,
>
> can you please tell me if that 4K corrupted block in a file was on a UP
> machine or SMP? So far I have not seen a corruption on UP machines, only
> SMP.

--

=====================================================================
Mohammad A. Haque http://www.haque.net/
[email protected]

"Alcohol and calculus don't mix. Project Lead
Don't drink and derive." --Unknown http://wm.themes.org/
[email protected]
=====================================================================