2001-04-25 12:20:40

by Pavel Machek

[permalink] [raw]
Subject: [patch] linux likes to kill bad inodes

Hi!

I had a temporary disk failure (played with acpi too much). What
happened was that disk was not able to do anything for five minutes
or so. When disk recovered, linux happily overwrote all inodes it
could not read while disk was down with zeros -> massive disk
corruption.

Solution is not to write bad inodes back to disk.

[Thanx to Jan Kara]
Pavel

--- clean/fs/inode.c Wed Apr 4 23:58:04 2001
+++ linux/fs/inode.c Sun Apr 22 14:04:46 2001
@@ -179,7 +179,7 @@

static inline void write_inode(struct inode *inode, int sync)
{
- if (inode->i_sb && inode->i_sb->s_op && inode->i_sb->s_op->write_inode)
+ if (inode->i_sb && inode->i_sb->s_op && inode->i_sb->s_op->write_inode && !is_bad_inode(inode))
inode->i_sb->s_op->write_inode(inode, sync);
}

--
I'm [email protected]. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at [email protected]


2001-04-25 13:30:51

by Chris Mason

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes



On Sunday, April 22, 2001 02:10:42 PM +0200 Pavel Machek <[email protected]>
wrote:

> Hi!
>
> I had a temporary disk failure (played with acpi too much). What
> happened was that disk was not able to do anything for five minutes
> or so. When disk recovered, linux happily overwrote all inodes it
> could not read while disk was down with zeros -> massive disk
> corruption.
>
> Solution is not to write bad inodes back to disk.
>

Wouldn't we rather make it so bad inodes don't get marked dirty at all?

-chris

2001-04-25 20:05:07

by Pavel Machek

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes

Hi!

> > Hi!
> >
> > I had a temporary disk failure (played with acpi too much). What
> > happened was that disk was not able to do anything for five minutes
> > or so. When disk recovered, linux happily overwrote all inodes it
> > could not read while disk was down with zeros -> massive disk
> > corruption.
> >
> > Solution is not to write bad inodes back to disk.
> >
>
> Wouldn't we rather make it so bad inodes don't get marked dirty at all?

I guess this is cheaper: we can mark inode dirty at 1000 points, but
you only write it at one point.

But I'm no FS expert.
Pavel
--
I'm [email protected]. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at [email protected]

2001-04-25 20:32:08

by Chris Mason

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes



On Wednesday, April 25, 2001 10:01:20 PM +0200 Pavel Machek <[email protected]>
wrote:

> Hi!
>
>> > Hi!
>> >
>> > I had a temporary disk failure (played with acpi too much). What
>> > happened was that disk was not able to do anything for five minutes
>> > or so. When disk recovered, linux happily overwrote all inodes it
>> > could not read while disk was down with zeros -> massive disk
>> > corruption.
>> >
>> > Solution is not to write bad inodes back to disk.
>> >
>>
>> Wouldn't we rather make it so bad inodes don't get marked dirty at all?
>
> I guess this is cheaper: we can mark inode dirty at 1000 points, but
> you only write it at one point.

Whoops, I worded that poorly. To me, it seems like a bug to dirty a bad
inode. If this patch works, it is because somewhere, somebody did
something with a bad inode, and thought the operation worked (otherwise,
why dirty it?).

So yes, even if we dirty them in a 1000 different places, we need to find
the one place that believes it can do something worthwhile to a bad inode.

-chris


2001-04-26 09:29:32

by Pavel Machek

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes

Hi!

> >> > Hi!
> >> >
> >> > I had a temporary disk failure (played with acpi too much). What
> >> > happened was that disk was not able to do anything for five minutes
> >> > or so. When disk recovered, linux happily overwrote all inodes it
> >> > could not read while disk was down with zeros -> massive disk
> >> > corruption.
> >> >
> >> > Solution is not to write bad inodes back to disk.
> >> >
> >>
> >> Wouldn't we rather make it so bad inodes don't get marked dirty at all?
> >
> > I guess this is cheaper: we can mark inode dirty at 1000 points, but
> > you only write it at one point.
>
> Whoops, I worded that poorly. To me, it seems like a bug to dirty a bad
> inode. If this patch works, it is because somewhere, somebody did
> something with a bad inode, and thought the operation worked (otherwise,
> why dirty it?).

Would it make sense to put the check into write_inode_ with BUG() and
return, then?

> So yes, even if we dirty them in a 1000 different places, we need to find
> the one place that believes it can do something worthwhile to a bad
> inode.

Could not it be something as simple as atime update?

Pavel
--
The best software in life is free (not shareware)! Pavel
GCM d? s-: !g p?:+ au- a--@ w+ v- C++@ UL+++ L++ N++ E++ W--- M- Y- R+

2001-04-26 09:33:42

by Jan Kara

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes

>
>
> On Wednesday, April 25, 2001 10:01:20 PM +0200 Pavel Machek <[email protected]>
> wrote:
>
> > Hi!
> >
> >> > Hi!
> >> >
> >> > I had a temporary disk failure (played with acpi too much). What
> >> > happened was that disk was not able to do anything for five minutes
> >> > or so. When disk recovered, linux happily overwrote all inodes it
> >> > could not read while disk was down with zeros -> massive disk
> >> > corruption.
> >> >
> >> > Solution is not to write bad inodes back to disk.
> >> >
> >>
> >> Wouldn't we rather make it so bad inodes don't get marked dirty at all?
> >
> > I guess this is cheaper: we can mark inode dirty at 1000 points, but
> > you only write it at one point.
>
> Whoops, I worded that poorly. To me, it seems like a bug to dirty a bad
> inode. If this patch works, it is because somewhere, somebody did
> something with a bad inode, and thought the operation worked (otherwise,
> why dirty it?).
>
> So yes, even if we dirty them in a 1000 different places, we need to find
> the one place that believes it can do something worthwhile to a bad inode.
Yes checking those places where bad inode gets dirty is probably good
idea. But I'd add test to write_inode anyway together with warning that
this shouldn't happen.

Honza

--
Jan Kara <[email protected]>
SuSE Labs

2001-04-27 14:16:12

by Pavel Machek

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes

Hi!

> >> > I had a temporary disk failure (played with acpi too much). What
> >> > happened was that disk was not able to do anything for five minutes
> >> > or so. When disk recovered, linux happily overwrote all inodes it
> >> > could not read while disk was down with zeros -> massive disk
> >> > corruption.
> >> >
> >> > Solution is not to write bad inodes back to disk.
> >> >
> >>
> >> Wouldn't we rather make it so bad inodes don't get marked dirty at all?
> >
> > I guess this is cheaper: we can mark inode dirty at 1000 points, but
> > you only write it at one point.
>
> Whoops, I worded that poorly. To me, it seems like a bug to dirty a bad
> inode. If this patch works, it is because somewhere, somebody did
> something with a bad inode, and thought the operation worked (otherwise,
> why dirty it?).
>
> So yes, even if we dirty them in a 1000 different places, we need to find
> the one place that believes it can do something worthwhile to a bad inode.

Okay, so what about following patch, followed by attempt to debug it?
[I'd really like to get patch it; killing user's data without good
reason seems evil to me, and this did quite a lot of damage to my
$HOME.]

Pavel
PS: Only filesystem at use at time of problem was ext2, and it was
ext2 iinode that got killed.


--- clean/fs/inode.c Wed Apr 4 23:58:04 2001
+++ linux/fs/inode.c Fri Apr 27 00:25:46 2001
@@ -179,6 +179,10 @@

static inline void write_inode(struct inode *inode, int sync)
{
+ if (is_bad_inode(inode)) {
+ printk(KERN_CRIT "Cowardly refusing to kill your inode\n");
+ return;
+ }
if (inode->i_sb && inode->i_sb->s_op && inode->i_sb->s_op->write_inode)
inode->i_sb->s_op->write_inode(inode, sync);
}


--
I'm [email protected]. "In my country we have almost anarchy and I don't care."
Panos Katsaloulis describing me w.r.t. patents at [email protected]

2001-04-27 14:37:46

by Chris Mason

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes



On Friday, April 27, 2001 12:28:54 AM +0200 Pavel Machek <[email protected]>
wrote:

> Okay, so what about following patch, followed by attempt to debug it?
> [I'd really like to get patch it; killing user's data without good
> reason seems evil to me, and this did quite a lot of damage to my
> $HOME.]

2.4.4-pre8 does have the patch to keep write_inode from syncing a
bad_inode. In the short term this is the best way to go.

For debugging further, it is probably best to put the warning in when
marking the inode dirty, and randomly returning bad_inodes from read_inode.
I'll give this a try next week.

My guess is that UPDATE_ATIME is the offending caller, the follow_link path
in open_namei is at least one place that should trigger it.

-chris



2001-04-27 22:11:15

by Andreas Dilger

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes

Pavel writes:
> [I'd really like to get patch it; killing user's data without good
> reason seems evil to me, and this did quite a lot of damage to my
> $HOME.]
>
> Pavel
> PS: Only filesystem at use at time of problem was ext2, and it was
> ext2 iinode that got killed.

However, since make_bad_inode() only changes the file methods and not
the superblock, any of the put_inode(), delete_inode(), or write_inode()
methods can still be called. While the write_inode() call is safe to
block in the VFS, I don't think it is safe to block *_put_inode() and
*_delete_inode() in the VFS because the filesystem may have allocated
memory or other things that need to be undone even for bad inodes.

The following patch fixes the VFS write_inode() (per Pavel's patch),
and ext2_put_inode(), bfs_delete_inode(), and udf_put_inode() to not
do anything with bad inodes. I have bailed (with a FIXME) on
hpfs_delete_inode() and smb_delete_inode(), because I don't know what
(if anything) needs to be done to correctly clean up a bad inode.

I will post a patch separately which handles a couple of cases where
*_delete_inode() does not call clear_inode() in all cases.

Cheers, Andreas
=======================================================================
diff -ru linux-2.4.4p1.orig/fs/inode.c linux/fs/inode.c
--- linux-2.4.4p1.orig/fs/inode.c Tue Apr 10 16:44:49 2001
+++ linux/fs/inode.c Fri Apr 27 13:05:04 2001
@@ -179,6 +181,12 @@

static inline void write_inode(struct inode *inode, int sync)
{
+ if (is_bad_inode(inode)) {
+ printk(KERN_CRIT "Cowardly refusing to write bad inode %s:%d\n",+ kdevname(inode->i_dev), inode->i_ino);
+ return;
+ }
+
if (inode->i_sb && inode->i_sb->s_op && inode->i_sb->s_op->write_inode)
inode->i_sb->s_op->write_inode(inode, sync);
}
diff -ru linux-2.4.4p1.orig/fs/ext2/inode.c linux/fs/ext2/inode.c
--- linux-2.4.4p1.orig/fs/ext2/inode.c Tue Apr 10 16:44:49 2001
+++ linux/fs/ext2/inode.c Fri Apr 27 13:51:15 2001
@@ -36,6 +36,9 @@
*/
void ext2_put_inode (struct inode * inode)
{
+ if (is_bad_inode(inode))
+ return;
+
ext2_discard_prealloc (inode);
}

diff -ru linux-2.4.4p1.orig/fs/bfs/inode.c linux/fs/bfs/inode.c
--- linux-2.4.4p1.orig/fs/bfs/inode.c Tue Apr 10 16:44:49 2001
+++ linux/fs/bfs/inode.c Fri Apr 27 15:45:31 2001
@@ -142,7 +142,8 @@

dprintf("ino=%08lx\n", inode->i_ino);

- if (inode->i_ino < BFS_ROOT_INO || inode->i_ino > inode->i_sb->su_lasti) {
+ if (is_bad_inode(inode) || inode->i_ino < BFS_ROOT_INO ||
+ inode->i_ino > inode->i_sb->su_lasti) {
printf("invalid ino=%08lx\n", inode->i_ino);
return;
}
diff -ru linux-2.4.4p1.orig/fs/udf/inode.c linux/fs/udf/inode.c
--- linux-2.4.4p1.orig/fs/udf/inode.c Tue Apr 10 16:41:51 2001
+++ linux/fs/udf/inode.c Fri Apr 27 14:03:49 2001
@@ -74,7 +74,7 @@
*/
void udf_put_inode(struct inode * inode)
{
- if (!(inode->i_sb->s_flags & MS_RDONLY))
+ if (!(inode->i_sb->s_flags & MS_RDONLY) && !is_bad_inode(inode))
{
lock_kernel();
udf_discard_prealloc(inode);
diff -ru linux-2.4.4p1.orig/fs/hpfs/inode.c linux/fs/hpfs/inode.c
--- linux-2.4.4p1.orig/fs/hpfs/inode.c Tue Apr 10 16:41:50 2001
+++ linux/fs/hpfs/inode.c Fri Apr 27 13:57:12 2001
@@ -316,6 +304,7 @@

void hpfs_delete_inode(struct inode *inode)
{
+ /* FIXME: handle is_bad_inode??? */
lock_kernel();
hpfs_remove_fnode(inode->i_sb, inode->i_ino);
unlock_kernel();
diff -ru linux-2.4.4p1.orig/fs/smbfs/inode.c linux/fs/smbfs/inode.c
--- linux-2.4.4p1.orig/fs/smbfs/inode.c Tue Apr 10 16:44:54 2001
+++ linux/fs/smbfs/inode.c Fri Apr 27 14:01:33 2001
@@ -254,6 +254,7 @@
smb_delete_inode(struct inode *ino)
{
DEBUG1("ino=%ld\n", ino->i_ino);
+ /* FIXME: handle is_bad_inode() case??? */
lock_kernel();
if (smb_close(ino))
PARANOIA("could not close inode %ld\n", ino->i_ino);
--
Andreas Dilger TurboLabs filesystem development
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-04-27 22:15:45

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes



On Fri, 27 Apr 2001, Andreas Dilger wrote:
>
> However, since make_bad_inode() only changes the file methods and not
> the superblock

Please just make "make_bad_inode()" just do

inode->i_sb = bad_super_block;

and do everything else too.

It's not acceptable to make low-level filesystems care about these things.

Linus

2001-04-27 23:34:01

by Andreas Dilger

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes

I previously wrote:
> I will post a patch separately which handles a couple of cases where
> *_delete_inode() does not call clear_inode() in all cases.

OK, here it is. The ext2_delete_inode() change isn't exactly a bug fix,
but rather a "performance" change. No need to hold BLK to check status
or call clear_inode() (we call clear_inode() outside BLK in VFS if
delete_inode() method does not exist).

Cheers, Andreas
=======================================================================
diff -ru linux-2.4.4p1.orig/fs/ext2/inode.c linux/fs/ext2/inode.c
--- linux-2.4.4p1.orig/fs/ext2/inode.c Tue Apr 10 16:44:49 2001
+++ linux/fs/ext2/inode.c Fri Apr 27 13:51:15 2001
@@ -44,12 +47,12 @@
*/
void ext2_delete_inode (struct inode * inode)
{
- lock_kernel();
-
if (is_bad_inode(inode) ||
inode->i_ino == EXT2_ACL_IDX_INO ||
inode->i_ino == EXT2_ACL_DATA_INO)
goto no_delete;
+
+ lock_kernel();
inode->u.ext2_i.i_dtime = CURRENT_TIME;
mark_inode_dirty(inode);
ext2_update_inode(inode, IS_SYNC(inode));
@@ -59,9 +62,7 @@
ext2_free_inode (inode);

unlock_kernel();
return;
no_delete:
- unlock_kernel();
clear_inode(inode); /* We must guarantee clearing of inode... */
}

diff -ru linux-2.4.4p1.orig/fs/bfs/inode.c linux/fs/bfs/inode.c
--- linux-2.4.4p1.orig/fs/bfs/inode.c Tue Apr 10 16:44:49 2001
+++ linux/fs/bfs/inode.c Fri Apr 27 15:45:31 2001
@@ -145,7 +145,7 @@
if (is_bad_inode(inode) || inode->i_ino < BFS_ROOT_INO ||
inode->i_ino > inode->i_sb->su_lasti) {
printf("invalid ino=%08lx\n", inode->i_ino);
- return;
+ goto bad_inode;
}

inode->i_size = 0;
@@ -155,8 +156,7 @@
bh = bread(dev, block, BFS_BSIZE);
if (!bh) {
printf("Unable to read inode %s:%08lx\n", bdevname(dev), ino);
- unlock_kernel();
- return;
+ goto bad_unlock;
}
off = (ino - BFS_ROOT_INO)%BFS_INODES_PER_BLOCK;
di = (struct bfs_inode *)bh->b_data + off;
@@ -178,7 +178,9 @@
s->su_lf_eblk = inode->iu_sblock - 1;
mark_buffer_dirty(s->su_sbh);
}
+bad_unlock:
unlock_kernel();
+bad_inode:
clear_inode(inode);
}

diff -ru linux-2.4.4p1.orig/fs/ufs/ialloc.c linux/fs/ufs/ialloc.c
--- linux-2.4.4p1.orig/fs/ufs/ialloc.c Thu Nov 16 14:18:26 2000
+++ linux/fs/ufs/ialloc.c Fri Apr 27 15:53:26 2001
@@ -82,6 +82,7 @@
if (!((ino > 1) && (ino < (uspi->s_ncg * uspi->s_ipg )))) {
ufs_warning(sb, "ufs_free_inode", "reserved inode or nonexistent inode %u\n", ino);
unlock_super (sb);
+ clear_inode (inode);
return;
}

@@ -90,6 +91,7 @@
ucpi = ufs_load_cylinder (sb, cg);
if (!ucpi) {
unlock_super (sb);
+ clear_inode (inode);
return;
}
ucg = ubh_get_ucg(UCPI_UBH);
--
Andreas Dilger TurboLabs filesystem development
http://sourceforge.net/projects/ext2resize/
http://www-mddsp.enel.ucalgary.ca/People/adilger/

2001-04-30 22:25:41

by Alan

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes

> {
> - if (inode->i_sb && inode->i_sb->s_op && inode->i_sb->s_op->write_inode)
> + if (inode->i_sb && inode->i_sb->s_op && inode->i_sb->s_op->write_inode && !is_bad_inode(inode))
> inode->i_sb->s_op->write_inode(inode, sync);
> }

Any reason a bad inode can't have its i_sb changed to a bad_inode_fs ?

2001-04-30 22:34:20

by David Miller

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes


Alan Cox writes:
> Any reason a bad inode can't have its i_sb changed to a bad_inode_fs ?

I believe this is what Linus suggested.

Later,
David S. Miller
[email protected]

2001-04-30 23:23:14

by Linus Torvalds

[permalink] [raw]
Subject: Re: [patch] linux likes to kill bad inodes



On Mon, 30 Apr 2001, Alan Cox wrote:
>
> Any reason a bad inode can't have its i_sb changed to a bad_inode_fs ?

That would be my personal preference too, this was just the quick hack
version.

Changing superblocks might have other consequences (like getting the
superblock inode lists right etc).

Linus