2008-06-23 21:56:53

by Duane Griffin

[permalink] [raw]
Subject: [PATCH] ext3: handle corrupted orphan list at mount

If the orphan node list includes valid, untruncatable nodes with nlink > 0
the ext3_orphan_cleanup loop which attempts to delete them will not do so,
causing it to loop forever. Fix by checking for such nodes in the
ext3_orphan_get function.

This patch fixes the second case (image hdb.20000009.softlockup.gz)
reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882.

Signed-off-by: Duane Griffin <[email protected]>
--

Note that we can still end up in an infinite loop if the ext3_truncate
errors out early (and keeps doing so). I'm not sure if we should worry
about that. If we did want to handle it then we could change ext3_truncate
to return success/failure and exit the ext3_orphan_cleanup.

---

diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c
index 7712682..bc030f4 100644
--- a/fs/ext3/ialloc.c
+++ b/fs/ext3/ialloc.c
@@ -669,6 +669,14 @@ struct inode *ext3_orphan_get(struct super_block *sb, unsigned long ino)
if (IS_ERR(inode))
goto iget_failed;

+ /*
+ * If the orphans has i_nlinks > 0 then it should be able to be
+ * truncated, otherwise it won't be removed from the orphan list
+ * during processing and an infinite loop will result.
+ */
+ if (inode->i_nlink && !ext3_can_truncate(inode))
+ goto bad_orphan;
+
if (NEXT_ORPHAN(inode) > max_ino)
goto bad_orphan;
brelse(bitmap_bh);
@@ -690,6 +698,7 @@ bad_orphan:
printk(KERN_NOTICE "NEXT_ORPHAN(inode)=%u\n",
NEXT_ORPHAN(inode));
printk(KERN_NOTICE "max_ino=%lu\n", max_ino);
+ printk(KERN_NOTICE "i_nlink=%lu\n", inode->i_nlink);
/* Avoid freeing blocks if we got a bad deleted inode */
if (inode->i_nlink == 0)
inode->i_blocks = 0;
diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
index 6ae4ecf..7b7e234 100644
--- a/fs/ext3/inode.c
+++ b/fs/ext3/inode.c
@@ -2253,6 +2253,14 @@ static void ext3_free_branches(handle_t *handle, struct inode *inode,
}
}

+int ext3_can_truncate(struct inode *inode)
+{
+ return (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
+ S_ISLNK(inode->i_mode)) &&
+ !ext3_inode_is_fast_symlink(inode) &&
+ !(IS_APPEND(inode) || IS_IMMUTABLE(inode));
+}
+
/*
* ext3_truncate()
*
@@ -2297,12 +2305,7 @@ void ext3_truncate(struct inode *inode)
unsigned blocksize = inode->i_sb->s_blocksize;
struct page *page;

- if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
- S_ISLNK(inode->i_mode)))
- return;
- if (ext3_inode_is_fast_symlink(inode))
- return;
- if (IS_APPEND(inode) || IS_IMMUTABLE(inode))
+ if (!ext3_can_truncate(inode))
return;

/*
diff --git a/include/linux/ext3_fs.h b/include/linux/ext3_fs.h
index 36c5403..80171ee 100644
--- a/include/linux/ext3_fs.h
+++ b/include/linux/ext3_fs.h
@@ -832,6 +832,7 @@ extern void ext3_discard_reservation (struct inode *);
extern void ext3_dirty_inode(struct inode *);
extern int ext3_change_inode_journal_flag(struct inode *, int);
extern int ext3_get_inode_loc(struct inode *, struct ext3_iloc *);
+extern int ext3_can_truncate(struct inode *inode);
extern void ext3_truncate (struct inode *);
extern void ext3_set_inode_flags(struct inode *);
extern void ext3_get_inode_flags(struct ext3_inode_info *);


2008-06-24 00:10:53

by Duane Griffin

[permalink] [raw]
Subject: Re: [PATCH] ext3: handle corrupted orphan list at mount

2008/6/23 Sami Liedes <[email protected]>:
> You people working hard to fix bugs and implement great filesystems
> almost makes me feel bad for trying to break your code :)

Not at all, it gives me something productive to do with my time! ;)

> Here I get (on x86 gcc 4.3.1):
>
> fs/ext3/ialloc.c: In function 'ext3_orphan_get':
> fs/ext3/ialloc.c:701: warning: format '%lu' expects type 'long unsigned int', but argument 2 has type 'unsigned int'
>
> So it probably should be %u or something.

Thanks, fixed for the next version.

Cheers,
Duane.

--
"I never could learn to drink that blood and call it wine" - Bob Dylan

2008-06-24 00:14:07

by Sami Liedes

[permalink] [raw]
Subject: Re: [PATCH] ext3: handle corrupted orphan list at mount

On Mon, Jun 23, 2008 at 10:56:40PM +0100, Duane Griffin wrote:
> If the orphan node list includes valid, untruncatable nodes with nlink > 0
> the ext3_orphan_cleanup loop which attempts to delete them will not do so,
> causing it to loop forever. Fix by checking for such nodes in the
> ext3_orphan_get function.

You people working hard to fix bugs and implement great filesystems
almost makes me feel bad for trying to break your code :)

> diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c
> index 7712682..bc030f4 100644
> --- a/fs/ext3/ialloc.c
> +++ b/fs/ext3/ialloc.c
[...]
> @@ -690,6 +698,7 @@ bad_orphan:
> printk(KERN_NOTICE "NEXT_ORPHAN(inode)=%u\n",
> NEXT_ORPHAN(inode));
> printk(KERN_NOTICE "max_ino=%lu\n", max_ino);
> + printk(KERN_NOTICE "i_nlink=%lu\n", inode->i_nlink);
^^^

Here I get (on x86 gcc 4.3.1):

fs/ext3/ialloc.c: In function 'ext3_orphan_get':
fs/ext3/ialloc.c:701: warning: format '%lu' expects type 'long unsigned int', but argument 2 has type 'unsigned int'

So it probably should be %u or something.

Sami

2008-06-24 16:08:22

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] ext3: handle corrupted orphan list at mount

> If the orphan node list includes valid, untruncatable nodes with nlink > 0
> the ext3_orphan_cleanup loop which attempts to delete them will not do so,
> causing it to loop forever. Fix by checking for such nodes in the
> ext3_orphan_get function.
>
> This patch fixes the second case (image hdb.20000009.softlockup.gz)
> reported in http://bugzilla.kernel.org/show_bug.cgi?id=10882.
>
> Signed-off-by: Duane Griffin <[email protected]>
> --
>
> Note that we can still end up in an infinite loop if the ext3_truncate
> errors out early (and keeps doing so). I'm not sure if we should worry
> about that. If we did want to handle it then we could change ext3_truncate
> to return success/failure and exit the ext3_orphan_cleanup.
>
> ---
>
> diff --git a/fs/ext3/ialloc.c b/fs/ext3/ialloc.c
> index 7712682..bc030f4 100644
> --- a/fs/ext3/ialloc.c
> +++ b/fs/ext3/ialloc.c
> @@ -669,6 +669,14 @@ struct inode *ext3_orphan_get(struct super_block *sb, unsigned long ino)
> if (IS_ERR(inode))
> goto iget_failed;
>
> + /*
> + * If the orphans has i_nlinks > 0 then it should be able to be
> + * truncated, otherwise it won't be removed from the orphan list
> + * during processing and an infinite loop will result.
> + */
> + if (inode->i_nlink && !ext3_can_truncate(inode))
> + goto bad_orphan;
> +
Maybe I miss something but shouldn't above rather be ||?

Honza

> if (NEXT_ORPHAN(inode) > max_ino)
> goto bad_orphan;
> brelse(bitmap_bh);
> @@ -690,6 +698,7 @@ bad_orphan:
> printk(KERN_NOTICE "NEXT_ORPHAN(inode)=%u\n",
> NEXT_ORPHAN(inode));
> printk(KERN_NOTICE "max_ino=%lu\n", max_ino);
> + printk(KERN_NOTICE "i_nlink=%lu\n", inode->i_nlink);
> /* Avoid freeing blocks if we got a bad deleted inode */
> if (inode->i_nlink == 0)
> inode->i_blocks = 0;
> diff --git a/fs/ext3/inode.c b/fs/ext3/inode.c
> index 6ae4ecf..7b7e234 100644
> --- a/fs/ext3/inode.c
> +++ b/fs/ext3/inode.c
> @@ -2253,6 +2253,14 @@ static void ext3_free_branches(handle_t *handle, struct inode *inode,
> }
> }
>
> +int ext3_can_truncate(struct inode *inode)
> +{
> + return (S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
> + S_ISLNK(inode->i_mode)) &&
> + !ext3_inode_is_fast_symlink(inode) &&
> + !(IS_APPEND(inode) || IS_IMMUTABLE(inode));
> +}
> +
> /*
> * ext3_truncate()
> *
> @@ -2297,12 +2305,7 @@ void ext3_truncate(struct inode *inode)
> unsigned blocksize = inode->i_sb->s_blocksize;
> struct page *page;
>
> - if (!(S_ISREG(inode->i_mode) || S_ISDIR(inode->i_mode) ||
> - S_ISLNK(inode->i_mode)))
> - return;
> - if (ext3_inode_is_fast_symlink(inode))
> - return;
> - if (IS_APPEND(inode) || IS_IMMUTABLE(inode))
> + if (!ext3_can_truncate(inode))
> return;
>
> /*
> diff --git a/include/linux/ext3_fs.h b/include/linux/ext3_fs.h
> index 36c5403..80171ee 100644
> --- a/include/linux/ext3_fs.h
> +++ b/include/linux/ext3_fs.h
> @@ -832,6 +832,7 @@ extern void ext3_discard_reservation (struct inode *);
> extern void ext3_dirty_inode(struct inode *);
> extern int ext3_change_inode_journal_flag(struct inode *, int);
> extern int ext3_get_inode_loc(struct inode *, struct ext3_iloc *);
> +extern int ext3_can_truncate(struct inode *inode);
> extern void ext3_truncate (struct inode *);
> extern void ext3_set_inode_flags(struct inode *);
> extern void ext3_get_inode_flags(struct ext3_inode_info *);
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
--
Jan Kara <[email protected]>
SuSE CR Labs

2008-06-24 17:16:53

by Duane Griffin

[permalink] [raw]
Subject: Re: [PATCH] ext3: handle corrupted orphan list at mount

2008/6/24 Jan Kara <[email protected]>:
>> + /*
>> + * If the orphans has i_nlinks > 0 then it should be able to be
>> + * truncated, otherwise it won't be removed from the orphan list
>> + * during processing and an infinite loop will result.
>> + */
>> + if (inode->i_nlink && !ext3_can_truncate(inode))
>> + goto bad_orphan;
>> +
> Maybe I miss something but shouldn't above rather be ||?

No, it is correct. If i_nlink == 0 the orphan will be deleted in the
cleanup loop by the iput. If i_nlink > 0 then ext3_truncate is called,
which usually calls ext3_orphan_del on the way out, thereby removing
the node from the orphan list. However, if it exits too early
(basically if the ext3_can_truncate check fails, although there are
other failure conditions such as OOM that can also cause it to exit
early) then it doesn't, hence we end up in the infinite loop. So the
check here says, if this node is not going to be deleted or truncated
then it is invalid.

Cheers,
Duane.

--
"I never could learn to drink that blood and call it wine" - Bob Dylan

2008-06-24 17:19:16

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH] ext3: handle corrupted orphan list at mount

On Tue 24-06-08 18:16:21, Duane Griffin wrote:
> 2008/6/24 Jan Kara <[email protected]>:
> >> + /*
> >> + * If the orphans has i_nlinks > 0 then it should be able to be
> >> + * truncated, otherwise it won't be removed from the orphan list
> >> + * during processing and an infinite loop will result.
> >> + */
> >> + if (inode->i_nlink && !ext3_can_truncate(inode))
> >> + goto bad_orphan;
> >> +
> > Maybe I miss something but shouldn't above rather be ||?
>
> No, it is correct. If i_nlink == 0 the orphan will be deleted in the
> cleanup loop by the iput. If i_nlink > 0 then ext3_truncate is called,
> which usually calls ext3_orphan_del on the way out, thereby removing
> the node from the orphan list. However, if it exits too early
> (basically if the ext3_can_truncate check fails, although there are
> other failure conditions such as OOM that can also cause it to exit
> early) then it doesn't, hence we end up in the infinite loop. So the
> check here says, if this node is not going to be deleted or truncated
> then it is invalid.
Ah, OK, I forgot that you want to handle also truncation without deletion
of the inode itself. Thanks for explanation.

Honza

--
Jan Kara <[email protected]>
SUSE Labs, CR