2010-02-25 21:20:16

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 1/3] ext4: fix error handling in migrate

Set i_nlink to zero for temporary inode from very beginning.
otherwise we may fail to start new journal handle and this
inode will be unreferenced but with i_nlink == 1
Since we hold inode reference it can not be pruned.

Also add missed journal_start retval check.

Signed-off-by: Dmitry Monakhov <[email protected]>
---
fs/ext4/migrate.c | 29 ++++++++++++++---------------
1 files changed, 14 insertions(+), 15 deletions(-)

diff --git a/fs/ext4/migrate.c b/fs/ext4/migrate.c
index 46a4101..8b87bd0 100644
--- a/fs/ext4/migrate.c
+++ b/fs/ext4/migrate.c
@@ -503,14 +503,10 @@ int ext4_ext_migrate(struct inode *inode)
}
i_size_write(tmp_inode, i_size_read(inode));
/*
- * We don't want the inode to be reclaimed
- * if we got interrupted in between. We have
- * this tmp inode carrying reference to the
- * data blocks of the original file. We set
- * the i_nlink to zero at the last stage after
- * switching the original file to extent format
+ * Set the i_nlink to zero so it will be deleted later
+ * when we drop inode reference.
*/
- tmp_inode->i_nlink = 1;
+ tmp_inode->i_nlink = 0;

ext4_ext_tree_init(handle, tmp_inode);
ext4_orphan_add(handle, tmp_inode);
@@ -537,6 +533,16 @@ int ext4_ext_migrate(struct inode *inode)
up_read((&EXT4_I(inode)->i_data_sem));

handle = ext4_journal_start(inode, 1);
+ if (IS_ERR(handle)) {
+ /*
+ * It is impossible to update on-disk structures without
+ * a handle, so just rollback in-core changes and live other
+ * work to orphan_list_cleanup()
+ */
+ ext4_orphan_del(NULL, tmp_inode);
+ retval = PTR_ERR(handle);
+ goto out;
+ }

ei = EXT4_I(inode);
i_data = ei->i_data;
@@ -618,15 +624,8 @@ err_out:

/* Reset the extent details */
ext4_ext_tree_init(handle, tmp_inode);
-
- /*
- * Set the i_nlink to zero so that
- * generic_drop_inode really deletes the
- * inode
- */
- tmp_inode->i_nlink = 0;


2010-02-25 21:20:17

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 2/3] ext4: explicitly remove inode from orphan list after failed direct io

Otherwise non empty orphan list will be triggered on umount.

Signed-off-by: Dmitry Monakhov <[email protected]>
---
fs/ext4/inode.c | 3 +++
1 files changed, 3 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 8c00127..0383b18 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -3467,6 +3467,9 @@ retry:
* but cannot extend i_size. Bail out and pretend
* the write failed... */
ret = PTR_ERR(handle);
+ if (inode->i_nlink)
+ ext4_orphan_del(NULL, inode);
+
goto out;
}
if (inode->i_nlink)
--
1.6.6


2010-02-25 21:20:19

by Dmitry Monakhov

[permalink] [raw]
Subject: [PATCH 3/3] ext4: Handle non empty on-disk orphan link

In case of truncate errors we explicitly remove inode from in-core
orphan list via orphan_del(NULL, inode) without on-disk list
modification.
But later same inode may be inserted in the orphan list again which
result in on-disk link corruption. If inode i_dtime contains valid
value let skip on-disk list modification.

Signed-off-by: Dmitry Monakhov <[email protected]>
---
fs/ext4/namei.c | 8 ++++++++
1 files changed, 8 insertions(+), 0 deletions(-)

diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 17a17e1..19ca9bf 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2020,6 +2020,13 @@ int ext4_orphan_add(handle_t *handle, struct inode *inode)
err = ext4_reserve_inode_write(handle, inode, &iloc);
if (err)
goto out_unlock;
+ /*
+ * Due to previous errors inode may be already a part of on-disk
+ * orphan list. If so skipp on-disk list modification.
+ */
+ if (NEXT_ORPHAN(inode) && NEXT_ORPHAN(inode) <=
+ (le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count)))
+ goto mem_insert;

/* Insert this inode at the head of the on-disk orphan list... */
NEXT_ORPHAN(inode) = le32_to_cpu(EXT4_SB(sb)->s_es->s_last_orphan);
@@ -2037,6 +2044,7 @@ int ext4_orphan_add(handle_t *handle, struct inode *inode)
*
* This is safe: on error we're going to ignore the orphan list
* anyway on the next recovery. */
+mem_insert:
if (!err)
list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan);

--
1.6.6


2010-02-25 22:55:52

by Dmitry Monakhov

[permalink] [raw]
Subject: Re: [PATCH 3/3] ext4: Handle non empty on-disk orphan link

Dmitry Monakhov <[email protected]> writes:

> In case of truncate errors we explicitly remove inode from in-core
> orphan list via orphan_del(NULL, inode) without on-disk list
> modification.
> But later same inode may be inserted in the orphan list again which
> result in on-disk link corruption.
There is another "100% reliable" way to solve the issue.
In case of truncate error instead of cleaning in-core inode's list
we may just reinsert it in to another sb->s_orphan_error list.
In this case orphan_add() will works without changes because
!list_empty() check will works as expected. And if later it is
also possible to call orphan_del().
Later we even may try to replay this s_orphan_error list for
example before umount/remount
But this solution has major disadvantage. We can have to pin
inode in to memory to prevent inode pruning.
This is not best choice because usually truncate failed
because of ENOMEM.

That's why i use this not absolutely reliable but simple approach.
> If inode i_dtime contains valid
> value let skip on-disk list modification.

>
> Signed-off-by: Dmitry Monakhov <[email protected]>
> ---
> fs/ext4/namei.c | 8 ++++++++
> 1 files changed, 8 insertions(+), 0 deletions(-)
>
> diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
> index 17a17e1..19ca9bf 100644
> --- a/fs/ext4/namei.c
> +++ b/fs/ext4/namei.c
> @@ -2020,6 +2020,13 @@ int ext4_orphan_add(handle_t *handle, struct inode *inode)
> err = ext4_reserve_inode_write(handle, inode, &iloc);
> if (err)
> goto out_unlock;
> + /*
> + * Due to previous errors inode may be already a part of on-disk
> + * orphan list. If so skipp on-disk list modification.
> + */
> + if (NEXT_ORPHAN(inode) && NEXT_ORPHAN(inode) <=
> + (le32_to_cpu(EXT4_SB(sb)->s_es->s_inodes_count)))
> + goto mem_insert;
>
> /* Insert this inode at the head of the on-disk orphan list... */
> NEXT_ORPHAN(inode) = le32_to_cpu(EXT4_SB(sb)->s_es->s_last_orphan);
> @@ -2037,6 +2044,7 @@ int ext4_orphan_add(handle_t *handle, struct inode *inode)
> *
> * This is safe: on error we're going to ignore the orphan list
> * anyway on the next recovery. */
> +mem_insert:
> if (!err)
> list_add(&EXT4_I(inode)->i_orphan, &EXT4_SB(sb)->s_orphan);

2010-03-02 04:33:15

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH 3/3] ext4: Handle non empty on-disk orphan link

Thanks, I've added these three patches to the ext4 patch queue.

- Ted