ext4: Add checks whether two inodes are different to move_extent.c
From: Akira Fujita <[email protected]>
If same inode is set to orig and donor in ext4_move_extensts(),
this leads to the deadlock in mext_double_down_{read, write}.
This patch adds checks to mext_double_{down, up}_{read, write}
and if inodes are same, we take/release one semaphore from them.
Signed-off-by: Akira Fujita <[email protected]>
---
fs/ext4/move_extent.c | 14 +++++++++-----
1 files changed, 9 insertions(+), 5 deletions(-)
diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
index c07a291..e1346cb 100644
--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -177,7 +177,8 @@ mext_double_down_read(struct inode *orig_inode, struct inode *donor_inode)
}
down_read(&EXT4_I(first)->i_data_sem);
- down_read(&EXT4_I(second)->i_data_sem);
+ if (first != second)
+ down_read(&EXT4_I(second)->i_data_sem);
}
/**
@@ -203,7 +204,8 @@ mext_double_down_write(struct inode *orig_inode, struct inode *donor_inode)
}
down_write(&EXT4_I(first)->i_data_sem);
- down_write(&EXT4_I(second)->i_data_sem);
+ if (first != second)
+ down_write(&EXT4_I(second)->i_data_sem);
}
/**
@@ -217,7 +219,8 @@ static void
mext_double_up_read(struct inode *orig_inode, struct inode *donor_inode)
{
up_read(&EXT4_I(orig_inode)->i_data_sem);
- up_read(&EXT4_I(donor_inode)->i_data_sem);
+ if (orig_inode != donor_inode)
+ up_read(&EXT4_I(donor_inode)->i_data_sem);
}
/**
@@ -231,7 +234,8 @@ static void
mext_double_up_write(struct inode *orig_inode, struct inode *donor_inode)
{
up_write(&EXT4_I(orig_inode)->i_data_sem);
- up_write(&EXT4_I(donor_inode)->i_data_sem);
+ if (orig_inode != donor_inode)
+ up_write(&EXT4_I(donor_inode)->i_data_sem);
}
/**
@@ -1002,7 +1006,7 @@ mext_check_arguments(struct inode *orig_inode,
}
/* orig and donor should be different file */
- if (orig_inode->i_ino == donor_inode->i_ino) {
+ if (orig_inode == donor_inode) {
ext4_debug("ext4 move extent: The argument files should not "
"be same file [ino:orig %lu, donor %lu]\n",
orig_inode->i_ino, donor_inode->i_ino);
On Sep 25, 2009 17:31 +0900, Akira Fujita wrote:
> ext4: Add checks whether two inodes are different to move_extent.c
>
> From: Akira Fujita <[email protected]>
>
> If same inode is set to orig and donor in ext4_move_extensts(),
> this leads to the deadlock in mext_double_down_{read, write}.
> This patch adds checks to mext_double_{down, up}_{read, write}
> and if inodes are same, we take/release one semaphore from them.
Wouldn't it make more sense to just return an error (or no error
but do nothing) in the case of source/target inode being the same?
I don't see how that would be good to "defragment" the inode onto
itself, just like "cp f f" and "rename f f" don't make sense either.
> Signed-off-by: Akira Fujita <[email protected]>
> ---
> fs/ext4/move_extent.c | 14 +++++++++-----
> 1 files changed, 9 insertions(+), 5 deletions(-)
>
> diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
> index c07a291..e1346cb 100644
> --- a/fs/ext4/move_extent.c
> +++ b/fs/ext4/move_extent.c
> @@ -177,7 +177,8 @@ mext_double_down_read(struct inode *orig_inode, struct inode *donor_inode)
> }
>
> down_read(&EXT4_I(first)->i_data_sem);
> - down_read(&EXT4_I(second)->i_data_sem);
> + if (first != second)
> + down_read(&EXT4_I(second)->i_data_sem);
> }
>
> /**
> @@ -203,7 +204,8 @@ mext_double_down_write(struct inode *orig_inode, struct inode *donor_inode)
> }
>
> down_write(&EXT4_I(first)->i_data_sem);
> - down_write(&EXT4_I(second)->i_data_sem);
> + if (first != second)
> + down_write(&EXT4_I(second)->i_data_sem);
> }
>
> /**
> @@ -217,7 +219,8 @@ static void
> mext_double_up_read(struct inode *orig_inode, struct inode *donor_inode)
> {
> up_read(&EXT4_I(orig_inode)->i_data_sem);
> - up_read(&EXT4_I(donor_inode)->i_data_sem);
> + if (orig_inode != donor_inode)
> + up_read(&EXT4_I(donor_inode)->i_data_sem);
> }
>
> /**
> @@ -231,7 +234,8 @@ static void
> mext_double_up_write(struct inode *orig_inode, struct inode *donor_inode)
> {
> up_write(&EXT4_I(orig_inode)->i_data_sem);
> - up_write(&EXT4_I(donor_inode)->i_data_sem);
> + if (orig_inode != donor_inode)
> + up_write(&EXT4_I(donor_inode)->i_data_sem);
> }
>
> /**
> @@ -1002,7 +1006,7 @@ mext_check_arguments(struct inode *orig_inode,
> }
>
> /* orig and donor should be different file */
> - if (orig_inode->i_ino == donor_inode->i_ino) {
> + if (orig_inode == donor_inode) {
> ext4_debug("ext4 move extent: The argument files should not "
> "be same file [ino:orig %lu, donor %lu]\n",
> orig_inode->i_ino, donor_inode->i_ino);
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Cheers, Andreas
--
Andreas Dilger
Sr. Staff Engineer, Lustre Group
Sun Microsystems of Canada, Inc.
On Fri, Sep 25, 2009 at 04:34:53AM -0600, Andreas Dilger wrote:
>
> Wouldn't it make more sense to just return an error (or no error
> but do nothing) in the case of source/target inode being the same?
> I don't see how that would be good to "defragment" the inode onto
> itself, just like "cp f f" and "rename f f" don't make sense either.
The code actually checks to make sure the source and target inodes are
different, but it does so too late. So the following patch should fix
the problem which Akira-san was attempting to solve, without
introducing any extra code complexity (we just move some lines of code
around instead.)
- Ted
commit 7cdf27b7241ef616b4158a26d7d85bd36f499462
Author: Theodore Ts'o <[email protected]>
Date: Mon Sep 28 15:58:29 2009 -0400
ext4: EXT4_IOC_MOVE_EXT: Check for different original and donor inodes first
Move the check to make sure the original and donor inodes are
different earlier, to avoid a potential deadlock by trying to lock the
same inode twice.
Signed-off-by: "Theodore Ts'o" <[email protected]>
diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
index 5332fd4..25b6b14 100644
--- a/fs/ext4/move_extent.c
+++ b/fs/ext4/move_extent.c
@@ -1001,14 +1001,6 @@ mext_check_arguments(struct inode *orig_inode,
return -EINVAL;
}
- /* orig and donor should be different file */
- if (orig_inode->i_ino == donor_inode->i_ino) {
- ext4_debug("ext4 move extent: The argument files should not "
- "be same file [ino:orig %lu, donor %lu]\n",
- orig_inode->i_ino, donor_inode->i_ino);
- return -EINVAL;
- }
Hi Ted / Andreas,
Theodore Tso wrote:
> On Fri, Sep 25, 2009 at 04:34:53AM -0600, Andreas Dilger wrote:
>> Wouldn't it make more sense to just return an error (or no error
>> but do nothing) in the case of source/target inode being the same?
>> I don't see how that would be good to "defragment" the inode onto
>> itself, just like "cp f f" and "rename f f" don't make sense either.
>
> The code actually checks to make sure the source and target inodes are
> different, but it does so too late. So the following patch should fix
> the problem which Akira-san was attempting to solve, without
> introducing any extra code complexity (we just move some lines of code
> around instead.)
I thought the argument check (orig and donor inodes are different)
should be done in mext_check_arguments()
because this function checks the arguments whether they are fine
(according to its name).
So mext_double_{up, down}_{read, write} which may be called earlier than
mext_check_arguments() should take/release one inode of them,
if orig and donor inodes are same.
And in the point of view of each function (mext_double_{up, down}_{read, write}),
I thought that we have had better to care about the situation
that same inode is passed to it, they are "static" function though.
Anyway, I tested Ted's patch fixes the problem.
Thanks for your work. :-)
Tested-by: Akira Fujita <[email protected]>
Regards,
Akira Fujita
>
> commit 7cdf27b7241ef616b4158a26d7d85bd36f499462
> Author: Theodore Ts'o <[email protected]>
> Date: Mon Sep 28 15:58:29 2009 -0400
>
> ext4: EXT4_IOC_MOVE_EXT: Check for different original and donor inodes first
>
> Move the check to make sure the original and donor inodes are
> different earlier, to avoid a potential deadlock by trying to lock the
> same inode twice.
>
> Signed-off-by: "Theodore Ts'o" <[email protected]>
>
> diff --git a/fs/ext4/move_extent.c b/fs/ext4/move_extent.c
> index 5332fd4..25b6b14 100644
> --- a/fs/ext4/move_extent.c
> +++ b/fs/ext4/move_extent.c
> @@ -1001,14 +1001,6 @@ mext_check_arguments(struct inode *orig_inode,
> return -EINVAL;
> }
>
> - /* orig and donor should be different file */
> - if (orig_inode->i_ino == donor_inode->i_ino) {
> - ext4_debug("ext4 move extent: The argument files should not "
> - "be same file [ino:orig %lu, donor %lu]\n",
> - orig_inode->i_ino, donor_inode->i_ino);
> - return -EINVAL;
> - }
> -
> /* Ext4 move extent supports only extent based file */
> if (!(EXT4_I(orig_inode)->i_flags & EXT4_EXTENTS_FL)) {
> ext4_debug("ext4 move extent: orig file is not extents "
> @@ -1232,6 +1224,14 @@ ext4_move_extents(struct file *o_filp, struct file *d_filp,
> int block_len_in_page;
> int uninit;
>
> + /* orig and donor should be different file */
> + if (orig_inode->i_ino == donor_inode->i_ino) {
> + ext4_debug("ext4 move extent: The argument files should not "
> + "be same file [ino:orig %lu, donor %lu]\n",
> + orig_inode->i_ino, donor_inode->i_ino);
> + return -EINVAL;
> + }
> +
> /* protect orig and donor against a truncate */
> ret1 = mext_inode_double_lock(orig_inode, donor_inode);
> if (ret1 < 0)
>
On Tue, Sep 29, 2009 at 02:18:39PM +0900, Akira Fujita wrote:
> I thought the argument check (orig and donor inodes are different)
> should be done in mext_check_arguments()
Sometimes separating things into multiple functions can actually make
things harder to understand, since it means you have to jump all over
the file to follow what a particular function is doing. Sometimes
having a function which is nested deeply in a loop can be made easier
to understand by separating it out into a separate function which is
called once.
Part of the reason why mext_check_arguments() is so huge is because of
all of the ext4_debug() statements. It's not clear to me they really
all need to be there. I'd also use EBUSY if the file is a swap file,
so it's easier for the userspace code to understand what is going on
in that case. For the other many cases where EINVAL is returned,
hopefully that's obvious enough for userspace to figure out without
having to resort to looking to the system debug logs.
- Ted