2022-05-26 22:11:46

by Javier Martinez Canillas

[permalink] [raw]
Subject: [PATCH v3 2/3] fat: add renameat2 RENAME_EXCHANGE flag support

The renameat2 RENAME_EXCHANGE flag allows to atomically exchange two paths
but is currently not supported by the Linux vfat filesystem driver.

Add a vfat_rename_exchange() helper function that implements this support.

The super block lock is acquired during the operation to ensure atomicity,
and in the error path actions made are reversed also with the mutex held.

It makes the operation as transactional as possible, within the limitation
impossed by vfat due not having a journal with logs to replay.

Signed-off-by: Javier Martinez Canillas <[email protected]>
---

(no changes since v2)

Changes in v2:
- Only update the new_dir inode version and timestamps if != old_dir
(Alex Larsson).
- Add some helper functions to avoid duplicating code (OGAWA Hirofumi).
- Use braces for multi-lines blocks even if are one statement (OGAWA Hirofumi).
- Mention in commit message that the operation is as transactional as possible
but within the vfat limitations of not having a journal (Colin Walters).

fs/fat/namei_vfat.c | 174 +++++++++++++++++++++++++++++++++++++++++++-
1 file changed, 173 insertions(+), 1 deletion(-)

diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c
index 88ccb2ee3537..97caec8c5207 100644
--- a/fs/fat/namei_vfat.c
+++ b/fs/fat/namei_vfat.c
@@ -1017,13 +1017,185 @@ static int vfat_rename(struct inode *old_dir, struct dentry *old_dentry,
goto out;
}

+/* Helpers for vfat_rename_exchange() */
+
+static int vfat_get_dotdot_info(struct inode *inode, struct buffer_head **dotdot_bh,
+ struct msdos_dir_entry **dotdot_de)
+{
+ if (!S_ISDIR(inode->i_mode))
+ return 0;
+
+ return fat_get_dotdot_entry(inode, dotdot_bh, dotdot_de);
+}
+
+static void vfat_exchange_dentries(struct inode *old_inode, struct inode *new_inode,
+ loff_t old_i_pos, loff_t new_i_pos)
+{
+ fat_detach(old_inode);
+ fat_detach(new_inode);
+
+ fat_attach(old_inode, new_i_pos);
+ fat_attach(new_inode, old_i_pos);
+}
+
+static int vfat_sync_after_exchange(struct inode *dir, struct inode *inode)
+{
+ int err = 0;
+
+ if (IS_DIRSYNC(dir))
+ err = fat_sync_inode(inode);
+ else
+ mark_inode_dirty(inode);
+
+ return err;
+}
+
+static int vfat_update_dotdot_info(struct buffer_head *dotdot_bh, struct msdos_dir_entry *dotdot_de,
+ struct inode *dir, struct inode *inode)
+{
+ int err = 0;
+
+ fat_set_start(dotdot_de, MSDOS_I(dir)->i_logstart);
+ mark_buffer_dirty_inode(dotdot_bh, inode);
+
+ if (IS_DIRSYNC(dir))
+ err = sync_dirty_buffer(dotdot_bh);
+
+ return err;
+}
+
+static void vfat_update_dir_metadata(struct inode *dir, struct timespec64 *ts)
+{
+ inode_inc_iversion(dir);
+ fat_truncate_time(dir, ts, S_CTIME | S_MTIME);
+
+ if (IS_DIRSYNC(dir))
+ (void)fat_sync_inode(dir);
+ else
+ mark_inode_dirty(dir);
+}
+
+static int vfat_rename_exchange(struct inode *old_dir, struct dentry *old_dentry,
+ struct inode *new_dir, struct dentry *new_dentry)
+{
+ struct buffer_head *old_dotdot_bh = NULL, *new_dotdot_bh = NULL;
+ struct msdos_dir_entry *old_dotdot_de = NULL, *new_dotdot_de = NULL;
+ struct inode *old_inode, *new_inode;
+ struct timespec64 ts = current_time(old_dir);
+ loff_t old_i_pos, new_i_pos;
+ int err, corrupt = 0;
+ struct super_block *sb = old_dir->i_sb;
+
+ old_inode = d_inode(old_dentry);
+ new_inode = d_inode(new_dentry);
+
+ /* Acquire super block lock for the operation to be atomic */
+ mutex_lock(&MSDOS_SB(sb)->s_lock);
+
+ /* if directories are not the same, get ".." info to update */
+ if (old_dir != new_dir) {
+ err = vfat_get_dotdot_info(old_inode, &old_dotdot_bh, &old_dotdot_de);
+ if (err)
+ goto out;
+
+ err = vfat_get_dotdot_info(new_inode, &new_dotdot_bh, &new_dotdot_de);
+ if (err)
+ goto out;
+ }
+
+ old_i_pos = MSDOS_I(old_inode)->i_pos;
+ new_i_pos = MSDOS_I(new_inode)->i_pos;
+
+ /* exchange the two dentries */
+ vfat_exchange_dentries(old_inode, new_inode, old_i_pos, new_i_pos);
+
+ err = vfat_sync_after_exchange(old_dir, new_inode);
+ if (err)
+ goto error_exchange;
+
+ err = vfat_sync_after_exchange(new_dir, old_inode);
+ if (err)
+ goto error_exchange;
+
+ /* update ".." directory entry info */
+ if (old_dotdot_de) {
+ err = vfat_update_dotdot_info(old_dotdot_bh, old_dotdot_de, new_dir, old_inode);
+ if (err)
+ goto error_old_dotdot;
+
+ drop_nlink(old_dir);
+ inc_nlink(new_dir);
+ }
+
+ if (new_dotdot_de) {
+ err = vfat_update_dotdot_info(new_dotdot_bh, new_dotdot_de, old_dir, new_inode);
+ if (err)
+ goto error_new_dotdot;
+
+ drop_nlink(new_dir);
+ inc_nlink(old_dir);
+ }
+
+ /* update inode version and timestamps */
+ inode_inc_iversion(old_inode);
+ inode_inc_iversion(new_inode);
+
+ vfat_update_dir_metadata(old_dir, &ts);
+
+ /* if directories are not the same, update new_dir as well */
+ if (old_dir != new_dir)
+ vfat_update_dir_metadata(new_dir, &ts);
+out:
+ brelse(old_dotdot_bh);
+ brelse(new_dotdot_bh);
+ mutex_unlock(&MSDOS_SB(sb)->s_lock);
+
+ return err;
+
+error_new_dotdot:
+ /* data cluster is shared, serious corruption */
+ corrupt = 1;
+
+ if (new_dotdot_de) {
+ corrupt |= vfat_update_dotdot_info(new_dotdot_bh, new_dotdot_de,
+ new_dir, new_inode);
+ }
+
+error_old_dotdot:
+ /* data cluster is shared, serious corruption */
+ corrupt = 1;
+
+ if (old_dotdot_de) {
+ corrupt |= vfat_update_dotdot_info(old_dotdot_bh, old_dotdot_de,
+ old_dir, old_inode);
+ }
+
+error_exchange:
+ vfat_exchange_dentries(old_inode, new_inode, new_i_pos, old_i_pos);
+
+ if (corrupt) {
+ corrupt |= fat_sync_inode(old_inode);
+ corrupt |= fat_sync_inode(new_inode);
+ }
+
+ if (corrupt < 0) {
+ fat_fs_error(new_dir->i_sb,
+ "%s: Filesystem corrupted (i_pos %lld, %lld)",
+ __func__, old_i_pos, new_i_pos);
+ }
+ goto out;
+}
+
static int vfat_rename2(struct user_namespace *mnt_userns, struct inode *old_dir,
struct dentry *old_dentry, struct inode *new_dir,
struct dentry *new_dentry, unsigned int flags)
{
- if (flags & ~RENAME_NOREPLACE)
+ if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE))
return -EINVAL;

+ if (flags & RENAME_EXCHANGE)
+ return vfat_rename_exchange(old_dir, old_dentry, new_dir, new_dentry);
+
/* VFS already handled RENAME_NOREPLACE, handle it as a normal rename */
return vfat_rename(old_dir, old_dentry, new_dir, new_dentry);
}
--
2.36.1



2022-05-31 12:43:09

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] fat: add renameat2 RENAME_EXCHANGE flag support

Javier Martinez Canillas <[email protected]> writes:

> The renameat2 RENAME_EXCHANGE flag allows to atomically exchange two paths
> but is currently not supported by the Linux vfat filesystem driver.
>
> Add a vfat_rename_exchange() helper function that implements this support.
>
> The super block lock is acquired during the operation to ensure atomicity,
> and in the error path actions made are reversed also with the mutex held.
>
> It makes the operation as transactional as possible, within the limitation
> impossed by vfat due not having a journal with logs to replay.

I tweaked your patch (tested only slightly), can you review and merge to
this patch if ok?

Main purpose of me is to consolidate helpers with vfat_rename(), and
tweak coding style to use existent fat codes.

> + if (old_dir != new_dir) {
> + err = vfat_get_dotdot_info(old_inode, &old_dotdot_bh, &old_dotdot_de);
> + if (err)
> + goto out;
> +
> + err = vfat_get_dotdot_info(new_inode, &new_dotdot_bh, &new_dotdot_de);
> + if (err)
> + goto out;

This should not return -ENOENT here. I tweaked to return -EIO in my patch.

> + /* update inode version and timestamps */
> + inode_inc_iversion(old_inode);
> + inode_inc_iversion(new_inode);

Why do we need to update iversion of those inodes? I couldn't get intent
of this.

> +error_new_dotdot:
> + /* data cluster is shared, serious corruption */

Sharing data cluster would happen here only if one inode success to sync
and another one failed. Then so I/O error, we would not be able to do
much for it.

Thanks.
--
OGAWA Hirofumi <[email protected]>


Signed-off-by: OGAWA Hirofumi <[email protected]>
---
fs/fat/namei_vfat.c | 205 +++++++++++++++++++++------------------------------
1 file changed, 87 insertions(+), 118 deletions(-)

diff --git a/fs/fat/namei_vfat.c b/fs/fat/namei_vfat.c
index 3e3cd4f..84f95eb 100644
--- a/fs/fat/namei_vfat.c 2022-05-31 16:53:56.987009983 +0900
+++ b/fs/fat/namei_vfat.c 2022-05-31 18:37:30.893473188 +0900
@@ -893,16 +893,55 @@ out:
return err;
}

+static int vfat_get_dotdot_de(struct inode *inode, struct buffer_head **bh,
+ struct msdos_dir_entry **de)
+{
+ if (S_ISDIR(inode->i_mode)) {
+ if (fat_get_dotdot_entry(inode, bh, de))
+ return -EIO;
+ }
+ return 0;
+}
+
+static int vfat_sync_ipos(struct inode *dir, struct inode *inode)
+{
+ if (IS_DIRSYNC(dir))
+ return fat_sync_inode(inode);
+ mark_inode_dirty(inode);
+ return 0;
+}
+
+static int vfat_update_dotdot_de(struct inode *dir, struct inode *inode,
+ struct buffer_head *dotdot_bh,
+ struct msdos_dir_entry *dotdot_de)
+{
+ fat_set_start(dotdot_de, MSDOS_I(dir)->i_logstart);
+ mark_buffer_dirty_inode(dotdot_bh, inode);
+ if (IS_DIRSYNC(dir))
+ return sync_dirty_buffer(dotdot_bh);
+ return 0;
+}
+
+static void vfat_update_dir_metadata(struct inode *dir, struct timespec64 *ts)
+{
+ inode_inc_iversion(dir);
+ fat_truncate_time(dir, ts, S_CTIME | S_MTIME);
+ if (IS_DIRSYNC(dir))
+ (void)fat_sync_inode(dir);
+ else
+ mark_inode_dirty(dir);
+}
+
static int vfat_rename(struct inode *old_dir, struct dentry *old_dentry,
struct inode *new_dir, struct dentry *new_dentry)
{
struct buffer_head *dotdot_bh;
- struct msdos_dir_entry *dotdot_de;
+ struct msdos_dir_entry *dotdot_de = NULL;
struct inode *old_inode, *new_inode;
struct fat_slot_info old_sinfo, sinfo;
struct timespec64 ts;
loff_t new_i_pos;
- int err, is_dir, update_dotdot, corrupt = 0;
+ int err, is_dir, corrupt = 0;
struct super_block *sb = old_dir->i_sb;

old_sinfo.bh = sinfo.bh = dotdot_bh = NULL;
@@ -913,15 +952,13 @@ static int vfat_rename(struct inode *old
if (err)
goto out;

- is_dir = S_ISDIR(old_inode->i_mode);
- update_dotdot = (is_dir && old_dir != new_dir);
- if (update_dotdot) {
- if (fat_get_dotdot_entry(old_inode, &dotdot_bh, &dotdot_de)) {
- err = -EIO;
+ if (old_dir != new_dir) {
+ err = vfat_get_dotdot_de(old_inode, &dotdot_bh, &dotdot_de);
+ if (err)
goto out;
- }
}

+ is_dir = S_ISDIR(old_inode->i_mode);
ts = current_time(old_dir);
if (new_inode) {
if (is_dir) {
@@ -942,21 +979,15 @@ static int vfat_rename(struct inode *old

fat_detach(old_inode);
fat_attach(old_inode, new_i_pos);
- if (IS_DIRSYNC(new_dir)) {
- err = fat_sync_inode(old_inode);
+ err = vfat_sync_ipos(new_dir, old_inode);
+ if (err)
+ goto error_inode;
+
+ if (dotdot_de) {
+ err = vfat_update_dotdot_de(new_dir, old_inode, dotdot_bh,
+ dotdot_de);
if (err)
- goto error_inode;
- } else
- mark_inode_dirty(old_inode);
-
- if (update_dotdot) {
- fat_set_start(dotdot_de, MSDOS_I(new_dir)->i_logstart);
- mark_buffer_dirty_inode(dotdot_bh, old_inode);
- if (IS_DIRSYNC(new_dir)) {
- err = sync_dirty_buffer(dotdot_bh);
- if (err)
- goto error_dotdot;
- }
+ goto error_dotdot;
drop_nlink(old_dir);
if (!new_inode)
inc_nlink(new_dir);
@@ -966,12 +997,7 @@ static int vfat_rename(struct inode *old
old_sinfo.bh = NULL;
if (err)
goto error_dotdot;
- inode_inc_iversion(old_dir);
- fat_truncate_time(old_dir, &ts, S_CTIME|S_MTIME);
- if (IS_DIRSYNC(old_dir))
- (void)fat_sync_inode(old_dir);
- else
- mark_inode_dirty(old_dir);
+ vfat_update_dir_metadata(old_dir, &ts);

if (new_inode) {
drop_nlink(new_inode);
@@ -991,10 +1017,9 @@ error_dotdot:
/* data cluster is shared, serious corruption */
corrupt = 1;

- if (update_dotdot) {
- fat_set_start(dotdot_de, MSDOS_I(old_dir)->i_logstart);
- mark_buffer_dirty_inode(dotdot_bh, old_inode);
- corrupt |= sync_dirty_buffer(dotdot_bh);
+ if (dotdot_de) {
+ corrupt |= vfat_update_dotdot_de(old_dir, old_inode, dotdot_bh,
+ dotdot_de);
}
error_inode:
fat_detach(old_inode);
@@ -1021,66 +1046,18 @@ error_inode:
goto out;
}

-/* Helpers for vfat_rename_exchange() */
-
-static int vfat_get_dotdot_info(struct inode *inode, struct buffer_head **dotdot_bh,
- struct msdos_dir_entry **dotdot_de)
-{
- if (!S_ISDIR(inode->i_mode))
- return 0;
-
- return fat_get_dotdot_entry(inode, dotdot_bh, dotdot_de);
-}
-
-static void vfat_exchange_dentries(struct inode *old_inode, struct inode *new_inode,
- loff_t old_i_pos, loff_t new_i_pos)
+static void vfat_exchange_ipos(struct inode *old_inode, struct inode *new_inode,
+ loff_t old_i_pos, loff_t new_i_pos)
{
fat_detach(old_inode);
fat_detach(new_inode);
-
fat_attach(old_inode, new_i_pos);
fat_attach(new_inode, old_i_pos);
}

-static int vfat_sync_after_exchange(struct inode *dir, struct inode *inode)
-{
- int err = 0;
-
- if (IS_DIRSYNC(dir))
- err = fat_sync_inode(inode);
- else
- mark_inode_dirty(inode);
-
- return err;
-}
-
-static int vfat_update_dotdot_info(struct buffer_head *dotdot_bh, struct msdos_dir_entry *dotdot_de,
- struct inode *dir, struct inode *inode)
-{
- int err = 0;
-
- fat_set_start(dotdot_de, MSDOS_I(dir)->i_logstart);
- mark_buffer_dirty_inode(dotdot_bh, inode);
-
- if (IS_DIRSYNC(dir))
- err = sync_dirty_buffer(dotdot_bh);
-
- return err;
-}
-
-static void vfat_update_dir_metadata(struct inode *dir, struct timespec64 *ts)
-{
- inode_inc_iversion(dir);
- fat_truncate_time(dir, ts, S_CTIME | S_MTIME);
-
- if (IS_DIRSYNC(dir))
- (void)fat_sync_inode(dir);
- else
- mark_inode_dirty(dir);
-}
-
-static int vfat_rename_exchange(struct inode *old_dir, struct dentry *old_dentry,
- struct inode *new_dir, struct dentry *new_dentry)
+static int
+vfat_rename_exchange(struct inode *old_dir, struct dentry *old_dentry,
+ struct inode *new_dir, struct dentry *new_dentry)
{
struct buffer_head *old_dotdot_bh = NULL, *new_dotdot_bh = NULL;
struct msdos_dir_entry *old_dotdot_de = NULL, *new_dotdot_de = NULL;
@@ -1098,11 +1075,13 @@ static int vfat_rename_exchange(struct i

/* if directories are not the same, get ".." info to update */
if (old_dir != new_dir) {
- err = vfat_get_dotdot_info(old_inode, &old_dotdot_bh, &old_dotdot_de);
+ err = vfat_get_dotdot_de(old_inode, &old_dotdot_bh,
+ &old_dotdot_de);
if (err)
goto out;

- err = vfat_get_dotdot_info(new_inode, &new_dotdot_bh, &new_dotdot_de);
+ err = vfat_get_dotdot_de(new_inode, &new_dotdot_bh,
+ &new_dotdot_de);
if (err)
goto out;
}
@@ -1110,45 +1089,41 @@ static int vfat_rename_exchange(struct i
old_i_pos = MSDOS_I(old_inode)->i_pos;
new_i_pos = MSDOS_I(new_inode)->i_pos;

- /* exchange the two dentries */
- vfat_exchange_dentries(old_inode, new_inode, old_i_pos, new_i_pos);
+ vfat_exchange_ipos(old_inode, new_inode, old_i_pos, new_i_pos);

- err = vfat_sync_after_exchange(old_dir, new_inode);
+ err = vfat_sync_ipos(old_dir, new_inode);
if (err)
goto error_exchange;
-
- err = vfat_sync_after_exchange(new_dir, old_inode);
+ err = vfat_sync_ipos(new_dir, old_inode);
if (err)
goto error_exchange;

/* update ".." directory entry info */
if (old_dotdot_de) {
- err = vfat_update_dotdot_info(old_dotdot_bh, old_dotdot_de, new_dir, old_inode);
+ err = vfat_update_dotdot_de(new_dir, old_inode, old_dotdot_bh,
+ old_dotdot_de);
if (err)
goto error_old_dotdot;
-
drop_nlink(old_dir);
inc_nlink(new_dir);
}

if (new_dotdot_de) {
- err = vfat_update_dotdot_info(new_dotdot_bh, new_dotdot_de, old_dir, new_inode);
+ err = vfat_update_dotdot_de(old_dir, new_inode, new_dotdot_bh,
+ new_dotdot_de);
if (err)
goto error_new_dotdot;
-
drop_nlink(new_dir);
inc_nlink(old_dir);
}

- /* update inode version and timestamps */
inode_inc_iversion(old_inode);
inode_inc_iversion(new_inode);
-
vfat_update_dir_metadata(old_dir, &ts);
-
/* if directories are not the same, update new_dir as well */
if (old_dir != new_dir)
vfat_update_dir_metadata(new_dir, &ts);
+
out:
brelse(old_dotdot_bh);
brelse(new_dotdot_bh);
@@ -1157,30 +1132,21 @@ out:
return err;

error_new_dotdot:
- /* data cluster is shared, serious corruption */
- corrupt = 1;
-
if (new_dotdot_de) {
- corrupt |= vfat_update_dotdot_info(new_dotdot_bh, new_dotdot_de,
- new_dir, new_inode);
+ corrupt |= vfat_update_dotdot_de(new_dir, new_inode,
+ new_dotdot_bh, new_dotdot_de);
}

error_old_dotdot:
- /* data cluster is shared, serious corruption */
- corrupt = 1;
-
if (old_dotdot_de) {
- corrupt |= vfat_update_dotdot_info(old_dotdot_bh, old_dotdot_de,
- old_dir, old_inode);
+ corrupt |= vfat_update_dotdot_de(old_dir, old_inode,
+ old_dotdot_bh, old_dotdot_de);
}

error_exchange:
- vfat_exchange_dentries(old_inode, new_inode, new_i_pos, old_i_pos);
-
- if (corrupt) {
- corrupt |= fat_sync_inode(old_inode);
- corrupt |= fat_sync_inode(new_inode);
- }
+ vfat_exchange_ipos(old_inode, new_inode, new_i_pos, old_i_pos);
+ corrupt |= vfat_sync_ipos(new_dir, new_inode);
+ corrupt |= vfat_sync_ipos(old_dir, old_inode);

if (corrupt < 0) {
fat_fs_error(new_dir->i_sb,
@@ -1190,15 +1156,18 @@ error_exchange:
goto out;
}

-static int vfat_rename2(struct user_namespace *mnt_userns, struct inode *old_dir,
+static int vfat_rename2(struct user_namespace *mnt_userns,
+ struct inode *old_dir,
struct dentry *old_dentry, struct inode *new_dir,
struct dentry *new_dentry, unsigned int flags)
{
if (flags & ~(RENAME_NOREPLACE | RENAME_EXCHANGE))
return -EINVAL;

- if (flags & RENAME_EXCHANGE)
- return vfat_rename_exchange(old_dir, old_dentry, new_dir, new_dentry);
+ if (flags & RENAME_EXCHANGE) {
+ return vfat_rename_exchange(old_dir, old_dentry,
+ new_dir, new_dentry);
+ }

/* VFS already handled RENAME_NOREPLACE, handle it as a normal rename */
return vfat_rename(old_dir, old_dentry, new_dir, new_dentry);
_

2022-06-01 06:54:27

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] fat: add renameat2 RENAME_EXCHANGE flag support

Javier Martinez Canillas <[email protected]> writes:

>> Main purpose of me is to consolidate helpers with vfat_rename(), and
>> tweak coding style to use existent fat codes.
>>
>
> Indeed. What do you think of the following plan for v4 ?
>
> 1) Keep patch "fat: add a vfat_rename2() and make existing .rename callback a helper"
> as the first patch of the series.
> 2) Add a patch #2 with your authorship that adds the helper and use them in the
> vfat_rename() function.
> 3) Make this patch "fat: add renameat2 RENAME_EXCHANGE flag support" to be patch #3
> and use the helpers introduced in patch #2.
> 4) Make patch #4 to not only add a test for RENAME_EXCHANGE but also for renameat()
> and renameat2(..., RENAME_NOREPLACE). That way it will also cover your changes in
> patch #2.

I don't care much about it because whole is not big (in short, I'm ok
with even one patch), so the point is the patches should be able to
bisect easily if separated.

>>> + /* update inode version and timestamps */
>>> + inode_inc_iversion(old_inode);
>>> + inode_inc_iversion(new_inode);
>>
>> Why do we need to update iversion of those inodes? I couldn't get intent
>> of this.
>>
>
> To be honest, I wasn't sure about this either but I saw that the implementation
> of RENAME_EXCHANGE in other filesystems did. For example btrfs_rename_exchange().

Ok. If I'm not overlooking, it looks like only btrfs. Please remove
those inode_inc_iversion() for {new,old}_inode.

Thanks.
--
OGAWA Hirofumi <[email protected]>

2022-06-01 20:05:27

by Javier Martinez Canillas

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] fat: add renameat2 RENAME_EXCHANGE flag support

Hello OGAWA,

Thanks a lot for your feedback and comments.

On 5/31/22 11:51, OGAWA Hirofumi wrote:
> Javier Martinez Canillas <[email protected]> writes:
>
>> The renameat2 RENAME_EXCHANGE flag allows to atomically exchange two paths
>> but is currently not supported by the Linux vfat filesystem driver.
>>
>> Add a vfat_rename_exchange() helper function that implements this support.
>>
>> The super block lock is acquired during the operation to ensure atomicity,
>> and in the error path actions made are reversed also with the mutex held.
>>
>> It makes the operation as transactional as possible, within the limitation
>> impossed by vfat due not having a journal with logs to replay.
>
> I tweaked your patch (tested only slightly), can you review and merge to
> this patch if ok?
>

Your changes make a lot of sense to me and from a quick glance they look OK
to me. I'll look at them in more detail when merging the changes for a v4.

> Main purpose of me is to consolidate helpers with vfat_rename(), and
> tweak coding style to use existent fat codes.
>

Indeed. What do you think of the following plan for v4 ?

1) Keep patch "fat: add a vfat_rename2() and make existing .rename callback a helper"
as the first patch of the series.
2) Add a patch #2 with your authorship that adds the helper and use them in the
vfat_rename() function.
3) Make this patch "fat: add renameat2 RENAME_EXCHANGE flag support" to be patch #3
and use the helpers introduced in patch #2.
4) Make patch #4 to not only add a test for RENAME_EXCHANGE but also for renameat()
and renameat2(..., RENAME_NOREPLACE). That way it will also cover your changes in
patch #2.

>> + if (old_dir != new_dir) {
>> + err = vfat_get_dotdot_info(old_inode, &old_dotdot_bh, &old_dotdot_de);
>> + if (err)
>> + goto out;
>> +
>> + err = vfat_get_dotdot_info(new_inode, &new_dotdot_bh, &new_dotdot_de);
>> + if (err)
>> + goto out;
>
> This should not return -ENOENT here. I tweaked to return -EIO in my patch.
>

Got it. I'm OK with that.

>> + /* update inode version and timestamps */
>> + inode_inc_iversion(old_inode);
>> + inode_inc_iversion(new_inode);
>
> Why do we need to update iversion of those inodes? I couldn't get intent
> of this.
>

To be honest, I wasn't sure about this either but I saw that the implementation
of RENAME_EXCHANGE in other filesystems did. For example btrfs_rename_exchange().

>> +error_new_dotdot:
>> + /* data cluster is shared, serious corruption */
>
> Sharing data cluster would happen here only if one inode success to sync
> and another one failed. Then so I/O error, we would not be able to do
> much for it.
>

I see. Thanks for the clarification.

--
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat


2022-06-01 20:24:39

by Javier Martinez Canillas

[permalink] [raw]
Subject: Re: [PATCH v3 2/3] fat: add renameat2 RENAME_EXCHANGE flag support

On 5/31/22 14:41, OGAWA Hirofumi wrote:
> Javier Martinez Canillas <[email protected]> writes:
>
>>> Main purpose of me is to consolidate helpers with vfat_rename(), and
>>> tweak coding style to use existent fat codes.
>>>
>>
>> Indeed. What do you think of the following plan for v4 ?
>>
>> 1) Keep patch "fat: add a vfat_rename2() and make existing .rename callback a helper"
>> as the first patch of the series.
>> 2) Add a patch #2 with your authorship that adds the helper and use them in the
>> vfat_rename() function.
>> 3) Make this patch "fat: add renameat2 RENAME_EXCHANGE flag support" to be patch #3
>> and use the helpers introduced in patch #2.
>> 4) Make patch #4 to not only add a test for RENAME_EXCHANGE but also for renameat()
>> and renameat2(..., RENAME_NOREPLACE). That way it will also cover your changes in
>> patch #2.
>
> I don't care much about it because whole is not big (in short, I'm ok
> with even one patch), so the point is the patches should be able to
> bisect easily if separated.
>

Yes, git bisect-ability is why I mentioned that we could do it in separate patches
but I'll integrate your changes now and see what approach I take depending on how
the code looks then.

>>>> + /* update inode version and timestamps */
>>>> + inode_inc_iversion(old_inode);
>>>> + inode_inc_iversion(new_inode);
>>>
>>> Why do we need to update iversion of those inodes? I couldn't get intent
>>> of this.
>>>
>>
>> To be honest, I wasn't sure about this either but I saw that the implementation
>> of RENAME_EXCHANGE in other filesystems did. For example btrfs_rename_exchange().
>
> Ok. If I'm not overlooking, it looks like only btrfs. Please remove
> those inode_inc_iversion() for {new,old}_inode.
>

Sure.

--
Best regards,

Javier Martinez Canillas
Linux Engineering
Red Hat