2007-05-14 11:10:02

by Cordenner jean noel

[permalink] [raw]
Subject: [RFC] [patch 2/2] i_version update - ext4 part

In the ext4 part, the i_version counter is stored into 2 32bit fields in
the ext4_inode structure i_disk_version and i_disk_version_hi.

Signed-off-by: Jean Noel Cordenner <[email protected]>

fs/ext4/ialloc.c | 1 +
fs/ext4/inode.c | 5 +++++
fs/ext4/super.c | 2 +-
include/linux/ext4_fs.h | 5 +++--

Index: linux-2.6.21-rc4-i_version/fs/ext4/ialloc.c
===================================================================
--- linux-2.6.21-rc4-i_version.orig/fs/ext4/ialloc.c 2007-05-10 17:22:23.000000000 +0200
+++ linux-2.6.21-rc4-i_version/fs/ext4/ialloc.c 2007-05-10 17:26:21.000000000 +0200
@@ -564,6 +564,7 @@
/* This is the optimal IO size (for stat), not the fs block size */
inode->i_blocks = 0;
inode->i_mtime = inode->i_atime = inode->i_ctime = CURRENT_TIME_SEC;
+ inode->i_version = 1;

memset(ei->i_data, 0, sizeof(ei->i_data));
ei->i_dir_start_lookup = 0;
Index: linux-2.6.21-rc4-i_version/fs/ext4/inode.c
===================================================================
--- linux-2.6.21-rc4-i_version.orig/fs/ext4/inode.c 2007-05-10 17:22:23.000000000 +0200
+++ linux-2.6.21-rc4-i_version/fs/ext4/inode.c 2007-05-10 18:14:43.000000000 +0200
@@ -2680,6 +2680,8 @@
inode->i_ctime.tv_sec = le32_to_cpu(raw_inode->i_ctime);
inode->i_mtime.tv_sec = le32_to_cpu(raw_inode->i_mtime);
inode->i_atime.tv_nsec = inode->i_ctime.tv_nsec = inode->i_mtime.tv_nsec = 0;
+ inode->i_version = (le32_to_cpu(raw_inode->i_disk_version_hi) << 32) +
+ le32_to_cpu(raw_inode->i_disk_version);

ei->i_state = 0;
ei->i_dir_start_lookup = 0;
@@ -2838,6 +2840,8 @@
raw_inode->i_atime = cpu_to_le32(inode->i_atime.tv_sec);
raw_inode->i_ctime = cpu_to_le32(inode->i_ctime.tv_sec);
raw_inode->i_mtime = cpu_to_le32(inode->i_mtime.tv_sec);
+ raw_inode->i_disk_version = cpu_to_le32(inode->i_version);
+ raw_inode->i_disk_version_hi = cpu_to_le32(inode->i_version >> 32);
raw_inode->i_blocks = cpu_to_le32(inode->i_blocks);
raw_inode->i_dtime = cpu_to_le32(ei->i_dtime);
raw_inode->i_flags = cpu_to_le32(ei->i_flags);
@@ -3111,6 +3115,7 @@
{
int err = 0;

+ inode->i_version++;
/* the do_update_inode consumes one bh->b_count */
get_bh(iloc->bh);

Index: linux-2.6.21-rc4-i_version/include/linux/ext4_fs.h
===================================================================
--- linux-2.6.21-rc4-i_version.orig/include/linux/ext4_fs.h 2007-05-10 17:22:23.000000000 +0200
+++ linux-2.6.21-rc4-i_version/include/linux/ext4_fs.h 2007-05-10 17:26:21.000000000 +0200
@@ -291,7 +291,7 @@
__le32 i_flags; /* File flags */
union {
struct {
- __u32 l_i_reserved1;
+ __u32 l_i_disk_version;
} linux1;
struct {
__u32 h_i_translator;
@@ -331,12 +331,13 @@
} osd2; /* OS dependent 2 */
__le16 i_extra_isize;
__le16 i_pad1;
+ __le32 i_disk_version_hi;
};

#define i_size_high i_dir_acl

#if defined(__KERNEL__) || defined(__linux__)
-#define i_reserved1 osd1.linux1.l_i_reserved1
+#define i_disk_version osd1.linux1.l_i_disk_version
#define i_frag osd2.linux2.l_i_frag
#define i_fsize osd2.linux2.l_i_fsize
#define i_file_acl_high osd2.linux2.l_i_file_acl_high
Index: linux-2.6.21-rc4-i_version/fs/ext4/super.c
===================================================================
--- linux-2.6.21-rc4-i_version.orig/fs/ext4/super.c 2007-05-10 17:22:23.000000000 +0200
+++ linux-2.6.21-rc4-i_version/fs/ext4/super.c 2007-05-10 17:26:21.000000000 +0200
@@ -2804,8 +2804,8 @@
i_size_write(inode, off+len-towrite);
EXT4_I(inode)->i_disksize = inode->i_size;
}
- inode->i_version++;
inode->i_mtime = inode->i_ctime = CURRENT_TIME;
+ inode->i_version = 1;
ext4_mark_inode_dirty(handle, inode);
mutex_unlock(&inode->i_mutex);
return len - towrite;


2007-05-14 20:21:50

by Andreas Dilger

[permalink] [raw]
Subject: Re: [RFC] [patch 2/2] i_version update - ext4 part

On May 14, 2007 13:05 +0200, Cordenner jean noel wrote:
> @@ -331,12 +331,13 @@
> } osd2; /* OS dependent 2 */
> __le16 i_extra_isize;
> __le16 i_pad1;
> + __le32 i_disk_version_hi;

No, this is not correct. There are already several other fields here
(nanosecond ctime, mtime, atime, crtime (creation time)) so you need
to use the correct reserved field for this.

__u16 i_extra_isize;
__u16 i_pad1;
__u32 i_ctime_extra; /* extra Change time (nsec << 2 | epoch) */
__u32 i_mtime_extra; /* extra Modification time (nsec << 2 | epoch)*/
__u32 i_atime_extra; /* extra Access time (nsec << 2 | epoch) */
__u32 i_crtime; /* File creation time */
__u32 i_crtime_extra; /* extra File creation time (nsec << 2 |epoch)*/

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2007-05-14 20:31:50

by Andreas Dilger

[permalink] [raw]
Subject: Re: [RFC] [patch 2/2] i_version update - ext4 part

On May 14, 2007 14:21 -0600, Andreas Dilger wrote:
> On May 14, 2007 13:05 +0200, Cordenner jean noel wrote:
> > @@ -331,12 +331,13 @@
> > } osd2; /* OS dependent 2 */
> > __le16 i_extra_isize;
> > __le16 i_pad1;
> > + __le32 i_disk_version_hi;
>
> No, this is not correct. There are already several other fields here
> (nanosecond ctime, mtime, atime, crtime (creation time)) so you need
> to use the correct reserved field for this.
>
> __u16 i_extra_isize;
> __u16 i_pad1;
> __u32 i_ctime_extra; /* extra Change time (nsec << 2 | epoch) */
> __u32 i_mtime_extra; /* extra Modification time (nsec << 2 | epoch)*/
> __u32 i_atime_extra; /* extra Access time (nsec << 2 | epoch) */
> __u32 i_crtime; /* File creation time */
> __u32 i_crtime_extra; /* extra File creation time (nsec << 2 |epoch)*/

Sorry, I meant to add (before hitting send :-) that the field after
i_crtime_extra is supposed to be "i_disk_version_hi".

See the patch from Kalpak Shah "[RFC] 64-bit inode version" which also handles
the case for expanding i_extra_isize to cover the needed extra fields if
i_extra_isize is not large enough. That patch didn't include the 64-bit
i_version_hi yet, because there wasn't yet agreement at that time if
the iversion_hi should be allocated separately, but that was since decided.

Without that patch, your patch will possibly corrupt the extended attributes
by just overwriting i_disk_version_hi while ignoring the actual value of
i_extra_isize. This would clobber the EA magic and result in loss of all
EAs in that inode.

Cheers, Andreas
--
Andreas Dilger
Principal Software Engineer
Cluster File Systems, Inc.

2007-05-14 22:11:17

by Mingming Cao

[permalink] [raw]
Subject: Re: [RFC] [patch 2/2] i_version update - ext4 part

On Mon, 2007-05-14 at 14:31 -0600, Andreas Dilger wrote:
> On May 14, 2007 14:21 -0600, Andreas Dilger wrote:
> > On May 14, 2007 13:05 +0200, Cordenner jean noel wrote:
> > > @@ -331,12 +331,13 @@
> > > } osd2; /* OS dependent 2 */
> > > __le16 i_extra_isize;
> > > __le16 i_pad1;
> > > + __le32 i_disk_version_hi;
> >
> > No, this is not correct. There are already several other fields here
> > (nanosecond ctime, mtime, atime, crtime (creation time)) so you need
> > to use the correct reserved field for this.
> >
> > __u16 i_extra_isize;
> > __u16 i_pad1;
> > __u32 i_ctime_extra; /* extra Change time (nsec << 2 | epoch) */
> > __u32 i_mtime_extra; /* extra Modification time (nsec << 2 | epoch)*/
> > __u32 i_atime_extra; /* extra Access time (nsec << 2 | epoch) */
> > __u32 i_crtime; /* File creation time */
> > __u32 i_crtime_extra; /* extra File creation time (nsec << 2 |epoch)*/
>
> Sorry, I meant to add (before hitting send :-) that the field after
> i_crtime_extra is supposed to be "i_disk_version_hi".
>


> See the patch from Kalpak Shah "[RFC] 64-bit inode version" which also handles
> the case for expanding i_extra_isize to cover the needed extra fields if
> i_extra_isize is not large enough. That patch didn't include the 64-bit
> i_version_hi yet, because there wasn't yet agreement at that time if
> the iversion_hi should be allocated separately, but that was since decided.
>

Kalpak already sent a patch [PATCH] Add i_version_hi for 64-bit version,
which added i_version_hi after i_crtime. His patch has in ext4 git tree
for a while.

ftp://ftp.kernel.org/pub/linux/kernel/people/tytso/ext4-patches/2.6.21-
ext4-1

Regards,
Mingming

> Without that patch, your patch will possibly corrupt the extended attributes
> by just overwriting i_disk_version_hi while ignoring the actual value of
> i_extra_isize. This would clobber the EA magic and result in loss of all
> EAs in that inode.
>
> Cheers, Andreas
> --
> Andreas Dilger
> Principal Software Engineer
> Cluster File Systems, Inc.
>
> -
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html