ext4: delayed allocation i_blocks fix for stat(2)
From: Mingming Cao <[email protected]>
Right now i_blocks is not getting updated until the disks are actually
allocaed on disk. This means with delayed allocation, right after files
are copied, "ls -sF" shoes the file as taking 0 blocks on disk. "du"
also shows the files taking zero space, which is highly confusing to the user.
Since current delayed allocation already keep track of per-inode total number
of blocks that are subject to delayed allocation, this patch fix this by using
that to adjust the value returned by stat(2). When real block allocation
is done, the i_blocks will get updated. Since the reserved blocks for delayed
allocation will be decreased, this will be keep value returned by stat(2)
consistent.
Signed-off-by: Mingming Cao <[email protected]>
---
fs/ext4/ext4.h | 2 ++
fs/ext4/file.c | 1 +
fs/ext4/inode.c | 26 ++++++++++++++++++++++++++
3 files changed, 29 insertions(+)
Index: linux-2.6.26-rc9/fs/ext4/ext4.h
===================================================================
--- linux-2.6.26-rc9.orig/fs/ext4/ext4.h 2008-07-07 16:29:23.000000000 -0700
+++ linux-2.6.26-rc9/fs/ext4/ext4.h 2008-07-07 16:32:54.000000000 -0700
@@ -1059,6 +1059,8 @@ int ext4_get_blocks_handle(handle_t *han
extern struct inode *ext4_iget(struct super_block *, unsigned long);
extern int ext4_write_inode (struct inode *, int);
extern int ext4_setattr (struct dentry *, struct iattr *);
+extern int ext4_getattr(struct vfsmount *mnt, struct dentry *dentry,
+ struct kstat *stat);
extern void ext4_delete_inode (struct inode *);
extern int ext4_sync_inode (handle_t *, struct inode *);
extern void ext4_discard_reservation (struct inode *);
Index: linux-2.6.26-rc9/fs/ext4/file.c
===================================================================
--- linux-2.6.26-rc9.orig/fs/ext4/file.c 2008-07-07 16:29:21.000000000 -0700
+++ linux-2.6.26-rc9/fs/ext4/file.c 2008-07-07 16:29:51.000000000 -0700
@@ -161,6 +161,7 @@ const struct file_operations ext4_file_o
const struct inode_operations ext4_file_inode_operations = {
.truncate = ext4_truncate,
.setattr = ext4_setattr,
+ .getattr = ext4_getattr,
#ifdef CONFIG_EXT4DEV_FS_XATTR
.setxattr = generic_setxattr,
.getxattr = generic_getxattr,
Index: linux-2.6.26-rc9/fs/ext4/inode.c
===================================================================
--- linux-2.6.26-rc9.orig/fs/ext4/inode.c 2008-07-07 16:29:22.000000000 -0700
+++ linux-2.6.26-rc9/fs/ext4/inode.c 2008-07-07 16:31:13.000000000 -0700
@@ -4203,6 +4203,32 @@ err_out:
return error;
}
+int ext4_getattr(struct vfsmount *mnt, struct dentry *dentry,
+ struct kstat *stat)
+{
+ struct inode *inode;
+ unsigned long delalloc_blocks;
+
+ inode = dentry->d_inode;
+ generic_fillattr(inode, stat);
+
+ /*
+ * We can't update i_blocks if the block allocation is delayed
+ * otherwise in the case of system crash before the real block
+ * allocation is done, we will have i_blocks inconsistent with
+ * on-disk file blocks.
+ * We always keep i_blocks updated together with real
+ * allocation. But to not confuse with user, stat
+ * will return the blocks that include the delayed allocation
+ * blocks for this file.
+ */
+ spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
+ delalloc_blocks = EXT4_I(inode)->i_reserved_data_blocks;
+ spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
+
+ stat->blocks += (delalloc_blocks << inode->i_sb->s_blocksize_bits)>>9;
+ return 0;
+}
/*
* How many blocks doth make a writepage()?
On Mon, Jul 07, 2008 at 04:36:39PM -0700, Mingming Cao wrote:
> ext4: delayed allocation i_blocks fix for stat(2)
Looks good, thanks!!
Signed-off-by: "Theodore Ts'o" <[email protected]>
I'm going to do a quick test and probably push out a 2.6.26-rc9-ext4-1
snapshot later this evening.
- Ted
Mingming Cao wrote:
> ext4: delayed allocation i_blocks fix for stat(2)
>
> From: Mingming Cao <[email protected]>
>
> Right now i_blocks is not getting updated until the disks are actually
> allocaed on disk. This means with delayed allocation, right after files
> are copied, "ls -sF" shoes the file as taking 0 blocks on disk. "du"
> also shows the files taking zero space, which is highly confusing to the user.
>
> Since current delayed allocation already keep track of per-inode total number
> of blocks that are subject to delayed allocation, this patch fix this by using
> that to adjust the value returned by stat(2). When real block allocation
> is done, the i_blocks will get updated. Since the reserved blocks for delayed
> allocation will be decreased, this will be keep value returned by stat(2)
> consistent.
>
> Signed-off-by: Mingming Cao <[email protected]>
Thanks Mingming, looks like just the right approach.
Something about the spinlock for every stat seems heavy-handed to me but
I'll have to give that more thought. :)
-Eric
> ---
> fs/ext4/ext4.h | 2 ++
> fs/ext4/file.c | 1 +
> fs/ext4/inode.c | 26 ++++++++++++++++++++++++++
> 3 files changed, 29 insertions(+)
>
> Index: linux-2.6.26-rc9/fs/ext4/ext4.h
> ===================================================================
> --- linux-2.6.26-rc9.orig/fs/ext4/ext4.h 2008-07-07 16:29:23.000000000 -0700
> +++ linux-2.6.26-rc9/fs/ext4/ext4.h 2008-07-07 16:32:54.000000000 -0700
> @@ -1059,6 +1059,8 @@ int ext4_get_blocks_handle(handle_t *han
> extern struct inode *ext4_iget(struct super_block *, unsigned long);
> extern int ext4_write_inode (struct inode *, int);
> extern int ext4_setattr (struct dentry *, struct iattr *);
> +extern int ext4_getattr(struct vfsmount *mnt, struct dentry *dentry,
> + struct kstat *stat);
> extern void ext4_delete_inode (struct inode *);
> extern int ext4_sync_inode (handle_t *, struct inode *);
> extern void ext4_discard_reservation (struct inode *);
> Index: linux-2.6.26-rc9/fs/ext4/file.c
> ===================================================================
> --- linux-2.6.26-rc9.orig/fs/ext4/file.c 2008-07-07 16:29:21.000000000 -0700
> +++ linux-2.6.26-rc9/fs/ext4/file.c 2008-07-07 16:29:51.000000000 -0700
> @@ -161,6 +161,7 @@ const struct file_operations ext4_file_o
> const struct inode_operations ext4_file_inode_operations = {
> .truncate = ext4_truncate,
> .setattr = ext4_setattr,
> + .getattr = ext4_getattr,
> #ifdef CONFIG_EXT4DEV_FS_XATTR
> .setxattr = generic_setxattr,
> .getxattr = generic_getxattr,
> Index: linux-2.6.26-rc9/fs/ext4/inode.c
> ===================================================================
> --- linux-2.6.26-rc9.orig/fs/ext4/inode.c 2008-07-07 16:29:22.000000000 -0700
> +++ linux-2.6.26-rc9/fs/ext4/inode.c 2008-07-07 16:31:13.000000000 -0700
> @@ -4203,6 +4203,32 @@ err_out:
> return error;
> }
>
> +int ext4_getattr(struct vfsmount *mnt, struct dentry *dentry,
> + struct kstat *stat)
> +{
> + struct inode *inode;
> + unsigned long delalloc_blocks;
> +
> + inode = dentry->d_inode;
> + generic_fillattr(inode, stat);
> +
> + /*
> + * We can't update i_blocks if the block allocation is delayed
> + * otherwise in the case of system crash before the real block
> + * allocation is done, we will have i_blocks inconsistent with
> + * on-disk file blocks.
> + * We always keep i_blocks updated together with real
> + * allocation. But to not confuse with user, stat
> + * will return the blocks that include the delayed allocation
> + * blocks for this file.
> + */
> + spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
> + delalloc_blocks = EXT4_I(inode)->i_reserved_data_blocks;
> + spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
> +
> + stat->blocks += (delalloc_blocks << inode->i_sb->s_blocksize_bits)>>9;
> + return 0;
> +}
>
> /*
> * How many blocks doth make a writepage()?
>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
Eric Sandeen wrote:
> Mingming Cao wrote:
>
>> ext4: delayed allocation i_blocks fix for stat(2)
>>
>> From: Mingming Cao <[email protected]>
>>
>> Right now i_blocks is not getting updated until the disks are actually
>> allocaed on disk. This means with delayed allocation, right after files
>> are copied, "ls -sF" shoes the file as taking 0 blocks on disk. "du"
>> also shows the files taking zero space, which is highly confusing to the user.
>>
>> Since current delayed allocation already keep track of per-inode total number
>> of blocks that are subject to delayed allocation, this patch fix this by using
>> that to adjust the value returned by stat(2). When real block allocation
>> is done, the i_blocks will get updated. Since the reserved blocks for delayed
>> allocation will be decreased, this will be keep value returned by stat(2)
>> consistent.
>>
>> Signed-off-by: Mingming Cao <[email protected]>
>>
>
> Thanks Mingming, looks like just the right approach.
>
> Something about the spinlock for every stat seems heavy-handed to me but
> I'll have to give that more thought. :)
>
>
Since i_reserved_blocks is an unsigned long, it should be possible
to atomically fetch it on all of the supported architectures,
without the use of the spinlock. It seems to me that this spinlock
is not required here.
ps
> -Eric
>
>
>> ---
>> fs/ext4/ext4.h | 2 ++
>> fs/ext4/file.c | 1 +
>> fs/ext4/inode.c | 26 ++++++++++++++++++++++++++
>> 3 files changed, 29 insertions(+)
>>
>> Index: linux-2.6.26-rc9/fs/ext4/ext4.h
>> ===================================================================
>> --- linux-2.6.26-rc9.orig/fs/ext4/ext4.h 2008-07-07 16:29:23.000000000 -0700
>> +++ linux-2.6.26-rc9/fs/ext4/ext4.h 2008-07-07 16:32:54.000000000 -0700
>> @@ -1059,6 +1059,8 @@ int ext4_get_blocks_handle(handle_t *han
>> extern struct inode *ext4_iget(struct super_block *, unsigned long);
>> extern int ext4_write_inode (struct inode *, int);
>> extern int ext4_setattr (struct dentry *, struct iattr *);
>> +extern int ext4_getattr(struct vfsmount *mnt, struct dentry *dentry,
>> + struct kstat *stat);
>> extern void ext4_delete_inode (struct inode *);
>> extern int ext4_sync_inode (handle_t *, struct inode *);
>> extern void ext4_discard_reservation (struct inode *);
>> Index: linux-2.6.26-rc9/fs/ext4/file.c
>> ===================================================================
>> --- linux-2.6.26-rc9.orig/fs/ext4/file.c 2008-07-07 16:29:21.000000000 -0700
>> +++ linux-2.6.26-rc9/fs/ext4/file.c 2008-07-07 16:29:51.000000000 -0700
>> @@ -161,6 +161,7 @@ const struct file_operations ext4_file_o
>> const struct inode_operations ext4_file_inode_operations = {
>> .truncate = ext4_truncate,
>> .setattr = ext4_setattr,
>> + .getattr = ext4_getattr,
>> #ifdef CONFIG_EXT4DEV_FS_XATTR
>> .setxattr = generic_setxattr,
>> .getxattr = generic_getxattr,
>> Index: linux-2.6.26-rc9/fs/ext4/inode.c
>> ===================================================================
>> --- linux-2.6.26-rc9.orig/fs/ext4/inode.c 2008-07-07 16:29:22.000000000 -0700
>> +++ linux-2.6.26-rc9/fs/ext4/inode.c 2008-07-07 16:31:13.000000000 -0700
>> @@ -4203,6 +4203,32 @@ err_out:
>> return error;
>> }
>>
>> +int ext4_getattr(struct vfsmount *mnt, struct dentry *dentry,
>> + struct kstat *stat)
>> +{
>> + struct inode *inode;
>> + unsigned long delalloc_blocks;
>> +
>> + inode = dentry->d_inode;
>> + generic_fillattr(inode, stat);
>> +
>> + /*
>> + * We can't update i_blocks if the block allocation is delayed
>> + * otherwise in the case of system crash before the real block
>> + * allocation is done, we will have i_blocks inconsistent with
>> + * on-disk file blocks.
>> + * We always keep i_blocks updated together with real
>> + * allocation. But to not confuse with user, stat
>> + * will return the blocks that include the delayed allocation
>> + * blocks for this file.
>> + */
>> + spin_lock(&EXT4_I(inode)->i_block_reservation_lock);
>> + delalloc_blocks = EXT4_I(inode)->i_reserved_data_blocks;
>> + spin_unlock(&EXT4_I(inode)->i_block_reservation_lock);
>> +
>> + stat->blocks += (delalloc_blocks << inode->i_sb->s_blocksize_bits)>>9;
>> + return 0;
>> +}
>>
>> /*
>> * How many blocks doth make a writepage()?
>>
>>
>> --
>> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
>> the body of a message to [email protected]
>> More majordomo info at http://vger.kernel.org/majordomo-info.html
>>
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-ext4" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
>
> Eric Sandeen wrote:
> >Mingming Cao wrote:
> >
> >>ext4: delayed allocation i_blocks fix for stat(2)
> >>
> >>From: Mingming Cao <[email protected]>
> >>
> >>Right now i_blocks is not getting updated until the disks are actually
> >>allocaed on disk. This means with delayed allocation, right after files
> >>are copied, "ls -sF" shoes the file as taking 0 blocks on disk. "du"
> >>also shows the files taking zero space, which is highly confusing to the
> >>user.
> >>
> >>Since current delayed allocation already keep track of per-inode total
> >>number
> >>of blocks that are subject to delayed allocation, this patch fix this by
> >>using
> >>that to adjust the value returned by stat(2). When real block allocation
> >>is done, the i_blocks will get updated. Since the reserved blocks for
> >>delayed
> >>allocation will be decreased, this will be keep value returned by stat(2)
> >>consistent.
> >>
> >>Signed-off-by: Mingming Cao <[email protected]>
> >>
> >
> >Thanks Mingming, looks like just the right approach.
> >
> >Something about the spinlock for every stat seems heavy-handed to me but
> >I'll have to give that more thought. :)
> >
> >
>
> Since i_reserved_blocks is an unsigned long, it should be possible
> to atomically fetch it on all of the supported architectures,
> without the use of the spinlock. It seems to me that this spinlock
> is not required here.
Well, it's certainly not nice to rely on this. The clean solution
would be to convert i_reserved_blocks to atomic_t or atomic64_t on archs
that have it...
Honza
--
Jan Kara <[email protected]>
SuSE CR Labs
在 2008-07-08二的 22:02 +0200,Jan Kara写道:
> > Eric Sandeen wrote:
> > >Mingming Cao wrote:
> > >
> > >>ext4: delayed allocation i_blocks fix for stat(2)
> > >>
> > >>From: Mingming Cao <[email protected]>
> > >>
> > >>Right now i_blocks is not getting updated until the disks are actually
> > >>allocaed on disk. This means with delayed allocation, right after files
> > >>are copied, "ls -sF" shoes the file as taking 0 blocks on disk. "du"
> > >>also shows the files taking zero space, which is highly confusing to the
> > >>user.
> > >>
> > >>Since current delayed allocation already keep track of per-inode total
> > >>number
> > >>of blocks that are subject to delayed allocation, this patch fix this by
> > >>using
> > >>that to adjust the value returned by stat(2). When real block allocation
> > >>is done, the i_blocks will get updated. Since the reserved blocks for
> > >>delayed
> > >>allocation will be decreased, this will be keep value returned by stat(2)
> > >>consistent.
> > >>
> > >>Signed-off-by: Mingming Cao <[email protected]>
> > >>
> > >
> > >Thanks Mingming, looks like just the right approach.
> > >
> > >Something about the spinlock for every stat seems heavy-handed to me but
> > >I'll have to give that more thought. :)
> > >
> > >
> >
> > Since i_reserved_blocks is an unsigned long, it should be possible
> > to atomically fetch it on all of the supported architectures,
> > without the use of the spinlock. It seems to me that this spinlock
> > is not required here.
> Well, it's certainly not nice to rely on this. The clean solution
> would be to convert i_reserved_blocks to atomic_t or atomic64_t on archs
> that have it...
>
I was thinking about the same thing when this lock was initially
introduced ... but this lock is protecting three counters, and these
counters are updated/reset in a couple of places, and being used to
calcuate how much per-fs free blocks counter need to be accounted. It's
doable, just need a careful look if we make them all atomic.
Mingming
> Honza