This patch-set makes ext4 file-system stop using the VFS '->write_supers()'
call-back and the '->s_dirt' superblock field because I plan to remove them
once all users are gone.
Like the other similar patch-sets, we switch to a delayed job for writing out
the superblock instead of using the 's_dirt' flag. Additionally, this patch-set
includes several clean-ups.
Tested with xfstests. And thanks to Jan Kara for the help!
Reminder:
The goal is to get rid of the 'sync_supers()' kernel thread. This kernel thread
wakes up every 5 seconds (by default) and calls '->write_super()' for all
mounted file-systems. And the bad thing is that this is done even if all the
superblocks are clean. Moreover, many file-systems do not even need this end
they do not register the '->write_super()' method at all (e.g., btrfs).
So 'sync_supers()' most often just generates useless wake-ups and wastes power.
I am trying to make all file-systems independent of '->write_super()' and plan
to remove 'sync_supers()' and '->write_super()' completely once there are no
more users.
======
Overall status:
1. ext4: patches submitted,
https://lkml.org/lkml/2012/4/2/111
2. exofs: patch submitted,
https://lkml.org/lkml/2012/6/4/211
3. udf: patch submitted, sits in Jan Kara's tree:
https://lkml.org/lkml/2012/6/4/233
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs for_testing
4. affs: patches submitted, sit in Al Viro's tree:
https://lkml.org/lkml/2012/6/6/400
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs for-next
5. hfs: patches submitted, sit Andrew Morton's tree
http://lkml.org/lkml/2012/6/12/82
6. hfsplus: patches submitted, sit in Andre Morton's tree:
https://lkml.org/lkml/2012/6/13/195
7. ext2: done, see commit f72cf5e223a28d3b3ea7dc9e40464fd534e359e8
8. vfat: done, see commit 78491189ddb6d84d4a4abae992ed891a236d0263
9. jffs2: done, see commit 208b14e507c00ff7f108e1a388dd3d8cc805a443
10. reiserfs: done, see commit 033369d1af1264abc23bea2e174aa47cdd212f6f
TODO: sysv, ufs
======
fs/ext4/ext4.h | 10 +---------
fs/ext4/ext4_jbd2.c | 2 +-
fs/ext4/file.c | 14 +++++++++++++-
fs/ext4/ialloc.c | 2 --
fs/ext4/mballoc.c | 2 --
fs/ext4/super.c | 15 ++-------------
6 files changed, 17 insertions(+), 28 deletions(-)
Thanks,
Artem.
From: Jan Kara <[email protected]>
Commit a0375156 properly notes that superblock doesn't need to be marked
as dirty when only number of free inodes / blocks / number of directories
changes since that is recomputed on each mount anyway. However that comment
leaves some unnecessary markings as dirty in place. Remove these.
Artem: tested using xfstests for both journalled and non-journalled ext4.
Signed-off-by: Jan Kara <[email protected]>
Tested-by: Artem Bityutskiy <[email protected]>
Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/ext4/ialloc.c | 2 --
fs/ext4/mballoc.c | 2 --
2 files changed, 0 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index d48e8b1..25b918c 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -315,7 +315,6 @@ out:
err = ext4_handle_dirty_metadata(handle, NULL, bitmap_bh);
if (!fatal)
fatal = err;
- ext4_mark_super_dirty(sb);
} else
ext4_error(sb, "bit already cleared for inode %lu", ino);
@@ -830,7 +829,6 @@ got:
percpu_counter_dec(&sbi->s_freeinodes_counter);
if (S_ISDIR(mode))
percpu_counter_inc(&sbi->s_dirs_counter);
- ext4_mark_super_dirty(sb);
if (sbi->s_log_groups_per_flex) {
flex_group = ext4_flex_group(sbi, group);
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 1cd6994..eabfb4c 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2825,7 +2825,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
err = ext4_handle_dirty_metadata(handle, NULL, gdp_bh);
out_err:
- ext4_mark_super_dirty(sb);
brelse(bitmap_bh);
return err;
}
@@ -4694,7 +4693,6 @@ do_more:
put_bh(bitmap_bh);
goto do_more;
}
- ext4_mark_super_dirty(sb);
error_return:
brelse(bitmap_bh);
ext4_std_error(sb, err);
--
1.7.7.6
From: Jan Kara <[email protected]>
The last user of ext4_mark_super_dirty() in ext4_file_open() is so rare it
can well be modifying the superblock properly by journalling the change.
Change it and get rid of ext4_mark_super_dirty() as it's not needed anymore.
Artem: small amendments.
Artem: tested using xfstests for both journalled and non-journalled ext4.
Signed-off-by: Jan Kara <[email protected]>
Tested-by: Artem Bityutskiy <[email protected]>
Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/ext4/ext4.h | 9 ---------
fs/ext4/file.c | 14 +++++++++++++-
2 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index cfc4e01..0c4042e 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2321,15 +2321,6 @@ static inline void ext4_unlock_group(struct super_block *sb,
spin_unlock(ext4_group_lock_ptr(sb, group));
}
-static inline void ext4_mark_super_dirty(struct super_block *sb)
-{
- struct ext4_super_block *es = EXT4_SB(sb)->s_es;
-
- ext4_superblock_csum_set(sb, es);
- if (EXT4_SB(sb)->s_journal == NULL)
- sb->s_dirt =1;
-}
-
/*
* Block validity checking
*/
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 8c7642a..3a77494 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -181,9 +181,21 @@ static int ext4_file_open(struct inode * inode, struct file * filp)
path.dentry = mnt->mnt_root;
cp = d_path(&path, buf, sizeof(buf));
if (!IS_ERR(cp)) {
+ handle_t *handle;
+ int err;
+
+ handle = ext4_journal_start_sb(sb, 1);
+ if (IS_ERR(handle))
+ return PTR_ERR(handle);
+ err = ext4_journal_get_write_access(handle, sbi->s_sbh);
+ if (err) {
+ ext4_journal_stop(handle);
+ return err;
+ }
strlcpy(sbi->s_es->s_last_mounted, cp,
sizeof(sbi->s_es->s_last_mounted));
- ext4_mark_super_dirty(sb);
+ ext4_handle_dirty_super(handle, sb);
+ ext4_journal_stop(handle);
}
}
/*
--
1.7.7.6
From: Artem Bityutskiy <[email protected]>
We do not depend on VFS's '->write_super()' anymore and do not need the
's_dirt' flag anymore. This patch weeds out 'ext4_write_super()' and the
's_dirt' usage in VFS.
Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/ext4/super.c | 10 ----------
1 files changed, 0 insertions(+), 10 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 9b26ba0..546c6ed 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -73,7 +73,6 @@ static const char *ext4_decode_error(struct super_block *sb, int errno,
static int ext4_remount(struct super_block *sb, int *flags, char *data);
static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf);
static int ext4_unfreeze(struct super_block *sb);
-static void ext4_write_super(struct super_block *sb);
static int ext4_freeze(struct super_block *sb);
static struct dentry *ext4_mount(struct file_system_type *fs_type, int flags,
const char *dev_name, void *data);
@@ -1193,7 +1192,6 @@ static const struct super_operations ext4_nojournal_sops = {
.dirty_inode = ext4_dirty_inode,
.drop_inode = ext4_drop_inode,
.evict_inode = ext4_evict_inode,
- .write_super = ext4_write_super,
.put_super = ext4_put_super,
.statfs = ext4_statfs,
.remount_fs = ext4_remount,
@@ -4202,7 +4200,6 @@ int ext4_commit_super(struct super_block *sb, int sync)
es->s_free_inodes_count =
cpu_to_le32(percpu_counter_sum_positive(
&EXT4_SB(sb)->s_freeinodes_counter));
- sb->s_dirt = 0;
BUFFER_TRACE(sbh, "marking dirty");
ext4_superblock_csum_set(sb, es);
mark_buffer_dirty(sbh);
@@ -4309,13 +4306,6 @@ int ext4_force_commit(struct super_block *sb)
return ret;
}
-static void ext4_write_super(struct super_block *sb)
-{
- lock_super(sb);
- ext4_commit_super(sb, 1);
- unlock_super(sb);
-}
-
static int ext4_sync_fs(struct super_block *sb, int wait)
{
int ret = 0;
--
1.7.7.6
From: Artem Bityutskiy <[email protected]>
This patch changes the '__ext4_handle_dirty_super()' function which is used
by ext4 to update the super-block via the journal in the following cases:
1. When creating the first large file on a file
system without EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
2. When re-sizing the file-system
3. When creating an xattr on a file-system without the
EXT4_FEATURE_COMPAT_EXT_ATTR feature.
This function, however, falls back to just marking the superblock as dirty
if the file-system has no journal. But it is user really rarely and it does not
give any benefit to use the delayed superblock write (via the VFS's
'sync_supers()' kernel thread) in these cases. We can just write out the
superblock asynchronously instead.
This patch also removes 's_dirt' condition on the unmount path because we never
set it anymore, so we should not test it.
Tested using xfstests for both journalled and non-journalled ext4.
Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/ext4/ext4.h | 1 +
fs/ext4/ext4_jbd2.c | 2 +-
fs/ext4/super.c | 5 ++---
3 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 0c4042e..b2439d5 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2041,6 +2041,7 @@ extern int ext4_superblock_csum_verify(struct super_block *sb,
struct ext4_super_block *es);
extern void ext4_superblock_csum_set(struct super_block *sb,
struct ext4_super_block *es);
+extern int ext4_commit_super(struct super_block *sb, int sync);
extern void *ext4_kvmalloc(size_t size, gfp_t flags);
extern void *ext4_kvzalloc(size_t size, gfp_t flags);
extern void ext4_kvfree(void *ptr);
diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 90f7c2e..27354df 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -156,6 +156,6 @@ int __ext4_handle_dirty_super(const char *where, unsigned int line,
(struct ext4_super_block *)bh->b_data);
mark_buffer_dirty(bh);
} else
- sb->s_dirt = 1;
+ err = ext4_commit_super(sb, 0);
return err;
}
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index eb7aa3e..9b26ba0 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -63,7 +63,6 @@ static struct ext4_features *ext4_feat;
static int ext4_load_journal(struct super_block *, struct ext4_super_block *,
unsigned long journal_devnum);
static int ext4_show_options(struct seq_file *seq, struct dentry *root);
-static int ext4_commit_super(struct super_block *sb, int sync);
static void ext4_mark_recovery_complete(struct super_block *sb,
struct ext4_super_block *es);
static void ext4_clear_journal_err(struct super_block *sb,
@@ -896,7 +895,7 @@ static void ext4_put_super(struct super_block *sb)
EXT4_CLEAR_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_RECOVER);
es->s_state = cpu_to_le16(sbi->s_mount_state);
}
- if (sb->s_dirt || !(sb->s_flags & MS_RDONLY))
+ if (!(sb->s_flags & MS_RDONLY))
ext4_commit_super(sb, 1);
if (sbi->s_proc) {
@@ -4155,7 +4154,7 @@ static int ext4_load_journal(struct super_block *sb,
return 0;
}
-static int ext4_commit_super(struct super_block *sb, int sync)
+int ext4_commit_super(struct super_block *sb, int sync)
{
struct ext4_super_block *es = EXT4_SB(sb)->s_es;
struct buffer_head *sbh = EXT4_SB(sb)->s_sbh;
--
1.7.7.6
On Tue, 2012-07-03 at 15:00 +0300, Artem Bityutskiy wrote:
> Like the other similar patch-sets, we switch to a delayed job for writing out
> the superblock instead of using the 's_dirt' flag. Additionally, this patch-set
> includes several clean-ups.
Sorry, this paragraph is wrong, ignore it (copy-paste error). These
patches are the same as I sent at the beginning of April
(https://lkml.org/lkml/2012/4/2/111). Just refreshed and re-tested on
top of v3.5-rc5.
--
Best Regards,
Artem Bityutskiy
Hi Artem,
thanks for your persistence with this. I wanted to give you my
Reviewed-by on this patch but I've noticed two (mostly minor) things
(see below).
On Tue 03-07-12 15:00:09, Artem Bityutskiy wrote:
> From: Artem Bityutskiy <[email protected]>
>
> This patch changes the '__ext4_handle_dirty_super()' function which is used
> by ext4 to update the super-block via the journal in the following cases:
>
> 1. When creating the first large file on a file
> system without EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
> 2. When re-sizing the file-system
> 3. When creating an xattr on a file-system without the
> EXT4_FEATURE_COMPAT_EXT_ATTR feature.
Since this was written, we seem to have grown a use of
ext4_handle_dirty_super() in fs/ext4/namei.c when adding inode to orphan
list.
> This function, however, falls back to just marking the superblock as dirty
> if the file-system has no journal. But it is user really rarely and it does not
> give any benefit to use the delayed superblock write (via the VFS's
> 'sync_supers()' kernel thread) in these cases. We can just write out the
> superblock asynchronously instead.
>
> This patch also removes 's_dirt' condition on the unmount path because we never
> set it anymore, so we should not test it.
>
> Tested using xfstests for both journalled and non-journalled ext4.
>
> Signed-off-by: Artem Bityutskiy <[email protected]>
> ---
> fs/ext4/ext4.h | 1 +
> fs/ext4/ext4_jbd2.c | 2 +-
> fs/ext4/super.c | 5 ++---
> 3 files changed, 4 insertions(+), 4 deletions(-)
>
...
> diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
> index 90f7c2e..27354df 100644
> --- a/fs/ext4/ext4_jbd2.c
> +++ b/fs/ext4/ext4_jbd2.c
> @@ -156,6 +156,6 @@ int __ext4_handle_dirty_super(const char *where, unsigned int line,
> (struct ext4_super_block *)bh->b_data);
> mark_buffer_dirty(bh);
> } else
> - sb->s_dirt = 1;
> + err = ext4_commit_super(sb, 0);
When you look at the whole __ext4_handle_dirty_super() function now, it
doesn't make much sense like this. When now == 1, we mark superblock buffer
as dirty, when now == 0, we set wtime, kbytes_written, free blocks and
inodes in the superblock buffer and then mark it dirty. Looking into all
the places calling this, I'd just drop the 'now' argument and make
everything behave as in now == 1 case.
Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR
On Tue, 2012-07-03 at 14:48 +0200, Jan Kara wrote:
> > 1. When creating the first large file on a file
> > system without EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
> > 2. When re-sizing the file-system
> > 3. When creating an xattr on a file-system without the
> > EXT4_FEATURE_COMPAT_EXT_ATTR feature.
> Since this was written, we seem to have grown a use of
> ext4_handle_dirty_super() in fs/ext4/namei.c when adding inode to orphan
> list.
Re-worked the commit message a bit in v4.
> > + err = ext4_commit_super(sb, 0);
> When you look at the whole __ext4_handle_dirty_super() function now, it
> doesn't make much sense like this. When now == 1, we mark superblock buffer
> as dirty, when now == 0, we set wtime, kbytes_written, free blocks and
> inodes in the superblock buffer and then mark it dirty. Looking into all
> the places calling this, I'd just drop the 'now' argument and make
> everything behave as in now == 1 case.
Done in v4 in patch N5. Thanks a lot for suggestions and the help!
--
Best Regards,
Artem Bityutskiy