This patch-set makes ext4 file-system stop using the VFS '->write_supers()'
call-back and the '->s_dirt' superblock field because I plan to remove them
once all users are gone.
What we do in this patch-set is we notice that ext4 does not really needed the
'write_super()' functionality and we can remove it. Instead, we simply mark the
superblock buffer as dirty directly, rather than postponing this till the next
'sync_supers()' kernel thread wake-up. Most of the ext4 changes were done or
suggested by Jan Kara - thanks Jan!
V7 is identical to v6, but I've added Jan's "Reviewed-by" and removed my
worries about slowing down the deletion path - they were false, apologies. So
only the commit messages changed.
Tested with xfstests (both journalled and non-journalled ext4 modes).
fs/ext4/ext4.h | 9 ---------
fs/ext4/ext4_jbd2.c | 8 +++-----
fs/ext4/ext4_jbd2.h | 7 ++-----
fs/ext4/file.c | 14 +++++++++++++-
fs/ext4/ialloc.c | 2 --
fs/ext4/inode.c | 2 +-
fs/ext4/mballoc.c | 2 --
fs/ext4/namei.c | 4 ++--
fs/ext4/resize.c | 2 +-
fs/ext4/super.c | 12 +-----------
10 files changed, 23 insertions(+), 39 deletions(-)
Reminder
========
The goal is to get rid of the 'sync_supers()' kernel thread. This kernel thread
wakes up every 5 seconds (by default) and calls '->write_super()' for all
mounted file-systems. And the bad thing is that this is done even if all the
superblocks are clean. Moreover, many file-systems do not even need this and
they do not even register the '->write_super()' method at all (e.g., btrfs).
So 'sync_supers()' mostly just generates useless wake-ups and wastes power.
I am trying to make all file-systems independent of '->write_super()' and plan
to remove 'sync_supers()' and '->write_super()' completely once there are no
more users.
Overall status
==============
1. ext4: patches submitted,
https://lkml.org/lkml/2012/7/10/195
2. exofs: patch submitted,
https://lkml.org/lkml/2012/6/4/211
3. sysv: patches submitted,
http://lkml.org/lkml/2012/7/3/250
4. udf: patch submitted, sits in Jan Kara's tree:
https://lkml.org/lkml/2012/6/4/233
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs for_testing
5. affs: patches submitted, sit in Al Viro's tree:
https://lkml.org/lkml/2012/6/6/400
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs for-next
6. hfs: patches submitted, sit Andrew Morton's tree
http://lkml.org/lkml/2012/6/12/82
7. hfsplus: patches submitted, sit in Andrew Morton's tree:
https://lkml.org/lkml/2012/6/13/195
8. ext2: done, see commit f72cf5e223a28d3b3ea7dc9e40464fd534e359e8
9. vfat: done, see commit 78491189ddb6d84d4a4abae992ed891a236d0263
10. jffs2: done, see commit 208b14e507c00ff7f108e1a388dd3d8cc805a443
11. reiserfs: done, see commit 033369d1af1264abc23bea2e174aa47cdd212f6f
TODO: ufs
Thanks,
Artem.
From: Artem Bityutskiy <[email protected]>
The '__ext4_handle_dirty_metadata()' does not need the 'now' argument anymore
and we can kill it.
Signed-off-by: Artem Bityutskiy <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
---
fs/ext4/ext4_jbd2.c | 3 +--
fs/ext4/ext4_jbd2.h | 7 ++-----
fs/ext4/inode.c | 2 +-
fs/ext4/namei.c | 4 ++--
fs/ext4/resize.c | 2 +-
5 files changed, 7 insertions(+), 11 deletions(-)
diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index c19ab6a..bfa65b4 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -138,8 +138,7 @@ int __ext4_handle_dirty_metadata(const char *where, unsigned int line,
}
int __ext4_handle_dirty_super(const char *where, unsigned int line,
- handle_t *handle, struct super_block *sb,
- int now)
+ handle_t *handle, struct super_block *sb)
{
struct buffer_head *bh = EXT4_SB(sb)->s_sbh;
int err = 0;
diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index f440e8f1..83b20fc 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -213,8 +213,7 @@ int __ext4_handle_dirty_metadata(const char *where, unsigned int line,
struct buffer_head *bh);
int __ext4_handle_dirty_super(const char *where, unsigned int line,
- handle_t *handle, struct super_block *sb,
- int now);
+ handle_t *handle, struct super_block *sb);
#define ext4_journal_get_write_access(handle, bh) \
__ext4_journal_get_write_access(__func__, __LINE__, (handle), (bh))
@@ -226,10 +225,8 @@ int __ext4_handle_dirty_super(const char *where, unsigned int line,
#define ext4_handle_dirty_metadata(handle, inode, bh) \
__ext4_handle_dirty_metadata(__func__, __LINE__, (handle), (inode), \
(bh))
-#define ext4_handle_dirty_super_now(handle, sb) \
- __ext4_handle_dirty_super(__func__, __LINE__, (handle), (sb), 1)
#define ext4_handle_dirty_super(handle, sb) \
- __ext4_handle_dirty_super(__func__, __LINE__, (handle), (sb), 0)
+ __ext4_handle_dirty_super(__func__, __LINE__, (handle), (sb))
handle_t *ext4_journal_start_sb(struct super_block *sb, int nblocks);
int __ext4_journal_stop(const char *where, unsigned int line, handle_t *handle);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 02bc8cb..b1bd96f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4034,7 +4034,7 @@ static int ext4_do_update_inode(handle_t *handle,
EXT4_SET_RO_COMPAT_FEATURE(sb,
EXT4_FEATURE_RO_COMPAT_LARGE_FILE);
ext4_handle_sync(handle);
- err = ext4_handle_dirty_super_now(handle, sb);
+ err = ext4_handle_dirty_super(handle, sb);
}
}
raw_inode->i_generation = cpu_to_le32(inode->i_generation);
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 5845cd9..f155d57 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2397,7 +2397,7 @@ int ext4_orphan_add(handle_t *handle, struct inode *inode)
/* Insert this inode at the head of the on-disk orphan list... */
NEXT_ORPHAN(inode) = le32_to_cpu(EXT4_SB(sb)->s_es->s_last_orphan);
EXT4_SB(sb)->s_es->s_last_orphan = cpu_to_le32(inode->i_ino);
- err = ext4_handle_dirty_super_now(handle, sb);
+ err = ext4_handle_dirty_super(handle, sb);
rc = ext4_mark_iloc_dirty(handle, inode, &iloc);
if (!err)
err = rc;
@@ -2470,7 +2470,7 @@ int ext4_orphan_del(handle_t *handle, struct inode *inode)
if (err)
goto out_brelse;
sbi->s_es->s_last_orphan = cpu_to_le32(ino_next);
- err = ext4_handle_dirty_super_now(handle, inode->i_sb);
+ err = ext4_handle_dirty_super(handle, inode->i_sb);
} else {
struct ext4_iloc iloc2;
struct inode *i_prev =
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 7ea6cbb..09c7aae 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -798,7 +798,7 @@ static int add_new_gdb(handle_t *handle, struct inode *inode,
ext4_kvfree(o_group_desc);
le16_add_cpu(&es->s_reserved_gdt_blocks, -1);
- err = ext4_handle_dirty_super_now(handle, sb);
+ err = ext4_handle_dirty_super(handle, sb);
if (err)
ext4_std_error(sb, err);
--
1.7.7.6
From: Artem Bityutskiy <[email protected]>
We do not depend on VFS's '->write_super()' anymore and do not need the
's_dirt' flag anymore, so weed out 'ext4_write_super()' and 's_dirt'.
Signed-off-by: Artem Bityutskiy <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
---
fs/ext4/super.c | 10 ----------
1 files changed, 0 insertions(+), 10 deletions(-)
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index a391c53..622d5c7 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -74,7 +74,6 @@ static const char *ext4_decode_error(struct super_block *sb, int errno,
static int ext4_remount(struct super_block *sb, int *flags, char *data);
static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf);
static int ext4_unfreeze(struct super_block *sb);
-static void ext4_write_super(struct super_block *sb);
static int ext4_freeze(struct super_block *sb);
static struct dentry *ext4_mount(struct file_system_type *fs_type, int flags,
const char *dev_name, void *data);
@@ -1194,7 +1193,6 @@ static const struct super_operations ext4_nojournal_sops = {
.dirty_inode = ext4_dirty_inode,
.drop_inode = ext4_drop_inode,
.evict_inode = ext4_evict_inode,
- .write_super = ext4_write_super,
.put_super = ext4_put_super,
.statfs = ext4_statfs,
.remount_fs = ext4_remount,
@@ -4203,7 +4201,6 @@ static int ext4_commit_super(struct super_block *sb, int sync)
es->s_free_inodes_count =
cpu_to_le32(percpu_counter_sum_positive(
&EXT4_SB(sb)->s_freeinodes_counter));
- sb->s_dirt = 0;
BUFFER_TRACE(sbh, "marking dirty");
ext4_superblock_csum_set(sb, es);
mark_buffer_dirty(sbh);
@@ -4310,13 +4307,6 @@ int ext4_force_commit(struct super_block *sb)
return ret;
}
-static void ext4_write_super(struct super_block *sb)
-{
- lock_super(sb);
- ext4_commit_super(sb, 1);
- unlock_super(sb);
-}
-
static int ext4_sync_fs(struct super_block *sb, int wait)
{
int ret = 0;
--
1.7.7.6
From: Artem Bityutskiy <[email protected]>
This patch changes the 'ext4_handle_dirty_super()' function which submits
the superblock for I/O in the following cases:
1. When creating the first large file on a file system without
EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
2. When re-sizing the file-system.
3. When creating an xattr on a file-system without the
EXT4_FEATURE_COMPAT_EXT_ATTR feature.
If the file-system has journal enabled, the superblock is written via the
journal. We do not modify this path.
If the file-system has no journal, this function, falls back to just marking
the superblock as dirty using the 's_dirt' superblock flag. This means that it
delays the actual superblock I/O submission by 5 seconds (default setting).
Namely, the 'sync_supers()' kernel thread will call 'ext4_write_super()' later
and will actually submit the superblock for I/O.
And this is the behavior this patch modifies: we stop using 's_dirt' and just
mark the superblock buffer as dirty right away. Indeed, all 3 cases above are
extremely rare and it does not add any value to delay the I/O submission for
them.
Note: 'ext4_handle_dirty_super()' executes '__ext4_handle_dirty_super()' with
'now = 0'. This patch basically makes the 'now' argument unneeded and it will
be deleted in one of the next patches.
This patch also removes 's_dirt' condition on the unmount path because we never
set it anymore, so we should not test it.
Tested using xfstests for both journalled and non-journalled ext4.
Signed-off-by: Artem Bityutskiy <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
---
fs/ext4/ext4_jbd2.c | 5 ++---
fs/ext4/super.c | 2 +-
2 files changed, 3 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 90f7c2e..c19ab6a 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -151,11 +151,10 @@ int __ext4_handle_dirty_super(const char *where, unsigned int line,
if (err)
ext4_journal_abort_handle(where, line, __func__,
bh, handle, err);
- } else if (now) {
+ } else {
ext4_superblock_csum_set(sb,
(struct ext4_super_block *)bh->b_data);
mark_buffer_dirty(bh);
- } else
- sb->s_dirt = 1;
+ }
return err;
}
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index eb7aa3e..a391c53 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -896,7 +896,7 @@ static void ext4_put_super(struct super_block *sb)
EXT4_CLEAR_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_RECOVER);
es->s_state = cpu_to_le16(sbi->s_mount_state);
}
- if (sb->s_dirt || !(sb->s_flags & MS_RDONLY))
+ if (!(sb->s_flags & MS_RDONLY))
ext4_commit_super(sb, 1);
if (sbi->s_proc) {
--
1.7.7.6
From: Jan Kara <[email protected]>
Commit a0375156 properly notes that superblock doesn't need to be marked
as dirty when only number of free inodes / blocks / number of directories
changes since that is recomputed on each mount anyway. However that comment
leaves some unnecessary markings as dirty in place. Remove these.
Artem: tested using xfstests for both journalled and non-journalled ext4.
Signed-off-by: Jan Kara <[email protected]>
Tested-by: Artem Bityutskiy <[email protected]>
Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/ext4/ialloc.c | 2 --
fs/ext4/mballoc.c | 2 --
2 files changed, 0 insertions(+), 4 deletions(-)
diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index d48e8b1..25b918c 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -315,7 +315,6 @@ out:
err = ext4_handle_dirty_metadata(handle, NULL, bitmap_bh);
if (!fatal)
fatal = err;
- ext4_mark_super_dirty(sb);
} else
ext4_error(sb, "bit already cleared for inode %lu", ino);
@@ -830,7 +829,6 @@ got:
percpu_counter_dec(&sbi->s_freeinodes_counter);
if (S_ISDIR(mode))
percpu_counter_inc(&sbi->s_dirs_counter);
- ext4_mark_super_dirty(sb);
if (sbi->s_log_groups_per_flex) {
flex_group = ext4_flex_group(sbi, group);
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 1cd6994..eabfb4c 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2825,7 +2825,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
err = ext4_handle_dirty_metadata(handle, NULL, gdp_bh);
out_err:
- ext4_mark_super_dirty(sb);
brelse(bitmap_bh);
return err;
}
@@ -4694,7 +4693,6 @@ do_more:
put_bh(bitmap_bh);
goto do_more;
}
- ext4_mark_super_dirty(sb);
error_return:
brelse(bitmap_bh);
ext4_std_error(sb, err);
--
1.7.7.6
From: Jan Kara <[email protected]>
The last user of ext4_mark_super_dirty() in ext4_file_open() is so rare it
can well be modifying the superblock properly by journalling the change.
Change it and get rid of ext4_mark_super_dirty() as it's not needed anymore.
Artem: small amendments.
Artem: tested using xfstests for both journalled and non-journalled ext4.
Signed-off-by: Jan Kara <[email protected]>
Tested-by: Artem Bityutskiy <[email protected]>
Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/ext4/ext4.h | 9 ---------
fs/ext4/file.c | 14 +++++++++++++-
2 files changed, 13 insertions(+), 10 deletions(-)
diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index cfc4e01..0c4042e 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2321,15 +2321,6 @@ static inline void ext4_unlock_group(struct super_block *sb,
spin_unlock(ext4_group_lock_ptr(sb, group));
}
-static inline void ext4_mark_super_dirty(struct super_block *sb)
-{
- struct ext4_super_block *es = EXT4_SB(sb)->s_es;
-
- ext4_superblock_csum_set(sb, es);
- if (EXT4_SB(sb)->s_journal == NULL)
- sb->s_dirt =1;
-}
-
/*
* Block validity checking
*/
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 8c7642a..3a77494 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -181,9 +181,21 @@ static int ext4_file_open(struct inode * inode, struct file * filp)
path.dentry = mnt->mnt_root;
cp = d_path(&path, buf, sizeof(buf));
if (!IS_ERR(cp)) {
+ handle_t *handle;
+ int err;
+
+ handle = ext4_journal_start_sb(sb, 1);
+ if (IS_ERR(handle))
+ return PTR_ERR(handle);
+ err = ext4_journal_get_write_access(handle, sbi->s_sbh);
+ if (err) {
+ ext4_journal_stop(handle);
+ return err;
+ }
strlcpy(sbi->s_es->s_last_mounted, cp,
sizeof(sbi->s_es->s_last_mounted));
- ext4_mark_super_dirty(sb);
+ ext4_handle_dirty_super(handle, sb);
+ ext4_journal_stop(handle);
}
}
/*
--
1.7.7.6
On Wed, 2012-07-11 at 16:46 +0300, Artem Bityutskiy wrote:
> This patch-set makes ext4 file-system stop using the VFS '->write_supers()'
> call-back and the '->s_dirt' superblock field because I plan to remove them
> once all users are gone.
Hi Ted, is there any chance for this stuff to hit 3.6?
Basically I have 3 pending file-systems:
1. ext4 (these patches)
2. sysv - trivial, I hope Andrew would pick them.
3. ufs - should be straight-forward to change the same way I changed
hfs. I'll send patches soon. I hope Andrew would pick them as well.
Everything else - either in upstream or in -mm or in sub-system trees.
After this, I can kill sync_supers and have a chance that Linus would
accept the patches at 3.6-rc1, and end up with sync_super-less 3.6.
--
Best Regards,
Artem Bityutskiy
On Wed, Jul 11, 2012 at 04:46:39PM +0300, Artem Bityutskiy wrote:
> This patch-set makes ext4 file-system stop using the VFS '->write_supers()'
> call-back and the '->s_dirt' superblock field because I plan to remove them
> once all users are gone.
I've applied this patch series, thanks.
- Ted