2012-07-04 12:20:20

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCHv4 0/5] ext4: stop using write_super and s_dirt

This patch-set makes ext4 file-system stop using the VFS '->write_supers()'
call-back and the '->s_dirt' superblock field because I plan to remove them
once all users are gone.

Most of the ext4 changes were done or suggested by Jan Kara (thanks!). This
patch does not do anything complex - we basically notice that ext4 does not
really needd the 'write_super()' functionality and we can remove it. We
mark the superblock buffer as dirty directly, rather than posponing this till
the next 'sync_supers()' kernel thread wake-up.

Tested with xfstests (both journalled and non-journalled ext4 modes).

Reminder:

The goal is to get rid of the 'sync_supers()' kernel thread. This kernel thread
wakes up every 5 seconds (by default) and calls '->write_super()' for all
mounted file-systems. And the bad thing is that this is done even if all the
superblocks are clean. Moreover, many file-systems do not even need this end
they do not register the '->write_super()' method at all (e.g., btrfs).

So 'sync_supers()' most often just generates useless wake-ups and wastes power.
I am trying to make all file-systems independent of '->write_super()' and plan
to remove 'sync_supers()' and '->write_super()' completely once there are no
more users.

======
Overall status:

1. ext4: patches submitted,
https://lkml.org/lkml/2012/7/3/210
2. exofs: patch submitted,
https://lkml.org/lkml/2012/6/4/211
3. sysv: patches submitted,
http://lkml.org/lkml/2012/7/3/250
4. udf: patch submitted, sits in Jan Kara's tree:
https://lkml.org/lkml/2012/6/4/233
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs for_testing
5. affs: patches submitted, sit in Al Viro's tree:
https://lkml.org/lkml/2012/6/6/400
git://git.kernel.org/pub/scm/linux/kernel/git/viro/vfs for-next
6. hfs: patches submitted, sit Andrew Morton's tree
http://lkml.org/lkml/2012/6/12/82
7. hfsplus: patches submitted, sit in Andrew Morton's tree:
https://lkml.org/lkml/2012/6/13/195
8. ext2: done, see commit f72cf5e223a28d3b3ea7dc9e40464fd534e359e8
9. vfat: done, see commit 78491189ddb6d84d4a4abae992ed891a236d0263
10. jffs2: done, see commit 208b14e507c00ff7f108e1a388dd3d8cc805a443
11. reiserfs: done, see commit 033369d1af1264abc23bea2e174aa47cdd212f6f

TODO: ufs
======

fs/ext4/ext4.h | 10 +---------
fs/ext4/ext4_jbd2.c | 9 ++-------
fs/ext4/ext4_jbd2.h | 7 ++-----
fs/ext4/file.c | 14 +++++++++++++-
fs/ext4/ialloc.c | 2 --
fs/ext4/inode.c | 2 +-
fs/ext4/mballoc.c | 2 --
fs/ext4/namei.c | 4 ++--
fs/ext4/resize.c | 2 +-
fs/ext4/super.c | 15 ++-------------
10 files changed, 24 insertions(+), 43 deletions(-)

Thanks,
Artem.


2012-07-04 12:21:50

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCHv4 1/5] ext4: Remove useless marking of superblock dirty

From: Jan Kara <[email protected]>

Commit a0375156 properly notes that superblock doesn't need to be marked
as dirty when only number of free inodes / blocks / number of directories
changes since that is recomputed on each mount anyway. However that comment
leaves some unnecessary markings as dirty in place. Remove these.

Artem: tested using xfstests for both journalled and non-journalled ext4.

Signed-off-by: Jan Kara <[email protected]>
Tested-by: Artem Bityutskiy <[email protected]>
Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/ext4/ialloc.c | 2 --
fs/ext4/mballoc.c | 2 --
2 files changed, 0 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/ialloc.c b/fs/ext4/ialloc.c
index d48e8b1..25b918c 100644
--- a/fs/ext4/ialloc.c
+++ b/fs/ext4/ialloc.c
@@ -315,7 +315,6 @@ out:
err = ext4_handle_dirty_metadata(handle, NULL, bitmap_bh);
if (!fatal)
fatal = err;
- ext4_mark_super_dirty(sb);
} else
ext4_error(sb, "bit already cleared for inode %lu", ino);

@@ -830,7 +829,6 @@ got:
percpu_counter_dec(&sbi->s_freeinodes_counter);
if (S_ISDIR(mode))
percpu_counter_inc(&sbi->s_dirs_counter);
- ext4_mark_super_dirty(sb);

if (sbi->s_log_groups_per_flex) {
flex_group = ext4_flex_group(sbi, group);
diff --git a/fs/ext4/mballoc.c b/fs/ext4/mballoc.c
index 1cd6994..eabfb4c 100644
--- a/fs/ext4/mballoc.c
+++ b/fs/ext4/mballoc.c
@@ -2825,7 +2825,6 @@ ext4_mb_mark_diskspace_used(struct ext4_allocation_context *ac,
err = ext4_handle_dirty_metadata(handle, NULL, gdp_bh);

out_err:
- ext4_mark_super_dirty(sb);
brelse(bitmap_bh);
return err;
}
@@ -4694,7 +4693,6 @@ do_more:
put_bh(bitmap_bh);
goto do_more;
}
- ext4_mark_super_dirty(sb);
error_return:
brelse(bitmap_bh);
ext4_std_error(sb, err);
--
1.7.7.6

2012-07-04 12:21:52

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCHv4 3/5] ext4: remove unnecessary superblock dirtying

From: Artem Bityutskiy <[email protected]>

This patch changes the '__ext4_handle_dirty_super()' function which is used
by ext4 to update the superblock via the journal in the following cases:

1. When creating the first large file on a file system without
EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
2. When re-sizing the file-system.
3. When creating an xattr on a file-system without the
EXT4_FEATURE_COMPAT_EXT_ATTR feature.
4. When adding or deleting an orphan (because we update the 's_last_orphan'
superblock field).

This function, however, falls back to just marking the superblock as dirty
if the file-system has no journal. This means that we delay the actual
superblock I/O submission by 5 seconds (roughly speaking). Namely, the
'sync_supers()' kernel thread will call 'ext4_write_super()' later, where
we actually will submit the superblock down to the media.

However:
1. For cases 1-3 it does not add any value to delay the I/O submission. These
events are rare and we may just commit submit the superblock for
asynchronous I/O right away.
2. For case 4 - similarly, not terribly frequent event in most of workloads.
It should be good enough to just submit asynchronous superblock write-out.

This patch also removes 's_dirt' condition on the unmount path because we never
set it anymore, so we should not test it.

Tested using xfstests for both journalled and non-journalled ext4.

Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/ext4/ext4.h | 1 +
fs/ext4/ext4_jbd2.c | 2 +-
fs/ext4/super.c | 5 ++---
3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index 0c4042e..b2439d5 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2041,6 +2041,7 @@ extern int ext4_superblock_csum_verify(struct super_block *sb,
struct ext4_super_block *es);
extern void ext4_superblock_csum_set(struct super_block *sb,
struct ext4_super_block *es);
+extern int ext4_commit_super(struct super_block *sb, int sync);
extern void *ext4_kvmalloc(size_t size, gfp_t flags);
extern void *ext4_kvzalloc(size_t size, gfp_t flags);
extern void ext4_kvfree(void *ptr);
diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 90f7c2e..27354df 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -156,6 +156,6 @@ int __ext4_handle_dirty_super(const char *where, unsigned int line,
(struct ext4_super_block *)bh->b_data);
mark_buffer_dirty(bh);
} else
- sb->s_dirt = 1;
+ err = ext4_commit_super(sb, 0);
return err;
}
diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index eb7aa3e..9b26ba0 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -63,7 +63,6 @@ static struct ext4_features *ext4_feat;
static int ext4_load_journal(struct super_block *, struct ext4_super_block *,
unsigned long journal_devnum);
static int ext4_show_options(struct seq_file *seq, struct dentry *root);
-static int ext4_commit_super(struct super_block *sb, int sync);
static void ext4_mark_recovery_complete(struct super_block *sb,
struct ext4_super_block *es);
static void ext4_clear_journal_err(struct super_block *sb,
@@ -896,7 +895,7 @@ static void ext4_put_super(struct super_block *sb)
EXT4_CLEAR_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_RECOVER);
es->s_state = cpu_to_le16(sbi->s_mount_state);
}
- if (sb->s_dirt || !(sb->s_flags & MS_RDONLY))
+ if (!(sb->s_flags & MS_RDONLY))
ext4_commit_super(sb, 1);

if (sbi->s_proc) {
@@ -4155,7 +4154,7 @@ static int ext4_load_journal(struct super_block *sb,
return 0;
}

-static int ext4_commit_super(struct super_block *sb, int sync)
+int ext4_commit_super(struct super_block *sb, int sync)
{
struct ext4_super_block *es = EXT4_SB(sb)->s_es;
struct buffer_head *sbh = EXT4_SB(sb)->s_sbh;
--
1.7.7.6

2012-07-04 12:20:30

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCHv4 5/5] ext4: simplify superblock dirtying

From: Artem Bityutskiy <[email protected]>

The '__ext4_handle_dirty_metadata()' takes a 'now' argument which does not add
any value anymore (after we removed 'ext4_write_super()'). First of all, it
only affects the non-journalled file-system case.
1. If 'now' is true, we just re-calculate the superblock checksum and submit
it down for asynchronous I/O.
2. If 'now' is false, we call 'ext4_commit_super()' which essentially does the
same, but it performs better error checking and additionally updates few
counters like 's_kbytes_written'.

So we can simplify the code a bit by killing the 'now' argument.

This change was suggested by Jan Kara <[email protected]>.

Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/ext4/ext4_jbd2.c | 7 +------
fs/ext4/ext4_jbd2.h | 7 ++-----
fs/ext4/inode.c | 2 +-
fs/ext4/namei.c | 4 ++--
fs/ext4/resize.c | 2 +-
5 files changed, 7 insertions(+), 15 deletions(-)

diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
index 27354df..ce169e6 100644
--- a/fs/ext4/ext4_jbd2.c
+++ b/fs/ext4/ext4_jbd2.c
@@ -138,8 +138,7 @@ int __ext4_handle_dirty_metadata(const char *where, unsigned int line,
}

int __ext4_handle_dirty_super(const char *where, unsigned int line,
- handle_t *handle, struct super_block *sb,
- int now)
+ handle_t *handle, struct super_block *sb)
{
struct buffer_head *bh = EXT4_SB(sb)->s_sbh;
int err = 0;
@@ -151,10 +150,6 @@ int __ext4_handle_dirty_super(const char *where, unsigned int line,
if (err)
ext4_journal_abort_handle(where, line, __func__,
bh, handle, err);
- } else if (now) {
- ext4_superblock_csum_set(sb,
- (struct ext4_super_block *)bh->b_data);
- mark_buffer_dirty(bh);
} else
err = ext4_commit_super(sb, 0);
return err;
diff --git a/fs/ext4/ext4_jbd2.h b/fs/ext4/ext4_jbd2.h
index f440e8f1..83b20fc 100644
--- a/fs/ext4/ext4_jbd2.h
+++ b/fs/ext4/ext4_jbd2.h
@@ -213,8 +213,7 @@ int __ext4_handle_dirty_metadata(const char *where, unsigned int line,
struct buffer_head *bh);

int __ext4_handle_dirty_super(const char *where, unsigned int line,
- handle_t *handle, struct super_block *sb,
- int now);
+ handle_t *handle, struct super_block *sb);

#define ext4_journal_get_write_access(handle, bh) \
__ext4_journal_get_write_access(__func__, __LINE__, (handle), (bh))
@@ -226,10 +225,8 @@ int __ext4_handle_dirty_super(const char *where, unsigned int line,
#define ext4_handle_dirty_metadata(handle, inode, bh) \
__ext4_handle_dirty_metadata(__func__, __LINE__, (handle), (inode), \
(bh))
-#define ext4_handle_dirty_super_now(handle, sb) \
- __ext4_handle_dirty_super(__func__, __LINE__, (handle), (sb), 1)
#define ext4_handle_dirty_super(handle, sb) \
- __ext4_handle_dirty_super(__func__, __LINE__, (handle), (sb), 0)
+ __ext4_handle_dirty_super(__func__, __LINE__, (handle), (sb))

handle_t *ext4_journal_start_sb(struct super_block *sb, int nblocks);
int __ext4_journal_stop(const char *where, unsigned int line, handle_t *handle);
diff --git a/fs/ext4/inode.c b/fs/ext4/inode.c
index 02bc8cb..b1bd96f 100644
--- a/fs/ext4/inode.c
+++ b/fs/ext4/inode.c
@@ -4034,7 +4034,7 @@ static int ext4_do_update_inode(handle_t *handle,
EXT4_SET_RO_COMPAT_FEATURE(sb,
EXT4_FEATURE_RO_COMPAT_LARGE_FILE);
ext4_handle_sync(handle);
- err = ext4_handle_dirty_super_now(handle, sb);
+ err = ext4_handle_dirty_super(handle, sb);
}
}
raw_inode->i_generation = cpu_to_le32(inode->i_generation);
diff --git a/fs/ext4/namei.c b/fs/ext4/namei.c
index 5845cd9..f155d57 100644
--- a/fs/ext4/namei.c
+++ b/fs/ext4/namei.c
@@ -2397,7 +2397,7 @@ int ext4_orphan_add(handle_t *handle, struct inode *inode)
/* Insert this inode at the head of the on-disk orphan list... */
NEXT_ORPHAN(inode) = le32_to_cpu(EXT4_SB(sb)->s_es->s_last_orphan);
EXT4_SB(sb)->s_es->s_last_orphan = cpu_to_le32(inode->i_ino);
- err = ext4_handle_dirty_super_now(handle, sb);
+ err = ext4_handle_dirty_super(handle, sb);
rc = ext4_mark_iloc_dirty(handle, inode, &iloc);
if (!err)
err = rc;
@@ -2470,7 +2470,7 @@ int ext4_orphan_del(handle_t *handle, struct inode *inode)
if (err)
goto out_brelse;
sbi->s_es->s_last_orphan = cpu_to_le32(ino_next);
- err = ext4_handle_dirty_super_now(handle, inode->i_sb);
+ err = ext4_handle_dirty_super(handle, inode->i_sb);
} else {
struct ext4_iloc iloc2;
struct inode *i_prev =
diff --git a/fs/ext4/resize.c b/fs/ext4/resize.c
index 7ea6cbb..09c7aae 100644
--- a/fs/ext4/resize.c
+++ b/fs/ext4/resize.c
@@ -798,7 +798,7 @@ static int add_new_gdb(handle_t *handle, struct inode *inode,
ext4_kvfree(o_group_desc);

le16_add_cpu(&es->s_reserved_gdt_blocks, -1);
- err = ext4_handle_dirty_super_now(handle, sb);
+ err = ext4_handle_dirty_super(handle, sb);
if (err)
ext4_std_error(sb, err);

--
1.7.7.6


2012-07-04 12:21:53

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCHv4 4/5] ext4: weed out ext4_write_super

From: Artem Bityutskiy <[email protected]>

We do not depend on VFS's '->write_super()' anymore and do not need the
's_dirt' flag anymore. This patch weeds out 'ext4_write_super()' and the
's_dirt' usage in VFS.

Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/ext4/super.c | 10 ----------
1 files changed, 0 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/super.c b/fs/ext4/super.c
index 9b26ba0..546c6ed 100644
--- a/fs/ext4/super.c
+++ b/fs/ext4/super.c
@@ -73,7 +73,6 @@ static const char *ext4_decode_error(struct super_block *sb, int errno,
static int ext4_remount(struct super_block *sb, int *flags, char *data);
static int ext4_statfs(struct dentry *dentry, struct kstatfs *buf);
static int ext4_unfreeze(struct super_block *sb);
-static void ext4_write_super(struct super_block *sb);
static int ext4_freeze(struct super_block *sb);
static struct dentry *ext4_mount(struct file_system_type *fs_type, int flags,
const char *dev_name, void *data);
@@ -1193,7 +1192,6 @@ static const struct super_operations ext4_nojournal_sops = {
.dirty_inode = ext4_dirty_inode,
.drop_inode = ext4_drop_inode,
.evict_inode = ext4_evict_inode,
- .write_super = ext4_write_super,
.put_super = ext4_put_super,
.statfs = ext4_statfs,
.remount_fs = ext4_remount,
@@ -4202,7 +4200,6 @@ int ext4_commit_super(struct super_block *sb, int sync)
es->s_free_inodes_count =
cpu_to_le32(percpu_counter_sum_positive(
&EXT4_SB(sb)->s_freeinodes_counter));
- sb->s_dirt = 0;
BUFFER_TRACE(sbh, "marking dirty");
ext4_superblock_csum_set(sb, es);
mark_buffer_dirty(sbh);
@@ -4309,13 +4306,6 @@ int ext4_force_commit(struct super_block *sb)
return ret;
}

-static void ext4_write_super(struct super_block *sb)
-{
- lock_super(sb);
- ext4_commit_super(sb, 1);
- unlock_super(sb);
-}
-
static int ext4_sync_fs(struct super_block *sb, int wait)
{
int ret = 0;
--
1.7.7.6


2012-07-04 12:20:24

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCHv4 2/5] ext4: Convert last user of ext4_mark_super_dirty() to ext4_handle_dirty_super()

From: Jan Kara <[email protected]>

The last user of ext4_mark_super_dirty() in ext4_file_open() is so rare it
can well be modifying the superblock properly by journalling the change.
Change it and get rid of ext4_mark_super_dirty() as it's not needed anymore.

Artem: small amendments.
Artem: tested using xfstests for both journalled and non-journalled ext4.

Signed-off-by: Jan Kara <[email protected]>
Tested-by: Artem Bityutskiy <[email protected]>
Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/ext4/ext4.h | 9 ---------
fs/ext4/file.c | 14 +++++++++++++-
2 files changed, 13 insertions(+), 10 deletions(-)

diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
index cfc4e01..0c4042e 100644
--- a/fs/ext4/ext4.h
+++ b/fs/ext4/ext4.h
@@ -2321,15 +2321,6 @@ static inline void ext4_unlock_group(struct super_block *sb,
spin_unlock(ext4_group_lock_ptr(sb, group));
}

-static inline void ext4_mark_super_dirty(struct super_block *sb)
-{
- struct ext4_super_block *es = EXT4_SB(sb)->s_es;
-
- ext4_superblock_csum_set(sb, es);
- if (EXT4_SB(sb)->s_journal == NULL)
- sb->s_dirt =1;
-}
-
/*
* Block validity checking
*/
diff --git a/fs/ext4/file.c b/fs/ext4/file.c
index 8c7642a..3a77494 100644
--- a/fs/ext4/file.c
+++ b/fs/ext4/file.c
@@ -181,9 +181,21 @@ static int ext4_file_open(struct inode * inode, struct file * filp)
path.dentry = mnt->mnt_root;
cp = d_path(&path, buf, sizeof(buf));
if (!IS_ERR(cp)) {
+ handle_t *handle;
+ int err;
+
+ handle = ext4_journal_start_sb(sb, 1);
+ if (IS_ERR(handle))
+ return PTR_ERR(handle);
+ err = ext4_journal_get_write_access(handle, sbi->s_sbh);
+ if (err) {
+ ext4_journal_stop(handle);
+ return err;
+ }
strlcpy(sbi->s_es->s_last_mounted, cp,
sizeof(sbi->s_es->s_last_mounted));
- ext4_mark_super_dirty(sb);
+ ext4_handle_dirty_super(handle, sb);
+ ext4_journal_stop(handle);
}
}
/*
--
1.7.7.6


2012-07-04 13:11:27

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCHv4 3/5] ext4: remove unnecessary superblock dirtying

On Wed 04-07-12 15:21:52, Artem Bityutskiy wrote:
> From: Artem Bityutskiy <[email protected]>
>
> This patch changes the '__ext4_handle_dirty_super()' function which is used
> by ext4 to update the superblock via the journal in the following cases:
>
> 1. When creating the first large file on a file system without
> EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
> 2. When re-sizing the file-system.
> 3. When creating an xattr on a file-system without the
> EXT4_FEATURE_COMPAT_EXT_ATTR feature.
> 4. When adding or deleting an orphan (because we update the 's_last_orphan'
> superblock field).
>
> This function, however, falls back to just marking the superblock as dirty
> if the file-system has no journal. This means that we delay the actual
> superblock I/O submission by 5 seconds (roughly speaking). Namely, the
> 'sync_supers()' kernel thread will call 'ext4_write_super()' later, where
> we actually will submit the superblock down to the media.
>
> However:
> 1. For cases 1-3 it does not add any value to delay the I/O submission. These
> events are rare and we may just commit submit the superblock for
> asynchronous I/O right away.
> 2. For case 4 - similarly, not terribly frequent event in most of workloads.
> It should be good enough to just submit asynchronous superblock write-out.
Well, it happens for every inode being truncated / deleted to it can be
rather frequent. That's why I wanted to have now == 1 case everywhere -
i.e. just recompute the checksum and do mark_buffer_dirty(). I'd just
remove the 'now' test in this patch and then in patch 5 remove the now
argument from the function and callers as you did.

Honza

>
> This patch also removes 's_dirt' condition on the unmount path because we never
> set it anymore, so we should not test it.
>
> Tested using xfstests for both journalled and non-journalled ext4.
>
> Signed-off-by: Artem Bityutskiy <[email protected]>
> ---
> fs/ext4/ext4.h | 1 +
> fs/ext4/ext4_jbd2.c | 2 +-
> fs/ext4/super.c | 5 ++---
> 3 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h
> index 0c4042e..b2439d5 100644
> --- a/fs/ext4/ext4.h
> +++ b/fs/ext4/ext4.h
> @@ -2041,6 +2041,7 @@ extern int ext4_superblock_csum_verify(struct super_block *sb,
> struct ext4_super_block *es);
> extern void ext4_superblock_csum_set(struct super_block *sb,
> struct ext4_super_block *es);
> +extern int ext4_commit_super(struct super_block *sb, int sync);
> extern void *ext4_kvmalloc(size_t size, gfp_t flags);
> extern void *ext4_kvzalloc(size_t size, gfp_t flags);
> extern void ext4_kvfree(void *ptr);
> diff --git a/fs/ext4/ext4_jbd2.c b/fs/ext4/ext4_jbd2.c
> index 90f7c2e..27354df 100644
> --- a/fs/ext4/ext4_jbd2.c
> +++ b/fs/ext4/ext4_jbd2.c
> @@ -156,6 +156,6 @@ int __ext4_handle_dirty_super(const char *where, unsigned int line,
> (struct ext4_super_block *)bh->b_data);
> mark_buffer_dirty(bh);
> } else
> - sb->s_dirt = 1;
> + err = ext4_commit_super(sb, 0);
> return err;
> }
> diff --git a/fs/ext4/super.c b/fs/ext4/super.c
> index eb7aa3e..9b26ba0 100644
> --- a/fs/ext4/super.c
> +++ b/fs/ext4/super.c
> @@ -63,7 +63,6 @@ static struct ext4_features *ext4_feat;
> static int ext4_load_journal(struct super_block *, struct ext4_super_block *,
> unsigned long journal_devnum);
> static int ext4_show_options(struct seq_file *seq, struct dentry *root);
> -static int ext4_commit_super(struct super_block *sb, int sync);
> static void ext4_mark_recovery_complete(struct super_block *sb,
> struct ext4_super_block *es);
> static void ext4_clear_journal_err(struct super_block *sb,
> @@ -896,7 +895,7 @@ static void ext4_put_super(struct super_block *sb)
> EXT4_CLEAR_INCOMPAT_FEATURE(sb, EXT4_FEATURE_INCOMPAT_RECOVER);
> es->s_state = cpu_to_le16(sbi->s_mount_state);
> }
> - if (sb->s_dirt || !(sb->s_flags & MS_RDONLY))
> + if (!(sb->s_flags & MS_RDONLY))
> ext4_commit_super(sb, 1);
>
> if (sbi->s_proc) {
> @@ -4155,7 +4154,7 @@ static int ext4_load_journal(struct super_block *sb,
> return 0;
> }
>
> -static int ext4_commit_super(struct super_block *sb, int sync)
> +int ext4_commit_super(struct super_block *sb, int sync)
> {
> struct ext4_super_block *es = EXT4_SB(sb)->s_es;
> struct buffer_head *sbh = EXT4_SB(sb)->s_sbh;
> --
> 1.7.7.6
>
--
Jan Kara <[email protected]>
SUSE Labs, CR

2012-07-10 10:31:23

by Artem Bityutskiy

[permalink] [raw]
Subject: Re: [PATCHv4 3/5] ext4: remove unnecessary superblock dirtying

On Wed, 2012-07-04 at 15:11 +0200, Jan Kara wrote:
> On Wed 04-07-12 15:21:52, Artem Bityutskiy wrote:
> > From: Artem Bityutskiy <[email protected]>
> >
> > This patch changes the '__ext4_handle_dirty_super()' function which is used
> > by ext4 to update the superblock via the journal in the following cases:
> >
> > 1. When creating the first large file on a file system without
> > EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
> > 2. When re-sizing the file-system.
> > 3. When creating an xattr on a file-system without the
> > EXT4_FEATURE_COMPAT_EXT_ATTR feature.
> > 4. When adding or deleting an orphan (because we update the 's_last_orphan'
> > superblock field).
> >
> > This function, however, falls back to just marking the superblock as dirty
> > if the file-system has no journal. This means that we delay the actual
> > superblock I/O submission by 5 seconds (roughly speaking). Namely, the
> > 'sync_supers()' kernel thread will call 'ext4_write_super()' later, where
> > we actually will submit the superblock down to the media.
> >
> > However:
> > 1. For cases 1-3 it does not add any value to delay the I/O submission. These
> > events are rare and we may just commit submit the superblock for
> > asynchronous I/O right away.
> > 2. For case 4 - similarly, not terribly frequent event in most of workloads.
> > It should be good enough to just submit asynchronous superblock write-out.
> Well, it happens for every inode being truncated / deleted to it can be
> rather frequent. That's why I wanted to have now == 1 case everywhere -
> i.e. just recompute the checksum and do mark_buffer_dirty(). I'd just
> remove the 'now' test in this patch and then in patch 5 remove the now
> argument from the function and callers as you did.

I am a bit confused.

It seems you consider that 'ext4_commit_super()' is a considerably
slower than just marking the buffer as dirty right away. But I do not
really understand why - all it does - it just updates a couple of
superblock fields and then marks the buffer as dirty (I assume sync ==
0). So from my POW they are almost the same. And when csum is enabled -
re-calculating csum will probably be the longest part.

More important is that we dirty the superblock on every deletion - this
mean that with my change we will re-calculate checsum on every deletion
and I am not sure it is nice. Ideally, we should be able to calculate
the checksum just before sending the buffer to the IO queue...

I'll prepare a new patch-set and send it to you. Thanks!

--
Best Regards,
Artem Bityutskiy


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2012-07-10 12:17:29

by Artem Bityutskiy

[permalink] [raw]
Subject: Re: [PATCHv4 3/5] ext4: remove unnecessary superblock dirtying

On Wed, 2012-07-04 at 15:11 +0200, Jan Kara wrote:
> On Wed 04-07-12 15:21:52, Artem Bityutskiy wrote:
> > From: Artem Bityutskiy <[email protected]>
> >
> > This patch changes the '__ext4_handle_dirty_super()' function which is used
> > by ext4 to update the superblock via the journal in the following cases:
> >
> > 1. When creating the first large file on a file system without
> > EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
> > 2. When re-sizing the file-system.
> > 3. When creating an xattr on a file-system without the
> > EXT4_FEATURE_COMPAT_EXT_ATTR feature.
> > 4. When adding or deleting an orphan (because we update the 's_last_orphan'
> > superblock field).
> >
> > This function, however, falls back to just marking the superblock as dirty
> > if the file-system has no journal. This means that we delay the actual
> > superblock I/O submission by 5 seconds (roughly speaking). Namely, the
> > 'sync_supers()' kernel thread will call 'ext4_write_super()' later, where
> > we actually will submit the superblock down to the media.
> >
> > However:
> > 1. For cases 1-3 it does not add any value to delay the I/O submission. These
> > events are rare and we may just commit submit the superblock for
> > asynchronous I/O right away.
> > 2. For case 4 - similarly, not terribly frequent event in most of workloads.
> > It should be good enough to just submit asynchronous superblock write-out.
> Well, it happens for every inode being truncated / deleted to it can be
> rather frequent. That's why I wanted to have now == 1 case everywhere -
> i.e. just recompute the checksum and do mark_buffer_dirty(). I'd just
> remove the 'now' test in this patch and then in patch 5 remove the now
> argument from the function and callers as you did.

It looked logical to me to use 'ext4_commit_super()' always and remove
'now' and marking the buffer dirty directly. Just because I thought the
speed difference should be nearly 0, and 'ext4_commit_super()' is doing
some error checking. But you seem to suggest to do the opposite, and I
do not understand why would that be better. So I dropped this change so
far.

I've sent v5 where I basically only changed the commit message in patch
3 and dropped patch 5. In patch 3 I've explicitly indicated that we'll
do more checksum calculations, but I think this is OK acceptable.

Thanks!

--
Best Regards,
Artem Bityutskiy


Attachments:
signature.asc (836.00 B)
This is a digitally signed message part

2012-07-10 12:52:40

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCHv4 3/5] ext4: remove unnecessary superblock dirtying

On Tue 10-07-12 13:35:36, Artem Bityutskiy wrote:
> On Wed, 2012-07-04 at 15:11 +0200, Jan Kara wrote:
> > On Wed 04-07-12 15:21:52, Artem Bityutskiy wrote:
> > > From: Artem Bityutskiy <[email protected]>
> > >
> > > This patch changes the '__ext4_handle_dirty_super()' function which is used
> > > by ext4 to update the superblock via the journal in the following cases:
> > >
> > > 1. When creating the first large file on a file system without
> > > EXT4_FEATURE_RO_COMPAT_LARGE_FILE feature.
> > > 2. When re-sizing the file-system.
> > > 3. When creating an xattr on a file-system without the
> > > EXT4_FEATURE_COMPAT_EXT_ATTR feature.
> > > 4. When adding or deleting an orphan (because we update the 's_last_orphan'
> > > superblock field).
> > >
> > > This function, however, falls back to just marking the superblock as dirty
> > > if the file-system has no journal. This means that we delay the actual
> > > superblock I/O submission by 5 seconds (roughly speaking). Namely, the
> > > 'sync_supers()' kernel thread will call 'ext4_write_super()' later, where
> > > we actually will submit the superblock down to the media.
> > >
> > > However:
> > > 1. For cases 1-3 it does not add any value to delay the I/O submission. These
> > > events are rare and we may just commit submit the superblock for
> > > asynchronous I/O right away.
> > > 2. For case 4 - similarly, not terribly frequent event in most of workloads.
> > > It should be good enough to just submit asynchronous superblock write-out.
> > Well, it happens for every inode being truncated / deleted to it can be
> > rather frequent. That's why I wanted to have now == 1 case everywhere -
> > i.e. just recompute the checksum and do mark_buffer_dirty(). I'd just
> > remove the 'now' test in this patch and then in patch 5 remove the now
> > argument from the function and callers as you did.
>
> I am a bit confused.
>
> It seems you consider that 'ext4_commit_super()' is a considerably
> slower than just marking the buffer as dirty right away. But I do not
> really understand why - all it does - it just updates a couple of
> superblock fields and then marks the buffer as dirty (I assume sync ==
> 0). So from my POW they are almost the same. And when csum is enabled -
> re-calculating csum will probably be the longest part.
Well, the part you might be missing is:
ext4_free_blocks_count_set(es,
EXT4_C2B(EXT4_SB(sb), percpu_counter_sum_positive(
&EXT4_SB(sb)->s_freeclusters_counter)));
es->s_free_inodes_count =
cpu_to_le32(percpu_counter_sum_positive(
&EXT4_SB(sb)->s_freeinodes_counter));
percpu_counter_sum() *is* rather expensive. At least for big machines.

Also just marking the buffer dirty more corresponds to what we do when
journalling.

> More important is that we dirty the superblock on every deletion - this
> mean that with my change we will re-calculate checsum on every deletion
> and I am not sure it is nice. Ideally, we should be able to calculate
> the checksum just before sending the buffer to the IO queue...
Yes, that would be nice but it's not easy to do currently...

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR