2012-06-01 14:16:59

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCH 0/5] reiserfs: stop using write_supers and s_dirt

This patch-set makes reiserfs file-system stop using the VFS '->write_supers()'
call-back and the '->s_dirt' superblock field because I plan to remove them
once all users are gone.

This patch-set makes reiserfs submit a delayed job for writing out old commits
instead of using 's_dirt'.

The goal is to get rid of the 'sync_supers()' kernel thread. This kernel thread
wakes up every 5 seconds (by default) and calls '->write_super()' for all
mounted file-systems. And the bad thing is that this is done even if all the
superblocks are clean. Moreover, many file-systems do not even need this end
they do not register the '->write_super()' method at all (e.g., btrfs).

So 'sync_supers()' most often just generates useless wake-ups and wastes power.
I am trying to make all file-systems independent of '->write_super()' and plan
to remove 'sync_supers()' and '->write_super()' completely once there are no
more users.

The '->write_supers()' method is mostly used by baroque file-systems like hfs,
udf, etc. Modern file-systems like btrfs and xfs do not use it. This justifies
removing this stuff from VFS completely and make every FS self-manage own
superblock.

Al, in the past I was trying to upstream patches which optimized 'sync_super()',
but you wanted me to kill it completely instead, which I am trying to do
now, see http://lkml.org/lkml/2010/7/22/96

Tested using xfstests.

======
Overall status:

1. ext4: patches submitted, waiting for reply from Ted Ts'o:
https://lkml.org/lkml/2012/4/2/111
Ted keeps silence so far WRT the fate of this patch-set.
2. ext2: done, see commit f72cf5e223a28d3b3ea7dc9e40464fd534e359e8
3. vfat: done, see commit 78491189ddb6d84d4a4abae992ed891a236d0263
4. jffs2: sits the mtd subsystem tree, see commit
208b14e507c00ff7f108e1a388dd3d8cc805a443 in
git://git.infradead.org/linux-mtd.git

TODO: affs, exofs, hfs, hfsplus, sysv, udf, ufs
======

Thanks,
Artem.


2012-06-01 14:17:07

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCH 4/5] reiserfs: mark the superblock as dirty a bit later

From: Artem Bityutskiy <[email protected]>

The 'journal_mark_dirty()' function currently first marks the superblock as
dirty by setting 's_dirt' to 1, then does various sanity checks and returns,
then actuall does all the magic with the journal.

This is not an ideal order, though. It makes more sense to first do all the
checks, then do all the internal stuff, and at the end notify the VFS that the
superblock is now dirty.

This patch moves the 's_dirt = 1' assignment from the very beginning of this
function to the very end.

Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/reiserfs/journal.c | 3 +--
1 files changed, 1 insertions(+), 2 deletions(-)

diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index 68aea62..e5e06dd 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -3231,8 +3231,6 @@ int journal_mark_dirty(struct reiserfs_transaction_handle *th,
th->t_trans_id, journal->j_trans_id);
}

- sb->s_dirt = 1;
-
prepared = test_clear_buffer_journal_prepared(bh);
clear_buffer_journal_restore_dirty(bh);
/* already in this transaction, we are done */
@@ -3316,6 +3314,7 @@ int journal_mark_dirty(struct reiserfs_transaction_handle *th,
journal->j_first = cn;
journal->j_last = cn;
}
+ sb->s_dirt = 1;
return 0;
}

--
1.7.7.6

2012-06-01 14:17:03

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCH 2/5] reiserfs: clean-up function return type

From: Artem Bityutskiy <[email protected]>

Turn 'reiserfs_flush_old_commits()' into a void function because the callers
do not cares about what it returns anyway.

We are going to remove the 'sb->s_dirt' field completely and this patch is a
small step towards this direction.

Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/reiserfs/journal.c | 8 +++-----
fs/reiserfs/reiserfs.h | 2 +-
2 files changed, 4 insertions(+), 6 deletions(-)

diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index b1a0857..68aea62 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -3492,7 +3492,7 @@ static void flush_async_commits(struct work_struct *work)
** flushes any old transactions to disk
** ends the current transaction if it is too old
*/
-int reiserfs_flush_old_commits(struct super_block *sb)
+void reiserfs_flush_old_commits(struct super_block *sb)
{
time_t now;
struct reiserfs_transaction_handle th;
@@ -3502,9 +3502,8 @@ int reiserfs_flush_old_commits(struct super_block *sb)
/* safety check so we don't flush while we are replaying the log during
* mount
*/
- if (list_empty(&journal->j_journal_list)) {
- return 0;
- }
+ if (list_empty(&journal->j_journal_list))
+ return;

/* check the current transaction. If there are no writers, and it is
* too old, finish it, and force the commit blocks to disk
@@ -3526,7 +3525,6 @@ int reiserfs_flush_old_commits(struct super_block *sb)
do_journal_end(&th, sb, 1, COMMIT_NOW | WAIT);
}
}
- return sb->s_dirt;
}

/*
diff --git a/fs/reiserfs/reiserfs.h b/fs/reiserfs/reiserfs.h
index a59d271..09664c9 100644
--- a/fs/reiserfs/reiserfs.h
+++ b/fs/reiserfs/reiserfs.h
@@ -2452,7 +2452,7 @@ struct reiserfs_transaction_handle *reiserfs_persistent_transaction(struct
int reiserfs_end_persistent_transaction(struct reiserfs_transaction_handle *);
int reiserfs_commit_page(struct inode *inode, struct page *page,
unsigned from, unsigned to);
-int reiserfs_flush_old_commits(struct super_block *);
+void reiserfs_flush_old_commits(struct super_block *);
int reiserfs_commit_for_inode(struct inode *);
int reiserfs_inode_needs_commit(struct inode *);
void reiserfs_update_inode_transaction(struct inode *);
--
1.7.7.6

2012-06-01 14:17:21

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCH 5/5] reiserfs: get rid of resierfs_sync_super

From: Artem Bityutskiy <[email protected]>

This patch stops reiserfs using the VFS 'write_super()' method along with the
s_dirt flag, because they are on their way out.

The whole "superblock write-out" VFS infrastructure is served by the
'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
writes out all dirty superblock using the '->write_super()' call-back. But the
problem with this thread is that it wastes power by waking up the system every
5 seconds, even if there are no diry superblocks, or there are no client
file-systems which would need this (e.g., btrfs does not use
'->write_super()'). So we want to kill it completely and thus, we need to make
file-systems to stop using the '->write_super()' VFS service, and then remove
it together with the kernel thread.

Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/reiserfs/journal.c | 6 +++-
fs/reiserfs/reiserfs.h | 6 +++++
fs/reiserfs/super.c | 54 ++++++++++++++++++++++++++++++++++++++++--------
3 files changed, 55 insertions(+), 11 deletions(-)

diff --git a/fs/reiserfs/journal.c b/fs/reiserfs/journal.c
index e5e06dd..afcadcc 100644
--- a/fs/reiserfs/journal.c
+++ b/fs/reiserfs/journal.c
@@ -1923,6 +1923,8 @@ static int do_journal_release(struct reiserfs_transaction_handle *th,
* the workqueue job (flush_async_commit) needs this lock
*/
reiserfs_write_unlock(sb);
+
+ cancel_delayed_work_sync(&REISERFS_SB(sb)->old_work);
flush_workqueue(commit_wq);

if (!reiserfs_mounted_fs_count) {
@@ -3314,7 +3316,7 @@ int journal_mark_dirty(struct reiserfs_transaction_handle *th,
journal->j_first = cn;
journal->j_last = cn;
}
- sb->s_dirt = 1;
+ reiserfs_schedule_old_flush(sb);
return 0;
}

@@ -3952,7 +3954,7 @@ static int do_journal_end(struct reiserfs_transaction_handle *th,
** it tells us if we should continue with the journal_end, or just return
*/
if (!check_journal_end(th, sb, nblocks, flags)) {
- sb->s_dirt = 1;
+ reiserfs_schedule_old_flush(sb);
wake_queued_writers(sb);
reiserfs_async_progress_wait(sb);
goto out;
diff --git a/fs/reiserfs/reiserfs.h b/fs/reiserfs/reiserfs.h
index 09664c9..d6d8a7d 100644
--- a/fs/reiserfs/reiserfs.h
+++ b/fs/reiserfs/reiserfs.h
@@ -480,6 +480,11 @@ struct reiserfs_sb_info {
struct dentry *priv_root; /* root of /.reiserfs_priv */
struct dentry *xattr_root; /* root of /.reiserfs_priv/xattrs */
int j_errno;
+
+ int work_queued; /* non-zero delayed work is queued */
+ struct delayed_work old_work; /* old transactions flush delayed work */
+ spinlock_t old_work_lock; /* protects old_work and work_queued */
+
#ifdef CONFIG_QUOTA
char *s_qf_names[MAXQUOTAS];
int s_jquota_fmt;
@@ -2487,6 +2492,7 @@ void reiserfs_abort(struct super_block *sb, int errno, const char *fmt, ...);
int reiserfs_allocate_list_bitmaps(struct super_block *s,
struct reiserfs_list_bitmap *, unsigned int);

+void reiserfs_schedule_old_flush(struct super_block *s);
void add_save_link(struct reiserfs_transaction_handle *th,
struct inode *inode, int truncate);
int remove_save_link(struct inode *inode, int truncate);
diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index 92db34b..5cc70e7 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -72,20 +72,58 @@ static int reiserfs_sync_fs(struct super_block *s, int wait)
if (!journal_begin(&th, s, 1))
if (!journal_end_sync(&th, s, 1))
reiserfs_flush_old_commits(s);
- s->s_dirt = 0; /* Even if it's not true.
- * We'll loop forever in sync_supers otherwise */
reiserfs_write_unlock(s);
return 0;
}

-static void reiserfs_write_super(struct super_block *s)
+static void flush_old_commits(struct work_struct *work)
{
+ struct reiserfs_sb_info *sbi;
+ struct super_block *s;
+
+ sbi = container_of(work, struct reiserfs_sb_info, old_work.work);
+ s = sbi->s_journal->j_work_sb;
+
+ spin_lock(&sbi->old_work_lock);
+ sbi->work_queued = 0;
+ spin_unlock(&sbi->old_work_lock);
+
reiserfs_sync_fs(s, 1);
}

+void reiserfs_schedule_old_flush(struct super_block *s)
+{
+ struct reiserfs_sb_info *sbi = REISERFS_SB(s);
+ unsigned long delay;
+
+ if (s->s_flags & MS_RDONLY)
+ return;
+
+ spin_lock(&sbi->old_work_lock);
+ if (!sbi->work_queued) {
+ delay = msecs_to_jiffies(dirty_writeback_interval * 10);
+ queue_delayed_work(system_long_wq, &sbi->old_work, delay);
+ sbi->work_queued = 1;
+ }
+ spin_unlock(&sbi->old_work_lock);
+}
+
+static void cancel_old_flush(struct super_block *s)
+{
+ struct reiserfs_sb_info *sbi = REISERFS_SB(s);
+
+ cancel_delayed_work_sync(&REISERFS_SB(s)->old_work);
+ spin_lock(&sbi->old_work_lock);
+ sbi->work_queued = 0;
+ spin_unlock(&sbi->old_work_lock);
+}
+
static int reiserfs_freeze(struct super_block *s)
{
struct reiserfs_transaction_handle th;
+
+ cancel_old_flush(s);
+
reiserfs_write_lock(s);
if (!(s->s_flags & MS_RDONLY)) {
int err = journal_begin(&th, s, 1);
@@ -99,7 +137,6 @@ static int reiserfs_freeze(struct super_block *s)
journal_end_sync(&th, s, 1);
}
}
- s->s_dirt = 0;
reiserfs_write_unlock(s);
return 0;
}
@@ -483,9 +520,6 @@ static void reiserfs_put_super(struct super_block *s)

reiserfs_write_lock(s);

- if (s->s_dirt)
- reiserfs_write_super(s);
-
/* change file system state to current state if it was mounted with read-write permissions */
if (!(s->s_flags & MS_RDONLY)) {
if (!journal_begin(&th, s, 10)) {
@@ -692,7 +726,6 @@ static const struct super_operations reiserfs_sops = {
.dirty_inode = reiserfs_dirty_inode,
.evict_inode = reiserfs_evict_inode,
.put_super = reiserfs_put_super,
- .write_super = reiserfs_write_super,
.sync_fs = reiserfs_sync_fs,
.freeze_fs = reiserfs_freeze,
.unfreeze_fs = reiserfs_unfreeze,
@@ -1400,7 +1433,6 @@ static int reiserfs_remount(struct super_block *s, int *mount_flags, char *arg)
err = journal_end(&th, s, 10);
if (err)
goto out_err;
- s->s_dirt = 0;

if (!(*mount_flags & MS_RDONLY)) {
dquot_resume(s, -1);
@@ -1741,6 +1773,8 @@ static int reiserfs_fill_super(struct super_block *s, void *data, int silent)
/* setup default block allocator options */
reiserfs_init_alloc_options(s);

+ spin_lock_init(&sbi->old_work_lock);
+ INIT_DELAYED_WORK(&sbi->old_work, flush_old_commits);
mutex_init(&sbi->lock);
sbi->lock_depth = -1;

@@ -2003,6 +2037,8 @@ error_unlocked:
reiserfs_write_unlock(s);
}

+ cancel_delayed_work_sync(&REISERFS_SB(s)->old_work);
+
reiserfs_free_bitmap_cache(s);
if (SB_BUFFER_WITH_SB(s))
brelse(SB_BUFFER_WITH_SB(s));
--
1.7.7.6

2012-06-01 14:17:50

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCH 3/5] reiserfs: remove useless superblock dirtying

From: Artem Bityutskiy <[email protected]>

The 'reiserfs_resize()' function marks the superblock as dirty by assigning 1
to 's_dirt' and then calls 'journal_mark_dirty()' which does the same. Thus,
we can remove the assignment from 'reiserfs_resize()'.

Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/reiserfs/resize.c | 1 -
1 files changed, 0 insertions(+), 1 deletions(-)

diff --git a/fs/reiserfs/resize.c b/fs/reiserfs/resize.c
index 9a17f63..3ce02cf 100644
--- a/fs/reiserfs/resize.c
+++ b/fs/reiserfs/resize.c
@@ -200,7 +200,6 @@ int reiserfs_resize(struct super_block *s, unsigned long block_count_new)
(bmap_nr_new - bmap_nr)));
PUT_SB_BLOCK_COUNT(s, block_count_new);
PUT_SB_BMAP_NR(s, bmap_would_wrap(bmap_nr_new) ? : bmap_nr_new);
- s->s_dirt = 1;

journal_mark_dirty(&th, s, SB_BUFFER_WITH_SB(s));

--
1.7.7.6

2012-06-01 14:18:20

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCH 1/5] reiserfs: cleanup reiserfs_fill_super a bit

From: Artem Bityutskiy <[email protected]>

We have the reiserfs superblock pointer in the 'sbi' variable in this
function, no need to use the 'REISERFS_SB(s)' macro which is the same.
This is jut a small clean-up.

Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/reiserfs/super.c | 20 ++++++++++----------
1 files changed, 10 insertions(+), 10 deletions(-)

diff --git a/fs/reiserfs/super.c b/fs/reiserfs/super.c
index 8b7616e..92db34b 100644
--- a/fs/reiserfs/super.c
+++ b/fs/reiserfs/super.c
@@ -1730,19 +1730,19 @@ static int reiserfs_fill_super(struct super_block *s, void *data, int silent)
return -ENOMEM;
s->s_fs_info = sbi;
/* Set default values for options: non-aggressive tails, RO on errors */
- REISERFS_SB(s)->s_mount_opt |= (1 << REISERFS_SMALLTAIL);
- REISERFS_SB(s)->s_mount_opt |= (1 << REISERFS_ERROR_RO);
- REISERFS_SB(s)->s_mount_opt |= (1 << REISERFS_BARRIER_FLUSH);
+ sbi->s_mount_opt |= (1 << REISERFS_SMALLTAIL);
+ sbi->s_mount_opt |= (1 << REISERFS_ERROR_RO);
+ sbi->s_mount_opt |= (1 << REISERFS_BARRIER_FLUSH);
/* no preallocation minimum, be smart in
reiserfs_file_write instead */
- REISERFS_SB(s)->s_alloc_options.preallocmin = 0;
+ sbi->s_alloc_options.preallocmin = 0;
/* Preallocate by 16 blocks (17-1) at once */
- REISERFS_SB(s)->s_alloc_options.preallocsize = 17;
+ sbi->s_alloc_options.preallocsize = 17;
/* setup default block allocator options */
reiserfs_init_alloc_options(s);

- mutex_init(&REISERFS_SB(s)->lock);
- REISERFS_SB(s)->lock_depth = -1;
+ mutex_init(&sbi->lock);
+ sbi->lock_depth = -1;

jdev_name = NULL;
if (reiserfs_parse_options
@@ -1751,8 +1751,8 @@ static int reiserfs_fill_super(struct super_block *s, void *data, int silent)
goto error_unlocked;
}
if (jdev_name && jdev_name[0]) {
- REISERFS_SB(s)->s_jdev = kstrdup(jdev_name, GFP_KERNEL);
- if (!REISERFS_SB(s)->s_jdev) {
+ sbi->s_jdev = kstrdup(jdev_name, GFP_KERNEL);
+ if (!sbi->s_jdev) {
SWARN(silent, s, "", "Cannot allocate memory for "
"journal device name");
goto error;
@@ -1810,7 +1810,7 @@ static int reiserfs_fill_super(struct super_block *s, void *data, int silent)
/* make data=ordered the default */
if (!reiserfs_data_log(s) && !reiserfs_data_ordered(s) &&
!reiserfs_data_writeback(s)) {
- REISERFS_SB(s)->s_mount_opt |= (1 << REISERFS_DATA_ORDERED);
+ sbi->s_mount_opt |= (1 << REISERFS_DATA_ORDERED);
}

if (reiserfs_data_log(s)) {
--
1.7.7.6