2012-05-04 15:22:49

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCH v3 0/4] do not use s_dirt in FAT FS

This is version 3 of the patch-set which makes FAT file-system stop using
the VFS '->write_super()' method for writing out the FSINFO block.

Fist version: https://lkml.org/lkml/2012/4/11/147
Second version: https://lkml.org/lkml/2012/4/13/215

Comparing to v2 - not much changes except that I fixed patch 3 and now only
mark the fsinfo inode as dirty when there were some changes, as Hirofumi
requested. I think it is OK to attach FSINFO to an inode, unlike Hirofumi
would say, so this part is unchanged.

Hirofumi, if you insist there is an issue, could you please again provide more
details and we'd start the conversation over? I think my patches do not change
ordering and even if they were, I do not see what would be the problem.

Let me recap why I am doing this, and the current status of this exercises.

The final goal is to get rid of the 'sync_supers()' kernel thread. This kernel
thread wakes up every 5 seconds (by default) and calls '->write_super()' for
all mounted file-systems. And the bad thing is that this is done even if all
the superblocks are clean. Moreover, some file-systems do not even need this
end they do not register the '->write_super()' method at all (e.g., btrfs).

So 'sync_supers()' most often just generates useless wake-ups and wastes power.
I am trying to make all file-systems independent of '->write_super()' and plan
to remove 'sync_supers()' and '->write_super' completely once there are no more
users.

The '->write_supers()' method is mostly used by baroque file-systems like hfs,
udf, etc. Modern file-systems like btrfs and xfs do not use it. This justifies
removing this stuff from VFS completely and make every FS self-manage own
superblock.

Tested with xfstests.

Note: in the past I was trying to upstream patches which optimized 'sync_super()',
but Al Viro wanted me to kill it completely instead, which I am trying to do
now, see http://lkml.org/lkml/2010/7/22/96

======
Overall status:

1. ext4: patches submitted, waiting for reply from Ted Ts'o:
https://lkml.org/lkml/2012/4/2/111
Ted keeps silence so far WRT the fate of this patch
2. ext2: patches are in the ext2 tree maintained by Jan Kara:
git://git.kernel.org/pub/scm/linux/kernel/git/jack/linux-fs.git for_next
3. FAT FS patches discussion is ongoin on lkml and fsdevel

TODO: affs, exofs, hfs, hfsplus, jffs2, reiserfs, sysv, udf, ufs
======

fs/fat/fat.h | 1 +
fs/fat/fatent.c | 22 +++++++++++++-----
fs/fat/inode.c | 54 ++++++++++++++++++++-------------------------
include/linux/msdos_fs.h | 3 +-
4 files changed, 43 insertions(+), 37 deletions(-)

Thanks,
Artem.


2012-05-04 15:22:51

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCH v3 2/4] fat: introduce mark_fsinfo_dirty helper

From: Artem Bityutskiy <[email protected]>

This is a preparation patch which introduces a 'mark_fsinfo_dirty()'
helper function which just sets the 's_dirt' flag to 1 so far. I'll add more
code to this helper later, so I do not mark it as 'inline'.

Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/fat/fatent.c | 13 +++++++++----
1 files changed, 9 insertions(+), 4 deletions(-)

diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c
index 2e81ac0..e49d274 100644
--- a/fs/fat/fatent.c
+++ b/fs/fat/fatent.c
@@ -308,6 +308,11 @@ void fat_ent_access_init(struct super_block *sb)
}
}

+static void mark_fsinfo_dirty(struct super_block *sb)
+{
+ sb->s_dirt = 1;
+}
+
static inline int fat_ent_update_ptr(struct super_block *sb,
struct fat_entry *fatent,
int offset, sector_t blocknr)
@@ -498,7 +503,7 @@ int fat_alloc_clusters(struct inode *inode, int *cluster, int nr_cluster)
sbi->prev_free = entry;
if (sbi->free_clusters != -1)
sbi->free_clusters--;
- sb->s_dirt = 1;
+ mark_fsinfo_dirty(sb);

cluster[idx_clus] = entry;
idx_clus++;
@@ -520,7 +525,7 @@ int fat_alloc_clusters(struct inode *inode, int *cluster, int nr_cluster)
/* Couldn't allocate the free entries */
sbi->free_clusters = 0;
sbi->free_clus_valid = 1;
- sb->s_dirt = 1;
+ mark_fsinfo_dirty(sb);
err = -ENOSPC;

out:
@@ -587,7 +592,7 @@ int fat_free_clusters(struct inode *inode, int cluster)
ops->ent_put(&fatent, FAT_ENT_FREE);
if (sbi->free_clusters != -1) {
sbi->free_clusters++;
- sb->s_dirt = 1;
+ mark_fsinfo_dirty(sb);
}

if (nr_bhs + fatent.nr_bhs > MAX_BUF_PER_PAGE) {
@@ -677,7 +682,7 @@ int fat_count_free_clusters(struct super_block *sb)
}
sbi->free_clusters = free;
sbi->free_clus_valid = 1;
- sb->s_dirt = 1;
+ mark_fsinfo_dirty(sb);
fatent_brelse(&fatent);
out:
unlock_fat(sbi);
--
1.7.7.6

2012-05-04 15:22:54

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCH v3 3/4] fat: mark superblock as dirty less often

From: Artem Bityutskiy <[email protected]>

This patch is a preparation for further changes. It touches few functions
in fatent.c and prevents them from marking the superblock as dirty
unnecessarily often. Namely, instead of marking it as dirty in the internal
tight loops - do it only once at the end of the functions. And instead of
marking it as dirty while holding the FAT table lock, do it outside the lock.

The reason for this patch is that marking the superblock as dirty will soon
become a little bit heavier operation, so it is cleaner to do this only when it
is necessary.

Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/fat/fatent.c | 9 +++++----
1 files changed, 5 insertions(+), 4 deletions(-)

diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c
index e49d274..8181548 100644
--- a/fs/fat/fatent.c
+++ b/fs/fat/fatent.c
@@ -503,7 +503,6 @@ int fat_alloc_clusters(struct inode *inode, int *cluster, int nr_cluster)
sbi->prev_free = entry;
if (sbi->free_clusters != -1)
sbi->free_clusters--;
- mark_fsinfo_dirty(sb);

cluster[idx_clus] = entry;
idx_clus++;
@@ -525,11 +524,11 @@ int fat_alloc_clusters(struct inode *inode, int *cluster, int nr_cluster)
/* Couldn't allocate the free entries */
sbi->free_clusters = 0;
sbi->free_clus_valid = 1;
- mark_fsinfo_dirty(sb);
err = -ENOSPC;

out:
unlock_fat(sbi);
+ mark_fsinfo_dirty(sb);
fatent_brelse(&fatent);
if (!err) {
if (inode_needs_sync(inode))
@@ -554,7 +553,7 @@ int fat_free_clusters(struct inode *inode, int cluster)
struct fat_entry fatent;
struct buffer_head *bhs[MAX_BUF_PER_PAGE];
int i, err, nr_bhs;
- int first_cl = cluster;
+ int first_cl = cluster, dirty_fsinfo = 0;

nr_bhs = 0;
fatent_init(&fatent);
@@ -592,7 +591,7 @@ int fat_free_clusters(struct inode *inode, int cluster)
ops->ent_put(&fatent, FAT_ENT_FREE);
if (sbi->free_clusters != -1) {
sbi->free_clusters++;
- mark_fsinfo_dirty(sb);
+ dirty_fsinfo = 1;
}

if (nr_bhs + fatent.nr_bhs > MAX_BUF_PER_PAGE) {
@@ -622,6 +621,8 @@ error:
for (i = 0; i < nr_bhs; i++)
brelse(bhs[i]);
unlock_fat(sbi);
+ if (dirty_fsinfo)
+ mark_fsinfo_dirty(sb);

return err;
}
--
1.7.7.6

2012-05-04 15:23:10

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCH v3 4/4] fat: switch to fsinfo_inode

From: Artem Bityutskiy <[email protected]>

Currently FAT file-system maps the VFS "superblock" abstraction to the FSINFO
block. The FSINFO block contains non-essential data about the amount of free
clusters and the next free cluster. FAT file-system can always find out this
information by scanning the FAT table, but having it in the FSINFO block may
speed things up sometimes. So FAT file-system relies on the VFS superblock
write-out services to make sure the FSINFO block is written out to the media
from time to time.

The whole "superblock write-out" VFS infrastructure is served by the
'sync_supers()' kernel thread, which wakes up every 5 (by default) seconds and
writes out all dirty superblock using the '->write_super()' call-back. But the
problem with this thread is that it wastes power by waking up the system every
5 seconds no matter what. So we want to kill it completely and thus, we need to
make file-systems to stop using the '->write_super' VFS service, and then
remove it together with the kernel thread.

This patch switches the FAT FSINFO block management from
'->write_super()'/'->s_dirt' to 'fsinfo_inode'/'->write_inode'. Now, instead of
setting the 's_dirt' flag, we just mark the special 'fsinfo_inode' inode as
dirty and let VFS invoke the '->write_inode' call-back when needed, where we
write-out the FSINFO block.

This patch also makes sure we do not mark the 'fsinfo_inode' inode as dirty if
we are not FAT32 (FAT16 and FAT12 do not have the FSINFO block) or if we are in
R/O mode.

As a bonus, we can also remove the '->sync_fs()' and '->write_super()' FAT
call-back function because they become unneeded.

Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/fat/fatent.c | 7 ++++++-
fs/fat/inode.c | 42 ++++++++++++------------------------------
2 files changed, 18 insertions(+), 31 deletions(-)

diff --git a/fs/fat/fatent.c b/fs/fat/fatent.c
index 8181548..31f08ab 100644
--- a/fs/fat/fatent.c
+++ b/fs/fat/fatent.c
@@ -310,7 +310,12 @@ void fat_ent_access_init(struct super_block *sb)

static void mark_fsinfo_dirty(struct super_block *sb)
{
- sb->s_dirt = 1;
+ struct msdos_sb_info *sbi = MSDOS_SB(sb);
+
+ if (sb->s_flags & MS_RDONLY || sbi->fat_bits != 32)
+ return;
+
+ __mark_inode_dirty(sbi->fsinfo_inode, I_DIRTY_SYNC);
}

static inline int fat_ent_update_ptr(struct super_block *sb,
diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 91e9a8a..9a87ec4 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -459,37 +459,10 @@ static void fat_evict_inode(struct inode *inode)
fat_detach(inode);
}

-static void fat_write_super(struct super_block *sb)
-{
- lock_super(sb);
- sb->s_dirt = 0;
-
- if (!(sb->s_flags & MS_RDONLY))
- fat_clusters_flush(sb);
- unlock_super(sb);
-}
-
-static int fat_sync_fs(struct super_block *sb, int wait)
-{
- int err = 0;
-
- if (sb->s_dirt) {
- lock_super(sb);
- sb->s_dirt = 0;
- err = fat_clusters_flush(sb);
- unlock_super(sb);
- }
-
- return err;
-}
-
static void fat_put_super(struct super_block *sb)
{
struct msdos_sb_info *sbi = MSDOS_SB(sb);

- if (sb->s_dirt)
- fat_write_super(sb);
-
iput(sbi->fsinfo_inode);
iput(sbi->fat_inode);

@@ -662,7 +635,18 @@ retry:

static int fat_write_inode(struct inode *inode, struct writeback_control *wbc)
{
- return __fat_write_inode(inode, wbc->sync_mode == WB_SYNC_ALL);
+ int err;
+
+ if (inode->i_ino == MSDOS_FSINFO_INO) {
+ struct super_block *sb = inode->i_sb;
+
+ lock_super(sb);
+ err = fat_clusters_flush(sb);
+ unlock_super(sb);
+ } else
+ err = __fat_write_inode(inode, wbc->sync_mode == WB_SYNC_ALL);
+
+ return err;
}

int fat_sync_inode(struct inode *inode)
@@ -679,8 +663,6 @@ static const struct super_operations fat_sops = {
.write_inode = fat_write_inode,
.evict_inode = fat_evict_inode,
.put_super = fat_put_super,
- .write_super = fat_write_super,
- .sync_fs = fat_sync_fs,
.statfs = fat_statfs,
.remount_fs = fat_remount,

--
1.7.7.6

2012-05-04 15:23:49

by Artem Bityutskiy

[permalink] [raw]
Subject: [PATCH v3 1/4] fat: introduce special inode for managing the FSINFO block

From: Artem Bityutskiy <[email protected]>

This patch is just a preparation for further changes. It introduces a special
inode ('fsinfo_inode') in FAT file-system which we'll later use for managing
the FSINFO block. Note, this there is already one special inode ('fat_inode')
which is used for managing the FAT tables.

Introduce new 'MSDOS_FSINFO_INO' constant for this special inode. It is safe to
do because FAT file-system does not store inode numbers on the media but
generates them run-time.

I've also cleaned up the comment to existing 'MSDOS_ROOT_INO' constant, while
on it.

Signed-off-by: Artem Bityutskiy <[email protected]>
---
fs/fat/fat.h | 1 +
fs/fat/inode.c | 12 ++++++++++++
include/linux/msdos_fs.h | 3 ++-
3 files changed, 15 insertions(+), 1 deletions(-)

diff --git a/fs/fat/fat.h b/fs/fat/fat.h
index 66994f3..951d12b 100644
--- a/fs/fat/fat.h
+++ b/fs/fat/fat.h
@@ -82,6 +82,7 @@ struct msdos_sb_info {
int fatent_shift;
struct fatent_operations *fatent_ops;
struct inode *fat_inode;
+ struct inode *fsinfo_inode;

struct ratelimit_state ratelimit;

diff --git a/fs/fat/inode.c b/fs/fat/inode.c
index 21687e3..91e9a8a 100644
--- a/fs/fat/inode.c
+++ b/fs/fat/inode.c
@@ -490,6 +490,7 @@ static void fat_put_super(struct super_block *sb)
if (sb->s_dirt)
fat_write_super(sb);

+ iput(sbi->fsinfo_inode);
iput(sbi->fat_inode);

unload_nls(sbi->nls_disk);
@@ -1244,6 +1245,7 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat,
void (*setup)(struct super_block *))
{
struct inode *root_inode = NULL, *fat_inode = NULL;
+ struct inode *fsinfo_inode = NULL;
struct buffer_head *bh;
struct fat_boot_sector *b;
struct msdos_sb_info *sbi;
@@ -1490,6 +1492,14 @@ int fat_fill_super(struct super_block *sb, void *data, int silent, int isvfat,
goto out_fail;
MSDOS_I(fat_inode)->i_pos = 0;
sbi->fat_inode = fat_inode;
+
+ fsinfo_inode = new_inode(sb);
+ if (!fsinfo_inode)
+ goto out_fail;
+ fsinfo_inode->i_ino = MSDOS_FSINFO_INO;
+ sbi->fsinfo_inode = fsinfo_inode;
+ insert_inode_hash(fsinfo_inode);
+
root_inode = new_inode(sb);
if (!root_inode)
goto out_fail;
@@ -1516,6 +1526,8 @@ out_invalid:
fat_msg(sb, KERN_INFO, "Can't find a valid FAT filesystem");

out_fail:
+ if (fsinfo_inode)
+ iput(fsinfo_inode);
if (fat_inode)
iput(fat_inode);
unload_nls(sbi->nls_io);
diff --git a/include/linux/msdos_fs.h b/include/linux/msdos_fs.h
index 34066e6..11cc2ac 100644
--- a/include/linux/msdos_fs.h
+++ b/include/linux/msdos_fs.h
@@ -21,8 +21,9 @@
#define CT_LE_W(v) cpu_to_le16(v)
#define CT_LE_L(v) cpu_to_le32(v)

+#define MSDOS_ROOT_INO 1 /* The root inode number */
+#define MSDOS_FSINFO_INO 2 /* Used for managing the FSINFO block */

-#define MSDOS_ROOT_INO 1 /* == MINIX_ROOT_INO */
#define MSDOS_DIR_BITS 5 /* log2(sizeof(struct msdos_dir_entry)) */

/* directory limit */
--
1.7.7.6