2014-04-11 04:50:52

by Fabian Frédérick

[permalink] [raw]
Subject: [PATCH 1/1] FS: Add generic data flush to fsync

Currently, there's no generic HW flush management support in kernel.
Only 7 filesystems have their own way to manage the barrier flag.

This patch adds MS_BARRIER in sb_flags and conditionnaly issues the flush
in generic_file_fsync.

That option would be enabled from userspace mount tool.

(-As discussed with Jan, we could add barrier/no barrier in older filesystems
but this would mix both filesystem and vfs barrier terminology plus
generic_file_fsync does not have access to FS sbi.
-Some modern FS will be involved as well ; eg no journal ext4 partitions.
)

Cc: Jan Kara <[email protected]>
Cc: [email protected]
Suggested-by: Jan Kara <[email protected]>
Reviewed-by: Jan Kara <[email protected]>
Signed-off-by: Fabian Frederick <[email protected]>
---
fs/libfs.c | 7 +++++++
include/uapi/linux/fs.h | 1 +
2 files changed, 8 insertions(+)

diff --git a/fs/libfs.c b/fs/libfs.c
index a184424..21983d9 100644
--- a/fs/libfs.c
+++ b/fs/libfs.c
@@ -3,6 +3,7 @@
* Library for filesystems writers.
*/

+#include <linux/blkdev.h>
#include <linux/export.h>
#include <linux/pagemap.h>
#include <linux/slab.h>
@@ -952,6 +953,12 @@ int generic_file_fsync(struct file *file, loff_t start, loff_t end,
err = sync_inode_metadata(inode, 1);
if (ret == 0)
ret = err;
+ if (inode->i_sb->s_flags & MS_BARRIER) {
+ err = blkdev_issue_flush(inode->i_sb->s_bdev, GFP_KERNEL, NULL);
+ if (ret == 0)
+ ret = err;
+ }
+
out:
mutex_unlock(&inode->i_mutex);
return ret;
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index ca1a11b..7c14ebd 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -89,6 +89,7 @@ struct inodes_stat_t {
#define MS_KERNMOUNT (1<<22) /* this is a kern_mount call */
#define MS_I_VERSION (1<<23) /* Update inode I_version field */
#define MS_STRICTATIME (1<<24) /* Always perform atime updates */
+#define MS_BARRIER (1<<25) /* Flush data during sync operations */

/* These sb flags are internal to the kernel */
#define MS_NOSEC (1<<28)
--
1.8.4.5


2014-04-11 07:45:29

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/1] FS: Add generic data flush to fsync

On Fri, Apr 11, 2014 at 06:52:42AM +0200, Fabian Frederick wrote:
> Currently, there's no generic HW flush management support in kernel.
> Only 7 filesystems have their own way to manage the barrier flag.
>
> This patch adds MS_BARRIER in sb_flags and conditionnaly issues the flush
> in generic_file_fsync.

Please don't add a binary mount option for something thast already is a
text option in many filesystems. We can actually enable/disable cache
flushes at the block device level, so I think this can be
unconditional.

2014-04-11 13:00:35

by Jan Kara

[permalink] [raw]
Subject: Re: [PATCH 1/1] FS: Add generic data flush to fsync

On Fri 11-04-14 00:45:22, Christoph Hellwig wrote:
> On Fri, Apr 11, 2014 at 06:52:42AM +0200, Fabian Frederick wrote:
> > Currently, there's no generic HW flush management support in kernel.
> > Only 7 filesystems have their own way to manage the barrier flag.
> >
> > This patch adds MS_BARRIER in sb_flags and conditionnaly issues the flush
> > in generic_file_fsync.
>
> Please don't add a binary mount option for something thast already is a
> text option in many filesystems. We can actually enable/disable cache
> flushes at the block device level, so I think this can be
> unconditional.
How can those be disabled? I didn't find that by a quick look through
/sys and flush code in kernel...

Honza
--
Jan Kara <[email protected]>
SUSE Labs, CR

2014-04-11 21:03:40

by Fabian Frédérick

[permalink] [raw]
Subject: Re: [PATCH 1/1] FS: Add generic data flush to fsync

On Fri, 11 Apr 2014 00:45:22 -0700
Christoph Hellwig <[email protected]> wrote:

> On Fri, Apr 11, 2014 at 06:52:42AM +0200, Fabian Frederick wrote:
> > Currently, there's no generic HW flush management support in kernel.
> > Only 7 filesystems have their own way to manage the barrier flag.
> >
> > This patch adds MS_BARRIER in sb_flags and conditionnaly issues the flush
> > in generic_file_fsync.
>
> Please don't add a binary mount option for something thast already is a
> text option in many filesystems. We can actually enable/disable cache
> flushes at the block device level, so I think this can be
> unconditional.

Ok, that would mean 'older' filesystems -without barrier flag-
to call generic_file_fsync would result in a flush but ext4 without journal
also calls it ; maybe with nobarrier and I'm not sure it's the only case.
How can I deal with this problem ?

2014-04-14 07:22:46

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/1] FS: Add generic data flush to fsync

On Fri, Apr 11, 2014 at 11:05:30PM +0200, Fabian Frederick wrote:
> Ok, that would mean 'older' filesystems -without barrier flag-
> to call generic_file_fsync would result in a flush but ext4 without journal
> also calls it ; maybe with nobarrier and I'm not sure it's the only case.
> How can I deal with this problem ?

I don't really think not implementing the barrier option is a problem.
For ext4 if you want consistency either opencode generic_file_fsync
there and add the check, or add a __generic_file_fsync that doesn't
do the flush.

2014-04-15 17:02:30

by Fabian Frédérick

[permalink] [raw]
Subject: Re: [PATCH 1/1] FS: Add generic data flush to fsync

On Mon, 14 Apr 2014 00:22:42 -0700
Christoph Hellwig <[email protected]> wrote:

> On Fri, Apr 11, 2014 at 11:05:30PM +0200, Fabian Frederick wrote:
> > Ok, that would mean 'older' filesystems -without barrier flag-
> > to call generic_file_fsync would result in a flush but ext4 without journal
> > also calls it ; maybe with nobarrier and I'm not sure it's the only case.
> > How can I deal with this problem ?
>
> I don't really think not implementing the barrier option is a problem.
> For ext4 if you want consistency either opencode generic_file_fsync
> there and add the check, or add a __generic_file_fsync that doesn't
> do the flush.

I've been searching once again for generic_file_fsync callsites and EXT4
is the only filesystem with barrier flag calling generic_file_fsync.

If Ted or Jan could tell me whether EXT4 without journal can issue the flush
(blkdev_issue_flush) whatever option has been chosen for 'barrier' (as that option stands for journal barrier) -> simple case, just add blkdev_issue_flush.

or if it requires looking at barrier flag in ext4/fsync.c
In that case I'd be interested to know how (needs_barrier is based on journal
flags) ...

-> generic_file_fsync calls __generic_file_fsync (flush=true)
-> ext4 calls __generic_file_fsync(false) if nobarrier.

Fabian