Subject: [PATCH] ext4: fix blkdev_issue_flush() failure handling

blkdev_issue_flush() may fail (i.e. due to media error on FLUSH CACHE
command execution) so its users should check for the return value.

Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
---
fs/ext4/fsync.c | 9 ++++++---
1 file changed, 6 insertions(+), 3 deletions(-)

Index: b/fs/ext4/fsync.c
===================================================================
--- a/fs/ext4/fsync.c
+++ b/fs/ext4/fsync.c
@@ -48,7 +48,7 @@ int ext4_sync_file(struct file *file, st
{
struct inode *inode = dentry->d_inode;
journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
- int ret = 0;
+ int ret = 0, tmp_ret;

J_ASSERT(ext4_journal_current_handle() == NULL);

@@ -92,8 +92,11 @@ int ext4_sync_file(struct file *file, st
.nr_to_write = 0, /* sys_fsync did this */
};
ret = sync_inode(inode, &wbc);
- if (journal && (journal->j_flags & JBD2_BARRIER))
- blkdev_issue_flush(inode->i_sb->s_bdev, NULL);
+ if (journal && (journal->j_flags & JBD2_BARRIER)) {
+ tmp_ret = blkdev_issue_flush(inode->i_sb->s_bdev, NULL);
+ if (ret == 0 && tmp_ret < 0 && tmp_ret != -EOPNOTSUPP)
+ ret = tmp_ret;
+ }
}
out:
return ret;


2009-03-29 17:43:32

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix blkdev_issue_flush() failure handling

Bartlomiej Zolnierkiewicz wrote:
> blkdev_issue_flush() may fail (i.e. due to media error on FLUSH CACHE
> command execution) so its users should check for the return value.
>
> Signed-off-by: Bartlomiej Zolnierkiewicz <[email protected]>
> ---
> fs/ext4/fsync.c | 9 ++++++---
> 1 file changed, 6 insertions(+), 3 deletions(-)
>
> Index: b/fs/ext4/fsync.c
> ===================================================================
> --- a/fs/ext4/fsync.c
> +++ b/fs/ext4/fsync.c
> @@ -48,7 +48,7 @@ int ext4_sync_file(struct file *file, st
> {
> struct inode *inode = dentry->d_inode;
> journal_t *journal = EXT4_SB(inode->i_sb)->s_journal;
> - int ret = 0;
> + int ret = 0, tmp_ret;
>
> J_ASSERT(ext4_journal_current_handle() == NULL);
>
> @@ -92,8 +92,11 @@ int ext4_sync_file(struct file *file, st
> .nr_to_write = 0, /* sys_fsync did this */
> };
> ret = sync_inode(inode, &wbc);
> - if (journal && (journal->j_flags & JBD2_BARRIER))
> - blkdev_issue_flush(inode->i_sb->s_bdev, NULL);
> + if (journal && (journal->j_flags & JBD2_BARRIER)) {
> + tmp_ret = blkdev_issue_flush(inode->i_sb->s_bdev, NULL);
> + if (ret == 0 && tmp_ret < 0 && tmp_ret != -EOPNOTSUPP)
> + ret = tmp_ret;
> + }
> }
> out:
> return ret;

As long as we keep the call there this is probably good, but after
talking w/ Chris Mason, I think the call is extraneous anyway and should
probably just be removed...

-Eric

2009-03-30 02:25:27

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix blkdev_issue_flush() failure handling

On Sun, Mar 29, 2009 at 12:43:22PM -0500, Eric Sandeen wrote:
>
> As long as we keep the call there this is probably good, but after
> talking w/ Chris Mason, I think the call is extraneous anyway and should
> probably just be removed...
>

Yes, I agree, but it takes a lot of digging to be completely sure of
that it's safe to remove it. Interestingly, it was you who added the
patch which added the call to blkdev_issue_flush():

commit d755fb384250d6bd7fd18a0930e71965acc8e72e
Author: Eric Sandeen <[email protected]>
Date: Fri Jul 11 19:27:31 2008 -0400

ext4: call blkdev_issue_flush on fsync

To ensure that bits are truly on-disk after an fsync,
we should call blkdev_issue_flush if barriers are supported.

Inspired by an old thread on barriers, by reiserfs & xfs
which do the same, and by a patch SuSE ships with their kernel

Signed-off-by: Eric Sandeen <[email protected]>
Signed-off-by: Mingming Cao <[email protected]>
Signed-off-by: "Theodore Ts'o" <[email protected]>

When we remove it we should add a comment noting why it's not
necessary.

- Ted

2009-03-30 03:22:20

by Eric Sandeen

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix blkdev_issue_flush() failure handling

Theodore Tso wrote:
> On Sun, Mar 29, 2009 at 12:43:22PM -0500, Eric Sandeen wrote:
>> As long as we keep the call there this is probably good, but after
>> talking w/ Chris Mason, I think the call is extraneous anyway and should
>> probably just be removed...
>>
>
> Yes, I agree, but it takes a lot of digging to be completely sure of
> that it's safe to remove it. Interestingly, it was you who added the
> patch which added the call to blkdev_issue_flush():

> commit d755fb384250d6bd7fd18a0930e71965acc8e72e
> Author: Eric Sandeen <[email protected]>
> Date: Fri Jul 11 19:27:31 2008 -0400


Yes, it was. Although I got the idea when hch pointed out that SuSE did
this... thanks to Chris. It's come full circle. :)

-Eric

2009-03-30 11:48:14

by Chris Mason

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix blkdev_issue_flush() failure handling

On Sun, 2009-03-29 at 22:22 -0500, Eric Sandeen wrote:
> Theodore Tso wrote:
> > On Sun, Mar 29, 2009 at 12:43:22PM -0500, Eric Sandeen wrote:
> >> As long as we keep the call there this is probably good, but after
> >> talking w/ Chris Mason, I think the call is extraneous anyway and should
> >> probably just be removed...
> >>
> >
> > Yes, I agree, but it takes a lot of digging to be completely sure of
> > that it's safe to remove it. Interestingly, it was you who added the
> > patch which added the call to blkdev_issue_flush():
>
> > commit d755fb384250d6bd7fd18a0930e71965acc8e72e
> > Author: Eric Sandeen <[email protected]>
> > Date: Fri Jul 11 19:27:31 2008 -0400
>
>
> Yes, it was. Although I got the idea when hch pointed out that SuSE did
> this... thanks to Chris. It's come full circle. :)

Grin. I'm not sure the I_DIRTY checks alone are enough to decide that a
commit is required though. I think the inode could be clean but still
have metadata that needs commit.

-chris



Subject: Re: [PATCH] ext4: fix blkdev_issue_flush() failure handling

Chris Mason wrote:
> On Sun, 2009-03-29 at 22:22 -0500, Eric Sandeen wrote:
>> Theodore Tso wrote:
>>> On Sun, Mar 29, 2009 at 12:43:22PM -0500, Eric Sandeen wrote:
>>>> As long as we keep the call there this is probably good, but after
>>>> talking w/ Chris Mason, I think the call is extraneous anyway and should
>>>> probably just be removed...
>>>>
>>> Yes, I agree, but it takes a lot of digging to be completely sure of
>>> that it's safe to remove it. Interestingly, it was you who added the
>>> patch which added the call to blkdev_issue_flush():
>>> commit d755fb384250d6bd7fd18a0930e71965acc8e72e
>>> Author: Eric Sandeen <[email protected]>
>>> Date: Fri Jul 11 19:27:31 2008 -0400
>>
>> Yes, it was. Although I got the idea when hch pointed out that SuSE did
>> this... thanks to Chris. It's come full circle. :)
>
> Grin. I'm not sure the I_DIRTY checks alone are enough to decide that a
> commit is required though. I think the inode could be clean but still
> have metadata that needs commit.

Chris, I have just sent patches that attempt to fix both ext3 and
ext4 while also adding a per-device sysfs knob tu disable
write-flushes. A previous version of this patch set added a new
generic mount option but comments from Christoph and others
convinced me to turn it into a per-device tunable. Could you take
a look at the patches?

Bartlomiej, I have just noticed that I happened to be working on
patches for reiserfs and xfs similar to the ones you sent earlier
this week. I picked some bits from your submission so I took the
liberty to add your signed-off to my patches. Could you take a
look at them and let me know if you are confortable with that?

Latest patches: http://lkml.org/lkml/2009/3/30/100
Beginning of the sub-thread: http://lkml.org/lkml/2009/3/29/28

Best regards,

Fernando

2009-03-30 13:24:19

by Chris Mason

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix blkdev_issue_flush() failure handling

On Mon, 2009-03-30 at 22:01 +0900, Fernando Luis Vázquez Cao wrote:
> Chris Mason wrote:
> > On Sun, 2009-03-29 at 22:22 -0500, Eric Sandeen wrote:
> >> Theodore Tso wrote:
> >>> On Sun, Mar 29, 2009 at 12:43:22PM -0500, Eric Sandeen wrote:
> >>>> As long as we keep the call there this is probably good, but after
> >>>> talking w/ Chris Mason, I think the call is extraneous anyway and should
> >>>> probably just be removed...
> >>>>
> >>> Yes, I agree, but it takes a lot of digging to be completely sure of
> >>> that it's safe to remove it. Interestingly, it was you who added the
> >>> patch which added the call to blkdev_issue_flush():
> >>> commit d755fb384250d6bd7fd18a0930e71965acc8e72e
> >>> Author: Eric Sandeen <[email protected]>
> >>> Date: Fri Jul 11 19:27:31 2008 -0400
> >>
> >> Yes, it was. Although I got the idea when hch pointed out that SuSE did
> >> this... thanks to Chris. It's come full circle. :)
> >
> > Grin. I'm not sure the I_DIRTY checks alone are enough to decide that a
> > commit is required though. I think the inode could be clean but still
> > have metadata that needs commit.
>
> Chris, I have just sent patches that attempt to fix both ext3 and
> ext4 while also adding a per-device sysfs knob tu disable
> write-flushes. A previous version of this patch set added a new
> generic mount option but comments from Christoph and others
> convinced me to turn it into a per-device tunable. Could you take
> a look at the patches?
>

Jens' comment are right on I think. If we get that fixed up we can get
rid of all the filesystem mount -o barrier=flush,0,1,xyz confusion and
set it via the block devices directly.

That would be nice ;)

-chris

2009-03-30 14:25:41

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix blkdev_issue_flush() failure handling

On Mon, Mar 30, 2009 at 10:01:16PM +0900, Fernando Luis V?zquez Cao wrote:
>
> Chris, I have just sent patches that attempt to fix both ext3 and
> ext4 while also adding a per-device sysfs knob tu disable
> write-flushes. A previous version of this patch set added a new
> generic mount option but comments from Christoph and others
> convinced me to turn it into a per-device tunable. Could you take
> a look at the patches?

Fernando, see my comments on those patches. We don't need to issue a
barrier after a call to sync_inode() or ext[34]_force_commit(), since
those functions will issue a barrier for us. It would probably be a
good idea to use blktrace to test and make sure that we have one and
exactly one barrier op issued for each fsync().

- Ted

Subject: Re: [PATCH] ext4: fix blkdev_issue_flush() failure handling


Hi,

On Monday 30 March 2009, Fernando Luis V?zquez Cao wrote:
> Chris Mason wrote:
> > On Sun, 2009-03-29 at 22:22 -0500, Eric Sandeen wrote:
> >> Theodore Tso wrote:
> >>> On Sun, Mar 29, 2009 at 12:43:22PM -0500, Eric Sandeen wrote:
> >>>> As long as we keep the call there this is probably good, but after
> >>>> talking w/ Chris Mason, I think the call is extraneous anyway and should
> >>>> probably just be removed...
> >>>>
> >>> Yes, I agree, but it takes a lot of digging to be completely sure of
> >>> that it's safe to remove it. Interestingly, it was you who added the
> >>> patch which added the call to blkdev_issue_flush():
> >>> commit d755fb384250d6bd7fd18a0930e71965acc8e72e
> >>> Author: Eric Sandeen <[email protected]>
> >>> Date: Fri Jul 11 19:27:31 2008 -0400
> >>
> >> Yes, it was. Although I got the idea when hch pointed out that SuSE did
> >> this... thanks to Chris. It's come full circle. :)
> >
> > Grin. I'm not sure the I_DIRTY checks alone are enough to decide that a
> > commit is required though. I think the inode could be clean but still
> > have metadata that needs commit.
>
> Chris, I have just sent patches that attempt to fix both ext3 and
> ext4 while also adding a per-device sysfs knob tu disable
> write-flushes. A previous version of this patch set added a new
> generic mount option but comments from Christoph and others
> convinced me to turn it into a per-device tunable. Could you take
> a look at the patches?
>
> Bartlomiej, I have just noticed that I happened to be working on
> patches for reiserfs and xfs similar to the ones you sent earlier
> this week. I picked some bits from your submission so I took the
> liberty to add your signed-off to my patches. Could you take a
> look at them and let me know if you are confortable with that?

I'm fine with people building bigger changes on top of my patches
but if you do so you please clearly denote in the patch description
what changes you have applied to the original patch...

Thanks,
Bart

2009-03-30 17:46:07

by Theodore Ts'o

[permalink] [raw]
Subject: Re: [PATCH] ext4: fix blkdev_issue_flush() failure handling

On Mon, Mar 30, 2009 at 07:47:47AM -0400, Chris Mason wrote:
> > Yes, it was. Although I got the idea when hch pointed out that SuSE did
> > this... thanks to Chris. It's come full circle. :)
>
> Grin. I'm not sure the I_DIRTY checks alone are enough to decide that a
> commit is required though. I think the inode could be clean but still
> have metadata that needs commit.

So to close this hole, I think what we can do is to track the last
transaction id where ext4_do_update_inode() was called, and if that
transaction id == the currently running transaction id, then we need
to call ext4_force_commit() even though the inode is clean. I think
that should fix up the race that you're concerned about.

- Ted

Subject: Re: [PATCH] ext4: fix blkdev_issue_flush() failure handling

Bartlomiej Zolnierkiewicz wrote:
> Hi,
>
> On Monday 30 March 2009, Fernando Luis V?zquez Cao wrote:
>> Chris Mason wrote:
>>> On Sun, 2009-03-29 at 22:22 -0500, Eric Sandeen wrote:
>>>> Theodore Tso wrote:
>>>>> On Sun, Mar 29, 2009 at 12:43:22PM -0500, Eric Sandeen wrote:
>>>>>> As long as we keep the call there this is probably good, but after
>>>>>> talking w/ Chris Mason, I think the call is extraneous anyway and should
>>>>>> probably just be removed...
>>>>>>
>>>>> Yes, I agree, but it takes a lot of digging to be completely sure of
>>>>> that it's safe to remove it. Interestingly, it was you who added the
>>>>> patch which added the call to blkdev_issue_flush():
>>>>> commit d755fb384250d6bd7fd18a0930e71965acc8e72e
>>>>> Author: Eric Sandeen <[email protected]>
>>>>> Date: Fri Jul 11 19:27:31 2008 -0400
>>>> Yes, it was. Although I got the idea when hch pointed out that SuSE did
>>>> this... thanks to Chris. It's come full circle. :)
>>> Grin. I'm not sure the I_DIRTY checks alone are enough to decide that a
>>> commit is required though. I think the inode could be clean but still
>>> have metadata that needs commit.
>> Chris, I have just sent patches that attempt to fix both ext3 and
>> ext4 while also adding a per-device sysfs knob tu disable
>> write-flushes. A previous version of this patch set added a new
>> generic mount option but comments from Christoph and others
>> convinced me to turn it into a per-device tunable. Could you take
>> a look at the patches?
>>
>> Bartlomiej, I have just noticed that I happened to be working on
>> patches for reiserfs and xfs similar to the ones you sent earlier
>> this week. I picked some bits from your submission so I took the
>> liberty to add your signed-off to my patches. Could you take a
>> look at them and let me know if you are confortable with that?
>
> I'm fine with people building bigger changes on top of my patches
> but if you do so you please clearly denote in the patch description
> what changes you have applied to the original patch...

You are right, sorry about that. I will add a short changelog when I
resubmit the patches.

Thanks!

- Fernando

Subject: Re: [PATCH] ext4: fix blkdev_issue_flush() failure handling

Chris Mason wrote:
> On Mon, 2009-03-30 at 22:01 +0900, Fernando Luis Vázquez Cao wrote:
>> Chris Mason wrote:
>>> On Sun, 2009-03-29 at 22:22 -0500, Eric Sandeen wrote:
>>>> Theodore Tso wrote:
>>>>> On Sun, Mar 29, 2009 at 12:43:22PM -0500, Eric Sandeen wrote:
>>>>>> As long as we keep the call there this is probably good, but after
>>>>>> talking w/ Chris Mason, I think the call is extraneous anyway and should
>>>>>> probably just be removed...
>>>>>>
>>>>> Yes, I agree, but it takes a lot of digging to be completely sure of
>>>>> that it's safe to remove it. Interestingly, it was you who added the
>>>>> patch which added the call to blkdev_issue_flush():
>>>>> commit d755fb384250d6bd7fd18a0930e71965acc8e72e
>>>>> Author: Eric Sandeen <[email protected]>
>>>>> Date: Fri Jul 11 19:27:31 2008 -0400
>>>> Yes, it was. Although I got the idea when hch pointed out that SuSE did
>>>> this... thanks to Chris. It's come full circle. :)
>>> Grin. I'm not sure the I_DIRTY checks alone are enough to decide that a
>>> commit is required though. I think the inode could be clean but still
>>> have metadata that needs commit.
>> Chris, I have just sent patches that attempt to fix both ext3 and
>> ext4 while also adding a per-device sysfs knob tu disable
>> write-flushes. A previous version of this patch set added a new
>> generic mount option but comments from Christoph and others
>> convinced me to turn it into a per-device tunable. Could you take
>> a look at the patches?
>>
>
> Jens' comment are right on I think. If we get that fixed up we can get
> rid of all the filesystem mount -o barrier=flush,0,1,xyz confusion and
> set it via the block devices directly.
>
> That would be nice ;)

Thank you for your feedback, Chris! I will address some of the issues spotted
in the mailing list and resend the whole patch-set.

Regards,

Fernando

Subject: Re: [PATCH] ext4: fix blkdev_issue_flush() failure handling

Theodore Tso wrote:
> On Mon, Mar 30, 2009 at 10:01:16PM +0900, Fernando Luis V?zquez Cao wrote:
>> Chris, I have just sent patches that attempt to fix both ext3 and
>> ext4 while also adding a per-device sysfs knob tu disable
>> write-flushes. A previous version of this patch set added a new
>> generic mount option but comments from Christoph and others
>> convinced me to turn it into a per-device tunable. Could you take
>> a look at the patches?
>
> Fernando, see my comments on those patches. We don't need to issue a
> barrier after a call to sync_inode() or ext[34]_force_commit(), since
> those functions will issue a barrier for us. It would probably be a
> good idea to use blktrace to test and make sure that we have one and
> exactly one barrier op issued for each fsync().

I'll give blktrace a spin and check if things are working as expected.

Thanks!

- Fernando