From: Jeff Layton Subject: Re: [PATCH v7 16/22] block: convert to errseq_t based writeback error tracking Date: Mon, 26 Jun 2017 10:34:18 -0400 Message-ID: <1498487658.5168.8.camel@redhat.com> References: <20170616193427.13955-1-jlayton@redhat.com> <20170616193427.13955-17-jlayton@redhat.com> <20170620123544.GC19781@infradead.org> <1497980684.4555.16.camel@redhat.com> <20170624115946.GA22561@infradead.org> <1498310166.4796.4.camel@redhat.com> Mime-Version: 1.0 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: 7bit Cc: Andrew Morton , Al Viro , Jan Kara , tytso@mit.edu, axboe@kernel.dk, mawilcox@microsoft.com, ross.zwisler@linux.intel.com, corbet@lwn.net, Chris Mason , Josef Bacik , David Sterba , "Darrick J . Wong" , Carlos Maiolino , Eryu Guan , David Howells , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-btrfs@vger.kernel.org, linux-block@vger.kernel.org To: Christoph Hellwig Return-path: Received: from mail-qk0-f169.google.com ([209.85.220.169]:35662 "EHLO mail-qk0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751491AbdFZOeV (ORCPT ); Mon, 26 Jun 2017 10:34:21 -0400 Received: by mail-qk0-f169.google.com with SMTP id 16so2742657qkg.2 for ; Mon, 26 Jun 2017 07:34:20 -0700 (PDT) In-Reply-To: <1498310166.4796.4.camel@redhat.com> Sender: linux-ext4-owner@vger.kernel.org List-ID: On Sat, 2017-06-24 at 09:16 -0400, Jeff Layton wrote: > On Sat, 2017-06-24 at 04:59 -0700, Christoph Hellwig wrote: > > On Tue, Jun 20, 2017 at 01:44:44PM -0400, Jeff Layton wrote: > > > In order to query for errors with errseq_t, you need a previously- > > > sampled point from which to check. When you call > > > filemap_write_and_wait_range though you don't have a struct file and so > > > no previously-sampled value. > > > > So can we simply introduce variants of them that take a struct file? > > That would be: > > > > a) less churn > > b) less code > > c) less chance to get data integrity wrong > > Yeah, I had that thought after I sent the reply to you earlier. > > The main reason I didn't do that before was that I had myself convinced > that we needed to do the check_and_advance as late as possible in the > fsync process, after the metadata had been written. > > Now that I think about it more, I think you're probably correct. As long > as we do the check and advance at some point after doing the > write_and_wait, we're fine here and shouldn't violate exactly once > semantics on the fsync return. So I have a file_write_and_wait_range now that should DTRT for this patch. The bigger question is -- what about more complex filesystems like ext4? There are a couple of cases where we can return -EIO or -EROFS on fsync before filemap_write_and_wait_range is ever called. Like this one for instance: if (unlikely(ext4_forced_shutdown(EXT4_SB(inode->i_sb)))) return -EIO; ...and the EXT4_MF_FS_ABORTED case. Are those conditions ever recoverable, such that a later fsync could succeed? IOW, could I do a remount or something such that the existing fds are left open and become usable again? If so, then we really ought to advance the errseq_t in the file when we catch those cases as well. If we have to do that, then it probably makes sense to leave the ext4 patch as-is. -- Jeff Layton