Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1039418AbdDUMrS (ORCPT ); Fri, 21 Apr 2017 08:47:18 -0400 Received: from mail-qk0-f173.google.com ([209.85.220.173]:33557 "EHLO mail-qk0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1038258AbdDUMrP (ORCPT ); Fri, 21 Apr 2017 08:47:15 -0400 Message-ID: <1492778818.7308.8.camel@redhat.com> Subject: Re: [PATCH v2 08/17] fs: retrofit old error reporting API onto new infrastructure From: Jeff Layton To: NeilBrown , linux-fsdevel@vger.kernel.org Cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org, tytso@mit.edu, jack@suse.cz, willy@infradead.org, viro@zeniv.linux.org.uk Date: Fri, 21 Apr 2017 08:46:58 -0400 In-Reply-To: <87vaq2tzhu.fsf@notabene.neil.brown.name> References: <20170412120614.6111-1-jlayton@redhat.com> <20170412120614.6111-9-jlayton@redhat.com> <87fuhduvcv.fsf@notabene.neil.brown.name> <1492036881.19286.1.camel@redhat.com> <87vaq2tzhu.fsf@notabene.neil.brown.name> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.6 (3.22.6-2.fc25) Mime-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2739 Lines: 61 On Tue, 2017-04-18 at 08:56 +1000, NeilBrown wrote: > On Wed, Apr 12 2017, Jeff Layton wrote: > > > On Thu, 2017-04-13 at 08:14 +1000, NeilBrown wrote: > > > > > > I suspect that the filemap_check_wb_error() will need to be moved > > > into some parent of the current call site, which is essentially what you > > > suggest below. It would be nice if we could do that first, rather than > > > having the current rather odd code. But maybe this way is an easier > > > transition. It isn't obviously wrong, it just isn't obviously right > > > either. > > > > > > > Yeah. It's just such a daunting task to have to change so much of the > > existing code. I'm looking for ways to make this simpler. > > > > I think it probably is reasonable for filemap_write_and_wait* to just > > sample it as early as possible in those functions. filemap_fdatawait is > > the real questionable one, as you may have already had some writebacks > > complete with errors. > > > > In any case, my thinking was that the old code is not obviously correct > > either, so while this shortens the "error capture window" on these > > calls, it seems like a reasonable place to start improving things. > > I agree. It wouldn't hurt to add a note to this effect in the patch > comment so that people understand that the code isn't seen to be > "correct" but only "no worse" with clear direction on what sort of > improvement might be appropriate. > I've got a cleaned-up set that is getting close to ready for reposting. Before I do though, I think there is another option here that's worth discussing. We could store a second wb_err_t (aka errseq_t in the new set) in the mapping that would would basically act as a "cursor" for these cases. filemap_check_errors would need to do something like filemap_report_wb_error, but it would swap the value into the mapping's cursor instead of dealing with the one in struct file. I don't really like adding yet another field here, but the struct address_space definition has this: __attribute__((aligned(sizeof(long)))); Adding the wb_err field means that we end up growing the struct by 8 bytes on x86_64 anyway. Adding another 4 bytes would just consume the pad, so it wouldn't cost anything there. YMMV on other arches of course. That's also not perfectly like what we have with AS_EIO/AS_ENOSPC flags, but is probably close enough not to matter. So...this would let us limp along for even longer with the model of reporting since last check. I'm not sure that's a good thing though. A long term goal here is to have kernel code that's dealing with writeback be more deliberate about the point from which it's checking errors, and this doesn't help promote that. -- Jeff Layton