Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1163476AbdDWWjK (ORCPT ); Sun, 23 Apr 2017 18:39:10 -0400 Received: from mx2.suse.de ([195.135.220.15]:33337 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1163459AbdDWWjF (ORCPT ); Sun, 23 Apr 2017 18:39:05 -0400 From: NeilBrown To: Jeff Layton , linux-fsdevel@vger.kernel.org Date: Mon, 24 Apr 2017 08:38:54 +1000 Cc: linux-kernel@vger.kernel.org, akpm@linux-foundation.org, tytso@mit.edu, jack@suse.cz, willy@infradead.org, viro@zeniv.linux.org.uk Subject: Re: [PATCH v2 08/17] fs: retrofit old error reporting API onto new infrastructure In-Reply-To: <1492778818.7308.8.camel@redhat.com> References: <20170412120614.6111-1-jlayton@redhat.com> <20170412120614.6111-9-jlayton@redhat.com> <87fuhduvcv.fsf@notabene.neil.brown.name> <1492036881.19286.1.camel@redhat.com> <87vaq2tzhu.fsf@notabene.neil.brown.name> <1492778818.7308.8.camel@redhat.com> Message-ID: <87pog2rbpd.fsf@notabene.neil.brown.name> MIME-Version: 1.0 Content-Type: multipart/signed; boundary="=-=-="; micalg=pgp-sha256; protocol="application/pgp-signature" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4465 Lines: 104 --=-=-= Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable On Fri, Apr 21 2017, Jeff Layton wrote: > On Tue, 2017-04-18 at 08:56 +1000, NeilBrown wrote: >> On Wed, Apr 12 2017, Jeff Layton wrote: >>=20 >> > On Thu, 2017-04-13 at 08:14 +1000, NeilBrown wrote: >> > >=20 >> > > I suspect that the filemap_check_wb_error() will need to be moved >> > > into some parent of the current call site, which is essentially what= you >> > > suggest below. It would be nice if we could do that first, rather t= han >> > > having the current rather odd code. But maybe this way is an easier >> > > transition. It isn't obviously wrong, it just isn't obviously right >> > > either. >> > >=20 >> >=20 >> > Yeah. It's just such a daunting task to have to change so much of the >> > existing code. I'm looking for ways to make this simpler. >> >=20 >> > I think it probably is reasonable for filemap_write_and_wait* to just >> > sample it as early as possible in those functions. filemap_fdatawait is >> > the real questionable one, as you may have already had some writebacks >> > complete with errors. >> >=20 >> > In any case, my thinking was that the old code is not obviously correct >> > either, so while this shortens the "error capture window" on these >> > calls, it seems like a reasonable place to start improving things. >>=20 >> I agree. It wouldn't hurt to add a note to this effect in the patch >> comment so that people understand that the code isn't seen to be >> "correct" but only "no worse" with clear direction on what sort of >> improvement might be appropriate. >>=20 > > I've got a cleaned-up set that is getting close to ready for > reposting.=C2=A0Before I do though, I think there is another option here > that's worth discussing. > > We could store a second wb_err_t (aka errseq_t in the new set) in the > mapping that would would basically act as a "cursor" for these cases. > filemap_check_errors would need to do something like=20 > filemap_report_wb_error, but it would swap the value into the mapping's > cursor instead of dealing with the one in struct file. > > I don't really like adding yet another field here, but the struct > address_space definition has this: > > __attribute__((aligned(sizeof(long)))); > > Adding the wb_err field means that we end up growing the struct by 8 > bytes on x86_64 anyway. Adding another 4 bytes would just consume the > pad, so it wouldn't cost anything there. YMMV on other arches of > course. > > That's also not perfectly like what we have with AS_EIO/AS_ENOSPC > flags, but is probably close enough not to matter. > > So...this would let us limp along for even longer with the model of > reporting since last check. I'm not sure that's a good thing though. A > long term goal here is to have kernel code that's dealing with > writeback be more deliberate about the point from which it's checking > errors, and this doesn't help promote that. I think this question needs some input from filesystem developers who might be affected by the answer. My preference is to not add this field. I think we would eventually want to remove it again, and it is easier to ensure it doesn't stay forever if it is never added. The version without this field isn't (I think) too bad, but maybe it is bad enough to motivate fs developers to create a better solution in each individual case. If some filesystem developer says they don't like that sort of social engineering, or objects for any other reason, I will bow to the superior stake they hold. NeilBrown --=-=-= Content-Type: application/pgp-signature; name="signature.asc" -----BEGIN PGP SIGNATURE----- iQIzBAEBCAAdFiEEG8Yp69OQ2HB7X0l6Oeye3VZigbkFAlj9LP4ACgkQOeye3VZi gbnXmg/8D8t9AQrUaxRQKAYrmTzMuZVGMHjZUUUsC/Tuu+3lw+4t7iJwramyIXZ4 fBJF3M7Z0jgZD3CadrW9ZwyTIpDHXixDO5M9OPKz01XX/MLcfoFS0wraNvuLVk46 XXM8r2YaiPsEJr/spCSm/NdRJj/wwXTYsjFUpu0WyNHYp7Zkqr8W9TAeciqbi8yu pLB8Uq4exO3nxgHTeDqAorocsgs6lnd5KOQCcORlVdbUgmjX9DWWKa0IdaXN+UPT 7oMijLhEZvvE7xv7yL8rzE7+cOry4RAjeLPy0WCu9plaGLIbQ+LUXCsDCsQOLbjw GEas/lCb3GFgU2lJudZ0ZRS8ibWxdQobOK4YZSL29lCoRJhn0CUgV6bUDLh3w3UN pGvIQYUaifJAEJtVFEeXioDNJT3MtGJae0b+Numzecco3OruafxyCDyWuc+1htIc X/livj7++BgZhRV1VCnPc7N5YgVXI/SfzCUhDI6rJRmzRpwA92Pdx6t8hCWlsnF3 k1+KGS9CUpVmQSoz0HIfg03UIreSHD6rXRRb4jshE8EW7FtJ0fL8HLQt5H0N0UUn uBCpvO88x9ZdvRbjkzXXIeoIrk8U9ZVXwakSieQOmJ/6Hms3NysHFLltW0aGRH/D KYngfwiNcdZmNb7jjS0wC4rj0BHoyAycv3DwjofLU9dpS6K12LM= =lLZ+ -----END PGP SIGNATURE----- --=-=-=--