Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753973AbdCFLnr (ORCPT ); Mon, 6 Mar 2017 06:43:47 -0500 Received: from mail-qk0-f172.google.com ([209.85.220.172]:35651 "EHLO mail-qk0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753484AbdCFLnh (ORCPT ); Mon, 6 Mar 2017 06:43:37 -0500 Message-ID: <1488800614.2989.4.camel@redhat.com> Subject: Re: [PATCH 0/3] mm/fs: get PG_error out of the writeback reporting business From: Jeff Layton To: NeilBrown , viro@zeniv.linux.org.uk, konishi.ryusuke@lab.ntt.co.jp Cc: linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org, linux-nilfs@vger.kernel.org, Andrew Morton Date: Mon, 06 Mar 2017 06:43:34 -0500 In-Reply-To: <871subkst8.fsf@notabene.neil.brown.name> References: <20170305133535.6516-1-jlayton@redhat.com> <871subkst8.fsf@notabene.neil.brown.name> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.5 (3.22.5-1.fc25) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2816 Lines: 70 On Mon, 2017-03-06 at 14:06 +1100, NeilBrown wrote: > On Sun, Mar 05 2017, Jeff Layton wrote: > > > I recently did some work to wire up -ENOSPC handling in ceph, and found > > I could get back -EIO errors in some cases when I should have instead > > gotten -ENOSPC. The problem was that the ceph writeback code would set > > PG_error on a writeback error, and that error would clobber the mapping > > error. > > > > While I fixed that problem by simply not setting that bit on errors, > > that led me down a rabbit hole of looking at how PG_error is being > > handled in the kernel. > > Speaking of rabbit holes... I thought to wonder how IO error propagate > up from NFS. > It doesn't use SetPageError or mapping_set_error() for files (except in > one case that looks a bit odd). > It has an "nfs_open_context" and store the latest error in ctx->error. > > So when you get around to documenting how this is supposed to work, it > would be worth while describing the required observable behaviour, and > note that while filesystems can use mapping_set_error() to achieve this, > they don't have to. > > I notice that > drivers/staging/lustre/lustre/llite/rw.c > fs/afs/write.c > fs/btrfs/extent_io.c > fs/cifs/file.c > fs/jffs2/file.c > fs/jfs/jfs_metapage.c > fs/ntfs/aops.c > > (and possible others) all have SetPageError() calls that seem to be > in response to a write error to a file, but don't appear to have > matching mapping_set_error() calls. Did you look at these? Did I miss > something? > Those are all in writepage implementations, and the callers of writepage all call mapping_set_error if it returns error. The exception is write_one_page, which is typically used for writing out dir info and such, and so it's not really necessary there. Now that I look though, I think I may have gotten the page migration codepath wrong. I had convinced myself it was ok before, but looking again, I think we probably need to add a mapping_set_error call to writeout() as well. I'll go over that more carefully in a little while. > > > > This patch series is a few fixes for things that I 100% noticed by > > inspection. I don't have a great way to test these since they involve > > error handling. I can certainly doctor up a kernel to inject errors > > in this code and test by hand however if these look plausible up front. > > > > Jeff Layton (3): > > nilfs2: set the mapping error when calling SetPageError on writeback > > mm: don't TestClearPageError in __filemap_fdatawait_range > > mm: set mapping error when launder_pages fails > > > > fs/nilfs2/segment.c | 1 + > > mm/filemap.c | 19 ++++--------------- > > mm/truncate.c | 6 +++++- > > 3 files changed, 10 insertions(+), 16 deletions(-) > > > > -- > > 2.9.3 -- Jeff Layton