Received: by 2002:a25:683:0:0:0:0:0 with SMTP id 125csp2105944ybg; Fri, 5 Jun 2020 05:53:48 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyf5kok+DDFsXe4aGJhz6+SAUJ72T9sgwaWdbyNDX3UChxA1Gzm75azUxzVgqlxudjt8U0O X-Received: by 2002:a05:6402:17af:: with SMTP id j15mr8844286edy.67.1591361628497; Fri, 05 Jun 2020 05:53:48 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1591361628; cv=none; d=google.com; s=arc-20160816; b=jqILHKwMnhQU7PMFzY39zYaqcNpR+GnhqPJB9LU/W/DOEPS0L8VISgKe1xJC/tg+gR NdIBHFrdlWi+pP8r24ScWmQq3CQZ1cZvAn8JYQVxlL0T+mQPBJr2Pv9rzr76cRo7qVZf rQWy9vUxBjPkvx5jnWofeTqARcxkdzmrTF+G6bqoxwpFAWQ7LB4vf+KawjVudprdZIs0 n+TZn0zfArbGi2P+hJpKFpEYku4iZoSk7gdOoPtAWipCu/mrJk39v/NqAPJK4SpetWyW VOII8vIkFM0aonC79kyabj/LFBZAIz7wllN9OeAoVUq9u6BIAePIg8O3aykyy4U7U9vb TWPw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=+W0k3BgZajHHBqJN7IRx+U2nGAZwZKKC8BfjUAFcIMA=; b=BXAaJ/4QvnbiVAej3GLwEA4EjqKjuDi6P0sHpXLgRT3sko0CcSmb4fYknVAdLhRare WV64KtOrk7DFt9JMdPh/kGjB7h5PtM3XCb/r8aoYboH1jM3V/HMEOVUHmQ+WXy1nNnyY sd6IYjE3zxWF2kX4PBBj8KyJ858KABwwEgPr36cbwU8F5eUACCgov46TrTEjP+VEpz8Q R9nsWfc8MRpbPSk9HF15IvmvLZ/yDvz0QAwUL+jEe8zPGtTMAdOyxqgXqpjccYc3SwfW OUjZCzZ/3cj/eF9SuzciTZ2m21SSy4KvGGKpM4sywAvwll6vabV5IWlaq6KBN8Wpp538 Irkg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=kG53gJfK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u10si3593401edy.364.2020.06.05.05.53.25; Fri, 05 Jun 2020 05:53:48 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=bombadil.20170209 header.b=kG53gJfK; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726904AbgFEMsc (ORCPT + 99 others); Fri, 5 Jun 2020 08:48:32 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:34492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726857AbgFEMsb (ORCPT ); Fri, 5 Jun 2020 08:48:31 -0400 Received: from bombadil.infradead.org (bombadil.infradead.org [IPv6:2607:7c80:54:e::133]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2829EC08C5C2; Fri, 5 Jun 2020 05:48:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=bombadil.20170209; h=In-Reply-To:Content-Type:MIME-Version :References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=+W0k3BgZajHHBqJN7IRx+U2nGAZwZKKC8BfjUAFcIMA=; b=kG53gJfK5kWHtokiyBxfJbTb7v v5xJMhAwcLxLaxyos1C08nBeenVsOEyKQ6HM1vCbCxSd8fQswpm59H0WHRf9V+tgqDYFM+dhemQU1 sNWSPO90/4ev/NU8qpRPUz7bisgIm22+Ofsnl6dEu2n7WHaYIfTBQblXb6tP9G5PY2FSonicbr+jP cWrOh2VOgwaxkr7Pr6hfiXMKzyVWJWRTGZBsmWLqE07afy0vc4lUT91n1ML+adFJAKzRjOKTvZSpI 6vtM0o/tzVO9JU5IZFClD1STvj85a9/5aB60JqrEurQmzZL202ZxCgPrTU5DExyKe8Hv/TNCx9N5H 3a+9Uv9w==; Received: from willy by bombadil.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1jhBlq-00058s-76; Fri, 05 Jun 2020 12:48:26 +0000 Date: Fri, 5 Jun 2020 05:48:26 -0700 From: Matthew Wilcox To: Dave Chinner Cc: Christoph Hellwig , "Darrick J. Wong" , linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] iomap: Handle I/O errors gracefully in page_mkwrite Message-ID: <20200605124826.GF19604@bombadil.infradead.org> References: <20200604202340.29170-1-willy@infradead.org> <20200604225726.GU2040@dread.disaster.area> <20200604230519.GW19604@bombadil.infradead.org> <20200604233053.GW2040@dread.disaster.area> <20200604235050.GX19604@bombadil.infradead.org> <20200605003159.GX2040@dread.disaster.area> <20200605022451.GZ19604@bombadil.infradead.org> <20200605030758.GB2040@dread.disaster.area> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200605030758.GB2040@dread.disaster.area> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jun 05, 2020 at 01:07:58PM +1000, Dave Chinner wrote: > On Thu, Jun 04, 2020 at 07:24:51PM -0700, Matthew Wilcox wrote: > > On Fri, Jun 05, 2020 at 10:31:59AM +1000, Dave Chinner wrote: > > > On Thu, Jun 04, 2020 at 04:50:50PM -0700, Matthew Wilcox wrote: > > > > > Sure, but that's not really what I was asking: why isn't this > > > > > !uptodate state caught before the page fault code calls > > > > > ->page_mkwrite? The page fault code has a reference to the page, > > > > > after all, and in a couple of paths it even has the page locked. > > > > > > > > If there's already a PTE present, then the page fault code doesn't > > > > check the uptodate bit. Here's the path I'm looking at: > > > > > > > > do_wp_page() > > > > -> vm_normal_page() > > > > -> wp_page_shared() > > > > -> do_page_mkwrite() > > > > > > > > I don't see anything in there that checked Uptodate. > > > > > > Yup, exactly the code I was looking at when I asked this question. > > > The kernel has invalidated the contents of a page, yet we still have > > > it mapped into userspace as containing valid contents, and we don't > > > check it at all when userspace generates a protection fault on the > > > page? > > > > Right. The iomap error path only clears PageUptodate. It doesn't go > > to the effort of unmapping the page from userspace, so userspace has a > > read-only view of a !Uptodate page. > > Hmmm - did you miss the ->discard_page() callout just before we call > ClearPageUptodate() on error in iomap_writepage_map()? That results > in XFS calling iomap_invalidatepage() on the page, which .... ... I don't think that's the interesting path. I mean, that's the submission path, and usually we discover errors in the completion path, not the submission path. > /me sighs as he realises that ->invalidatepage doesn't actually > invalidate page mappings but only clears the page dirty state and > releases filesystem references to the page. > > Yay. We leave -invalidated page cache pages- mapped into userspace, > and page faults on those pages don't catch access to invalidated > pages. More than that ... by clearing Uptodate, you're trying to prevent future reads to this page from succeeding without verifying the data is still on storage, but a task that had it mapped before can still load the data that was written but never made it to storage. So at some point it'll teleport backwards when another task has a successful read(). Funfunfun. > Geez, we really suck at this whole software thing, don't we? Certainly at handling errors ... > It's not clear to me that we can actually unmap those pages safely > in a race free manner from this code - can we actually do that from > the page writeback path? I don't see why it can't be done from the submission path. unmap_mapping_range() calls i_mmap_lock_write(), which is down_write(i_mmap_rwsem) in drag. There might be a lock ordering issue there, although lockdep should find it pretty quickly. The bigger problem is the completion path. We're in softirq context, so that will have to punt to a thread that can take mutexes.