Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp596273ybt; Wed, 8 Jul 2020 07:17:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx33FnVXf9ZejdjvPLZobJKb+Es4z/oarPK7PjnFrikYM4Zf5CtTGdPuDEpmv2ARiQbQjGw X-Received: by 2002:a17:907:11ce:: with SMTP id va14mr42199001ejb.189.1594217855431; Wed, 08 Jul 2020 07:17:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594217855; cv=none; d=google.com; s=arc-20160816; b=aYwjDqp4wsEAlnLwT3qqolxxydcBAfuyWfPjhZdM6zxtoK7lVIiQJyWtzFtII9k/ZY 0Jj9az+4OZxvbpisVjA2QknmMLFMRIb1/hdUOo+JlcXO3AHFvH9PrRll2y0RyVIPjcIu W6lUznayURh4JvIREOxVtWKnhRqJaAPEZ1yCOBc0Am5cZQSHnPXqiX45SwM5c7pmkLK4 dB4f+D7IsF2PWYbY/d6PTe6nNGyyCHNIOLH+WbN85iG+hTbKuLiVqKw71uMzyDYvxoHo Lpn6IJMUKAhCoB4fPbeDg5uGd1ELmE9rJ3/h/s1F5IvIQodQG2aqu8HyA7fnoSKSjsXE 4AmA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=UB2tdF87tFQumNjUF6URWsZKnhuGKxYkdOBfeGqgmNc=; b=B8oOZgWOoIgdIGwjKNh6ayN4lD/u8OAW0kGqvnSUzrTX91IUr1JQaFNCXMpcOjpJ+9 TMS1E2r/W3HG2L+NO1Newfaj4dT2Vc4skBOAQUeVkSd9cGkY8ZxKjFeBVIDZRMWoXHo4 pid3YgT4/l/vy48hCUeH++h4vnUnK3GjQc51HAZMU/509ylYH+IDv2ycQgu4nVqDh19G A1b/j4LphPdcktIMkb3cpSWcOxMIKy6A++In65PCNZo+cj5MA/wt3kwPdmP5u82RwWP6 HXpgKt40ZM5xo9qPOKnlSLHX7W5iGzYl/VA7GI6XKN7i/qq0jc5s0rIjTBYn9lqWXyry anTg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=casper.20170209 header.b=UeWTs1yn; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u4si45813edl.565.2020.07.08.07.17.02; Wed, 08 Jul 2020 07:17:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=casper.20170209 header.b=UeWTs1yn; spf=pass (google.com: domain of linux-ext4-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729288AbgGHOPO (ORCPT + 99 others); Wed, 8 Jul 2020 10:15:14 -0400 Received: from casper.infradead.org ([90.155.50.34]:35888 "EHLO casper.infradead.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728148AbgGHOPN (ORCPT ); Wed, 8 Jul 2020 10:15:13 -0400 X-Greylist: delayed 1186 seconds by postgrey-1.27 at vger.kernel.org; Wed, 08 Jul 2020 10:15:13 EDT DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=casper.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=UB2tdF87tFQumNjUF6URWsZKnhuGKxYkdOBfeGqgmNc=; b=UeWTs1ynUHzLfr+5s7N2jufuZq hVJMkoOaL51orUJ2U6d/jXiD65aMw5gc4OGO/iGGnU4H5d+7sRGEgTEwnJmjw0Wvojw1vz0K1jIhX ROyUmisDAxlU1Qaw8C8IOh9jCz/mFEYKYXFV0g01q144YSfPf1SFBQs5YNmhGkhRtkHfecJm//rjj z0mzg63OZ9bdyw2OW79uVfvsYz2zBTM1edW2wTP9uRo6ZkQKXz/zcFQyxn440w3YjX89PhiFI6E5G fva+HWNLqN3VKzlfYtsID3IerAPVQF5Thq/MfdWQu4E8tRwiYt/HoJ/GNzJIU335YmyDNl7Fzxs0v a01ZHrsA==; Received: from willy by casper.infradead.org with local (Exim 4.92.3 #3 (Red Hat Linux)) id 1jtAX1-0005vm-5p; Wed, 08 Jul 2020 13:54:43 +0000 Date: Wed, 8 Jul 2020 14:54:37 +0100 From: Matthew Wilcox To: Dave Chinner Cc: Christoph Hellwig , Goldwyn Rodrigues , linux-fsdevel@vger.kernel.org, linux-btrfs@vger.kernel.org, fdmanana@gmail.com, dsterba@suse.cz, darrick.wong@oracle.com, cluster-devel@redhat.com, linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org Subject: Re: always fall back to buffered I/O after invalidation failures, was: Re: [PATCH 2/6] iomap: IOMAP_DIO_RWF_NO_STALE_PAGECACHE return if page invalidation fails Message-ID: <20200708135437.GP25523@casper.infradead.org> References: <20200629192353.20841-1-rgoldwyn@suse.de> <20200629192353.20841-3-rgoldwyn@suse.de> <20200701075310.GB29884@lst.de> <20200707124346.xnr5gtcysuzehejq@fiona> <20200707125705.GK25523@casper.infradead.org> <20200707130030.GA13870@lst.de> <20200708065127.GM2005@dread.disaster.area> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200708065127.GM2005@dread.disaster.area> Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Wed, Jul 08, 2020 at 04:51:27PM +1000, Dave Chinner wrote: > On Tue, Jul 07, 2020 at 03:00:30PM +0200, Christoph Hellwig wrote: > > On Tue, Jul 07, 2020 at 01:57:05PM +0100, Matthew Wilcox wrote: > > > Indeed, I'm in favour of not invalidating > > > the page cache at all for direct I/O. For reads, I think the page cache > > > should be used to satisfy any portion of the read which is currently > > > cached. For writes, I think we should write into the page cache pages > > > which currently exist, and then force those pages to be written back, > > > but left in cache. > > > > Something like that, yes. > > So are we really willing to take the performance regression that > occurs from copying out of the page cache consuming lots more CPU > than an actual direct IO read? Or that direct IO writes suddenly > serialise because there are page cache pages and now we have to do > buffered IO? > > Direct IO should be a deterministic, zero-copy IO path to/from > storage. Using the CPU to copy data during direct IO is the complete > opposite of the intended functionality, not to mention the behaviour > that many applications have been careful designed and tuned for. Direct I/O isn't deterministic though. If the file isn't shared, then it works great, but as soon as you get mixed buffered and direct I/O, everything is already terrible. Direct I/Os perform pagecache lookups already, but instead of using the data that we found in the cache, we (if it's dirty) write it back, wait for the write to complete, remove the page from the pagecache and then perform another I/O to get the data that we just wrote out! And then the app that's using buffered I/O has to read it back in again. Nobody's proposing changing Direct I/O to exclusively work through the pagecache. The proposal is to behave less weirdly when there's already data in the pagecache. I have had an objection raised off-list. In a scenario with a block device shared between two systems and an application which does direct I/O, everything is normally fine. If one of the systems uses tar to back up the contents of the block device then the application on that system will no longer see the writes from the other system because there's nothing to invalidate the pagecache on the first system. Unfortunately, this is in direct conflict with the performance problem caused by some little arsewipe deciding to do: $ while true; do dd if=/lib/x86_64-linux-gnu/libc-2.30.so iflag=direct of=/dev/null; done ... doesn't hurt me because my root filesystem is on ext4 which doesn't purge the cache. But anything using iomap gets all the pages for libc kicked out of the cache, and that's a lot of fun.