From: Miklos Szeredi Subject: Re: [RFC] [PATCH] Fix race when checking i_size on direct i/o read Date: Fri, 17 Jan 2014 11:20:49 +0100 Message-ID: References: <1387273422.2729.13.camel@menhir> <20131217111626.GA7544@gmail.com> <1387282664.2729.42.camel@menhir> <20131217164159.GA7331@gmail.com> <1387456073.2763.20.camel@menhir> <20131219224400.GC31386@dastard> <1387531724.2739.13.camel@menhir> <20131223030006.GD3220@dastard> <1389712933.2790.31.camel@menhir> <20140114191901.GC27863@quack.suse.cz> <20140115071933.GA3449@gmail.com> <1389886553.2779.32.camel@menhir> Mime-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Cc: Zheng Liu , Jan Kara , Dave Chinner , Linux-Fsdevel , Christoph Hellwig , Dmitry Monakhov , Alexander Viro , Zheng Liu , Lukas Czerner , linux-ext4@vger.kernel.org, Chris Mason , Josef Bacik To: Steven Whitehouse Return-path: In-Reply-To: <1389886553.2779.32.camel@menhir> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-ext4.vger.kernel.org On Thu, Jan 16, 2014 at 4:35 PM, Steven Whitehouse wrote: > > Following on from the "Re: [PATCH v3] vfs: fix a bug when we do some dio > reads with append dio writes" thread on linux-fsdevel, this patch is my > current version of the fix proposed as option (b) in that thread. > > Removing the i_size test from the direct i/o read path at VFS level > means that filesystems now have to deal with requests which are beyond > i_size themselves. These I've divided into three sets: > > a) Those with "no op" ->direct_IO (9p, cifs, ceph) > These are obviously not going to be an issue > > b) Those with "home brew" ->direct_IO (nfs, fuse) > I've been told that NFS should not have any problem with the larger > i_size, however I've added an extra test to FUSE to duplicate the > original behaviour just to be on the safe side. Someone who knows fuse > better maybe able to confirm whether this is actually required or not. > > c) Those using __blockdev_direct_IO() > These call through to ->get_block() which should deal with the EOF > condition correctly. I've verified that with GFS2 and I believe that > Zheng has verified it for ext4. I've also run the test on XFS and it > passes both before and after this change. > > The part of the patch in filemap.c looks a lot larger than it really is > - there are only two lines of real change. The rest is just indentation > of the contained code. > > There remains a test of i_size though, which was added for btrfs. It > doesn't cause the other filesystems a problem as the test is performed > after ->direct_IO has been called. It is possible that there is a race > that does matter to btrfs, however this patch doesn't change that, so > its still an overall improvement. > > So please have a look at this and let me know what you think. I guess > that when time comes to submit it, it should probably be via the vfs > tree. > > Signed-off-by: Steven Whitehouse > Reported-by: Zheng Liu > Cc: Jan Kara > Cc: Dave Chinner > Cc: Miklos Szeredi > Cc: Chris Mason > Cc: Josef Bacik > Cc: Christoph Hellwig > Cc: Alexander Viro > > diff --git a/fs/fuse/file.c b/fs/fuse/file.c > index 7e70506..89fdfd1 100644 > --- a/fs/fuse/file.c > +++ b/fs/fuse/file.c > @@ -2710,6 +2710,9 @@ fuse_direct_IO(int rw, struct kiocb *iocb, const struct iovec *iov, > inode = file->f_mapping->host; > i_size = i_size_read(inode); > > + if ((rw == READ) && (offset > i_size)) > + return 0; > + Hmm, OK. It's not strictly needed, but a valid optimization. So ACK. Thanks, Miklos