From: Matthew Wilcox Subject: Re: [PATCH v3 0/3] Add XIP support to ext4 Date: Tue, 17 Dec 2013 19:31:43 -0700 Message-ID: <20131218023143.GA24491@parisc-linux.org> References: <20131217223050.GB20579@dastard> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Cc: Matthew Wilcox , linux-ext4@vger.kernel.org, linux-fsdevel@vger.kernel.org To: Dave Chinner Return-path: Content-Disposition: inline In-Reply-To: <20131217223050.GB20579@dastard> Sender: linux-fsdevel-owner@vger.kernel.org List-Id: linux-ext4.vger.kernel.org On Wed, Dec 18, 2013 at 09:30:50AM +1100, Dave Chinner wrote: > On Tue, Dec 17, 2013 at 02:18:25PM -0500, Matthew Wilcox wrote: > > For v3, we've addressed the problem with unwritten extents that Dave > > Chinner pointed out. > > No, you haven't addressed the problem. There is nothing in this > patch set that converts an unwritten extent after it is written to. > Hence on every subsequent read will return zeros because the block > is still marked as unwritten. I don't understand. Here's the path as I understand it: xip_file_write -> __xip_file_write -> ext4_get_xip_mem(create=0), returns -ENODATA. So we call ext4_get_xip_mem again, this time with create=1 which causes ext4_get_block() to allocate blocks. > Further, write page faults won't do unwritten extent conversion or > block allocation, either, because: > > You wire .mmap up to xip_file_mmap, which wires up .page_mkwrite > like this: > > static const struct vm_operations_struct xip_file_vm_ops = { > .fault = xip_file_fault, > .page_mkwrite = filemap_page_mkwrite, > .remap_pages = generic_file_remap_pages, > }; > > and filemap_page_mkwrite() does none of the special stuff that > ext4_page_mkwrite() does for handling unwritten extents, allocating > blocks for faults over holes in files, etc. Again, I don't think that's a problem. The first time we take a page fault, we call xip_file_fault() which installs a PFN map if there's no hole. If there is a hole, and the mapping is writable, it calls get_xip_mem with create=1 again, causing the extent to be allocated, so we never get an unwritten extent mapped to userspace. > We actually have an xfstests test that test whether mmap and > unwritten extents work correctly - xfs/166 - but there's nothing > XFS specific about it anymore. it could easily be made generic > simply by replacing xfs_bmap with the xfs_io fiemap command.... Thanks. I'll put that on the increasingly-long todo list ... > Also, you haven't address the read vs truncate races I pointed out. > That is, buffered read currently serialises against truncate via a > combination of inode size checks and page locks. i.e. after each > page is locked, it is checked to see if it is beyond EOF before > the read proceeds into that page. the XIP path does not have any > page locks, nor read IO locks, and so is not in any way serialised > against a truncate changing the size of the inode while the read is > in progress. Umm ... what do you think patch 1/3 does? If you think it doesn't fix the race, I need you to explain why. -- Matthew Wilcox Intel Open Source Technology Centre "Bill, look, we understand that you're interested in selling us this operating system, but compare it to ours. We can't possibly take such a retrograde step."