Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933852AbbHLHy2 (ORCPT ); Wed, 12 Aug 2015 03:54:28 -0400 Received: from mail-wi0-f180.google.com ([209.85.212.180]:34928 "EHLO mail-wi0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933527AbbHLHy0 (ORCPT ); Wed, 12 Aug 2015 03:54:26 -0400 Message-ID: <55CAFBAF.104@plexistor.com> Date: Wed, 12 Aug 2015 10:54:23 +0300 From: Boaz Harrosh User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.7.0 MIME-Version: 1.0 To: "Kirill A. Shutemov" CC: Jan Kara , Dave Chinner , "Kirill A. Shutemov" , Andrew Morton , Matthew Wilcox , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org, Davidlohr Bueso , "Theodore Ts'o" Subject: Re: [PATCH, RFC 2/2] dax: use range_lock instead of i_mmap_lock References: <1439219664-88088-1-git-send-email-kirill.shutemov@linux.intel.com> <1439219664-88088-3-git-send-email-kirill.shutemov@linux.intel.com> <20150811081909.GD2650@quack.suse.cz> <20150811093708.GB906@dastard> <20150811135004.GC2659@quack.suse.cz> <55CA0728.7060001@plexistor.com> <20150811152850.GA2608@node.dhcp.inet.fi> <55CA2008.7070702@plexistor.com> <20150811202639.GA1408@node.dhcp.inet.fi> In-Reply-To: <20150811202639.GA1408@node.dhcp.inet.fi> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1762 Lines: 61 On 08/11/2015 11:26 PM, Kirill A. Shutemov wrote: > On Tue, Aug 11, 2015 at 07:17:12PM +0300, Boaz Harrosh wrote: >> On 08/11/2015 06:28 PM, Kirill A. Shutemov wrote: >>> We also used lock_page() to make sure we shoot out all pages as we don't >>> exclude page faults during truncate. Consider this race: >>> >>> >>> get_block >>> check i_size >>> update i_size >>> unmap >>> setup pte >>> >> >> Please consider this senario then: >> >> >> read_lock(inode) >> >> get_block >> check i_size >> >> read_unlock(inode) >> >> write_lock(inode) >> >> update i_size >> * remove allocated blocks >> unmap >> >> write_unlock(inode) >> >> setup pte >> >> IS what you suppose to do in xfs > > Do you realize that you describe a race? :-P > > Exactly in this scenario pfn your pte point to is not belong to the file > anymore. Have fun. > Sorry yes I have written it wrong, I have now returned to read the actual code and the setup pte part is also part of the read lock inside the fault handler before the release of the r_lock. Da of course it is, it is the page_fault handler that does the vm_insert_mixed(vma,,pfn) and in the case of concurrent faults the second call to vm_insert_mixed will return -EBUSY which means all is well. So the only thing left is the fault-to-fault zero-the-page race as Matthew described and as Dave and me think we can make this part of the FS's get_block where it is more natural. Thanks Boaz -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/