Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp5879818ybl; Tue, 14 Jan 2020 17:00:01 -0800 (PST) X-Google-Smtp-Source: APXvYqzkdXjrlHZD308iUUbxAWg0q84ZiV9aJDCDcuv9y8u+b1c6ocHo+Fz233MXvGUNgABU6l1M X-Received: by 2002:aca:d610:: with SMTP id n16mr10231877oig.108.1579050001318; Tue, 14 Jan 2020 17:00:01 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1579050001; cv=none; d=google.com; s=arc-20160816; b=zgAp+lrkCYlBtSIMzyj1a24Y9YGHGbJMdj8cTCrcMlfhnRaODDul0JxSaYlCfarhJK Td/dRuxiW738GY/91PFvSJUtueGhDqGLsHoaH75wWdK4WxcRGfuzpid09G8+10/JB899 5dw06I83kfpEt4N+w6K45PKXr+70sayetotWxXwWPlhur622Ze1Et6lkkk3z3GwyJQbN Aytys/Fl0Wf4LHRbt5v6imTtd7DJbCO7po5fFVa33kpvKe3rzfzJs0FcOt9JyC5bZhc1 EmM6/EPJhOrzkrPq6QE2cjRlkF2oIjF26gUMKMQRE4yZ6S+/tzcaCKs2dTBFLxpOFhnI 6lIg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=M2/3Iq+fFzKuk5m9jzoroA/hPDO8TrN8e1hL5IYhMcc=; b=CN37qIzUhtw7tvaaRoGcyeXkVCBGQQaIVIQjgbzPD7755dl+ofVEhVkFHG19HQTwYk ft78W8w4DBZcm2XSiKoRtcVz9q2sni4mwIOJUvzjn5x2cI4e/e/ufxDoaC165E8pPJBi pQbX+5tq4zECf7cmNW05z8rwXJBzIR9K2LvJhxFlD8tMbB4T9R5yrKZveNib3VKU2mqv wJQGg37rU1PV5x8PC47Sa/2eK2uwWDqv3YO3TUNvoZ3QOYkiX4gmfomUKs5/bKx7Gg+s n35SWxws5cNwe4bPAxcbwQFr4CbqBNRvm1BiGC2oHPBKEWaJz+LXWfIqKczBXUiPWCvu MgWw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id l189si8662390oih.166.2020.01.14.16.59.43; Tue, 14 Jan 2020 17:00:01 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728844AbgAOA53 (ORCPT + 99 others); Tue, 14 Jan 2020 19:57:29 -0500 Received: from mga09.intel.com ([134.134.136.24]:41193 "EHLO mga09.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728795AbgAOA53 (ORCPT ); Tue, 14 Jan 2020 19:57:29 -0500 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from orsmga003.jf.intel.com ([10.7.209.27]) by orsmga102.jf.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Jan 2020 16:57:28 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.70,320,1574150400"; d="scan'208";a="225393056" Received: from iweiny-desk2.sc.intel.com ([10.3.52.157]) by orsmga003.jf.intel.com with ESMTP; 14 Jan 2020 16:57:28 -0800 Date: Tue, 14 Jan 2020 16:57:28 -0800 From: Ira Weiny To: "Darrick J. Wong" Cc: linux-kernel@vger.kernel.org, Alexander Viro , Dan Williams , Dave Chinner , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: [RFC PATCH V2 08/12] fs/xfs: Add lock/unlock mode to xfs Message-ID: <20200115005727.GB23311@iweiny-DESK2.sc.intel.com> References: <20200110192942.25021-1-ira.weiny@intel.com> <20200110192942.25021-9-ira.weiny@intel.com> <20200113221957.GN8247@magnolia> <20200114003521.GB29860@iweiny-DESK2.sc.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200114003521.GB29860@iweiny-DESK2.sc.intel.com> User-Agent: Mutt/1.11.1 (2018-12-01) Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Mon, Jan 13, 2020 at 04:35:21PM -0800, 'Ira Weiny' wrote: > On Mon, Jan 13, 2020 at 02:19:57PM -0800, Darrick J. Wong wrote: > > On Fri, Jan 10, 2020 at 11:29:38AM -0800, ira.weiny@intel.com wrote: > > > From: Ira Weiny > > [snip] > > > > > > > diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c > > > index 401da197f012..e8fd95b75e5b 100644 > > > --- a/fs/xfs/xfs_inode.c > > > +++ b/fs/xfs/xfs_inode.c > > > @@ -142,12 +142,12 @@ xfs_ilock_attr_map_shared( > > > * > > > * Basic locking order: > > > * > > > - * i_rwsem -> i_mmap_lock -> page_lock -> i_ilock > > > + * i_rwsem -> i_dax_sem -> i_mmap_lock -> page_lock -> i_ilock > > > > Mmmmmm, more locks. Can we skip the extra lock if CONFIG_FSDAX=n or if > > the filesystem devices don't support DAX at all? > > I'll look into it. > > > > > Also, I don't think we're actually following the i_rwsem -> i_daxsem > > order in fallocate, and possibly elsewhere too? > > I'll have to verify. It took a lot of iterations to get the order working so > I'm not going to claim perfection. Yes this was inconsistent. The code was right WRT i_rwsem. mmap_sem may have issues: What about this? diff --git a/fs/xfs/xfs_inode.c b/fs/xfs/xfs_inode.c index c5d11b70d067..8808782a085e 100644 --- a/fs/xfs/xfs_inode.c +++ b/fs/xfs/xfs_inode.c @@ -142,12 +142,12 @@ xfs_ilock_attr_map_shared( * * Basic locking order: * - * i_rwsem -> i_dax_sem -> i_mmap_lock -> page_lock -> i_ilock + * i_dax_sem -> i_rwsem -> i_mmap_lock -> page_lock -> i_ilock * * mmap_sem locking order: * * i_rwsem -> page lock -> mmap_sem - * mmap_sem -> i_dax_sem -> i_mmap_lock -> page_lock + * i_dax_sem -> mmap_sem -> i_mmap_lock -> page_lock * * The difference in mmap_sem locking order mean that we cannot hold the * i_mmap_lock over syscall based read(2)/write(2) based IO. These IO paths can diff --git a/mm/mmap.c b/mm/mmap.c index e6b68924b7ca..b500aef30b27 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -1547,18 +1547,12 @@ unsigned long do_mmap(struct file *file, unsigned long addr, vm_flags |= VM_NORESERVE; } - if (file) - lock_inode_mode(file_inode(file)); - addr = mmap_region(file, addr, len, vm_flags, pgoff, uf); if (!IS_ERR_VALUE(addr) && ((vm_flags & VM_LOCKED) || (flags & (MAP_POPULATE | MAP_NONBLOCK)) == MAP_POPULATE)) *populate = len; - if (file) - unlock_inode_mode(file_inode(file)); - return addr; } diff --git a/mm/util.c b/mm/util.c index 988d11e6c17c..1cfead8cd1ce 100644 --- a/mm/util.c +++ b/mm/util.c @@ -501,11 +501,18 @@ unsigned long vm_mmap_pgoff(struct file *file, unsigned long addr, ret = security_mmap_file(file, prot, flag); if (!ret) { - if (down_write_killable(&mm->mmap_sem)) + if (file) + lock_inode_mode(file_inode(file)); + if (down_write_killable(&mm->mmap_sem)) { + if (file) + unlock_inode_mode(file_inode(file)); return -EINTR; + } ret = do_mmap_pgoff(file, addr, len, prot, flag, pgoff, &populate, &uf); up_write(&mm->mmap_sem); + if (file) + unlock_inode_mode(file_inode(file)); userfaultfd_unmap_complete(mm, &uf); if (populate) mm_populate(ret, populate);