Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp4903573ybg; Mon, 21 Oct 2019 16:46:52 -0700 (PDT) X-Google-Smtp-Source: APXvYqwalEVR4Me+qSsXAbovDRjjVC4QH5LxHkLNqA5APkvDvs0BrE46gnhu0dmJmgsQHwf2XsBL X-Received: by 2002:a50:8f65:: with SMTP id 92mr28906977edy.9.1571701611905; Mon, 21 Oct 2019 16:46:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1571701611; cv=none; d=google.com; s=arc-20160816; b=ZBbuz3j6awoJ6Uoatb8GGAlohlNoBfhqwlom3FjZZ13WC3DmN5RDvFNcR5B6usSqve KS6ptejHj0fgwMyCiGZy0/nWor4UVpu89V8eP3fsEdTq21XgjXFRQxUOjbBCTjPXICzd Ti6rc4VoA/wawUKMSdhNusImopm9mwXraUqQuYRXSzKjh0LgLZUVZS4iPldEj1u0ptN7 Ab8mvwUzBAm2LkzYPW8K2VJBvgM7cmN3bGW71oUZ8pIbuO5ezNtC/nWzxanJrkjIh+DD j2kyb/q4hc6VwlfuYlwUqvfYqBOi41XgGA2ZR3sMSqzCnbEwWdvtUL0cR7u7tpU8JP/+ 3syQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=Y8uPowQ9a0qTgQMVAItTUiJJOGMc5KMxniEg2JddblU=; b=ojSmjPRIf5qRVK1MAehG1NWQY5rnriP+/zd2meCDx8/T2bE/jn+AcICLk5VF4PDOcx IqGoekxJB21XTiROFIxQvV8UA8ZoQSAjwXKJ773BVTxDCUZCogvu/ebF5701wWfwfC/9 qnnLsjJlqEUNdCOJoEOZ4NJsd9Hvf/HezXubnDS9nVsUftbx1O6XsEOSWt154YuJnRn5 0SokyPjzu6LmJfa2RTb4lN8Qa+l90b9+Z08LwE4DspY1RVdYJI0teVLpRFBnN9tgRrhV Wu07zH4ueps8xIael3nOnfHbaGwmNxvpkhfTPlT5NsFMSR9VFYnMGDTfuyUHub95JUuq tMwg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n2si10171419edt.408.2019.10.21.16.46.20; Mon, 21 Oct 2019 16:46:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-ext4-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-ext4-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728819AbfJUXqP (ORCPT + 99 others); Mon, 21 Oct 2019 19:46:15 -0400 Received: from mail105.syd.optusnet.com.au ([211.29.132.249]:45110 "EHLO mail105.syd.optusnet.com.au" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727264AbfJUXqP (ORCPT ); Mon, 21 Oct 2019 19:46:15 -0400 Received: from dread.disaster.area (pa49-180-40-48.pa.nsw.optusnet.com.au [49.180.40.48]) by mail105.syd.optusnet.com.au (Postfix) with ESMTPS id E32A73629FF; Tue, 22 Oct 2019 10:46:04 +1100 (AEDT) Received: from dave by dread.disaster.area with local (Exim 4.92.3) (envelope-from ) id 1iMhNE-0007QX-Aa; Tue, 22 Oct 2019 10:46:04 +1100 Date: Tue, 22 Oct 2019 10:46:04 +1100 From: Dave Chinner To: Ira Weiny Cc: linux-kernel@vger.kernel.org, Alexander Viro , "Darrick J. Wong" , Dan Williams , Christoph Hellwig , "Theodore Y. Ts'o" , Jan Kara , linux-ext4@vger.kernel.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: Re: [PATCH 5/5] fs/xfs: Allow toggle of physical DAX flag Message-ID: <20191021234604.GB2681@dread.disaster.area> References: <20191020155935.12297-1-ira.weiny@intel.com> <20191020155935.12297-6-ira.weiny@intel.com> <20191021004536.GD8015@dread.disaster.area> <20191021224931.GA25526@iweiny-DESK2.sc.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20191021224931.GA25526@iweiny-DESK2.sc.intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Optus-CM-Score: 0 X-Optus-CM-Analysis: v=2.2 cv=D+Q3ErZj c=1 sm=1 tr=0 a=y881pOMu+B+mZdf5UrsJdA==:117 a=y881pOMu+B+mZdf5UrsJdA==:17 a=jpOVt7BSZ2e4Z31A5e1TngXxSK0=:19 a=kj9zAlcOel0A:10 a=XobE76Q3jBoA:10 a=QyXUC8HyAAAA:8 a=7-415B0cAAAA:8 a=YPogiMHmU-6fRLEyw_MA:9 a=z-c3JQe-jSBONlfH:21 a=Z1gFN9eyTm15yBDe:21 a=CjuIK1q_8ugA:10 a=biEYGPWJfzWAr4FL6Ov7:22 Sender: linux-ext4-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-ext4@vger.kernel.org On Mon, Oct 21, 2019 at 03:49:31PM -0700, Ira Weiny wrote: > On Mon, Oct 21, 2019 at 11:45:36AM +1100, Dave Chinner wrote: > > On Sun, Oct 20, 2019 at 08:59:35AM -0700, ira.weiny@intel.com wrote: > > > @@ -1232,12 +1233,10 @@ xfs_diflags_to_linux( > > > inode->i_flags |= S_NOATIME; > > > else > > > inode->i_flags &= ~S_NOATIME; > > > -#if 0 /* disabled until the flag switching races are sorted out */ > > > if (xflags & FS_XFLAG_DAX) > > > inode->i_flags |= S_DAX; > > > else > > > inode->i_flags &= ~S_DAX; > > > -#endif > > > > This code has bit-rotted. See xfs_setup_iops(), where we now have a > > different inode->i_mapping->a_ops for DAX inodes. > > :-( > > > > > That, fundamentally, is the issue here - it's not setting/clearing > > the DAX flag that is the issue, it's doing a swap of the > > mapping->a_ops while there may be other code using that ops > > structure. > > > > IOWs, if there is any code anywhere in the kernel that > > calls an address space op without holding one of the three locks we > > hold here (i_rwsem, MMAPLOCK, ILOCK) then it can race with the swap > > of the address space operations. > > > > By limiting the address space swap to file sizes of zero, we rule > > out the page fault path (mmap of a zero length file segv's with an > > access beyond EOF on the first read/write page fault, right?). > > Yes I checked that and thought we were safe here... > > > However, other aops callers that might run unlocked and do the wrong > > thing if the aops pointer is swapped between check of the aop method > > existing and actually calling it even if the file size is zero? > > > > A quick look shows that FIBMAP (ioctl_fibmap())) looks susceptible > > to such a race condition with the current definitions of the XFS DAX > > aops. I'm guessing there will be others, but I haven't looked > > further than this... > > I'll check for others and think on what to do about this. ext4 will have the > same problem I think. :-( > > I don't suppose using a single a_ops for both DAX and non-DAX is palatable? IMO, no. It means we have to check IS_DAX() in every aops, and replicate a bunch of callouts to generic code. i.e this sort of thing: if (aops->method) return aops->method(...) /* do something else */ results in us having to replicate that logic as something like: if (!IS_DAX) return filesystem_aops_method() /* do something else */ Indeed, the calling code may well do the wrong thing if we have methods defined just to add IS_DAX() checks to avoid using that functionality because the caller now thinks that functionality is supported when in fact it isn't. So it seems to me like an even bigger can of worms to try to use a single aops structure for vastly different functionality.... Cheers, Dave. -- Dave Chinner david@fromorbit.com