Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966716AbdIZAJG (ORCPT ); Mon, 25 Sep 2017 20:09:06 -0400 Received: from ipmail01.adl2.internode.on.net ([150.101.137.133]:49527 "EHLO ipmail01.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966079AbdIZAJB (ORCPT ); Mon, 25 Sep 2017 20:09:01 -0400 X-IronPort-Anti-Spam-Filtered: true X-IronPort-Anti-Spam-Result: =?us-ascii?q?A2C1BABJmclZ//yBpztchVyDUotfpyiCE?= =?us-ascii?q?oU/BAIChHcXAwEBAQEBAQFrKIUZAQU6HCMQCAMYCSUPBSUDIROKMqpai1Mhgwq?= =?us-ascii?q?ILTWKdgWhH5RPkxNIljcgATaBDjIhCB0Vh3guiw4BAQE?= Date: Tue, 26 Sep 2017 09:29:45 +1000 From: Dave Chinner To: Ross Zwisler Cc: Andrew Morton , linux-kernel@vger.kernel.org, "Darrick J. Wong" , "J. Bruce Fields" , Christoph Hellwig , Dan Williams , Jan Kara , Jeff Layton , linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-nvdimm@lists.01.org, linux-xfs@vger.kernel.org Subject: Re: [PATCH 4/7] xfs: protect S_DAX transitions in XFS write path Message-ID: <20170925232945.GL10955@dastard> References: <20170925231404.32723-1-ross.zwisler@linux.intel.com> <20170925231404.32723-5-ross.zwisler@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170925231404.32723-5-ross.zwisler@linux.intel.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1160 Lines: 33 On Mon, Sep 25, 2017 at 05:14:01PM -0600, Ross Zwisler wrote: > In the current XFS write I/O path we check IS_DAX() in > xfs_file_write_iter() to decide whether to do DAX I/O, direct I/O or > buffered I/O. This check is done without holding the XFS_IOLOCK, though, > which means that if we allow S_DAX to be manipulated via the inode flag we > can run into this race: > > CPU 0 CPU 1 > ----- ----- > xfs_file_write_iter() > IS_DAX() << returns false > xfs_ioctl_setattr() > xfs_ioctl_setattr_dax_invalidate() > xfs_ilock(XFS_MMAPLOCK|XFS_IOLOCK) > sets S_DAX > releases XFS_MMAPLOCK and XFS_IOLOCK > xfs_file_buffered_aio_write() > does buffered I/O to DAX inode, death > > Fix this by ensuring that we only check S_DAX when we hold the XFS_IOLOCK > in the write path. NACK. This breaks concurrent direct IO write semantics. We must not take XFS_IOLOCK_EXCL on direct IO writes unless it is absolutely necessary - there are lots of applications out there that rely on these semantics for performance. CHeers, Dave. -- Dave Chinner david@fromorbit.com