Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932579AbcDSSXg (ORCPT ); Tue, 19 Apr 2016 14:23:36 -0400 Received: from mga11.intel.com ([192.55.52.93]:64980 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753545AbcDSSXW (ORCPT ); Tue, 19 Apr 2016 14:23:22 -0400 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.24,506,1455004800"; d="scan'208";a="935829457" Date: Tue, 19 Apr 2016 14:23:47 -0400 From: Matthew Wilcox To: Jan Kara Cc: Andrew Morton , Toshi Kani , dan.j.williams@intel.com, viro@zeniv.linux.org.uk, ross.zwisler@linux.intel.com, kirill.shutemov@linux.intel.com, david@fromorbit.com, tytso@mit.edu, adilger.kernel@dilger.ca, linux-nvdimm@ml01.01.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v3 0/2] Align mmap address for DAX pmd mappings Message-ID: <20160419182347.GA29068@linux.intel.com> References: <1460652511-19636-1-git-send-email-toshi.kani@hpe.com> <20160415220531.c7b55adb5b26eb749fae3186@linux-foundation.org> <20160418202610.GA17889@quack2.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20160418202610.GA17889@quack2.suse.cz> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3016 Lines: 61 On Mon, Apr 18, 2016 at 10:26:10PM +0200, Jan Kara wrote: > On Fri 15-04-16 22:05:31, Andrew Morton wrote: > > On Thu, 14 Apr 2016 10:48:29 -0600 Toshi Kani wrote: > > > > > When CONFIG_FS_DAX_PMD is set, DAX supports mmap() using pmd page > > > size. This feature relies on both mmap virtual address and FS > > > block (i.e. physical address) to be aligned by the pmd page size. > > > Users can use mkfs options to specify FS to align block allocations. > > > However, aligning mmap address requires code changes to existing > > > applications for providing a pmd-aligned address to mmap(). > > > > > > For instance, fio with "ioengine=mmap" performs I/Os with mmap() [1]. > > > It calls mmap() with a NULL address, which needs to be changed to > > > provide a pmd-aligned address for testing with DAX pmd mappings. > > > Changing all applications that call mmap() with NULL is undesirable. > > > > > > This patch-set extends filesystems to align an mmap address for > > > a DAX file so that unmodified applications can use DAX pmd mappings. > > > > Matthew sounded unconvinced about the need for this patchset, but I > > must say that > > > > : The point is that we do not need to modify existing applications for using > > : DAX PMD mappings. > > : > > : For instance, fio with "ioengine=mmap" performs I/Os with mmap(). > > : https://github.com/caius/fio/blob/master/engines/mmap.c > > : > > : With this change, unmodified fio can be used for testing with DAX PMD > > : mappings. There are many examples like this, and I do not think we want > > : to modify all applications that we want to evaluate/test with. > > > > sounds pretty convincing? > > > > > > And if we go ahead with this, it looks like 4.7 material to me - it > > affects ABI and we want to get that stabilized asap. What do people > > think? > > So I think Mathew didn't question the patch set as a whole. I think we all > agree that we should align the virtual address we map to so that PMD > mappings can be used. What Mathew was questioning was whether we really > need to play tricks when logical offset in the file where mmap is starting > is not aligned (and similarly for map length). Whether allowing PMD > mappings for unaligned file offsets is worth the complication is IMO a > valid question. I was questioning the approach as a whole ... since we have userspace already doing this in the form of NVML, do we really need the kernel to do this for us? Now, a further wrinkle. We have two competing patch sets (from Kirill and Hugh) which are going to give us THP for page cache filesystems. I would suggest that this is not DAX functionality but rather VFS functionality to opportunistically align all mmaps on files which are reasonably likely to be able to use THP. I hadn't thought about this until earlier today, and I'm sorry I didn't raise it further. Perhaps we can do a lightning session on this later today at LSFMM since all six (Toshi, Andrew, Jan, Hugh, Kirill and myself) are here.