From: Jan Kara Subject: Re: [PATCH] mm: replace FAULT_FLAG_SIZE with parameter to huge_fault Date: Wed, 8 Feb 2017 09:41:56 +0100 Message-ID: <20170208084156.GD26317@quack2.suse.cz> References: <148615748258.43180.1690152053774975329.stgit@djiang5-desk3.ch.intel.com> <20170206143648.GA461@infradead.org> <20170206172731.GA17515@infradead.org> <20170207084411.GA527@node.shutemov.name> Mime-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Cc: linux-xfs-u79uwXL29TY76Z2rM5mHXA@public.gmane.org, Matthew Wilcox , "linux-nvdimm-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org" , Dave Hansen , Christoph Hellwig , Linux MM , "Kirill A. Shutemov" , Jan Kara , "Kirill A. Shutemov" , Andrew Morton , linux-ext4 , Vlastimil Babka To: Dan Williams Return-path: Content-Disposition: inline In-Reply-To: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Errors-To: linux-nvdimm-bounces-hn68Rpc1hR1g9hUCZPvPmw@public.gmane.org Sender: "Linux-nvdimm" List-Id: linux-ext4.vger.kernel.org On Tue 07-02-17 08:56:56, Dan Williams wrote: > On Tue, Feb 7, 2017 at 12:44 AM, Kirill A. Shutemov > wrote: > > On Mon, Feb 06, 2017 at 09:30:22AM -0800, Dan Williams wrote: > >> On Mon, Feb 6, 2017 at 9:27 AM, Christoph Hellwig wrote: > >> > On Mon, Feb 06, 2017 at 08:24:48AM -0800, Dan Williams wrote: > >> >> > Also can be use this opportunity > >> >> > to fold ->huge_fault into ->fault? > > > > BTW, for tmpfs we already use ->fault for both small and huge pages. > > If ->fault returned THP, core mm look if it's possible to map the page as > > huge in this particular VMA (due to size/alignment). If yes mm maps the > > page with PMD, if not fallback to PTE. > > > > I think it would be nice to do the same for DAX: filesystem provides core > > mm with largest page this part of file can be mapped with (base aligned > > address + lenght for DAX) and core mm sort out the rest. > > For DAX we would need plumb pfn_t into the core mm so that we have the > PFN_DEV and PFN_MAP flags beyond the raw pfn. So we can pass necessary information through struct vm_fault rather easily. However due to DAX locking we cannot really "return" pfn for generic code to install (we need to unlock radix tree locks after modifying page tables). So if we want generic code to handle PFNs what needs to be done is to teach finish_fault() to handle pfn_t which is passed to it and install it in page tables. Long term we could transition all page fault handlers (at least the non-trivial ones) to using finish_fault() which would IMO make the code flow easier to follow and export less of MM internals into drivers. However there's so many fault handlers that I didn't have a good motivation to do that yet. Honza -- Jan Kara SUSE Labs, CR