Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752837AbbFEFoE (ORCPT ); Fri, 5 Jun 2015 01:44:04 -0400 Received: from mail-wi0-f171.google.com ([209.85.212.171]:37582 "EHLO mail-wi0-f171.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751560AbbFEFoA (ORCPT ); Fri, 5 Jun 2015 01:44:00 -0400 MIME-Version: 1.0 In-Reply-To: <20150512144752.GA4003@gmail.com> References: <20150507191107.GB22952@gmail.com> <554CBE17.4070904@redhat.com> <20150508140556.GA2185@gmail.com> <21836.51957.715473.780762@quad.stoffel.home> <554CEB5D.90209@redhat.com> <20150509084510.GA10587@gmail.com> <20150511082536.GP4327@dastard> <20150511091836.GA29191@gmail.com> <20150512005347.GQ4327@dastard> <20150512144752.GA4003@gmail.com> Date: Thu, 4 Jun 2015 22:43:58 -0700 Message-ID: Subject: Re: "Directly mapped persistent memory page cache" From: Dan Williams To: Jerome Glisse Cc: Dave Chinner , Ingo Molnar , Rik van Riel , Linus Torvalds , John Stoffel , Dave Hansen , Linux Kernel Mailing List , Boaz Harrosh , Jan Kara , Mike Snitzer , Neil Brown , Benjamin Herrenschmidt , Heiko Carstens , Chris Mason , Paul Mackerras , "H. Peter Anvin" , Christoph Hellwig , Alasdair Kergon , "linux-nvdimm@lists.01.org" , Mel Gorman , Matthew Wilcox , Ross Zwisler , Martin Schwidefsky , Jens Axboe , "Theodore Ts'o" , "Martin K. Petersen" , Julia Lawall , Tejun Heo , linux-fsdevel , Andrew Morton Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2497 Lines: 45 On Tue, May 12, 2015 at 7:47 AM, Jerome Glisse wrote: > On Tue, May 12, 2015 at 10:53:47AM +1000, Dave Chinner wrote: >> On Mon, May 11, 2015 at 11:18:36AM +0200, Ingo Molnar wrote: >> IMO, we need to be designing around the concept that the filesytem >> manages the pmem space, and the MM subsystem simply uses the block >> mapping information provided to it from the filesystem to decide how >> it references and maps the regions into the user's address space or >> for DMA. The mm subsystem does not manage the pmem space, it's >> alignment or how it is allocated to user files. Hence page mappings >> can only be - at best - reactive to what the filesystem does with >> it's free space. The mm subsystem already has to query the block >> layer to get mappings on page faults, so it's only a small stretch >> to enhance the DAX mapping request to ask for a large page mapping >> rather than a 4k mapping. If the fs can't do a large page mapping, >> you'll get a 4k aligned mapping back. >> >> What I'm trying to say is that the mapping behaviour needs to be >> designed with the way filesystems and the mm subsystem interact in >> mind, not from a pre-formed "direct Io is bad, we must use the page >> cache" point of view. The filesystem and the mm subsystem must >> co-operate to allow things like large page mappings to be made and >> hence looking at the problem purely from a mm<->pmem device >> perspective as you are ignores an important chunk of the system: >> the part that actually manages the pmem space... > > I am all for letting the filesystem manage pmem, but i think having > struct page expose to mm allow the mm side to stay ignorant of what > is really behind. Also if i could share more code with other i would > be happier :) > As this thread is directly referencing one of the topics listed for the Persistent Memory microconference I do not think it is unreasonable to shamelessly hijack it to promote Linux Plumbers 2015. Tomorrow is the deadline for earlybird registration and topic submission tool is now open for submission of this or any other persistent memory topic. https://linuxplumbersconf.org/2015/attend/ https://linuxplumbersconf.org/2015/how-to-submit-microconference-discussions-topics/ -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/