Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751708AbbEFWKK (ORCPT ); Wed, 6 May 2015 18:10:10 -0400 Received: from mail-ie0-f180.google.com ([209.85.223.180]:35271 "EHLO mail-ie0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751127AbbEFWKI (ORCPT ); Wed, 6 May 2015 18:10:08 -0400 MIME-Version: 1.0 In-Reply-To: <20150506200219.40425.74411.stgit@dwillia2-desk3.amr.corp.intel.com> References: <20150506200219.40425.74411.stgit@dwillia2-desk3.amr.corp.intel.com> Date: Wed, 6 May 2015 15:10:07 -0700 X-Google-Sender-Auth: dzI0xnKqFipWpZS0ZkMTfnBv8Cs Message-ID: Subject: Re: [PATCH v2 00/10] evacuate struct page from the block layer, introduce __pfn_t From: Linus Torvalds To: Dan Williams Cc: Linux Kernel Mailing List , Boaz Harrosh , Jan Kara , Mike Snitzer , Neil Brown , Benjamin Herrenschmidt , Dave Hansen , Heiko Carstens , Chris Mason , Paul Mackerras , "H. Peter Anvin" , Christoph Hellwig , Alasdair Kergon , linux-nvdimm@ml01.01.org, Ingo Molnar , Mel Gorman , Matthew Wilcox , Ross Zwisler , Rik van Riel , Martin Schwidefsky , Jens Axboe , "Theodore Ts'o" , "Martin K. Petersen" , Julia Lawall , Tejun Heo , linux-fsdevel , Andrew Morton Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2062 Lines: 46 On Wed, May 6, 2015 at 1:04 PM, Dan Williams wrote: > > The motivation for this change is persistent memory and the desire to > use it not only via the pmem driver, but also as a memory target for I/O > (DAX, O_DIRECT, DMA, RDMA, etc) in other parts of the kernel. I detest this approach. I'd much rather go exactly the other way around, and do the dynamic "struct page" instead. Add a flag to "struct page" to mark it as a fake entry and teach "page_to_pfn()" to look up the actual pfn some way (that union tha contains "index" looks like a good target to also contain 'pfn', for example). Especially if this is mainly for persistent storage, we'll never have issues with worrying about writing it back under memory pressure, so allocating a "struct page" for these things shouldn't be a problem. There's likely only a few paths that actually generate IO for those things. In other words, I'd really like our basic infrastructure to be for the *normal* case, and the "struct page" is about so much more than just "what's the target for IO". For normal IO, "struct page" is also what serializes the IO so that you have a consistent view of the end result, and there's obviously the reference count there too. So I really *really* think that "struct page" is the better entity for describing the actual IO, because it's the common and the generic thing, while a "pfn" is not actually *enough* for IO in general, and you now end up having to look up the "struct page" for the locking and refcounting etc. If you go the other way, and instead generate a "struct page" from the pfn for the few cases that need it, you put the onus on odd behavior where it belongs. Yes, it might not be any simpler in the end, but I think it would be conceptually much better. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/