Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754487AbcCMXJq (ORCPT ); Sun, 13 Mar 2016 19:09:46 -0400 Received: from mail-ob0-f173.google.com ([209.85.214.173]:33459 "EHLO mail-ob0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754463AbcCMXJj (ORCPT ); Sun, 13 Mar 2016 19:09:39 -0400 MIME-Version: 1.0 In-Reply-To: <20160312183005.GA2525@linux.intel.com> References: <1457730784-9890-1-git-send-email-matthew.r.wilcox@intel.com> <1457730784-9890-2-git-send-email-matthew.r.wilcox@intel.com> <20160312183005.GA2525@linux.intel.com> Date: Sun, 13 Mar 2016 16:09:38 -0700 Message-ID: Subject: Re: [PATCH 1/3] pfn_t: Change the encoding From: Dan Williams To: Matthew Wilcox Cc: Matthew Wilcox , "linux-nvdimm@lists.01.org" , "linux-kernel@vger.kernel.org" , Linux MM , Dave Hansen Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2045 Lines: 39 On Sat, Mar 12, 2016 at 10:30 AM, Matthew Wilcox wrote: > On Fri, Mar 11, 2016 at 01:40:20PM -0800, Dan Williams wrote: >> On Fri, Mar 11, 2016 at 1:13 PM, Matthew Wilcox >> wrote: >> > By moving the flag bits to the bottom, we encourage commonality >> > between SGs with pages and those using pfn_t. We can also then insert >> > a pfn_t into a radix tree, as it uses the same two bits for indirect & >> > exceptional indicators. >> >> It's not immediately clear to me what we gain with SG entry >> commonality. The down side is that we lose the property that >> pfn_to_pfn_t() is a nop. This was Dave's suggestion so that the >> nominal case did not change the binary layout of a typical pfn. > > I understand that motivation! > >> Can we just bit swizzle a pfn_t on insertion/retrieval from the radix? > > Of course we *can*, but we end up doing more swizzling that way than we > do this way. In the Brave New Future where we're storing pfn_t in the > radix tree, on a page fault we find the pfn_t in the radix tree then > we want to insert it into the page tables. So DAX would first have to > convert the radix tree entry to a pfn_t, then the page table code has to > convert the pfn_t into a pte/pmd/pud (which we currently do by converting > a pfn_t to a pfn, then converting the pfn to a pte/pmd/pud, but I assume > that either the compiler optimises that into a single conversion, or we'll > add pfn_t_pte to each architecture in future if it's actually a problem). > > Much easier to look up a pfn_t in the radix tree and pass it directly > to vm_insert_mixed(). > > If there's any part of the kernel that is doing a *lot* of conversion > between pfn_t and pfn, that surely indicates a place in the kernel where > we need to convert an interface from pfn to pfn_t. So this is dependent on where pfn_t gets pushed in the future. For example, if we revive using a pfn_t in a bio then I think the pfn_to_pfn_t() conversions will be more prevalent than the fs/dax.c radix usages.