Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934919AbaKNBBg (ORCPT ); Thu, 13 Nov 2014 20:01:36 -0500 Received: from mta-out1.inet.fi ([62.71.2.203]:49424 "EHLO kirsi1.inet.fi" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933409AbaKNBBf (ORCPT ); Thu, 13 Nov 2014 20:01:35 -0500 Date: Fri, 14 Nov 2014 02:58:33 +0200 From: "Kirill A. Shutemov" To: Linus Torvalds Cc: Jerome Glisse , Andrew Morton , Linux Kernel Mailing List , linux-mm , Joerg Roedel , Mel Gorman , "H. Peter Anvin" , Peter Zijlstra , Andrea Arcangeli , Johannes Weiner , Larry Woodman , Rik van Riel , Dave Airlie , Brendan Conoboy , Joe Donohue , Duncan Poole , Sherry Cheung , Subhash Gutti , John Hubbard , Mark Hairgrove , Lucien Dunning , Cameron Buschardt , Arvind Gopalakrishnan , Shachar Raindel , Liran Liss , Roland Dreier , Ben Sander , Greg Stoner , John Bridgman , Michael Mantor , Paul Blinzer , Laurent Morichetti , Alexander Deucher , Oded Gabbay , =?iso-8859-1?B?Suly9G1l?= Glisse Subject: Re: [PATCH 3/5] lib: lockless generic and arch independent page table (gpt) v2. Message-ID: <20141114005833.GA1572@node.dhcp.inet.fi> References: <1415644096-3513-1-git-send-email-j.glisse@gmail.com> <1415644096-3513-4-git-send-email-j.glisse@gmail.com> <20141110205814.GA4186@gmail.com> <20141110225036.GB4186@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.23.1 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Nov 13, 2014 at 03:50:02PM -0800, Linus Torvalds wrote: > +/* > + * The 'tree_level' data only describes one particular level > + * of the tree. The upper levels are totally invisible to the > + * user of the tree walker, since the tree walker will walk > + * those using the tree definitions. > + * > + * NOTE! "struct tree_entry" is an opaque type, and is just a > + * used as a pointer to the particular level. You can figure > + * out which level you are at by looking at the "tree_level", > + * but even better is to just use different "lookup()" > + * functions for different levels, at which point the > + * function is inherent to the level. Please, don't. We will end up with the same last-level centric code as we have now in mm subsystem: all code only cares about pte. It makes implementing variable page size support really hard and lead to copy-paste approach. And to hugetlb parallel world... It would be nice to have tree_level description generic enough to get rid of pte_present()/pte_dirty()/pte_* and implement generic helpers instead. Apart from variable page size problem, we could get one day support different CPU page table format supported in runtime: PAE/non-PAE on 32-bit x86 or LPAE/non-LPAE on ARM in one binary kernel image. The big topic is how to get it done without significant runtime cost :-/ -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/