Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752881AbbF0Beg (ORCPT ); Fri, 26 Jun 2015 21:34:36 -0400 Received: from hqemgate15.nvidia.com ([216.228.121.64]:14765 "EHLO hqemgate15.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753526AbbF0BeZ (ORCPT ); Fri, 26 Jun 2015 21:34:25 -0400 X-PGP-Universal: processed; by hqnvupgp08.nvidia.com on Fri, 26 Jun 2015 18:34:25 -0700 Date: Fri, 26 Jun 2015 18:34:16 -0700 From: Mark Hairgrove To: Jerome Glisse CC: "akpm@linux-foundation.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , Linus Torvalds , "joro@8bytes.org" , Mel Gorman , "H. Peter Anvin" , Peter Zijlstra , Andrea Arcangeli , Johannes Weiner , Larry Woodman , Rik van Riel , Dave Airlie , Brendan Conoboy , Joe Donohue , Duncan Poole , Sherry Cheung , Subhash Gutti , John Hubbard , Lucien Dunning , Cameron Buschardt , Arvind Gopalakrishnan , Haggai Eran , Shachar Raindel , Liran Liss , Roland Dreier , Ben Sander , Greg Stoner , John Bridgman , Michael Mantor , Paul Blinzer , Laurent Morichetti , Alexander Deucher , Oded Gabbay , =?ISO-8859-15?Q?J=E9r=F4me_Glisse?= , Jatin Kumar Subject: Re: [PATCH 06/36] HMM: add HMM page table v2. In-Reply-To: <20150626163030.GA3748@gmail.com> Message-ID: References: <1432236705-4209-1-git-send-email-j.glisse@gmail.com> <1432236705-4209-7-git-send-email-j.glisse@gmail.com> <20150626163030.GA3748@gmail.com> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) X-NVConfidentiality: public MIME-Version: 1.0 X-Originating-IP: [172.17.162.12] X-ClientProxiedBy: HQMAIL105.nvidia.com (172.20.187.12) To HQMAIL108.nvidia.com (172.18.146.13) Content-Type: multipart/mixed; boundary="8323329-1411506140-1435368864=:20890" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5361 Lines: 131 --8323329-1411506140-1435368864=:20890 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8BIT On Fri, 26 Jun 2015, Jerome Glisse wrote: > On Thu, Jun 25, 2015 at 03:57:29PM -0700, Mark Hairgrove wrote: > > On Thu, 21 May 2015, j.glisse@gmail.com wrote: > > > From: J?r?me Glisse > > > [...] > > > + > > > +void hmm_pt_iter_init(struct hmm_pt_iter *iter); > > > +void hmm_pt_iter_fini(struct hmm_pt_iter *iter, struct hmm_pt *pt); > > > +unsigned long hmm_pt_iter_next(struct hmm_pt_iter *iter, > > > + struct hmm_pt *pt, > > > + unsigned long addr, > > > + unsigned long end); > > > +dma_addr_t *hmm_pt_iter_update(struct hmm_pt_iter *iter, > > > + struct hmm_pt *pt, > > > + unsigned long addr); > > > +dma_addr_t *hmm_pt_iter_fault(struct hmm_pt_iter *iter, > > > + struct hmm_pt *pt, > > > + unsigned long addr); > > > > I've got a few more thoughts on hmm_pt_iter after looking at some of the > > later patches. I think I've convinced myself that this patch functionally > > works as-is, but I've got some suggestions and questions about the design. > > > > Right now there are these three major functions: > > > > 1) hmm_pt_iter_update(addr) > > - Returns the hmm_pte * for addr, or NULL if none exists. > > > > 2) hmm_pt_iter_fault(addr) > > - Returns the hmm_pte * for addr, allocating a new one if none exists. > > > > 3) hmm_pt_iter_next(addr, end) > > - Returns the next possibly-valid address. The caller must use > > hmm_pt_iter_update to check if there really is an hmm_pte there. > > > > In my view, there are two sources of confusion here: > > - Naming. "update" shares a name with the HMM mirror callback, and it also > > implies that the page tables are "updated" as a result of the call. > > "fault" likewise implies that the function handles a fault in some way. > > Neither of these implications are true. > > Maybe hmm_pt_iter_walk & hmm_pt_iter_populate are better name ? hmm_pt_iter_populate sounds good. See below for _walk. > > > > - hmm_pt_iter_next and hmm_pt_iter_update have some overlapping > > functionality when compared to traditional iterators, requiring the > > callers to all do this sort of thing: > > > > hmm_pte = hmm_pt_iter_update(&iter, &mirror->pt, addr); > > if (!hmm_pte) { > > addr = hmm_pt_iter_next(&iter, &mirror->pt, > > addr, event->end); > > continue; > > } > > > > Wouldn't it be more efficient and simpler to have _next do all the > > iteration internally so it always returns the next valid entry? Then you > > could combine _update and _next into a single function, something along > > these lines (which also addresses the naming concern): > > > > void hmm_pt_iter_init(iter, pt, start, end); > > unsigned long hmm_pt_iter_next(iter, hmm_pte *); > > unsigned long hmm_pt_iter_next_alloc(iter, hmm_pte *); > > > > hmm_pt_iter_next would return the address and ptep of the next valid > > entry, taking the place of the existing _update and _next functions. > > hmm_pt_iter_next_alloc takes the place of _fault. > > > > Also, since the _next functions don't take in an address, the iterator > > doesn't have to handle the input addr being different from iter->cur. > > It would still need to do the same kind of test, this test is really to > know when you switch from one directory to the next and to drop and take > reference accordingly. But all of the directory references are already hidden entirely in the iterator _update function. The caller only has to worry about taking references on the bottom level, so I don't understand why the iterator needs to return to the caller when it hits the end of a directory. Or for that matter, why it returns every possible index within a directory to the caller whether that index is valid or not. If _next only returned to the caller when it hit a valid hmm_pte (or end), then only one function would be needed (_next) instead of two (_update/_walk and _next). > > > > The logical extent of this is a callback approach like mm_walk. That would > > be nice because the caller wouldn't have to worry about making the _init > > and _fini calls. I assume you didn't go with this approach because > > sometimes you need to iterate over hmm_pt while doing an mm_walk itself, > > and you didn't want the overhead of nesting those? > > Correct i do not want to do a hmm_pt_walk inside a mm_walk, that sounded and > looked bad in my mind. That being said i could add a hmm_pt_walk like mm_walk > for device driver and simply have it using the hmm_pt_iter internally. I agree that nesting walks feels bad. If we can get the hmm_pt_iter API simple enough, I don't think an hmm_pt_walk callback approach is necessary. > > > > Finally, another minor thing I just noticed: shouldn't hmm_pt.h include > > since it uses all of the clear/set/test bit APIs? > > Good catch, i forgot that. > > Cheers, > J?r?me > --8323329-1411506140-1435368864=:20890-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/