Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752262AbbF0DCW (ORCPT ); Fri, 26 Jun 2015 23:02:22 -0400 Received: from hqemgate15.nvidia.com ([216.228.121.64]:16757 "EHLO hqemgate15.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751710AbbF0DCN (ORCPT ); Fri, 26 Jun 2015 23:02:13 -0400 X-PGP-Universal: processed; by hqnvupgp08.nvidia.com on Fri, 26 Jun 2015 20:02:12 -0700 Date: Fri, 26 Jun 2015 20:02:03 -0700 From: Mark Hairgrove To: Jerome Glisse CC: "akpm@linux-foundation.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , Linus Torvalds , "joro@8bytes.org" , Mel Gorman , "H. Peter Anvin" , Peter Zijlstra , Andrea Arcangeli , Johannes Weiner , Larry Woodman , Rik van Riel , Dave Airlie , Brendan Conoboy , Joe Donohue , Duncan Poole , Sherry Cheung , Subhash Gutti , John Hubbard , Lucien Dunning , Cameron Buschardt , Arvind Gopalakrishnan , Haggai Eran , Shachar Raindel , Liran Liss , Roland Dreier , Ben Sander , Greg Stoner , John Bridgman , Michael Mantor , Paul Blinzer , Laurent Morichetti , Alexander Deucher , Oded Gabbay , =?ISO-8859-15?Q?J=E9r=F4me_Glisse?= , Jatin Kumar Subject: Re: [PATCH 07/36] HMM: add per mirror page table v3. In-Reply-To: <20150626164338.GB3748@gmail.com> Message-ID: References: <1432236705-4209-1-git-send-email-j.glisse@gmail.com> <1432236705-4209-8-git-send-email-j.glisse@gmail.com> <20150626164338.GB3748@gmail.com> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) X-NVConfidentiality: public MIME-Version: 1.0 X-Originating-IP: [172.17.162.12] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To HQMAIL108.nvidia.com (172.18.146.13) Content-Type: multipart/mixed; boundary="8323329-557262299-1435374131=:22464" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4622 Lines: 111 --8323329-557262299-1435374131=:22464 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8BIT On Fri, 26 Jun 2015, Jerome Glisse wrote: > On Thu, Jun 25, 2015 at 04:05:48PM -0700, Mark Hairgrove wrote: > > On Thu, 21 May 2015, j.glisse@gmail.com wrote: > > > From: J?r?me Glisse > > > [...] > > > > > > + /* update() - update device mmu following an event. > > > + * > > > + * @mirror: The mirror that link process address space with the device. > > > + * @event: The event that triggered the update. > > > + * Returns: 0 on success or error code {-EIO, -ENOMEM}. > > > + * > > > + * Called to update device page table for a range of address. > > > + * The event type provide the nature of the update : > > > + * - Range is no longer valid (munmap). > > > + * - Range protection changes (mprotect, COW, ...). > > > + * - Range is unmapped (swap, reclaim, page migration, ...). > > > + * - Device page fault. > > > + * - ... > > > + * > > > + * Thought most device driver only need to use pte_mask as it reflects > > > + * change that will happen to the HMM page table ie : > > > + * new_pte = old_pte & event->pte_mask; > > > > Documentation request: It would be useful to break down exactly what is > > required from the driver for each event type here, and what extra > > information is provided by the type that isn't provided by the pte_mask. > > Mostly event tell you if you need to free or not the device page table for > the range, which is not something you can infer from the pte_mask reliably. > Difference btw migration and munmap for instance, same pte_mask but range > is still valid in the migration case it will just be backed by a new set of > pages. Given that event->pte_mask and event->type provide redundant information, are they both necessary? With or without pte_mask, the below table would be helpful to have in the comments for the ->update callback: Event type Driver action HMM_NONE N/A (driver will never get this) HMM_FORK Same as HMM_WRITE_PROTECT HMM_ISDIRTY Same as HMM_WRITE_PROTECT HMM_MIGRATE Make device PTEs invalid and use hmm_pte_set_dirty or hmm_mirror_range_dirty if applicable HMM_MUNMAP Same as HMM_MIGRATE, but the driver may take this as a hint to free device page tables and other resources associated with this range HMM_DEVICE_RFAULT Read hmm_ptes using hmm_pt_iter and write them on the device HMM_DEVICE_WFAULT Same as HMM_DEVICE_RFAULT HMM_WRITE_PROTECT Remove write permission from device PTEs and use hmm_pte_set_dirty or hmm_mirror_range_dirty if applicable > > > [...] > > > @@ -142,6 +223,7 @@ int hmm_device_unregister(struct hmm_device *device); > > > * @kref: Reference counter (private to HMM do not use). > > > * @dlist: List of all hmm_mirror for same device. > > > * @mlist: List of all hmm_mirror for same process. > > > + * @pt: Mirror page table. > > > * > > > * Each device that want to mirror an address space must register one of this > > > * struct for each of the address space it wants to mirror. Same device can > > > @@ -154,6 +236,7 @@ struct hmm_mirror { > > > struct kref kref; > > > struct list_head dlist; > > > struct hlist_node mlist; > > > + struct hmm_pt pt; > > > > Documentation request: Why does each mirror have its own separate set of > > page tables rather than the hmm keeping one set for all devices? This is > > so different devices can have different permissions for the same address > > range, correct? > > Several reasons, first and mostly dma mapping, while i have plan to allow > to share dma mapping directory btw devices this require work in the dma > layer first. Second reasons is, like you point out, different permissions, > like one device requesting atomic access ie the device will be the only > one with write permission and HMM need somewhere to store that information > per device per address. It also helps to avoid calling device driver on a > range that one device does not mirror. Sure, that makes sense. Can you put this in the documentation somewhere, perhaps in the header comments for struct hmm_mirror? Thanks! --8323329-557262299-1435374131=:22464-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/