Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752780AbbFIByo (ORCPT ); Mon, 8 Jun 2015 21:54:44 -0400 Received: from hqemgate16.nvidia.com ([216.228.121.65]:13003 "EHLO hqemgate16.nvidia.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751956AbbFIByj (ORCPT ); Mon, 8 Jun 2015 21:54:39 -0400 X-PGP-Universal: processed; by hqnvupgp07.nvidia.com on Mon, 08 Jun 2015 18:51:56 -0700 Date: Mon, 8 Jun 2015 18:54:29 -0700 From: Mark Hairgrove To: Jerome Glisse CC: "akpm@linux-foundation.org" , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , Linus Torvalds , "joro@8bytes.org" , Mel Gorman , "H. Peter Anvin" , Peter Zijlstra , Andrea Arcangeli , Johannes Weiner , Larry Woodman , Rik van Riel , Dave Airlie , Brendan Conoboy , Joe Donohue , Duncan Poole , Sherry Cheung , Subhash Gutti , John Hubbard , Lucien Dunning , Mark Hairgrove , Cameron Buschardt , Arvind Gopalakrishnan , Haggai Eran , Shachar Raindel , Liran Liss , Roland Dreier , Ben Sander , Greg Stoner , John Bridgman , Michael Mantor , Paul Blinzer , Laurent Morichetti , Alexander Deucher , Oded Gabbay , =?ISO-8859-15?Q?J=E9r=F4me_Glisse?= , Jatin Kumar , "linux-rdma@vger.kernel.org" Subject: Re: [PATCH 05/36] HMM: introduce heterogeneous memory management v3. In-Reply-To: <20150608211740.GA5241@gmail.com> Message-ID: References: <1432236705-4209-1-git-send-email-j.glisse@gmail.com> <1432236705-4209-6-git-send-email-j.glisse@gmail.com> <20150608211740.GA5241@gmail.com> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) X-NVConfidentiality: public MIME-Version: 1.0 X-Originating-IP: [172.17.162.12] X-ClientProxiedBy: HQMAIL101.nvidia.com (172.20.187.10) To HQMAIL107.nvidia.com (172.20.187.13) Content-Type: multipart/mixed; boundary="8323329-1917143692-1433814877=:1802" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4386 Lines: 98 --8323329-1917143692-1433814877=:1802 Content-Type: text/plain; charset="iso-8859-1" Content-Transfer-Encoding: 8BIT On Mon, 8 Jun 2015, Jerome Glisse wrote: > On Mon, Jun 08, 2015 at 12:40:18PM -0700, Mark Hairgrove wrote: > > > > > > On Thu, 21 May 2015, j.glisse@gmail.com wrote: > > > > > From: J?r?me Glisse > > > > > > This patch only introduce core HMM functions for registering a new > > > mirror and stopping a mirror as well as HMM device registering and > > > unregistering. > > > > > > [...] > > > > > > +/* struct hmm_device_operations - HMM device operation callback > > > + */ > > > +struct hmm_device_ops { > > > + /* release() - mirror must stop using the address space. > > > + * > > > + * @mirror: The mirror that link process address space with the device. > > > + * > > > + * When this is call, device driver must kill all device thread using > > > + * this mirror. Also, this callback is the last thing call by HMM and > > > + * HMM will not access the mirror struct after this call (ie no more > > > + * dereference of it so it is safe for the device driver to free it). > > > + * It is call either from : > > > + * - mm dying (all process using this mm exiting). > > > + * - hmm_mirror_unregister() (if no other thread holds a reference) > > > + * - outcome of some device error reported by any of the device > > > + * callback against that mirror. > > > + */ > > > + void (*release)(struct hmm_mirror *mirror); > > > +}; > > > > The comment that ->release is called when the mm dies doesn't match the > > implementation. ->release is only called when the mirror is destroyed, and > > that can only happen after the mirror has been unregistered. This may not > > happen until after the mm dies. > > > > Is the intent for the driver to get the callback when the mm goes down? > > That seems beneficial so the driver can kill whatever's happening on the > > device. Otherwise the device may continue operating in a dead address > > space until the driver's file gets closed and it unregisters the mirror. > > This was the intent before merging free & release. I guess i need to > reinstate the free versus release callback. Sadly the lifetime for HMM > is more complex than mmu_notifier as we intend the mirror struct to > be embedded into a driver private struct. Can you clarify how that's different from mmu_notifiers? Those are also embedded into a driver-owned struct. Is the goal to allow calling hmm_mirror_unregister from within the "mm is dying" HMM callback? I don't know whether that's really necessary as long as there's some association between the driver files and the mirrors. > > If so, I think there's a race here in the case of mm teardown happening > > concurrently with hmm_mirror_unregister. > > > > [...] > > > > Do you agree that this sequence can happen, or am I missing something > > which prevents it? > > Can't happen because child have mm->hmm = NULL ie only one hmm per mm > and hmm is tie to only one mm. It is the responsability of the device > driver to make sure same apply to private reference to the hmm mirror > struct ie hmm_mirror should never be tie to a private file struct. It's useful for the driver to have some association between files and mirrors. If the file is closed prior to process exit we would like to unregister the mirror, otherwise it will persist until process teardown. The association doesn't have to be 1:1 but having the files ref count the mirror or something would be useful. But even if we assume no association at all between files and mirrors, are you sure that prevents the race? The driver may choose to unregister the hmm_device at any point once its files are closed. In the case of module unload the device unregister can't be prevented. If mm teardown hasn't happened yet mirrors may still be active and registered on that hmm_device. The driver thus has to first call hmm_mirror_unregister on all active mirrors, then call hmm_device_unregister. mm teardown of those mirrors may trigger at any point in this sequence, so we're right back to that race. --8323329-1917143692-1433814877=:1802-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/