Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760755AbYBRWdz (ORCPT ); Mon, 18 Feb 2008 17:33:55 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752376AbYBRWdr (ORCPT ); Mon, 18 Feb 2008 17:33:47 -0500 Received: from sj-iport-2.cisco.com ([171.71.176.71]:31272 "EHLO sj-iport-2.cisco.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752569AbYBRWdp (ORCPT ); Mon, 18 Feb 2008 17:33:45 -0500 To: Christoph Lameter Cc: akpm@linux-foundation.org, Andrea Arcangeli , Robin Holt , Avi Kivity , Izik Eidus , kvm-devel@lists.sourceforge.net, Peter Zijlstra , general@lists.openfabrics.org, Steve Wise , Kanoj Sarcar , steiner@sgi.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, daniel.blueman@quadrics.com Subject: Re: [patch 1/6] mmu_notifier: Core code X-Message-Flag: Warning: May contain useful information References: <20080215064859.384203497@sgi.com> <20080215064932.371510599@sgi.com> From: Roland Dreier Date: Mon, 18 Feb 2008 14:33:32 -0800 In-Reply-To: <20080215064932.371510599@sgi.com> (Christoph Lameter's message of "Thu, 14 Feb 2008 22:49:00 -0800") Message-ID: User-Agent: Gnus/5.1008 (Gnus v5.10.8) XEmacs/21.4.21 (linux) MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-OriginalArrivalTime: 18 Feb 2008 22:33:33.0055 (UTC) FILETIME=[4B0B80F0:01C8727E] Authentication-Results: sj-dkim-4; header.From=rdreier@cisco.com; dkim=pass ( sig from cisco.com/sjdkim4002 verified; ); Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2082 Lines: 48 It seems that we've come up with two reasonable cases where it makes sense to use these notifiers for InfiniBand/RDMA: First, the ability to safely to DMA to/from userspace memory with the memory regions mlock()ed but the pages not pinned. In this case the notifiers here would seem to suit us well: > + void (*invalidate_range_begin)(struct mmu_notifier *mn, > + struct mm_struct *mm, > + unsigned long start, unsigned long end, > + int atomic); > + > + void (*invalidate_range_end)(struct mmu_notifier *mn, > + struct mm_struct *mm, > + unsigned long start, unsigned long end, > + int atomic); If I understand correctly, the IB stack would have to get the hardware driver to shoot down translation entries and suspend access to the region when an invalidate_range_begin notifier is called, and wait for the invalidate_range_end notifier to repopulate the adapter translation tables. This will probably work OK as long as the interval between the invalidate_range_begin and invalidate_range_end calls is not "too long." Also, using this effectively requires us to figure out how we want to mlock() regions that are going to be used for RDMA. We could require userspace to do it, but it's not clear to me that we're safe in the case where userspace decides not to... what happens if some pages get swapped out after the invalidate_range_begin notifier? The second case where some form of notifiers are useful is for userspace to know when a memory registration is still valid, ie Pete Wyckoff's work: http://www.osc.edu/~pw/papers/wyckoff-memreg-ccgrid05.pdf http://www.osc.edu/~pw/dreg/ however these MMU notifiers seem orthogonal to that: the registration cache is concerned with address spaces, not page mapping, and hence the existing vma operations seem to be a better fit. - R. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/