Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755733AbYA2VST (ORCPT ); Tue, 29 Jan 2008 16:18:19 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751727AbYA2VSG (ORCPT ); Tue, 29 Jan 2008 16:18:06 -0500 Received: from host36-195-149-62.serverdedicati.aruba.it ([62.149.195.36]:59251 "EHLO mx.cpushare.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750811AbYA2VSE (ORCPT ); Tue, 29 Jan 2008 16:18:04 -0500 Date: Tue, 29 Jan 2008 22:17:59 +0100 From: Andrea Arcangeli To: Christoph Lameter Cc: Robin Holt , Avi Kivity , Izik Eidus , Nick Piggin , kvm-devel@lists.sourceforge.net, Benjamin Herrenschmidt , Peter Zijlstra , steiner@sgi.com, linux-kernel@vger.kernel.org, linux-mm@kvack.org, daniel.blueman@quadrics.com, Hugh Dickins Subject: Re: [patch 2/6] mmu_notifier: Callbacks to invalidate address ranges Message-ID: <20080129211759.GV7233@v2.random> References: <20080128202840.974253868@sgi.com> <20080128202923.849058104@sgi.com> <20080129162004.GL7233@v2.random> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2657 Lines: 71 On Tue, Jan 29, 2008 at 11:55:10AM -0800, Christoph Lameter wrote: > I am not sure. AFAICT you wrote that code. Actually I didn't need to change a single line in do_wp_page because ptep_clear_flush was already doing everything transparently for me. This was the memory.c part of my last patch I posted, it only touches zap_page_range, remap_pfn_range and apply_to_page_range. diff --git a/mm/memory.c b/mm/memory.c --- a/mm/memory.c +++ b/mm/memory.c @@ -889,6 +889,7 @@ unsigned long zap_page_range(struct vm_a end = unmap_vmas(&tlb, vma, address, end, &nr_accounted, details); if (tlb) tlb_finish_mmu(tlb, address, end); + mmu_notifier(invalidate_range, mm, address, end); return end; } @@ -1317,7 +1318,7 @@ int remap_pfn_range(struct vm_area_struc { pgd_t *pgd; unsigned long next; - unsigned long end = addr + PAGE_ALIGN(size); + unsigned long start = addr, end = addr + PAGE_ALIGN(size); struct mm_struct *mm = vma->vm_mm; int err; @@ -1358,6 +1359,7 @@ int remap_pfn_range(struct vm_area_struc if (err) break; } while (pgd++, addr = next, addr != end); + mmu_notifier(invalidate_range, mm, start, end); return err; } EXPORT_SYMBOL(remap_pfn_range); @@ -1441,7 +1443,7 @@ int apply_to_page_range(struct mm_struct { pgd_t *pgd; unsigned long next; - unsigned long end = addr + size; + unsigned long start = addr, end = addr + size; int err; BUG_ON(addr >= end); @@ -1452,6 +1454,7 @@ int apply_to_page_range(struct mm_struct if (err) break; } while (pgd++, addr = next, addr != end); + mmu_notifier(invalidate_range, mm, start, end); return err; } EXPORT_SYMBOL_GPL(apply_to_page_range); > It seems to be okay to invalidate range if you hold mmap_sem writably. In > that case no additional faults can happen that would create new ptes. In that place the mmap_sem is taken but in readonly mode. I never rely on the mmap_sem in the mmu notifier methods. Not invoking the notifier before releasing the PT lock adds quite some uncertainty on the smp safety of the spte invalidates, because the pte may be unmapped and remapped by a minor fault before invalidate_range is invoked, but I didn't figure out a kernel crashing race yet thanks to the pin we take through get_user_pages (and only thanks to it). The requirement is that invalidate_range is invoked after the last ptep_clear_flush or it leaks pins that's why I had to move it at the end. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/