Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753014Ab0LCQ23 (ORCPT ); Fri, 3 Dec 2010 11:28:29 -0500 Received: from mail-px0-f174.google.com ([209.85.212.174]:48740 "EHLO mail-px0-f174.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751450Ab0LCQ22 (ORCPT ); Fri, 3 Dec 2010 11:28:28 -0500 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=date:from:to:cc:subject:message-id:references:mime-version :content-type:content-disposition:in-reply-to:user-agent; b=JWIztmtZxQ8hKaeI4Fdoov7t1IrkkV/Phy+9h9q+O9LItiGZQloUcvVhXpVR1ndZYo RtZgSoxmfXienz7Klz5I5qs+PV8GmfHLCdrDitTi4VhzfXYhQ21li7brLfNaqXJXE8ml tRJW1p07cfjSpw4wUaM5ymnLm2mWwEBqU5clo= Date: Sat, 4 Dec 2010 01:28:17 +0900 From: Minchan Kim To: Oleg Nesterov Cc: Roland McGrath , michal.simek@petalogix.com, Andrew Morton , LKML , linux-mm@kvack.org, John Williams , "Edgar E. Iglesias" , Hugh Dickins , Nick Piggin Subject: Re: Flushing whole page instead of work for ptrace Message-ID: <20101203162817.GA21438@barrios-desktop> References: <4CEFA8AE.2090804@petalogix.com> <20101130233250.35603401C8@magilla.sf.frob.com> <20101203150021.GA11114@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20101203150021.GA11114@redhat.com> User-Agent: Mutt/1.5.20 (2009-06-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4095 Lines: 124 On Fri, Dec 03, 2010 at 04:00:21PM +0100, Oleg Nesterov wrote: > On 11/30, Roland McGrath wrote: > > > > Documentation/cachetlb.txt says: > > > > Any time the kernel writes to a page cache page, _OR_ > > the kernel is about to read from a page cache page and > > user space shared/writable mappings of this page potentially > > exist, this routine is called. > > > > In your case, the kernel is only reading (write=0 passed to > > access_process_vm and get_user_pages). In normal situations, > > the page in question will have only a private and read-only > > mapping in user space. So the call should not be required in > > these cases--if the code can tell that's so. > > > > Perhaps something like the following would be safe. > > But you really need some VM folks to tell you for sure. > > > > diff --git a/mm/memory.c b/mm/memory.c > > index 02e48aa..2864ee7 100644 > > --- a/mm/memory.c > > +++ b/mm/memory.c > > @@ -1484,7 +1484,8 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, > > pages[i] = page; > > > > flush_anon_page(vma, page, start); > > - flush_dcache_page(page); > > + if ((vm_flags & VM_WRITE) || (vma->vm_flags & VM_SHARED) > > + flush_dcache_page(page); > > First of all, I know absolutely nothing about D-cache aliasing. > My poor understanding of flush_dcache_page() is: synchronize the > kernel/user vision of this memory, in the case when either side > can change it. > > If this is true, then this change doesn't look right in general. > > Even if (vma->vm_flags & VM_SHARED) == 0, it is possible that > tsk can write to this memory, this mapping can be writable and > private. > > Even if we ensure that this mapping is readonly/private, another > user-space process can write to this page via shared/writable > mapping. > I think you're right. It has a portential that other processes have a such mapping. > > I'd like to know if my understanding is correct, I am just curious. > > Oleg. How about this? Maybe this patch would mitigate the overhead. But I am not sure this patch. Cced GUP experts. >From 8fb3d84c7bb32c4ba9c4a0063198ce7cfcca6b37 Mon Sep 17 00:00:00 2001 From: Minchan Kim Date: Sat, 4 Dec 2010 01:19:43 +0900 Subject: [PATCH] Remove redundant flush_dcache_page in GUP If we get the page with handle_mm_fault, it already handled page flush. So GUP's flush_dcache_page call is redundant. Cc: Hugh Dickins Cc: Nick Piggin Signed-off-by: Minchan Kim --- mm/memory.c | 5 ++++- 1 files changed, 4 insertions(+), 1 deletions(-) diff --git a/mm/memory.c b/mm/memory.c index ebfeedf..9166f4b 100644 --- a/mm/memory.c +++ b/mm/memory.c @@ -1430,6 +1430,7 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, do { struct page *page; unsigned int foll_flags = gup_flags; + bool dcache_flushed = false; /* * If we have a pending SIGKILL, don't keep faulting @@ -1464,6 +1465,7 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, tsk->maj_flt++; else tsk->min_flt++; + dcache_flushed = true; /* * The VM_FAULT_WRITE bit tells us that @@ -1489,7 +1491,8 @@ int __get_user_pages(struct task_struct *tsk, struct mm_struct *mm, pages[i] = page; flush_anon_page(vma, page, start); - flush_dcache_page(page); + if (!dcache_flushed) + flush_dcache_page(page); } if (vmas) vmas[i] = vma; -- 1.7.0.4 > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Fight unfair telecom policy in Canada: sign http://dissolvethecrtc.ca/ > Don't email: email@kvack.org -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/