Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754643AbZG0WuF (ORCPT ); Mon, 27 Jul 2009 18:50:05 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753695AbZG0WuE (ORCPT ); Mon, 27 Jul 2009 18:50:04 -0400 Received: from smtp-out.google.com ([216.239.45.13]:19425 "EHLO smtp-out.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753000AbZG0WuB (ORCPT ); Mon, 27 Jul 2009 18:50:01 -0400 DomainKey-Signature: a=rsa-sha1; s=beta; d=google.com; c=nofws; q=dns; h=date:from:x-x-sender:to:cc:subject:in-reply-to:message-id: references:user-agent:mime-version:content-type:x-system-of-record; b=A/6WPNoAyOqmJpDdNEqjSWO7rQKlwBNzXUfKgueSfymizKzxZ8T31uO4giFxfHS9E G3vVJ8Oz3qMvoYL1aWV5Q== Date: Mon, 27 Jul 2009 15:49:54 -0700 (PDT) From: David Rientjes X-X-Sender: rientjes@chino.kir.corp.google.com To: Moussa Ba cc: linux-kernel@vger.kernel.org, xiyou.wangcong@gmail.com, Andrew Morton , Alexey Dobriyan , Matt Mackall , Mel Gorman , Ying Han , Nick Piggin , jaredeh@gmail.com Subject: Re: [PATCH 1/1] pagemap clear_refs: modify to specify anon or mapped vma clearing In-Reply-To: <7928e7bd0907271514y699d1de4j54f9c562b94ef0cc@mail.gmail.com> Message-ID: References: <4A693122.6060503@gmail.com> <20090724083926.GA6372@cr0.nay.redhat.com> <4A6E0BD6.50102@gmail.com> <7928e7bd0907271514y699d1de4j54f9c562b94ef0cc@mail.gmail.com> User-Agent: Alpine 2.00 (DEB 1167 2008-08-23) MIME-Version: 1.0 Content-Type: MULTIPART/MIXED; BOUNDARY="497827084-1684126342-1248734996=:23587" X-System-Of-Record: true Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 8171 Lines: 219 This message is in MIME format. The first part should be readable text, while the remaining parts are likely unreadable without MIME-aware tools. --497827084-1684126342-1248734996=:23587 Content-Type: TEXT/PLAIN; charset=ISO-8859-1 Content-Transfer-Encoding: 8BIT On Mon, 27 Jul 2009, Moussa Ba wrote: > > clear_refs currently accepts any non-zero value, so it's possible that > > this will break user scripts that, for whatever reason, write '2' or '3'. > > I think that's acceptable, but it would be helpful to make all other > > values a no-op similar to drop_caches at this point to avoid the potential > > for breakage if this is ever extended any further. > > In your latest post, I see you implemented this in clear_refs_walk_vma_area() by checking for CLEAR_REFS_ALL > type > CLEAR_REFS_MAPPED. Thanks! Don't forget to update Documentation/filesystems/proc.txt to specify that. > >> Selective clearing the pages has a measurable impact on performance as it > >> limits the number of page walks. ?We have been using this interface and ?this > >> adds flexibility to the user user space application implementing the reference > >> clearing. > >> > >> Signed-off-by: Jared Hulbert (jaredeh@gmail.com) > >> Signed-off-by: Moussa A. Ba (moussa.a.ba@gmail.com) > > > > Email addresses in < > braces, please. > > > > The first sign-off line normally indicates who wrote the patch, but your > > submission lacks a From: line, so git would indicate you wrote it. ?If > > that's incorrect, please add a From: line as described in > > Documentation/SubmittingPatches. ?If it's correct, please reorder your > > sign-off lines. > > > I will reorder the sign-off lines Ok, that removes the ambiguity concerning authorship, thanks. > >> ------- > >> Documentation/filesystems/proc.txt | ? ?7 +++++++ > >> fs/proc/task_mmu.c ? ? ? ? ? ? ? ? | ? 29 +++++++++++++++++++++++++---- > >> 2 files changed, 32 insertions(+), 4 deletions(-) > >> > >> --- a/fs/proc/task_mmu.c ? ? ?2009-07-21 14:30:01.000000000 -0700 > >> +++ b/fs/proc/task_mmu.c ? ? ?2009-07-27 11:46:05.000000000 -0700 > >> @@ -462,6 +462,27 @@ > >> ? ? ? return 0; > >> ?} > >> > >> +static void walk_vma_area(struct mm_walk *this_walk, > >> + ? ? ? ? ? ? ? ? ? ? ? struct vm_area_struct *vma, int type) > >> +{ > > > > This is a very generic name for something that is only applicable to > > clear_refs, so please name it accordingly. ?This will also avoid having to > > pass the struct mm_walk * in since its only user is clear_refs_walk. > > > Done. > >> + > >> + ? ? /* Writing 2 to /proc/pid/clear_refs will clear all Anonymous > >> + ? ? ?* pages. > >> + ? ? ?* > >> + ? ? ?* Writing 3 to /proc/pid/clear_refs will clear all file mapped > >> + ? ? ?* pages. > >> + ? ? ?* > >> + ? ? ?* Writing any other value including 1 will clear all pages > >> + ? ? ?*/ > > > > Documentation/CodingStyle would suggest this format: > > > > ? ? ? ?/* > > ? ? ? ? * Multi-line kernel comments always start .. > > ? ? ? ? * with an empty first line. > > ? ? ? ? */ > > > Done. > >> + ? ? if (is_vm_hugetlb_page(vma)) > >> + ? ? ? ? ? ? return; > >> + ? ? if (type == 2 && vma->vm_file) > >> + ? ? ? ? ? ? return; > >> + ? ? if (type == 3 && !vma->vm_file) > >> + ? ? ? ? ? ? return; > >> + ? ? walk_page_range(vma->vm_start, vma->vm_end, this_walk); > >> +} > > > > K&R would suggest #define's (or enums) for those hard-coded values. ?I > > think that's already been suggested for this patch, actually. > > > > Would this be acceptable? > > enum clear_refs_walk_type { > CLEAR_REFS_ALL = 1, > CLEAR_REFS_ANON = 2, > CLEAR_REFS_MAPPED = 3 > }; > #define seems more appropriate for this particular use case. We simply try to avoid hard-coded integers in source code (rationale: K&R). > static void clear_refs_walk_vma_area(struct mm_walk *this_walk, > struct vm_area_struct *vma, enum clear_refs_walk_type type) > { Ddi you consider my suggestion of dropping the struct mm_walk * formal since this is now a clear_refs-specific function? walk_page_range() will always take clear_refs_walk, there's no need to pass it in. > > /* > * Writing 1 to /proc/pid/clear_refs clears all pages. > * Writing 2 to /proc/pid/clear_refs clears Anonymous pages. > * Writing 3 to /proc/pid/clear_refs clears file mapped pages. > */ > if (type < CLEAR_REFS_ALL || type > CLEAR_REFS_MAPPED) > return; > if (is_vm_hugetlb_page(vma)) > return; > if (type == CLEAR_REFS_ANON && vma->vm_file) > return; > if (type == CLEAR_REFS_MAPPED && !vma->vm_file) > return; > walk_page_range(vma->vm_start, vma->vm_end, this_walk); > } > > > >> + > >> ?static ssize_t clear_refs_write(struct file *file, const char __user * buf, > >> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? size_t count, loff_t * ppos) > >> ?{ > >> @@ -469,13 +490,15 @@ > >> ? ? ? char buffer[PROC_NUMBUF], *end; > >> ? ? ? struct mm_struct *mm; > >> ? ? ? struct vm_area_struct *vma; > >> + ? ? int type; > >> > >> ? ? ? memset(buffer, 0, sizeof(buffer)); > >> ? ? ? if (count > sizeof(buffer) - 1) > >> ? ? ? ? ? ? ? count = sizeof(buffer) - 1; > >> ? ? ? if (copy_from_user(buffer, buf, count)) > >> ? ? ? ? ? ? ? return -EFAULT; > >> - ? ? if (!simple_strtol(buffer, &end, 0)) > >> + ? ? type = strict_strtol(buffer, &end, 0); > >> + ? ? if (!type) > >> ? ? ? ? ? ? ? return -EINVAL; > >> ? ? ? if (*end == '\n') > >> ? ? ? ? ? ? ? end++; > >> @@ -491,9 +514,7 @@ > >> ? ? ? ? ? ? ? down_read(&mm->mmap_sem); > >> ? ? ? ? ? ? ? for (vma = mm->mmap; vma; vma = vma->vm_next) { > >> ? ? ? ? ? ? ? ? ? ? ? clear_refs_walk.private = vma; > >> - ? ? ? ? ? ? ? ? ? ? if (!is_vm_hugetlb_page(vma)) > >> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? walk_page_range(vma->vm_start, vma->vm_end, > >> - ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? &clear_refs_walk); > >> + ? ? ? ? ? ? ? ? ? ? walk_vma_area(&clear_refs_walk, vma, type); > >> ? ? ? ? ? ? ? } > >> ? ? ? ? ? ? ? flush_tlb_mm(mm); > >> ? ? ? ? ? ? ? up_read(&mm->mmap_sem); > >> --- a/Documentation/filesystems/proc.txt ? ? ?2009-07-20 17:29:11.000000000 > >> -0700 > >> +++ b/Documentation/filesystems/proc.txt ? ? ?2009-07-27 12:08:34.000000000 > >> -0700 > >> @@ -375,6 +375,13 @@ > >> ?This file is only present if the CONFIG_MMU kernel configuration option is > >> ?enabled. > >> > >> +The clear_refs entry is used to reset the Referenced bits on virtual and physical > >> +pages associated with a process. > >> +echo 1 > /proc/PID/clear_refs clears all pages associated with the process > >> +echo 2 > /proc/PID/clear_refs clears anonymous pages only > >> +echo 3 > /proc/PID/clear_refs clears file mapped pages only > >> +Any other value written to the proc entry will clear all pages. > >> + > > > > Please follow the format in this document for how other /proc/PID/* > > entries are described. > > > > That format could really be improved here, perhaps you could clean > > proc.txt up a little bit while you're here? > > > > > > I am not sure what you mean by "clean" proc.txt, I did not detect much > formatting in the PID proc enries description, beyond what I rewrote > below: > Right, the file is pretty sloppy, so I was wondering if you wanted to take the time to clean it up a little so there's a more consistent style. > > The /proc/PID/clear_refs is used to reset the Referenced bits on virtual and Probably better to say PG_referenced instead of Referenced. > physical pages associated with a process. > To clear all pages associated with the process > > echo 1 > /proc/PID/clear_refs > > To clear all anonymous pages associated with the process > > echo 2 > /proc/PID/clear_refs > > To clear all file mapped pages associated with the process > > echo 3 > /proc/PID/clear_refs > Any other value written to /proc/PID/clear_refs will have no effect. > You should probably say these all clear the bit instead of saying they "clear pages," which doesn't make a lot of sense. --497827084-1684126342-1248734996=:23587-- -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/