Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935593Ab3DPDdP (ORCPT ); Mon, 15 Apr 2013 23:33:15 -0400 Received: from mail-pa0-f47.google.com ([209.85.220.47]:57771 "EHLO mail-pa0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S935355Ab3DPDdO (ORCPT ); Mon, 15 Apr 2013 23:33:14 -0400 Message-ID: <516CC675.8020903@linaro.org> Date: Mon, 15 Apr 2013 20:33:09 -0700 From: John Stultz User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130329 Thunderbird/17.0.5 MIME-Version: 1.0 To: Minchan Kim CC: KOSAKI Motohiro , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Michael Kerrisk , Arun Sharma , Mel Gorman , Hugh Dickins , Dave Hansen , Rik van Riel , Neil Brown , Mike Hommey , Taras Glek , KOSAKI Motohiro , KAMEZAWA Hiroyuki , Jason Evans , sanjay@google.com, Paul Turner , Johannes Weiner , Michel Lespinasse , Andrew Morton Subject: Re: [RFC v7 00/11] Support vrange for anonymous page References: <1363073915-25000-1-git-send-email-minchan@kernel.org> <5165CA22.6080808@gmail.com> <20130411065546.GA10303@blaptop> <5166643E.6050704@gmail.com> <20130411080243.GA12626@blaptop> <5166712C.7040802@gmail.com> <20130411083146.GB12626@blaptop> <5166D037.6040405@gmail.com> <20130414074204.GC8241@blaptop> In-Reply-To: <20130414074204.GC8241@blaptop> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2732 Lines: 54 On 04/14/2013 12:42 AM, Minchan Kim wrote: > Hi KOSAKI, > > On Thu, Apr 11, 2013 at 11:01:11AM -0400, KOSAKI Motohiro wrote: >>>>>> and adding new syscall invokation is unwelcome. >>>>> Sure. But one more system call could be cheaper than page-granuarity >>>>> operation on purged range. >>>> I don't think vrange(VOLATILE) cost is the related of this discusstion. >>>> Whether sending SIGBUS or just nuke pte, purge should be done on vmscan, >>>> not vrange() syscall. >>> Again, please see the MADV_FREE. http://lwn.net/Articles/230799/ >>> It does changes pte and page flags on all pages of the range through >>> zap_pte_range. So it would make vrange(VOLASTILE) expensive and >>> the bigger cost is, the bigger range is. >> This haven't been crossed my mind. now try_to_discard_one() insert vrange >> for making SIGBUS. then, we can insert pte_none() as the same cost too. Am >> I missing something? > For your requirement, we need some tracking model to detect some page is > using by the process currently before VM discards it *if* we don't give > vrange(NOVOLATILE) pair system call(Look at below). So the tracking model > should be formed in vrange(VOLATILE) system call context. To further clarify Minchan's note here, the reason its important for the application to use vrange(NOVOLATILE), its really to help define _when the range stops being volatile_. In your libc hack to use vrange(), you see the benfit of not immediately purging the memory as you do with MADV_DONTNEED. However, if the heap grows again, and those address are re-used, nothing has stopped those pages from continuing to be volatile. Thus the kernel could then decide to purge those pages after they start to be used again, and you'd lose data. I suspect that's not what you want. :) Rik's MADV_FREE implementation is very similar to vrange(VOLATILE), but has an implicit vrange(NOVOLATILE) on any page write. So by dirtying a page, it stops the kernel from later purging it. This MADV_FREE semantic works very well if you always want zerofill (as in the case of malloc/free). But for other data, its important to know something was lost (as a zero page could be valid data), and that's why we provide the SIGBUS, as well as the purged notification on vrange(NOVOLATILE). In other-words, as long as you do a vrange(NOVOLATILE) when you grow the heap again (before its used), it should be very similar to the MADV_FREE behavior, but is more flexible for other use cases. thanks -john -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/