Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756702AbaGCIgJ (ORCPT ); Thu, 3 Jul 2014 04:36:09 -0400 Received: from LGEMRELSE7Q.lge.com ([156.147.1.151]:34580 "EHLO lgemrelse7q.lge.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750884AbaGCIgH (ORCPT ); Thu, 3 Jul 2014 04:36:07 -0400 X-Original-SENDERIP: 10.177.220.169 X-Original-MAILFROM: minchan@kernel.org Date: Thu, 3 Jul 2014 17:37:29 +0900 From: Minchan Kim To: Martin Schwidefsky Cc: "Kirill A. Shutemov" , Andrew Morton , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Michael Kerrisk , Linux API , Hugh Dickins , Johannes Weiner , Rik van Riel , KOSAKI Motohiro , Mel Gorman , Jason Evans , Zhang Yanfei , Heiko Carstens , linux390@de.ibm.com, Gerald Schaefer Subject: Re: [PATCH v9] mm: support madvise(MADV_FREE) Message-ID: <20140703083729.GE2939@bbox> References: <1404174975-22019-1-git-send-email-minchan@kernel.org> <20140701145058.GA2084@node.dhcp.inet.fi> <20140703010318.GA2939@bbox> <20140703072954.GC2939@bbox> <20140703102901.322bfdb0@mschwide> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <20140703102901.322bfdb0@mschwide> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hello, On Thu, Jul 03, 2014 at 10:29:01AM +0200, Martin Schwidefsky wrote: > On Thu, 3 Jul 2014 16:29:54 +0900 > Minchan Kim wrote: > > > Hello, > > > > On Thu, Jul 03, 2014 at 10:03:19AM +0900, Minchan Kim wrote: > > > Hello, > > > > > > On Tue, Jul 01, 2014 at 05:50:58PM +0300, Kirill A. Shutemov wrote: > > > > On Tue, Jul 01, 2014 at 09:36:15AM +0900, Minchan Kim wrote: > > > > > + do { > > > > > + /* > > > > > + * XXX: We can optimize with supporting Hugepage free > > > > > + * if the range covers. > > > > > + */ > > > > > + next = pmd_addr_end(addr, end); > > > > > + if (pmd_trans_huge(*pmd)) > > > > > + split_huge_page_pmd(vma, addr, pmd); > > > > > > > > Could you implement proper THP support before upstreaming the feature? > > > > It shouldn't be a big deal. > > > > > > Okay, Hope to review. > > > > > > Thanks for the feedback! > > > > > > > I tried to implement it but had a issue. > > > > I need pmd_mkold, pmd_mkclean for MADV_FREE operation and pmd_dirty for > > page_referenced. When I investigate all of arches supported THP, > > it's not a big deal but s390 is not sure to me who has no idea of > > soft tracking of s390 by storage key instead of page table information. > > Cced s390 maintainer. Hope to help. > > Storage key for dirty and referenced tracking is a thing of the past. > The current code for s390 uses software tracking for dirty and referenced. > There is one catch though, for ptes the software implementation covers > dirty and referenced bit but for pmds only referenced bit is available. > The reason is that there is no free bit left in the pmd entry for the > software dirty bit. Thanks for the quick reply. > > > So, if there isn't any help from s390, I should introduce > > HAVE_ARCH_THP_MADVFREE to disable MADV_FREE support of THP in s390 but > > not want to introduce such new config. > > Why is the dirty bit for pmds needed for the MADV_FREE implementation? MADV_FREE semantic want it. When madvise syscall is called, VM clears dirty bit of ptes of the range. If memory pressure happens, VM checks dirty bit of page table and if it found still "clean", it means it's a "lazyfree pages" so VM could discard the page instead of swapping out. Once there was store operation for the page before VM peek a page to reclaim, dirty bit is set so VM can swap out the page instead of discarding to keep up-to-date contents. If it's hard on s390, maybe we could use just reference bit instead of dirty bit to check recent access but it might change semantic a bit with other OSes. :( > > -- > blue skies, > Martin. > > "Reality continues to ruin my life." - Calvin. > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majordomo@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: email@kvack.org -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/