Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753344Ab1B1KS4 (ORCPT ); Mon, 28 Feb 2011 05:18:56 -0500 Received: from gir.skynet.ie ([193.1.99.77]:47339 "EHLO gir.skynet.ie" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751839Ab1B1KSz (ORCPT ); Mon, 28 Feb 2011 05:18:55 -0500 Date: Mon, 28 Feb 2011 10:18:27 +0000 From: Mel Gorman To: KAMEZAWA Hiroyuki Cc: Andrea Arcangeli , Andrew Morton , Arthur Marsh , Clemens Ladisch , Linux-MM , Linux Kernel Mailing List Subject: Re: [PATCH 2/2] mm: compaction: Minimise the time IRQs are disabled while isolating pages for migration Message-ID: <20110228101827.GE9548@csn.ul.ie> References: <1298664299-10270-1-git-send-email-mel@csn.ul.ie> <1298664299-10270-3-git-send-email-mel@csn.ul.ie> <20110228111746.34f3f3e0.kamezawa.hiroyu@jp.fujitsu.com> <20110228054818.GF22700@random.random> <20110228145402.65e6f200.kamezawa.hiroyu@jp.fujitsu.com> <20110228092814.GC9548@csn.ul.ie> <20110228184230.7c2eefb7.kamezawa.hiroyu@jp.fujitsu.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-15 Content-Disposition: inline In-Reply-To: <20110228184230.7c2eefb7.kamezawa.hiroyu@jp.fujitsu.com> User-Agent: Mutt/1.5.17+20080114 (2008-01-14) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3714 Lines: 87 On Mon, Feb 28, 2011 at 06:42:30PM +0900, KAMEZAWA Hiroyuki wrote: > On Mon, 28 Feb 2011 09:28:14 +0000 > Mel Gorman wrote: > > > On Mon, Feb 28, 2011 at 02:54:02PM +0900, KAMEZAWA Hiroyuki wrote: > > > On Mon, 28 Feb 2011 06:48:18 +0100 > > > Andrea Arcangeli wrote: > > > > > > > On Mon, Feb 28, 2011 at 11:17:46AM +0900, KAMEZAWA Hiroyuki wrote: > > > > > BTW, I forget why we always take zone->lru_lock with IRQ disabled.... > > > > > > > > To decrease lock contention in SMP to deliver overall better > > > > performance (not sure how much it helps though). It was supposed to be > > > > hold for a very short time (PAGEVEC_SIZE) to avoid giving irq latency > > > > problems. > > > > > > > > > > memory hotplug uses MIGRATE_ISOLATED migrate types for scanning pfn range > > > without lru_lock. I wonder whether we can make use of it (the function > > > which memory hotplug may need rework for the compaction but migrate_type can > > > be used, I think). > > > > > > > I don't see how migrate_type would be of any benefit here particularly > > as compaction does not directly affect the migratetype of a pageblock. I > > have not checked closely which part of hotplug you are on about but if > > you're talking about when pages actually get offlined, the zone lock is > > not necessary there because the pages are not on the LRU. In compactions > > case, they are. Did I misunderstand? > > > > memory offline code doesn't take big lru_lock (and call isolate_lru_page()) > at picking up migration target pages from LRU. Right - the scanning doesn't hold the lock but you get and release the lock for every LRU page encountered. This is fairly expensive but considered ok during memory hotplug. It's less ideal in compaction which is expected to be a more frequent operation than memory hotplug. > While this, allocation from > the zone is allowed. memory offline is done by mem_section unit. > > memory offline does. > > 1. making a whole section as MIGRATETYPE_ISOLATED. > 2. scan pfn within section. > 3. find a page on LRU > 4. isolate_lru_page() -> take/release lru_lock. ----(*) > 5. migrate it. > 6. making all pages in the range as RESERVED. > > During this, by marking the pageblock as MIGRATETYPE_ISOLATED, > > - new allocation will never picks up a page in the range. Overkill for compaction though. To use MIGRATE_ISOLATE properly, all pages on the freelist for that block have to be moved as well. It also operates at the granularity of a pageblock which is potentially far higher than the allocation target. It'd might make some sense for transparent hugepage support. > - newly freed pages in the range will never be allocated and never in pcp. > - page type of the range will never change. > > then, memory offline success. > > If (*) seems too heavy anyway and will be no help even if with some batching > as isolate_lru_page_pagevec() or some, okay please forget offlining. (*) is indeed too heavy but if IRQ disabled times are found to be too high it could be batched like I described in my previous mail. Right now, the interrupt disabled times are not showing up as very high. > BTW, can't we drop disable_irq() from all lru_lock related codes ? > I don't think so - at least not right now. Some LRU operations such as LRU pagevec draining are run from IPI which is running from an interrupt so minimally spin_lock_irq is necessary. -- Mel Gorman SUSE Labs -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/