Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751571Ab2KFTye (ORCPT ); Tue, 6 Nov 2012 14:54:34 -0500 Received: from mx1.redhat.com ([209.132.183.28]:38326 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752840Ab2KFTyc (ORCPT ); Tue, 6 Nov 2012 14:54:32 -0500 Message-ID: <50996B8B.30404@redhat.com> Date: Tue, 06 Nov 2012 14:56:59 -0500 From: Rik van Riel User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:16.0) Gecko/20121009 Thunderbird/16.0 MIME-Version: 1.0 To: Mel Gorman CC: Peter Zijlstra , Andrea Arcangeli , Ingo Molnar , Johannes Weiner , Hugh Dickins , Thomas Gleixner , Linus Torvalds , Andrew Morton , Linux-MM , LKML Subject: Re: [PATCH 19/19] mm: sched: numa: Implement slow start for working set sampling References: <1352193295-26815-1-git-send-email-mgorman@suse.de> <1352193295-26815-20-git-send-email-mgorman@suse.de> In-Reply-To: <1352193295-26815-20-git-send-email-mgorman@suse.de> Content-Type: text/plain; charset=ISO-8859-1; format=flowed Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2823 Lines: 64 On 11/06/2012 04:14 AM, Mel Gorman wrote: > From: Peter Zijlstra > > Add a 1 second delay before starting to scan the working set of > a task and starting to balance it amongst nodes. > > [ note that before the constant per task WSS sampling rate patch > the initial scan would happen much later still, in effect that > patch caused this regression. ] > > The theory is that short-run tasks benefit very little from NUMA > placement: they come and go, and they better stick to the node > they were started on. As tasks mature and rebalance to other CPUs > and nodes, so does their NUMA placement have to change and so > does it start to matter more and more. > > In practice this change fixes an observable kbuild regression: > > # [ a perf stat --null --repeat 10 test of ten bzImage builds to /dev/shm ] > > !NUMA: > 45.291088843 seconds time elapsed ( +- 0.40% ) > 45.154231752 seconds time elapsed ( +- 0.36% ) > > +NUMA, no slow start: > 46.172308123 seconds time elapsed ( +- 0.30% ) > 46.343168745 seconds time elapsed ( +- 0.25% ) > > +NUMA, 1 sec slow start: > 45.224189155 seconds time elapsed ( +- 0.25% ) > 45.160866532 seconds time elapsed ( +- 0.17% ) > > and it also fixes an observable perf bench (hackbench) regression: > > # perf stat --null --repeat 10 perf bench sched messaging > > -NUMA: > > -NUMA: 0.246225691 seconds time elapsed ( +- 1.31% ) > +NUMA no slow start: 0.252620063 seconds time elapsed ( +- 1.13% ) > > +NUMA 1sec delay: 0.248076230 seconds time elapsed ( +- 1.35% ) > > The implementation is simple and straightforward, most of the patch > deals with adding the /proc/sys/kernel/balance_numa_scan_delay_ms tunable > knob. > > Signed-off-by: Peter Zijlstra > Cc: Linus Torvalds > Cc: Andrew Morton > Cc: Peter Zijlstra > Cc: Andrea Arcangeli > Cc: Rik van Riel > [ Wrote the changelog, ran measurements, tuned the default. ] > Signed-off-by: Ingo Molnar > Signed-off-by: Mel Gorman Reviewed-by: Rik van Riel -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/