Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758939Ab2FUO6q (ORCPT ); Thu, 21 Jun 2012 10:58:46 -0400 Received: from mx1.redhat.com ([209.132.183.28]:42477 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758705Ab2FUO6l (ORCPT ); Thu, 21 Jun 2012 10:58:41 -0400 Date: Thu, 21 Jun 2012 16:55:52 +0200 From: Andrea Arcangeli To: Alex Shi Cc: Petr Holasek , "Kirill A. Shutemov" , linux-kernel@vger.kernel.org, linux-mm@kvack.org, Hillf Danton , Dan Smith , Peter Zijlstra , Linus Torvalds , Andrew Morton , Thomas Gleixner , Ingo Molnar , Paul Turner , Suresh Siddha , Mike Galbraith , "Paul E. McKenney" , Lai Jiangshan , Bharata B Rao , Lee Schermerhorn , Rik van Riel , Johannes Weiner , Srivatsa Vaddagiri , Christoph Lameter , Alex Shi , "Chen, Tim C" Subject: Re: AutoNUMA15 Message-ID: <20120621145552.GG4954@redhat.com> References: <1337965359-29725-1-git-send-email-aarcange@redhat.com> <20120529133627.GA7637@shutemov.name> <20120529154308.GA10790@dhcp-27-244.brq.redhat.com> <20120531180834.GP21339@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2748 Lines: 66 On Thu, Jun 21, 2012 at 03:29:52PM +0800, Alex Shi wrote: > > I released an AutoNUMA15 branch that includes all pending fixes: > > > > git clone --reference linux -b autonuma15 git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git > > > > I did a quick testing on our > specjbb2005/oltp/hackbench/tbench/netperf-loop/fio/ffsb on NHM EP/EX, > Core2 EP, Romely EP machine, In generally no clear performance change > found. Is this results expected for this patch set? hackbench and network benchs won't get benefit (the former overschedule like crazy so there's no way any autonuma balancing can have effect with such an overscheduling and zillion of threads, the latter is I/O dominated usually taking so little RAM it doesn't matter, the memory accesses on the kernel side and DMA issue should dominate it in CPU utilization). Similar issue for filesystem benchmarks like fio. On all _system_ time dominated kernel benchmarks it is expected not to measure a performance optimization and if you don't measure a regression it's more than enough. The only benchmarks that gets benefit are userland where the user/nice time in top dominates. AutoNUMA cannot optimize or move kernel memory around, it only optimizes userland computations. So you should run HPC jobs. The only strange thing here is that specjbb2005 gets a measurable significant boost with AutoNUMA so if you didn't even get a boost with that you may want to verify: cat /sys/kernel/mm/autonuma/enabled == 1 Also verify: CONFIG_AUTONUMA_DEFAULT_ENABLED=y If that's 1 well maybe the memory interconnect is so fast that there's no benefit? My numa01/02 benchmarks measures the best worst case of the hardware (not software), with -DINVERSE_BIND -DHARD_BIND parameters, you can consider running that to verify. Probably there should be a little boot time kernel benchmark to measure the inverse bind vs hard bind performance across the first two nodes, if the difference is nil AutoNUMA should disengage and not even allocate the page_autonuma (now only 12 bytes per page but anyway). If you can retest with autonuma17 it would help too as there was some performance issue fixed and it'd stress the new autonuma migration lru code: git clone --reference linux -b autonuma17 git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git autonuma17 And the very latest is always at the autonuma branch: git clone --reference linux -b autonuma git://git.kernel.org/pub/scm/linux/kernel/git/andrea/aa.git autonuma Thanks, Andrea -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/