Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753192AbYAVUAQ (ORCPT ); Tue, 22 Jan 2008 15:00:16 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1751559AbYAVUAE (ORCPT ); Tue, 22 Jan 2008 15:00:04 -0500 Received: from relay2.sgi.com ([192.48.171.30]:48584 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751393AbYAVUAD (ORCPT ); Tue, 22 Jan 2008 15:00:03 -0500 Date: Tue, 22 Jan 2008 12:00:00 -0800 (PST) From: Christoph Lameter X-X-Sender: clameter@schroedinger.engr.sgi.com To: Matthew Wilcox cc: linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: Re: SLUB: Increasing partial pages In-Reply-To: <20080118191430.GD20490@parisc-linux.org> Message-ID: References: <20080116195949.GO18741@parisc-linux.org> <20080116214127.GA11559@parisc-linux.org> <20080116221618.GB11559@parisc-linux.org> <20080118191430.GD20490@parisc-linux.org> MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3032 Lines: 75 On Fri, 18 Jan 2008, Matthew Wilcox wrote: > > I repeatedly saw patches from Intel to do minor changes to SLAB that > > increase performance by 0.5% or so (like the recent removal of a BUG_ON > > for performance reasons). These do not regress again when you build a > > newer kernel release? > > No, they don't. Unless someone's broken something (eg putting pages on > the LRU in the wrong order so we don't get merges any more ;-) I tried over the weekend to get reproducable results after applying each patch using tbench (shows a similar regression to what you are seeing 1-4%) but each kernel build has different performance charateristics. Probably due to the way code is aligned in cachelines? Here are the slab numbes for a number of runs with different kernel builds: 2.6.24-rc8 slab (x86_64 8p SMP 8G) Throughput 2316.87 MB/sec 8 procs Throughput 2257.49 MB/sec 8 procs Throughput 2264.11 MB/sec 8 procs Throughput 2280.62 MB/sec 8 procs So a ~3% variance! I was able to get a feel (from the measurements that are jumping around) that some patches may cause tbench performance to drop a bit. In particular the patch before the cmpxchg_local must add the page->end field. The cmpxchg_local patch must offset that cost in order to make this a performance increase. It would be great if you could get stable results on these (with multiple differently compiled kernels! Apply some patch that should have no performance impact but adds some code to verify). I saw an overall slight performance decrease on tbench (still doubting how much I can trust those numbers) so lets not merge the series upstream until we have more data). Patches that I would recommend to test individually if you could do it (get the series via git pull git://git.kernel.org/pub/scm/linux/kernel/git/christoph/vm.git performance): 0006-Use-non-atomic-unlock.patch Surprisingly this one caused a 1% regression in some of my tests. Maybe the cacheline is held longer if no atomic op is used during unlock? 0005-Add-parameter-to-add_partial-to-avoid-having-two-fun.patch I mostly saw performance increases (1-2%) on this one. 0009-SLUB-Avoid-folding-functions-into-__slab_alloc-and.patch 0010-Move-kmem_cache_node-determination-into-add_full-par.patch The cmpxchg stuff is a group of 3 patches. The first two should cause a slight performance decrease which needs at least to be offset by the third one. 0014-SLUB-Use-unique-end-pointer-for-each-slab-page.patch 0015-slub-provide-unique-end-marker-for-each-slab-fix.patch 0017-SLUB-Alternate-fast-paths-using-cmpxchg_local.patch 0018-SLUB-Do-our-own-locking-to-avoid-extraneous-memory.patch 0019-SLUB-Own-locking-checkpatch-fixes.patch 0021-SLUB-Restructure-slab_alloc-to-flow-in-execution-se.patch -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/