Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755108AbYKRQCt (ORCPT ); Tue, 18 Nov 2008 11:02:49 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1753219AbYKRQCl (ORCPT ); Tue, 18 Nov 2008 11:02:41 -0500 Received: from smtp1.linux-foundation.org ([140.211.169.13]:45783 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753205AbYKRQCl (ORCPT ); Tue, 18 Nov 2008 11:02:41 -0500 Date: Tue, 18 Nov 2008 08:02:10 -0800 (PST) From: Linus Torvalds To: Nick Piggin cc: Paul Mackerras , Benjamin Herrenschmidt , Steven Rostedt , LKML , linuxppc-dev@ozlabs.org, Andrew Morton , Ingo Molnar , Thomas Gleixner Subject: Re: Large stack usage in fs code (especially for PPC64) In-Reply-To: <200811182124.33141.nickpiggin@yahoo.com.au> Message-ID: References: <18722.2107.970887.768477@cargo.ozlabs.ibm.com> <200811182124.33141.nickpiggin@yahoo.com.au> User-Agent: Alpine 2.00 (LFD 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1663 Lines: 37 On Tue, 18 Nov 2008, Nick Piggin wrote: > > > > The fact is, Intel (and to a lesser degree, AMD) has shown how hardware > > can do good TLB's with essentially gang lookups, giving almost effective > > page sizes of 32kB with hardly any of the downsides. Couple that with > > It's much harder to do this with powerpc I think because they would need > to calculate 8 hashes and touch 8 cachelines to prefill 8 translations, > wouldn't they? Oh, absolutely. It's why I despise hashed page tables. It's a broken concept. > The per-page processing costs are interesting too, but IMO there is more > work that should be done to speed up order-0 pages. The patches I had to > remove the sync instruction for smp_mb() in unlock_page sped up pagecache > throughput (populate, write(2), reclaim) on my G5 by something really > crazy like 50% (most of that's in, but I'm still sitting on that fancy > unlock_page speedup to remove the final smp_mb). > > I suspect some of the costs are also in powerpc specific code to insert > linux ptes into their hash table. I think some of the synchronisation for > those could possibly be shared with generic code so you don't need the > extra layer of locks there. Yeah, the hashed page tables get extra costs from the fact that it can't share the software page tables with the hardware ones, and the associated coherency logic. It's even worse at unmap time, I think. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/