Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756643AbZALTDb (ORCPT ); Mon, 12 Jan 2009 14:03:31 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754169AbZALTDS (ORCPT ); Mon, 12 Jan 2009 14:03:18 -0500 Received: from smtp1.linux-foundation.org ([140.211.169.13]:42547 "EHLO smtp1.linux-foundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753972AbZALTDQ (ORCPT ); Mon, 12 Jan 2009 14:03:16 -0500 Date: Mon, 12 Jan 2009 11:02:17 -0800 (PST) From: Linus Torvalds X-X-Sender: torvalds@localhost.localdomain To: Bernd Schmidt cc: Andi Kleen , David Woodhouse , Andrew Morton , Ingo Molnar , Harvey Harrison , "H. Peter Anvin" , Chris Mason , Peter Zijlstra , Steven Rostedt , paulmck@linux.vnet.ibm.com, Gregory Haskins , Matthew Wilcox , Linux Kernel Mailing List , linux-fsdevel , linux-btrfs , Thomas Gleixner , Nick Piggin , Peter Morreale , Sven Dietrich , jh@suse.cz Subject: Re: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implement adaptive spinning In-Reply-To: <496B86B5.3090707@t-online.de> Message-ID: References: <1231676801.25018.150.camel@macbook.infradead.org> <20090111181307.GM26290@one.firstfloor.org> <20090111201427.GP26290@one.firstfloor.org> <1231704939.25018.548.camel@macbook.infradead.org> <20090111203441.GQ26290@one.firstfloor.org> <20090112001255.GR26290@one.firstfloor.org> <20090112005228.GS26290@one.firstfloor.org> <496B86B5.3090707@t-online.de> User-Agent: Alpine 2.00 (LFD 1167 2008-08-23) MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2829 Lines: 101 On Mon, 12 Jan 2009, Bernd Schmidt wrote: > > Something at the back of my mind said "aliasing". > > $ gcc linus.c -O2 -S ; grep subl linus.s > subl $1624, %esp > $ gcc linus.c -O2 -S -fno-strict-aliasing; grep subl linus.s > subl $824, %esp > > That's with 4.3.2. Interesting. Nonsensical, but interesting. Since they have no overlap in lifetime, confusing this with aliasing is really really broken (if the functions _hadn't_ been inlined, you'd have gotten the same address for the two variables anyway! So anybody who thinks that they need different addresses because they are different types is really really fundmantally confused!). But your numbers are unambiguous, and I can see the effect of that compiler flag myself. The good news is that the kernel obviously already uses -fno-strict-aliasing for other reasonds, so we should see this effect already, _despite_ it making no sense. And the stack usage still causes problems. Oh, and I see why. This test-case shows it clearly. Note how the max stack usage _should_ be "struct b" + "struct c". Note how it isn't (it's "struct a" + "struct b/c"). So what seems to be going on is that gcc is able to do some per-slot sharing, but if you have one function with a single large entity, and another with a couple of different ones, gcc can't do any smart allocation. Put another way: gcc doesn't create a "union of the set of different stack usages" (which would be optimal given a single frame, and generate the stack layout of just the maximum possible size), it creates a "set of unions of different stack usages" (which can be optimal in the trivial cases, but not nearly optimal in practical cases). That explains the ioctl behavior - the structure use is usually pretty complicated (ie it's almost never about just _one_ large stack slot, but the ioctl cases tend to do random stuff with multiple slots). So it doesn't add up to some horrible maximum of all sizes, but it also doesn't end up coalescing stack usage very well. Linus --- struct a { int a; unsigned long array[200]; }; struct b { int b; unsigned long array[100]; }; struct c { int c; unsigned long array[100]; }; extern int fn3(int, void *); extern int fn4(int, void *); static inline __attribute__ ((always_inline)) int fn1(int flag) { struct a a; return fn3(flag, &a); } static inline __attribute__ ((always_inline)) int fn2(int flag) { struct b b; struct c c; return fn4(flag, &b) + fn4(flag, &c); } int fn(int flag) { fn1(flag); if (flag & 1) return 0; return fn2(flag); } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/