Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1761419AbZAUJjX (ORCPT ); Wed, 21 Jan 2009 04:39:23 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1754825AbZAUJjL (ORCPT ); Wed, 21 Jan 2009 04:39:11 -0500 Received: from one.firstfloor.org ([213.235.205.2]:59716 "EHLO one.firstfloor.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753724AbZAUJjJ (ORCPT ); Wed, 21 Jan 2009 04:39:09 -0500 Date: Wed, 21 Jan 2009 10:54:18 +0100 From: Andi Kleen To: Nick Piggin Cc: Andi Kleen , Ingo Molnar , Linus Torvalds , David Woodhouse , Bernd Schmidt , Andrew Morton , Harvey Harrison , "H. Peter Anvin" , Chris Mason , Peter Zijlstra , Steven Rostedt , paulmck@linux.vnet.ibm.com, Gregory Haskins , Matthew Wilcox , Linux Kernel Mailing List , linux-fsdevel , linux-btrfs , Thomas Gleixner , Peter Morreale , Sven Dietrich , jh@suse.cz Subject: Re: gcc inlining heuristics was Re: [PATCH -v7][RFC]: mutex: implement adaptive spinning Message-ID: <20090121095418.GG15750@one.firstfloor.org> References: <20090120005124.GD16304@wotan.suse.de> <20090120123824.GD7790@elte.hu> <1232480940.22233.1435.camel@macbook.infradead.org> <20090120210515.GC19710@elte.hu> <20090120220516.GA10483@elte.hu> <20090121085402.GD15750@one.firstfloor.org> <20090121085208.GO24891@wotan.suse.de> <20090121092049.GE15750@one.firstfloor.org> <20090121092550.GP24891@wotan.suse.de> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20090121092550.GP24891@wotan.suse.de> User-Agent: Mutt/1.4.2.1i Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2029 Lines: 56 > The point is that the compiler is then free to do it. If things > slow down after the compiler gets *more* information, then that > is a problem with the compiler heuristics rather than the > information we give it. The point was the -Os disables it typically then. (not always, compiler heuristics are far from perfect) > > > > Then x86s tend to have very very fast L1 caches and > > if something is not in L1 on reads then the cost of fetching > > something for a read dwarfs the few cycles you can typically > > get out of this. > > Well most architectures have L1 caches of several cycles. And > L2 miss typically means going to L2 which in some cases the > compiler is expected to attempt to cover as much as possible > (eg in-order architectures). L2 cache is so much slower that scheduling a few instructions more doesn't help much. > stall, so you still want to get loads out early if possible. > > Even a lot of OOOE CPUs I think won't have the best alias > anaysis, so all else being equal, it wouldn't hurt them to > move loads earlier. Hmm, but if the load is nearby it won't matter if a store is in the middle, because the CPU will just execute over it. The real big win is if you do some computation inbetween, but at least for typical list manipulation there isn't really any. > > Also at least x86 gcc normally doesn't do scheduling > > beyond basic blocks, so any if () shuts it up. > > I don't think any of this is a reason not to use restrict, though. > But... there are so many places we could add it to the kernel, and > probably so few where it makes much difference. Maybe it should be > able to help some critical core code, though. Frankly I think it would be another unlikely(). -Andi -- ak@linux.intel.com -- Speaking for myself only. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/