Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754410Ab3HETBE (ORCPT ); Mon, 5 Aug 2013 15:01:04 -0400 Received: from mail-ve0-f180.google.com ([209.85.128.180]:57160 "EHLO mail-ve0-f180.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752968Ab3HETBB (ORCPT ); Mon, 5 Aug 2013 15:01:01 -0400 MIME-Version: 1.0 In-Reply-To: <51FFF430.1060701@linux.intel.com> References: <1375721715.22073.80.camel@gandalf.local.home> <1375725328.22073.101.camel@gandalf.local.home> <51FFEC56.6040206@linux.intel.com> <1375727010.22073.110.camel@gandalf.local.home> <51FFEEEC.5060902@linux.intel.com> <1375728583.22073.118.camel@gandalf.local.home> <51FFF430.1060701@linux.intel.com> Date: Mon, 5 Aug 2013 12:01:01 -0700 X-Google-Sender-Auth: qqlx7b2eEBvniYNhlVOV75_MHsE Message-ID: Subject: Re: [RFC] gcc feature request: Moving blocks into sections From: Linus Torvalds To: "H. Peter Anvin" Cc: Steven Rostedt , LKML , gcc , Ingo Molnar , Mathieu Desnoyers , Thomas Gleixner , David Daney , Behan Webster , Peter Zijlstra , Herbert Xu Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1706 Lines: 37 On Mon, Aug 5, 2013 at 11:51 AM, H. Peter Anvin wrote: >> >> Also, how would you pass the parameters? Every tracepoint has its own >> parameters to pass to it. How would a trap know what where to get "prev" >> and "next"? > > How do you do that now? > > You have to do an IP lookup to find out what you are doing. No, he just generates the code for the call and then uses a static_key to jump to it. So normally it's all out-of-line, and the only thing in the hot-path is that 5-byte nop (which gets turned into a 5-byte jump when the tracing key is enabled) Works fine, but the normally unused stubs end up mixing in the normal code segment. Which I actually think is fine, but right now we don't get the short-jump advantage from it (and there is likely some I$ disadvantage from just fragmentation of the code). With two-byte jumps, you'd still get the I$ fragmentation (the argument generation and the call and the branch back would all be in the same code segment as the hot code), but that would be offset by the fact that at least the hot code itself could use a short jump when possible (ie a 2-byte nop rather than a 5-byte one). Don't know which way it would go performance-wise. But it shouldn't need gcc changes, it just needs the static key branch/nop rewriting to be able to handle both sizes. I couldn't tell why Steven's series to do that was so complex, though - I only glanced through the patches. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/