Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752274AbbEAUt6 (ORCPT ); Fri, 1 May 2015 16:49:58 -0400 Received: from mail-ig0-f172.google.com ([209.85.213.172]:33787 "EHLO mail-ig0-f172.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751940AbbEAUtx (ORCPT ); Fri, 1 May 2015 16:49:53 -0400 MIME-Version: 1.0 In-Reply-To: <5543CDC0.6010206@redhat.com> References: <20150501151630.GH5029@twins.programming.kicks-ass.net> <20150501163329.GU1751@tucnak.redhat.com> <5543CDC0.6010206@redhat.com> Date: Fri, 1 May 2015 13:49:52 -0700 X-Google-Sender-Auth: bMMADOeMw0_kQZUwGI8IT8SnzAs Message-ID: Subject: Re: [PATCH] x86: Optimize variable_test_bit() From: Linus Torvalds To: Vladimir Makarov Cc: Jakub Jelinek , Richard Henderson , Peter Zijlstra , Ingo Molnar , "H. Peter Anvin" , Thomas Gleixner , Linux Kernel Mailing List , Borislav Petkov Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2775 Lines: 64 On Fri, May 1, 2015 at 12:02 PM, Vladimir Makarov wrote: > > GCC RA is a major reason to prohibit output operands for asm goto. Hmm.. Thinking some more about it, I think that what would actually work really well at least for the kernel is: (a) allow *memory* operands (ie "=m") as outputs and having them be meaningful even at any output labels (obviously with the caveat that the asm instructions that write to memory would have to happen before the branch ;) This covers the somewhat common case of having magic instructions that result in conditions that can't be tested at a C level. Things like "bit clear and test" on x86 (with or without the lock) . (b) allow other operands to be meaningful onlty for the fallthrough case. >From a register allocation standpoint, these should be the easy cases. (a) doesn't need any register allocation of the output (only on the input to set up the effective address of the memory location), and (b) would explicitly mean that an "asm goto" would leave any non-memory outputs undefined in any of the goto cases, so from a RA standpoint it ends up being equivalent to a non-goto asm.. Hmm? So as an example of something that the kernel does and which wants to have an output register. is to do a load from user space that can fault. When it faults, we obviously simply don't *have* an actual result, and we return an error. But for the successful fallthrough case, we get a value in a register. I'd love to be able to write it as (this is simplified, and doesn't worry about all the different access sizes, or the "stac/clac" sequence to enable user accesses on modern Intel CPU's): asm goto( "1:" "\tmovl %0,%1\n" _ASM_EXTABLE(1b,%l[error]) : "=r" (val) : "m" (*userptr) : : error); where that "_ASM_EXTABLE()" is our magic macro for generating an exception entry for that instruction, so that if the load takes an exception, it will instead to to the "error" label. But if it goes to the error label, the "val" output register really doesn't contain anything, so we wouldn't even *want* gcc to try to do any register allocation for the "jump to label from assembly" case. So at least for one of the major cases that I'd like to use "asm goto" with an output, I actually don't *want* any register allocation for anything but the fallthrough case. And I suspect that's a not-too-uncommon pattern - it's probably often about error handling. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/