Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030328AbbEAQDh (ORCPT ); Fri, 1 May 2015 12:03:37 -0400 Received: from mail-ie0-f169.google.com ([209.85.223.169]:33781 "EHLO mail-ie0-f169.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030310AbbEAQDd (ORCPT ); Fri, 1 May 2015 12:03:33 -0400 MIME-Version: 1.0 In-Reply-To: <20150501151630.GH5029@twins.programming.kicks-ass.net> References: <20150501151630.GH5029@twins.programming.kicks-ass.net> Date: Fri, 1 May 2015 09:03:32 -0700 X-Google-Sender-Auth: XtFhmOT9u_KalRS60QdvvAMS4u8 Message-ID: Subject: Re: [PATCH] x86: Optimize variable_test_bit() From: Linus Torvalds To: Peter Zijlstra Cc: Ingo Molnar , "H. Peter Anvin" , Thomas Gleixner , Linux Kernel Mailing List , Borislav Petkov , Jakub Jelinek Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2895 Lines: 67 On Fri, May 1, 2015 at 8:16 AM, Peter Zijlstra wrote: > > Since test_bit() doesn't actually have any output variables, we can use > asm goto without having to add a memory clobber. This reduces the code > to something sensible: Yes, looks good, except if we have anything that actually wants to use the value rather than branch on it. But a quick grep seems to show that the vast majority of them are all about just directly testing the result. It worries me a bit that gcc now cannot pick the likely branch any more. It will always branch out for the bit being set. So code like this: net/core/dev.c: if (likely(!test_bit(__QDISC_STATE_DEACTIVATED, &q->state))) wouldn't work, but almost all of those seem to be the constant case that doesn't get to this anyway. So on the whole, this would seem to be a win. > PS. should we kill the memory clobber for __test_and_change_bit()? It > seems inconsistent and out of place. We don't seem to have it for the clear case, so yeah.. > PPS. Jakub, I see gcc5.1 still hasn't got output operands for asm goto; > is this something we can get 'fixed' ? I suspect the problem is that now the particular register allocation choices are basically not just around the asm, they'd affect the target labels of the asm too. I think that for the kernel, it would *generally* be ok to just say that the outputs are only valid in the case the asm does *not* branch out, assuming that the *clobbers* obviously clobber things regardless. Keeping the register allocation for the asm itself still purely "local" to the asm. Something with a memory output we could just turn into a memory clobber (so we could do the test-and-change bits today without using any outputs - just mark memory clobbered). Don't get me wrong - it would be even better if outputs would be valid even for the labels the asm can jump to, but I can see that being fundamentally hard. For example, two different asm goto's that share one label, but that have different output register allocation. That would be a nightmare (the compiler could basically have to duplicate the label etc - although maybe you have to do that anyway). And many of the places where I would personally like to use "asm goto" are where the goto label is for an error case. Things like a failed cmpxchg, or a failed user access etc. It would generally be ok if the output values from the asm were "lost", because it's about the cleanup (or trying again).. So we might be ok in many cases with that kind of "weaker output", where the output is only valid when the goto is *not* taken. Linus -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/