Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751650AbbBLXqJ (ORCPT ); Thu, 12 Feb 2015 18:46:09 -0500 Received: from ns.horizon.com ([71.41.210.147]:41306 "HELO ns.horizon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S1750992AbbBLXqI (ORCPT ); Thu, 12 Feb 2015 18:46:08 -0500 Date: 12 Feb 2015 18:46:03 -0500 Message-ID: <20150212234603.30908.qmail@ns.horizon.com> From: "George Spelvin" To: linux@horizon.com, linux@rasmusvillemoes.dk Subject: Re: [PATCH v3 1/3] lib: find_*_bit reimplementation Cc: akpm@linux-foundation.org, chris@chris-wilson.co.uk, davem@davemloft.net, dborkman@redhat.com, hannes@stressinduktion.org, klimov.linux@gmail.com, laijs@cn.fujitsu.com, linux-kernel@vger.kernel.org, msalter@redhat.com, takahiro.akashi@linaro.org, tgraf@suug.ch, valentinrothberg@gmail.com, yury.norov@gmail.com In-Reply-To: <87zj8jzc11.fsf@rasmusvillemoes.dk> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4099 Lines: 114 > That, and if the compiler was smart enough, the AND should actually be > free (at least on x86), since it can replace a test instruction. [1] Ah, right, x86 loads don't set the flags, so you need a TEST instruction anyway. So the AND is free! Of course, not all the world's an x86. But load/store architectures generally need a test instruction as well. It's things like MIPS and Aloha, that don't have condition codes, but only conditional branches on register values, that will need an extra cycle. > In any case, my code compiles to fewer bytes (though not an entire cache > line), and I think it is somewhat clearer - I'm obviously biased on the > latter point. Myself, I think the code that shows that only the first word gets masked by control flow is clearer, but since the complexity is completely localized to this function, I'll take smaller and faster. > [1] Unfortunately, the compiler doesn't seem to be smart enough :-( When I > compile my code with gcc 5.0, I get > > 0000000000000000 : > 0: 53 push %rbx > 1: 89 f1 mov %esi,%ecx > 3: 48 8d 5e 3f lea 0x3f(%rsi),%rbx > 7: f7 d9 neg %ecx > 9: 48 c7 c2 ff ff ff ff mov $0xffffffffffffffff,%rdx > 10: 48 83 e3 c0 and $0xffffffffffffffc0,%rbx > 14: 48 d3 ea shr %cl,%rdx > 17: eb 1a jmp 33 > 19: 0f 1f 80 00 00 00 00 nopl 0x0(%rax) > > 20: 48 89 d1 mov %rdx,%rcx > 23: 48 23 0c df and (%rdi,%rbx,8),%rcx > 27: 48 c7 c2 ff ff ff ff mov $0xffffffffffffffff,%rdx > 2e: 48 85 c9 test %rcx,%rcx > 31: 75 15 jne 48 > 33: 48 83 eb 01 sub $0x1,%rbx > 37: 48 83 fb ff cmp $0xffffffffffffffff,%rbx > 3b: 75 e3 jne 20 > > 3d: 48 89 f0 mov %rsi,%rax > 40: 5b pop %rbx > 41: c3 retq > 42: 66 0f 1f 44 00 00 nopw 0x0(%rax,%rax,1) > 48: 48 89 cf mov %rcx,%rdi > 4b: 48 c1 e3 06 shl $0x6,%rbx > 4f: e8 00 00 00 00 callq 54 > 54: 48 98 cltq > 56: 48 01 d8 add %rbx,%rax > 59: 5b pop %rbx > 5a: c3 retq > 5b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) > the main loop is 20--3b. The test instruction at 2e seems to be > redundant. The same at 37: the sub instruction already sets plenty of > flags that could be used, so explicitly comparing %rbx to -1 seems > redundant. Er... I think you hand-edited that code; it's wrong. The loop assumes that %rbx is in units of words, but the prologue sets it up in units of bits. The mov to %rcx is also redundant, since it could be eliminated with some minor rescheduling. The code generation I *want* for that function is: # addr in %rdi, size in %rsi movl %esi, %ecx leaq 0x3f(%rsi), %rax negl %ecx movq $-1, %rdx shrq $6, %rax shrq %cl, %rdx jmp 2f 1: movq $-1, %rdx 2: subq $1, %rax jc 3f andq (%rdi,%rax,8), %rdx jeq 1b bsrq %rdx, %rdx salq $6, %rax addq %rdx, %rax ret 3: movq %rsi, %rax retq I wonder if the compiler can be convinced by a bit of source tweaking? unsigned long find_last_bit(const unsigned long *addr, unsigned long size) { size_t idx = DIV_ROUND_UP(size, BITS_PER_LONG); unsigned long val = LAST_WORD_MASK(size); while (idx--) { val &= addr[idx]; if (val) return idx * BITS_PER_LONG + __fls(val); val = ~0ul; } return size; } Darn, no difference... -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/