Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756068AbbBDXpY (ORCPT ); Wed, 4 Feb 2015 18:45:24 -0500 Received: from mail-la0-f54.google.com ([209.85.215.54]:55689 "EHLO mail-la0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755565AbbBDXpX (ORCPT ); Wed, 4 Feb 2015 18:45:23 -0500 Message-ID: <54D2AF0E.30500@gmail.com> Date: Thu, 05 Feb 2015 02:45:18 +0300 From: Yury User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.4.0 MIME-Version: 1.0 To: Rasmus Villemoes , George Spelvin CC: akpm@linux-foundation.org, chris@chris-wilson.co.uk, davem@davemloft.net, dborkman@redhat.com, hannes@stressinduktion.org, klimov.linux@gmail.com, laijs@cn.fujitsu.com, linux-kernel@vger.kernel.org, msalter@redhat.com, takahiro.akashi@linaro.org, tgraf@suug.ch, valentinrothberg@gmail.com, Yury Norov Subject: Re: [PATCH v2 1/3] lib: find_*_bit reimplementation References: <878uggr3tm.fsf@rasmusvillemoes.dk> <20150202114759.25525.qmail@ns.horizon.com> <87twz4pj44.fsf@rasmusvillemoes.dk> In-Reply-To: <87twz4pj44.fsf@rasmusvillemoes.dk> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3201 Lines: 80 On 02.02.2015 15:56, Rasmus Villemoes wrote: > On Mon, Feb 02 2015, "George Spelvin" wrote: > >> Rasmus Villemoes wrote: >>> ... and this be part of _find_next_bit? Can find_next_bit not be simply >>> 'return _find_next_bit(addr, size, offset, 1);', and similarly for >>> find_next_zero_bit? Btw., passing true and false for the boolean >>> parameter may be a little clearer. >> Looking at the generated code, it would be better to replace the boolean >> parameter with 0ul or ~0ul and XOR with it. The same number of registers, >> and saves a conditional branch. > Nice trick. When I compiled it, gcc inlined _find_next_bit into both its > callers, making the conditional go away completely. That was with gcc > 4.7. When I try with 5.0, I do see _find_next_bit being compiled > separately. > > With the proposed change, 4.7 also makes find_next{,_zero}_bit wrappers > for _find_next_bit, further reducing the total size, which is a good > thing. And, if some other version decides to still inline it, it > should then know how to optimize the xor with 0ul or ~0ul just as well > as when the conditional was folded away. > > Yury, please also incorporate this in the next round. > > Rasmus > Ok. What are you thinking about joining _find_next_bit and _find_next_bit_le? They really differ in 2 lines. It's generally good to remove duplications, and it may decrease text size for big-endian machines. But it definitely doesn't make code easier for reading, and maybe affects performance after the optimization suggested by George... (I didn't test it yet) 29 #if !defined(find_next_bit) || !defined(find_next_zero_bit) \ 30 || (defined(BIG_ENDIAN) && \ 31 (!defined(find_next_bit_le) || !defined(find_next_zero_bit_le))) 32 static unsigned long _find_next_bit(const unsigned long *addr, 33 unsigned long nbits, unsigned long start, unsigned long flags) 34 { 35 unsigned long xor_mask = (flags & SET) ? 0UL : ULONG_MAX; 36 unsigned long tmp = addr[start / BITS_PER_LONG] ^ xor_mask; 37 38 /* Handle 1st word. */ 39 if (!IS_ALIGNED(start, BITS_PER_LONG)) { 40 #ifdef BIG_ENDIAN 41 if (flags & LE) 42 tmp &= ext2_swab(HIGH_BITS_MASK(start % BITS_PER_LONG)); 43 else 44 #endif 45 tmp &= HIGH_BITS_MASK(start % BITS_PER_LONG); 46 47 start = round_down(start, BITS_PER_LONG); 48 } 49 50 while (!tmp) { 51 start += BITS_PER_LONG; 52 if (start >= nbits) 53 return nbits; 54 55 tmp = addr[start / BITS_PER_LONG] ^ xor_mask; 56 } 57 58 #ifdef BIG_ENDIAN 59 if (flags & LE) 60 return start + __ffs(ext2_swab(tmp)); 61 62 #endif 63 return start + __ffs(tmp); 64 } 65 #endif -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/