Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932279AbbBLIPu (ORCPT ); Thu, 12 Feb 2015 03:15:50 -0500 Received: from ns.horizon.com ([71.41.210.147]:41425 "HELO ns.horizon.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with SMTP id S932176AbbBLIPt (ORCPT ); Thu, 12 Feb 2015 03:15:49 -0500 Date: 12 Feb 2015 03:15:47 -0500 Message-ID: <20150212081547.14999.qmail@ns.horizon.com> From: "George Spelvin" To: linux@horizon.com, linux@rasmusvillemoes.dk, yury.norov@gmail.com Subject: Re: [PATCH v3 1/3] lib: find_*_bit reimplementation Cc: akpm@linux-foundation.org, chris@chris-wilson.co.uk, davem@davemloft.net, dborkman@redhat.com, hannes@stressinduktion.org, klimov.linux@gmail.com, laijs@cn.fujitsu.com, linux-kernel@vger.kernel.org, msalter@redhat.com, takahiro.akashi@linaro.org, tgraf@suug.ch, valentinrothberg@gmail.com In-Reply-To: <54DBE027.3010209@gmail.com> Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 991 Lines: 22 > Rasmus, your version has ANDing by mask, and resetting the mask at each iteration > of main loop. I think we can avoid it. What do you think on next? Yes, that's basically what I proposed (modulo checking for zero size and my buggy LAST_WORD_MASK). But two unconditional instructions in the loop are awfully minor; it's loads and conditional branches that cost. The reset of the mask can be done in parallel with other operations; it's only the AND that actually takes a cycle. I can definitely see the argument that, for code that's not used often enough to stay resident in the L1 cache, any speedup has to win by at least one L2 cache access to be worth taking another cache line. For Ivy bridge, those numbers are 32 KB and 12 cycles. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/