Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753370Ab2HTK5U (ORCPT ); Mon, 20 Aug 2012 06:57:20 -0400 Received: from nat28.tlf.novell.com ([130.57.49.28]:33083 "EHLO nat28.tlf.novell.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751256Ab2HTK5S convert rfc822-to-8bit (ORCPT ); Mon, 20 Aug 2012 06:57:18 -0400 Message-Id: <50323427020000780009660E@nat28.tlf.novell.com> X-Mailer: Novell GroupWise Internet Agent 12.0.0 Date: Mon, 20 Aug 2012 11:57:11 +0100 From: "Jan Beulich" To: "Andi Kleen" Cc: , , , , , Subject: Re: [PATCH 46/74] x86, lto: Disable fancy hweight optimizations for LTO References: <1345345030-22211-1-git-send-email-andi@firstfloor.org> <1345345030-22211-47-git-send-email-andi@firstfloor.org> <5030B1A5020000780008A200@nat28.tlf.novell.com> <20120819151516.GS11413@one.firstfloor.org> In-Reply-To: <20120819151516.GS11413@one.firstfloor.org> Mime-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Content-Transfer-Encoding: 8BIT Content-Disposition: inline Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1620 Lines: 42 >>> On 19.08.12 at 17:15, Andi Kleen wrote: >> >--- a/arch/x86/include/asm/arch_hweight.h >> >+++ b/arch/x86/include/asm/arch_hweight.h >> >@@ -25,9 +25,14 @@ static inline unsigned int __arch_hweight32(unsigned int w) >> >{ >> > unsigned int res = 0; >> > >> >+#ifdef CONFIG_LTO >> >+ res = __sw_hweight32(w); >> >+#else >> >+ >> > asm (ALTERNATIVE("call __sw_hweight32", POPCNT32, X86_FEATURE_POPCNT) >> > : "="REG_OUT (res) >> > : REG_IN (w)); >> >+#endif >> >> Isn't this a little to harsh? Rather than not using popcnt at all, why don't >> you just add the necessary clobbers to the asm() in the LTO case? > > gcc lacks the means to declare that a asm uses an external symbol > currently. Ok we could make it visible. But there's no way to make the > special calling convention work anyways, at least not without someone > changing gcc to allow to declare this per function. That's not the point: The point really is that you could allow the alternative regardless of LTO, and just penalize the LTO case by having even the asm clobber the registers that a function call would not preserve. > I'm not sure the optimization is really worth it anyways, hweight should > be uncommon. That's a separate question (but I sort of agree - not sure whether CPU mask weights ever get calculated on hot paths). Jan -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/