Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758838AbbKSQqc (ORCPT ); Thu, 19 Nov 2015 11:46:32 -0500 Received: from unicorn.mansr.com ([81.2.72.234]:50918 "EHLO unicorn.mansr.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758744AbbKSQo1 convert rfc822-to-8bit (ORCPT ); Thu, 19 Nov 2015 11:44:27 -0500 From: =?iso-8859-1?Q?M=E5ns_Rullg=E5rd?= To: Nicolas Pitre Cc: Alexey Brodkin , Arnd Bergmann , rmk+kernel@arm.linux.org.uk, linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 5/5] ARM: asm/div64.h: adjust to generic codde References: <1446503610-6942-1-git-send-email-nicolas.pitre@linaro.org> <1446503610-6942-6-git-send-email-nicolas.pitre@linaro.org> Date: Thu, 19 Nov 2015 16:44:25 +0000 In-Reply-To: (Nicolas Pitre's message of "Thu, 19 Nov 2015 11:42:45 -0500 (EST)") Message-ID: User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/24.5 (gnu/linux) MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Transfer-Encoding: 8BIT Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1637 Lines: 51 Nicolas Pitre writes: > On Thu, 19 Nov 2015, M?ns Rullg?rd wrote: > >> Nicolas Pitre writes: >> >> > +static inline uint64_t __arch_xprod_64(uint64_t m, uint64_t n, bool bias) >> > +{ >> > + unsigned long long res; >> > + unsigned int tmp = 0; >> > + >> > + if (!bias) { >> > + asm ( "umull %Q0, %R0, %Q1, %Q2\n\t" >> > + "mov %Q0, #0" >> > + : "=&r" (res) >> > + : "r" (m), "r" (n) >> > + : "cc"); >> > + } else if (!(m & ((1ULL << 63) | (1ULL << 31)))) { >> > + res = m; >> > + asm ( "umlal %Q0, %R0, %Q1, %Q2\n\t" >> > + "mov %Q0, #0" >> > + : "+&r" (res) >> > + : "r" (m), "r" (n) >> > + : "cc"); >> > + } else { >> > + asm ( "umull %Q0, %R0, %Q2, %Q3\n\t" >> > + "cmn %Q0, %Q2\n\t" >> > + "adcs %R0, %R0, %R2\n\t" >> > + "adc %Q0, %1, #0" >> > + : "=&r" (res), "+&r" (tmp) >> > + : "r" (m), "r" (n) >> >> Why is tmp using a +r constraint here? The register is not written, so >> using an input-only operand could/should result in better code. That is >> also what the old code did. > > No, it is worse. gcc allocates two registers because, somehow, it > doesn't think that the first one still holds zero after the first usage. > This way usage of only one temporary register is forced throughout, > producing better code. Makes sense. Thanks for explaining. -- M?ns Rullg?rd mans@mansr.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/