Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S933557AbcCNGzq (ORCPT ); Mon, 14 Mar 2016 02:55:46 -0400 Received: from mail-io0-f181.google.com ([209.85.223.181]:33732 "EHLO mail-io0-f181.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751159AbcCNGzj (ORCPT ); Mon, 14 Mar 2016 02:55:39 -0400 MIME-Version: 1.0 In-Reply-To: <1457898620-1867-2-git-send-email-apinski@cavium.com> References: <1457898620-1867-1-git-send-email-apinski@cavium.com> <1457898620-1867-2-git-send-email-apinski@cavium.com> Date: Mon, 14 Mar 2016 07:55:38 +0100 Message-ID: Subject: Re: [PATCH 1/2] ARM64:VDSO: Improve gettimeofday, don't use udiv From: Ard Biesheuvel To: Andrew Pinski , Mark Rutland Cc: pinskia@gmail.com, "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2195 Lines: 67 On 13 March 2016 at 20:50, Andrew Pinski wrote: > On many cores, udiv with a large value is slow, expand instead > the division out to be what GCC would have generated for the > divide by 1000. > > On ThunderX, the speeds up gettimeofday by 5%. > > Signed-off-by: Andrew Pinski > --- > arch/arm64/kernel/vdso/gettimeofday.S | 20 ++++++++++++++++---- > 1 files changed, 16 insertions(+), 4 deletions(-) > > diff --git a/arch/arm64/kernel/vdso/gettimeofday.S b/arch/arm64/kernel/vdso/gettimeofday.S > index efa79e8..e5caef9 100644 > --- a/arch/arm64/kernel/vdso/gettimeofday.S > +++ b/arch/arm64/kernel/vdso/gettimeofday.S > @@ -64,10 +64,22 @@ ENTRY(__kernel_gettimeofday) > bl __do_get_tspec > seqcnt_check w9, 1b > > - /* Convert ns to us. */ > - mov x13, #1000 > - lsl x13, x13, x12 > - udiv x11, x11, x13 > + /* Undo the shift. */ > + lsr x11, x11, x12 > + > + /* Convert ns to us (division by 1000 by using multiply high). > + * This is how GCC converts the division by 1000 into. > + * This is faster than divide on most cores. > + */ > + mov x13, 63439 Please don't mix hex and decimal constants > + movk x13, 0xe353, lsl 16 > + lsr x11, x11, 3 > + movk x13, 0x9ba5, lsl 32 > + movk x13, 0x20c4, lsl 48 > + /* x13 = 0x20c49ba5e353f7cf */ Could we clean this up a bit? Something along the lines of .set m, 0x20c49ba5e353f7cf movz x13,#:abs_g3:m movk x13, #:abs:g2_nc:m movk x13, #:abs_g1_nc:m movk x13, #:abs_g0_nc:m Actually, the movz/movk sequence should probably be implemented as a macro in asm/assembler.h, with parameters for the register and the symbol name. I think Mark proposed such a patch at some point > + umulh x11, x11, x13 > + lsr x11, x11, 4 > + > stp x10, x11, [x0, #TVAL_TV_SEC] > 2: > /* If tz is NULL, return 0. */ > -- > 1.7.2.5 > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel