Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp792474pxy; Thu, 22 Apr 2021 13:37:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxdkOGxUEi5tMP6vbHk/PHJFDwGRo6izOqXWTsL93zOLHPNI7x91hP9+f4k8TIcN8FUkFhO X-Received: by 2002:a17:90a:dc13:: with SMTP id i19mr1918527pjv.194.1619123825234; Thu, 22 Apr 2021 13:37:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619123825; cv=none; d=google.com; s=arc-20160816; b=wtyCE3QTISHECBJPEP8tURT+dZIg5LQS0GM5XMe8Ua/c8IOy3YwGnGAwbnM5ukpFzh b8AL4pJcnIEK4RV10UtVnNrOc6xTkbo3390i4tVB4318a1g4r+NwPEYA1Rez8O38NoDU PtB3HwveO2ek5h2hDovSW9WORhBeecJ20ptB7t+JQfjYbjVaSyvlvBsLOMFNQSiVUBA2 EwYOiD5o8uE546b2rfa0CnK0svgi1Gp64pegs4uDzhjECd6GJZjUJXghE8lW/YeMx+Mu 7KzeP5n0IImee6y9+NOIIgnoZ4a1gk32J0jyOUe1l6Bs5ShLzo+r2iM4aowpMtlSrxlE 3giA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:message-id :in-reply-to:subject:cc:to:from:date; bh=7KniBtWhJDVr3XpS8SZVVJhgaGwui0nKaOd8xjuQDCg=; b=WHFgyBaHDbMzPsuF3Zn37nxtW29aUL5d5JOaZiR697nJKlHG+xNRSPCjapEdhlzCNb PffeZXlIcTQwdzQxgKgUWFFFgs3T+73+yVSh9dMOLYAy3h0RaHOSr3QNxfxDM9faBeV9 qjMcNAUtw6YxGVHpxnY3z64yzlCKd5nFTd7xcqu0L3n4oJkDMKM30XBy/YDXqhdBYvzv ULCQ0KjwJcRkMIUNnbyicSPKJnnf89lkVGpBzDvFGmeMLUyKitFs/v9cZ0lgVSvMaYYL 4aXa6ztVcKJz35apyxssgm54KP826JSQt9e4Pdh/BuGU2mZiA+Gb7ywxgc1lgn63EncT SkXg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t17si4762393plg.102.2021.04.22.13.36.50; Thu, 22 Apr 2021 13:37:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239357AbhDVUgt (ORCPT + 99 others); Thu, 22 Apr 2021 16:36:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239298AbhDVUgs (ORCPT ); Thu, 22 Apr 2021 16:36:48 -0400 Received: from angie.orcam.me.uk (angie.orcam.me.uk [IPv6:2001:4190:8020::4]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7206DC06174A; Thu, 22 Apr 2021 13:36:13 -0700 (PDT) Received: by angie.orcam.me.uk (Postfix, from userid 500) id C1C8E92009E; Thu, 22 Apr 2021 22:36:12 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by angie.orcam.me.uk (Postfix) with ESMTP id BB91F92009B; Thu, 22 Apr 2021 22:36:12 +0200 (CEST) Date: Thu, 22 Apr 2021 22:36:12 +0200 (CEST) From: "Maciej W. Rozycki" To: Thomas Bogendoerfer cc: Huacai Chen , Huacai Chen , Jiaxun Yang , linux-mips@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] MIPS: Avoid handcoded DIVU in `__div64_32' altogether In-Reply-To: Message-ID: References: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Remove the inline asm with a DIVU instruction from `__div64_32' and use plain C code for the intended DIVMOD calculation instead. GCC is smart enough to know that both the quotient and the remainder are calculated with single DIVU, so with ISAs up to R5 the same instruction is actually produced with overall similar code. For R6 compiled code will work, but separate DIVU and MODU instructions will be produced, which are also interlocked, so scalar implementations will likely not perform as well as older ISAs with their asynchronous MD unit. Likely still faster then the generic algorithm though. This removes a compilation error for R6 however where the original DIVU instruction is not supported anymore and the MDU accumulator registers have been removed and consequently GCC complains as to a constraint it cannot find a register for: In file included from ./include/linux/math.h:5, from ./include/linux/kernel.h:13, from mm/page-writeback.c:15: ./include/linux/math64.h: In function 'div_u64_rem': ./arch/mips/include/asm/div64.h:76:17: error: inconsistent operand constraints in an 'asm' 76 | __asm__("divu $0, %z1, %z2" \ | ^~~~~~~ ./include/asm-generic/div64.h:245:25: note: in expansion of macro '__div64_32' 245 | __rem = __div64_32(&(n), __base); \ | ^~~~~~~~~~ ./include/linux/math64.h:91:22: note: in expansion of macro 'do_div' 91 | *remainder = do_div(dividend, divisor); | ^~~~~~ This has passed correctness verification with test_div64 and reduced the module's average execution time down to 1.0404s from 1.0445s with R3400 @40MHz. The module's MIPS I machine code has also shrunk by 12 bytes or 3 instructions. Signed-off-by: Maciej W. Rozycki --- arch/mips/include/asm/div64.h | 8 ++------ 1 file changed, 2 insertions(+), 6 deletions(-) Index: linux-3maxp-div64/arch/mips/include/asm/div64.h =================================================================== --- linux-3maxp-div64.orig/arch/mips/include/asm/div64.h +++ linux-3maxp-div64/arch/mips/include/asm/div64.h @@ -58,7 +58,6 @@ #define __div64_32(n, base) ({ \ unsigned long __upper, __low, __high, __radix; \ - unsigned long long __modquot; \ unsigned long long __quot; \ unsigned long long __div; \ unsigned long __mod; \ @@ -73,11 +72,8 @@ __upper = __high; \ __high = 0; \ } else { \ - __asm__("divu $0, %z1, %z2" \ - : "=x" (__modquot) \ - : "Jr" (__high), "Jr" (__radix)); \ - __upper = __modquot >> 32; \ - __high = __modquot; \ + __upper = __high % __radix; \ + __high /= __radix; \ } \ \ __mod = do_div64_32(__low, __upper, __low, __radix); \