Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp2971860pxb; Mon, 19 Apr 2021 19:52:21 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzUckmKTdNMYLYYbkBevJGqpxKLLtsFdumoEdU67d1INTvwpGUffEuhls3bYgnZ1mnUoSJI X-Received: by 2002:a17:902:20e:b029:ec:a39a:4194 with SMTP id 14-20020a170902020eb02900eca39a4194mr8963448plc.31.1618887141353; Mon, 19 Apr 2021 19:52:21 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618887141; cv=none; d=google.com; s=arc-20160816; b=S9h0jshkW5j5xgAO4RVDhwMlV8NkXZ12TbnvB+z+36uEVNyvNOcm8rHzp5WBw9O+av zCmYEt8CjkBinKGSZwf5H4oAgc/PilsW2IBTjB6WwUIK3wGNDh+kATfOKvB9a8b8k0mH DOQPnujUYIrBhVzRd629PU2jdv1thVcu/0JgKRKjBcEQorT9188CqE/Czb89iF9WPLsf eWv1UAn0UeoVCZHE19Om7du3Y3gsntZBADPcr2eyAUdyBm3eYbiX6eGp13G2a6VODHu2 xxH4D/361wQd7/G/O6KBtN1lJCUfGZiRM+gRLd0EAU9euRWlzIij2HJeP0r2QO3Dm0Nz YfMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:user-agent:references:message-id :in-reply-to:subject:cc:to:from:date; bh=wVqblM4FMRoKxdKo1ZFYTxd7zgX/e8+nzs2hW7m3Wl0=; b=s1VegPtVlqLDJZjF+v0x2w/u7LwmKEsGRoR/pF3T163XU58pNStzwpTSDgm68/jBSN WQF0hrVahf+EiAZOBKJLw0rmFQhIlz8epAPepXQNp1Rnx376w+ajAUQuwwNqlEw2gigJ 41FpoXivD25B87M5h7sjkK+0r50YBkSpWSHFHPWGeSdAb9cBk4Lggu/JPCkm90hhtcBW yEbVTo+njGvpH2UR9Uz5X0CXjRVAqX5n82auU6bhwSzE0Xzg6Q7Dp6U2m1iDgc+O7h++ QdQ0HgApedCNPrF6ijSHm2gcx/HGBS0D2BYkUaA/mx8conXhnGQgSZ7KZQjQQhWGX701 E5Jw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m18si18398220pfo.85.2021.04.19.19.52.09; Mon, 19 Apr 2021 19:52:21 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233899AbhDTCvW (ORCPT + 99 others); Mon, 19 Apr 2021 22:51:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39368 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229508AbhDTCvV (ORCPT ); Mon, 19 Apr 2021 22:51:21 -0400 Received: from angie.orcam.me.uk (angie.orcam.me.uk [IPv6:2001:4190:8020::4]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id ADB8AC06174A; Mon, 19 Apr 2021 19:50:49 -0700 (PDT) Received: by angie.orcam.me.uk (Postfix, from userid 500) id 07E0D9200BC; Tue, 20 Apr 2021 04:50:49 +0200 (CEST) Received: from localhost (localhost [127.0.0.1]) by angie.orcam.me.uk (Postfix) with ESMTP id 03A6C92009D; Tue, 20 Apr 2021 04:50:48 +0200 (CEST) Date: Tue, 20 Apr 2021 04:50:48 +0200 (CEST) From: "Maciej W. Rozycki" To: Arnd Bergmann , Thomas Bogendoerfer cc: Huacai Chen , Huacai Chen , Jiaxun Yang , linux-arch@vger.kernel.org, linux-mips@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH 4/4] MIPS: Avoid DIVU in `__div64_32' is result would be zero In-Reply-To: Message-ID: References: User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org We already check the high part of the divident against zero to avoid the costly DIVU instruction in that case, needed to reduce the high part of the divident, so we may well check against the divisor instead and set the high part of the quotient to zero right away. We need to treat the high part the divident in that case though as the remainder that would be calculated by the DIVU instruction we avoided. This has passed correctness verification with test_div64 and reduced the module's average execution time down to 1.0445s and 0.2619s from 1.0668s and 0.2629s respectively for an R3400 CPU @40MHz and a 5Kc CPU @160MHz. Signed-off-by: Maciej W. Rozycki --- I have made an experimental change on top of this to put `__div64_32' out of line, and that increases the averages respectively up to 1.0785s and 0.2705s. Not a terrible loss, especially compared to generic times quoted with 3/4, but still, so I think it would best be made where optimising for size, as noted in the cover letter. --- arch/mips/include/asm/div64.h | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) Index: linux-3maxp-div64/arch/mips/include/asm/div64.h =================================================================== --- linux-3maxp-div64.orig/arch/mips/include/asm/div64.h +++ linux-3maxp-div64/arch/mips/include/asm/div64.h @@ -68,9 +68,11 @@ \ __high = __div >> 32; \ __low = __div; \ - __upper = __high; \ \ - if (__high) { \ + if (__high < __radix) { \ + __upper = __high; \ + __high = 0; \ + } else { \ __asm__("divu $0, %z1, %z2" \ : "=x" (__modquot) \ : "Jr" (__high), "Jr" (__radix)); \