Received: by 2002:a05:6a10:c604:0:0:0:0 with SMTP id y4csp648689pxt; Thu, 12 Aug 2021 06:43:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxF1jAf5eGQaWWvSiUqBCfx19eAGCkumo5h2nSNRaArovUjlIvmFmJckp05pWzLKo6iKu3S X-Received: by 2002:a17:906:2817:: with SMTP id r23mr3663665ejc.285.1628775833982; Thu, 12 Aug 2021 06:43:53 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1628775833; cv=none; d=google.com; s=arc-20160816; b=x0i6rk/McDRg/5Cs/grkstD7MesIGJ4YuDJBme8++YQ+plFZQLjklbZ+Buyam9+sQz IdJAqcBeqnbjqU4REw2UTiFxL7wxANtWjd0R2om0qpbojhycCOpPZ2JSAPTHfeh4Yf5s nNWhO6x6XiyRCOQyHpGGohdaj3a1N/6HvOqtQcmWbJpfKk3GfKUXWgjL7XxiZ5LpdchY VjscB2UxkUH1ck2l1gTuBAG/PrPipcWeNTtPu8XeOcK8F08lu3wl6loFm3gmKBrmRLB3 fq7dS0i554X50wLm6JG46JBBMK/A1LhfilPXkKq0VLz3ea+6YzHLrFPCVPn9h91R2uGO 58uQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:sender:dkim-signature; bh=w8ByNvmcER+djVrwXEK9pUfCYkOH6rFbdfIhJ/tSBvA=; b=iFLH44m6r3WPK3MO3RkMC/WGpcDXokOBJqpcjcCXrHq6IgRUVf0qiPA53Oc1clKbns fAyBkbhY3jOwn2fO3ZcvBEfbSdYdageMOYK0l3Haj2HAMmpGxREWBG+zHpLhls049IP/ wmh3wX/UZRyFBNY7wkhD074hCTH1APxQ+mENe0m6HcQoLpVEtkjFRN6lilfE0T2zH+xu EvUgX9DL1vtLv7n/gt5oQG0HnB1qFpEUv/CbAPduMTac+PDZsZ2vxqAI7HkeL6NjjFu6 w0CdikPLDPHW4draJkDjNS7UuxhDX2qvz+eRZIKJrbrJ6F60iG1Yueq/uD7mHCqvquVO LYCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=MNipbIQl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h13si2552572ejt.369.2021.08.12.06.43.29; Thu, 12 Aug 2021 06:43:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=MNipbIQl; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236717AbhHLNmO (ORCPT + 99 others); Thu, 12 Aug 2021 09:42:14 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37262 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236152AbhHLNmN (ORCPT ); Thu, 12 Aug 2021 09:42:13 -0400 Received: from mail-oi1-x229.google.com (mail-oi1-x229.google.com [IPv6:2607:f8b0:4864:20::229]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0942BC061756 for ; Thu, 12 Aug 2021 06:41:48 -0700 (PDT) Received: by mail-oi1-x229.google.com with SMTP id t35so10340152oiw.9 for ; Thu, 12 Aug 2021 06:41:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=sender:date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=w8ByNvmcER+djVrwXEK9pUfCYkOH6rFbdfIhJ/tSBvA=; b=MNipbIQlWMXcHHMJ/+LhWIi2OgJ76yGlWDEmKM/0Bjhpqrf2ZfrIV/O3VG1gs20c4o 0EYw/Tnt/Q4Lg1N6gTrwRaJsOsvzgPOD6e6Eq5BtewSI1ZZyWF9JqjskwJUjqXK2bHvg B3hJGt4+1R1Un8lVAHbXDIQ4XQCJxsbOMKWdPR98mTKJZTCIP/Faj5t+h1vQepdd2qgV oVRNCnpRl4SzwP0WctbMN8MytglLzukvw5/g6CbEGp8RdK2HnlnqHOkpJ0PT4TwFa/v9 H5E7pAeX/h/QR3y7F62jTyKc4Sj6idE4suFcsW1HMpnSVEYwixuntgwctDS3LatpU9kW geJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:sender:date:from:to:cc:subject:message-id :references:mime-version:content-disposition:in-reply-to; bh=w8ByNvmcER+djVrwXEK9pUfCYkOH6rFbdfIhJ/tSBvA=; b=JnnkjfZqk363trYBh3/Ptxm8JJWlkjr8CNAH134o/kSXCcFfPq84h1MMHEJP0ctud7 16l5GyxFQPJ6AOLBe/+wh6symy0De4ZKehKSqInwyiMPXTAegKkmNaUe0PTPhgPwQWeT kI9tOGO4H0It1fROSnXwFrs9NV92TMY+44XSisC6dIxpAW94+QshEkUG9irWZrz9e+MM vnszEWb52bgcJdCKuk2XFxF10YU1rFCmuYtZ2fDQcFRrW3SdnbDyirQlohid3S0Ba9dE moXQzyUnEB9w8ADyzaU57ccvcAwrcZZnSya1vr/WU0sVNeD9S0WAYRp6I0cMfp0w4QR2 i4ig== X-Gm-Message-State: AOAM532hikd0QjxwbC1dLCPEiCxYP4eD6xHIDDecRGVWdlgI1Szs+kC5 yTMQGyoICTen3PO7jRCKfaM= X-Received: by 2002:a05:6808:144e:: with SMTP id x14mr3316950oiv.28.1628775707346; Thu, 12 Aug 2021 06:41:47 -0700 (PDT) Received: from server.roeck-us.net ([2600:1700:e321:62f0:329c:23ff:fee3:9d7c]) by smtp.gmail.com with ESMTPSA id z21sm613612oto.46.2021.08.12.06.41.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Aug 2021 06:41:46 -0700 (PDT) Sender: Guenter Roeck Date: Thu, 12 Aug 2021 06:41:45 -0700 From: Guenter Roeck To: Akira Tsukamoto Cc: Paul Walmsley , Palmer Dabbelt , Geert Uytterhoeven , Qiu Wenbo , Albert Ou , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/1] riscv: __asm_copy_to-from_user: Improve using word copy if size < 9*SZREG Message-ID: <20210812134145.GA4132779@roeck-us.net> References: <65f08f01-d4ce-75c2-030b-f8759003e061@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 30, 2021 at 10:52:44PM +0900, Akira Tsukamoto wrote: > Reduce the number of slow byte_copy when the size is in between > 2*SZREG to 9*SZREG by using none unrolled word_copy. > > Without it any size smaller than 9*SZREG will be using slow byte_copy > instead of none unrolled word_copy. > > Signed-off-by: Akira Tsukamoto Tested-by: Guenter Roeck > --- > arch/riscv/lib/uaccess.S | 46 ++++++++++++++++++++++++++++++++++++---- > 1 file changed, 42 insertions(+), 4 deletions(-) > > diff --git a/arch/riscv/lib/uaccess.S b/arch/riscv/lib/uaccess.S > index 63bc691cff91..6a80d5517afc 100644 > --- a/arch/riscv/lib/uaccess.S > +++ b/arch/riscv/lib/uaccess.S > @@ -34,8 +34,10 @@ ENTRY(__asm_copy_from_user) > /* > * Use byte copy only if too small. > * SZREG holds 4 for RV32 and 8 for RV64 > + * a3 - 2*SZREG is minimum size for word_copy > + * 1*SZREG for aligning dst + 1*SZREG for word_copy > */ > - li a3, 9*SZREG /* size must be larger than size in word_copy */ > + li a3, 2*SZREG > bltu a2, a3, .Lbyte_copy_tail > > /* > @@ -66,9 +68,40 @@ ENTRY(__asm_copy_from_user) > andi a3, a1, SZREG-1 > bnez a3, .Lshift_copy > > +.Lcheck_size_bulk: > + /* > + * Evaluate the size if possible to use unrolled. > + * The word_copy_unlrolled requires larger than 8*SZREG > + */ > + li a3, 8*SZREG > + add a4, a0, a3 > + bltu a4, t0, .Lword_copy_unlrolled > + > .Lword_copy: > - /* > - * Both src and dst are aligned, unrolled word copy > + /* > + * Both src and dst are aligned > + * None unrolled word copy with every 1*SZREG iteration > + * > + * a0 - start of aligned dst > + * a1 - start of aligned src > + * t0 - end of aligned dst > + */ > + bgeu a0, t0, .Lbyte_copy_tail /* check if end of copy */ > + addi t0, t0, -(SZREG) /* not to over run */ > +1: > + REG_L a5, 0(a1) > + addi a1, a1, SZREG > + REG_S a5, 0(a0) > + addi a0, a0, SZREG > + bltu a0, t0, 1b > + > + addi t0, t0, SZREG /* revert to original value */ > + j .Lbyte_copy_tail > + > +.Lword_copy_unlrolled: > + /* > + * Both src and dst are aligned > + * Unrolled word copy with every 8*SZREG iteration > * > * a0 - start of aligned dst > * a1 - start of aligned src > @@ -97,7 +130,12 @@ ENTRY(__asm_copy_from_user) > bltu a0, t0, 2b > > addi t0, t0, 8*SZREG /* revert to original value */ > - j .Lbyte_copy_tail > + > + /* > + * Remaining might large enough for word_copy to reduce slow byte > + * copy > + */ > + j .Lcheck_size_bulk > > .Lshift_copy: >