Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp4508344imu; Tue, 29 Jan 2019 02:50:25 -0800 (PST) X-Google-Smtp-Source: ALg8bN7sWE7f18Vt+rVZMg29nYj5RHnVKlzp8YwQsnhhli0zNwXp7exZNskpORmb1eLvpsWAN3Ui X-Received: by 2002:a17:902:830a:: with SMTP id bd10mr25638604plb.321.1548759024983; Tue, 29 Jan 2019 02:50:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548759024; cv=none; d=google.com; s=arc-20160816; b=C4XecLZkA4tJ9fIsQhwp553GzGqIK3k6fuI6ydjKARmJ6KcxO1SjoWA5FdtmyIIUe0 9I0zx3RjtRb8pwT53SexM2Lqwz53BdWm7Pg9iWx76+DG0Eet+WSgWYFDY6A43QDz2294 BQGLA3jneYthnU0KgpP77hztpIFov8xAGizeVyQaMUr8AWtl2DRYTZNxo0GlaJeVS4o3 KCWKKgUZxz5U9EPEM1x4uWlPQ9f7kJKlvFZCzZIKTvvmiDTPxR8xhLKyQaT6GL6p/mrr c/VP/UZGHeRnEvF7kdkFtXIDpztR2xcZ/zFag2XLtwZauJJealdVbgmJScs6m0YjhCXy TpOw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:dkim-signature; bh=eRR+ViPWE6cNqgqdEuQP9fCthaAhDcgAEB+LEnOjf44=; b=z4jIPNSvLWquysTDBkJGEtizEGKYth5RvDGbHADsi0pkiq/KwF1mGlNefRCWY05b/f ewTc5Oeyz0XbUE3QTarLfEcYe6jrBq0bRFxU6MCK9jCApTW4iiGdrXrA6AAn+WvC5n38 n+kqlTyjOVPmwOXgaL4LeYqky2yocvlCy1M1xkbkSewG4P+RRVPmY+SyfSgiNBKcm+Ey VbqiE83tkudiZtJrY686T33Z0whInPcFtYt5D2kJGomLP2uMLf9gwooIWkstMnFqDEqy BIKJIiPdPbi91Cp92w9Fqzyt/TvegTCiPUC+C5PvGg8wHJemaTixgKzxFw/TIhH5yLUk uM0w== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@synopsys.com header.s=mail header.b=VpxmEqZi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=synopsys.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e8si23695079pgn.325.2019.01.29.02.50.09; Tue, 29 Jan 2019 02:50:24 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@synopsys.com header.s=mail header.b=VpxmEqZi; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=synopsys.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728477AbfA2Kt6 (ORCPT + 99 others); Tue, 29 Jan 2019 05:49:58 -0500 Received: from smtprelay.synopsys.com ([198.182.47.9]:43328 "EHLO smtprelay.synopsys.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728444AbfA2Ktz (ORCPT ); Tue, 29 Jan 2019 05:49:55 -0500 Received: from mailhost.synopsys.com (unknown [10.12.135.161]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtprelay.synopsys.com (Postfix) with ESMTPS id 0403124E1718; Tue, 29 Jan 2019 02:49:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=synopsys.com; s=mail; t=1548758995; bh=gp9/AEoNo/BmClLhSdVvfGrL9375u+uThQLB9UExYDs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VpxmEqZiG+jUJPydx/0Aa+URcJ86T+f5XU3KsB3Y1dYOshVmjgco/dc3Di4yVTA+i 1PBhvRyqRdUUTmi3y0qaD1L1Ec5SC2dxRKFkRZpl9I/Xq4o2zjbXUcWqEpWJ/Ec0wU Px2qmYGiEcoa201W6UmvHY73N/XIXAzN3tAayRKu0yJPts3DaVEK8vzZXat6KGNtvr 8lGjvWCa1Kt0FwxoEIDBXjFaZY/4qK5TENzHDUAPrTwMtYcaIYA9jOQpJAuw0PxeuW L9zMdJh08VpNl9zy5rJShoKobpF+RucZvE7B1A4rnfYDR3WjWQFvP/I/u482yTOidn 7WU7e92h3OoJw== Received: from paltsev-e7480.internal.synopsys.com (unknown [10.121.8.46]) by mailhost.synopsys.com (Postfix) with ESMTP id A473FA0070; Tue, 29 Jan 2019 10:49:53 +0000 (UTC) From: Eugeniy Paltsev To: linux-snps-arc@lists.infradead.org, Vineet Gupta Cc: linux-kernel@vger.kernel.org, Alexey Brodkin , Eugeniy Paltsev Subject: [PATCH 5/5] ARCv2: LIB: MEMCPY: fixed and optimised routine Date: Tue, 29 Jan 2019 13:49:42 +0300 Message-Id: <20190129104942.31705-6-Eugeniy.Paltsev@synopsys.com> X-Mailer: git-send-email 2.14.5 In-Reply-To: <20190129104942.31705-1-Eugeniy.Paltsev@synopsys.com> References: <20190129104942.31705-1-Eugeniy.Paltsev@synopsys.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Optimise code to use efficient unaligned memory access which is available on ARCv2. This allows us to really simplify memcpy code and speed up the code one and a half times (in case of unaligned source or destination). Signed-off-by: Eugeniy Paltsev --- arch/arc/Kconfig | 4 +++ arch/arc/lib/Makefile | 5 +++- arch/arc/lib/memcpy-archs-unaligned.S | 46 +++++++++++++++++++++++++++++++++++ 3 files changed, 54 insertions(+), 1 deletion(-) create mode 100644 arch/arc/lib/memcpy-archs-unaligned.S diff --git a/arch/arc/Kconfig b/arch/arc/Kconfig index a1d976c612a6..88f1a3205b8f 100644 --- a/arch/arc/Kconfig +++ b/arch/arc/Kconfig @@ -396,6 +396,10 @@ config ARC_USE_UNALIGNED_MEM_ACCESS which is disabled by default. Enable unaligned access in hardware and use it in software. +#dummy symbol for using in makefile +config ARC_NO_UNALIGNED_MEM_ACCESS + def_bool !ARC_USE_UNALIGNED_MEM_ACCESS + config ARC_HAS_LL64 bool "Insn: 64bit LDD/STD" help diff --git a/arch/arc/lib/Makefile b/arch/arc/lib/Makefile index b1656d156097..59cc8b61342e 100644 --- a/arch/arc/lib/Makefile +++ b/arch/arc/lib/Makefile @@ -8,4 +8,7 @@ lib-y := strchr-700.o strcpy-700.o strlen.o memcmp.o lib-$(CONFIG_ISA_ARCOMPACT) += memcpy-700.o memset.o strcmp.o -lib-$(CONFIG_ISA_ARCV2) += memcpy-archs.o memset-archs.o strcmp-archs.o +lib-$(CONFIG_ISA_ARCV2) += memset-archs.o strcmp-archs.o + +lib-$(CONFIG_ARC_NO_UNALIGNED_MEM_ACCESS) += memcpy-archs.o +lib-$(CONFIG_ARC_USE_UNALIGNED_MEM_ACCESS) += memcpy-archs-unaligned.o diff --git a/arch/arc/lib/memcpy-archs-unaligned.S b/arch/arc/lib/memcpy-archs-unaligned.S new file mode 100644 index 000000000000..e09b51d4de70 --- /dev/null +++ b/arch/arc/lib/memcpy-archs-unaligned.S @@ -0,0 +1,46 @@ +/* SPDX-License-Identifier: GPL-2.0+ */ +// +// ARCv2 memcpy implementation optimized for unaligned memory access using. +// +// Copyright (C) 2019 Synopsys +// Author: Eugeniy Paltsev + +#include + +#ifdef CONFIG_ARC_HAS_LL64 +# define LOADX(DST,RX) ldd.ab DST, [RX, 8] +# define STOREX(SRC,RX) std.ab SRC, [RX, 8] +# define ZOLSHFT 5 +# define ZOLAND 0x1F +#else +# define LOADX(DST,RX) ld.ab DST, [RX, 4] +# define STOREX(SRC,RX) st.ab SRC, [RX, 4] +# define ZOLSHFT 4 +# define ZOLAND 0xF +#endif + +ENTRY_CFI(memcpy) + mov r3, r0 ; don;t clobber ret val + + lsr.f lp_count, r2, ZOLSHFT + lpnz @.Lcopy32_64bytes + ;; LOOP START + LOADX (r6, r1) + LOADX (r8, r1) + LOADX (r10, r1) + LOADX (r4, r1) + STOREX (r6, r3) + STOREX (r8, r3) + STOREX (r10, r3) + STOREX (r4, r3) +.Lcopy32_64bytes: + + and.f lp_count, r2, ZOLAND ;Last remaining 31 bytes + lpnz @.Lcopyremainingbytes + ;; LOOP START + ldb.ab r5, [r1, 1] + stb.ab r5, [r3, 1] +.Lcopyremainingbytes: + + j [blink] +END_CFI(memcpy) -- 2.14.5