Received: by 2002:a05:6359:c8b:b0:c7:702f:21d4 with SMTP id go11csp5979989rwb; Wed, 21 Sep 2022 15:39:19 -0700 (PDT) X-Google-Smtp-Source: AMsMyM7qilWC03+hV4KgdUvScEUI/uVW50IAWcleUxng/JS3GRi3HQwo+MTYtqI8qN/vZ1qfE0oQ X-Received: by 2002:a17:902:d4ce:b0:178:1e39:3218 with SMTP id o14-20020a170902d4ce00b001781e393218mr274766plg.144.1663799959636; Wed, 21 Sep 2022 15:39:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663799959; cv=none; d=google.com; s=arc-20160816; b=AL4C/XHLBUSsSoVUZ1V2x14uvH0AVQlIaIfIT35+1mVz2IXsOazQUr4u1rvtfoyO/B x/pmZ7HHc0RTzauUsuEQyUfoFKznUhqea4kPOrSXiPEs+02/C40BDgtD4WNCtIXDqAWE LqkfFV6q9/EHounRNBHVq9pwLKEBfdsOei98NWxow2jBzwXpZr/Dw3oB4/MrV3FezKd2 KNQm0f0G6ds8IXqqlJxWcojNRTmPOa9zUntg986qkwyuMkiEXBeo4turB19kPwqcAxZI JYFk0CZUGKw8ZdEGhFRtXzZkw+HHkYMHne6mUIQFxbGOdBBkpn1wcJb+qOMT2ORlHsap cfbQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:to:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:from :dkim-signature; bh=NQRKtndsBQVthqZ/KWoSsGVPFqnEuTZgHG0gsudzTz4=; b=XGYoSE37bBs2vKdmn6xvxL4bd0+YkR8z/kQHzj+1zQ94ZOfJYEFyNccEMCJAoGkMKW 8UEJQkXok3eaJDMj41tC+R2cwlCCAHSSSxYOFjPdy7qAndTSfZRHyQfgu5LxUzuopYjG wCMEE/YeeLce1gAZssNjdpHNZx+b8F+n6LxOmXHpB80593i8H7+nTacbOJUhdfNTXkmb SyfqHkAZYIIDVl0HZUj/+TlZgC1UcKixQZA9V2eqNTopnTh1KxOW4riju5rNDODxW0jA 2wBrGbUhQeZg44PSXwZ5oQJmVC8iesLSmvQxF12zucASzbABbjcL/dpReksMGqldUG1n I31Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@rivosinc-com.20210112.gappssmtp.com header.s=20210112 header.b=5ce3QwpZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id m9-20020a056a00080900b0053e5ac66c3csi4910022pfk.38.2022.09.21.15.38.44; Wed, 21 Sep 2022 15:39:19 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=fail header.i=@rivosinc-com.20210112.gappssmtp.com header.s=20210112 header.b=5ce3QwpZ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229936AbiIUVxz (ORCPT + 99 others); Wed, 21 Sep 2022 17:53:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:32868 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231255AbiIUVxj (ORCPT ); Wed, 21 Sep 2022 17:53:39 -0400 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7B1CFA6C7C for ; Wed, 21 Sep 2022 14:53:32 -0700 (PDT) Received: by mail-pl1-x631.google.com with SMTP id c24so7000152plo.3 for ; Wed, 21 Sep 2022 14:53:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=rivosinc-com.20210112.gappssmtp.com; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date; bh=NQRKtndsBQVthqZ/KWoSsGVPFqnEuTZgHG0gsudzTz4=; b=5ce3QwpZ49NppFQNZRZZcJZM8WdPKY26XL+i90+YYAVYhw9g5ugHVk/6segFceMuUS Umrk//LVcZ8L03BX85lF1I16W6X7UP3hGV4+2g4JvVGPvMArqFxdCq985NCE4rneI5oB jZpk1Z32geuua2LmATIdzyfUz0a9c5F2EAqaTNyo9S+QvEuaVV2cjU3dxZrRqNJ92gvV WStvSGiM2JhUMcDKUi1mIji6Sbrv3LzojrQfrufyJ79ySD7hS1n9O5p1cbvFTWarKcc1 cm2S3JQSAZ8TZH4/qIh2WPsXLGH0EQz6TNhEURNP4Uy4lhQdl4ud5uUzj3zFbsoy8EYU Sfhw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date; bh=NQRKtndsBQVthqZ/KWoSsGVPFqnEuTZgHG0gsudzTz4=; b=oSWzy3/e9RxMKLqe+aqZ1o7SC9sq+7ILDGkeHXQFev7tj4wpl2TJCnUrxd/X44Hmvh VfJmfWM1D84HiiFAEySFoIqRbP2sdI81nRzMO00+g8rUM987ISAYjd1LF8NyxUZqa2Cy LbaMgg2nFUwRJYTcc02JspQhWMIqx36O+9tZL2xpjfyOtb7QIueZGAZSkP4FQo2qXUtf t3FwXX+XeSxOig7AXEkCyLfKuzlQnsTSMSpRgkOqqih0bgHquwNThIeNEmo29AdD2Prc ZLZRqAB9tXu139OssCOyf++r6U8zNN5l/Wgc/5tfCjKKcZW5099nViMrmf2+LcqYRBSQ LQFQ== X-Gm-Message-State: ACrzQf05dVtQRLqHSWD660AeigXlsrQQ3f7v/+osxjfsEf7IEOK48P2/ 1RvZbuoGunwttanlEqNQVTffhA== X-Received: by 2002:a17:90b:3c90:b0:203:bf90:f78b with SMTP id pv16-20020a17090b3c9000b00203bf90f78bmr6997208pjb.138.1663797211990; Wed, 21 Sep 2022 14:53:31 -0700 (PDT) Received: from stillson.ba.rivosinc.com ([66.220.2.162]) by smtp.gmail.com with ESMTPSA id k7-20020aa79727000000b005484d133127sm2634536pfg.129.2022.09.21.14.53.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 21 Sep 2022 14:53:31 -0700 (PDT) From: Chris Stillson Cc: Greentime Hu , Han-Kuan Chen , Paul Walmsley , Palmer Dabbelt , Albert Ou , Eric Biederman , Kees Cook , Anup Patel , Atish Patra , Oleg Nesterov , Heinrich Schuchardt , Guo Ren , Conor Dooley , Arnaud Pouliquen , Chris Stillson , Paolo Bonzini , Qinglin Pan , Alexandre Ghiti , Arnd Bergmann , Vincent Chen , Heiko Stuebner , Dao Lu , Jisheng Zhang , Sunil V L , Li Zhengyu , Alexander Graf , Ard Biesheuvel , Tsukasa OI , Yury Norov , "Paul E. McKenney" , Nicolas Saenz Julienne , Mark Rutland , Frederic Weisbecker , Changbin Du , Vitaly Wool , Myrtle Shah , Catalin Marinas , Will Deacon , Mark Brown , Huacai Chen , Alexey Dobriyan , Janosch Frank , Christian Brauner , Colin Cross , Eugene Syromiatnikov , Peter Collingbourne , Andrew Morton , Suren Baghdasaryan , Barret Rhoden , Davidlohr Bueso , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, kvm@vger.kernel.org, kvm-riscv@lists.infradead.org Subject: [PATCH v12 13/17] riscv: Add vector extension XOR implementation Date: Wed, 21 Sep 2022 14:43:55 -0700 Message-Id: <20220921214439.1491510-13-stillson@rivosinc.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20220921214439.1491510-1-stillson@rivosinc.com> References: <20220921214439.1491510-1-stillson@rivosinc.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net To: unlisted-recipients:; (no To-header on input) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Greentime Hu This patch adds support for vector optimized XOR and it is tested in qemu. Co-developed-by: Han-Kuan Chen Signed-off-by: Han-Kuan Chen Signed-off-by: Greentime Hu --- arch/riscv/include/asm/xor.h | 82 ++++++++++++++++++++++++++++++++++++ arch/riscv/lib/Makefile | 1 + arch/riscv/lib/xor.S | 81 +++++++++++++++++++++++++++++++++++ 3 files changed, 164 insertions(+) create mode 100644 arch/riscv/include/asm/xor.h create mode 100644 arch/riscv/lib/xor.S diff --git a/arch/riscv/include/asm/xor.h b/arch/riscv/include/asm/xor.h new file mode 100644 index 000000000000..d1f2eeb14afb --- /dev/null +++ b/arch/riscv/include/asm/xor.h @@ -0,0 +1,82 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2021 SiFive + */ + +#include +#include +#ifdef CONFIG_VECTOR +#include +#include + +void xor_regs_2_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2); +void xor_regs_3_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3); +void xor_regs_4_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4); +void xor_regs_5_(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4, + const unsigned long *__restrict p5); + +static void xor_rvv_2(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2) +{ + kernel_rvv_begin(); + xor_regs_2_(bytes, p1, p2); + kernel_rvv_end(); +} + +static void xor_rvv_3(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3) +{ + kernel_rvv_begin(); + xor_regs_3_(bytes, p1, p2, p3); + kernel_rvv_end(); +} + +static void xor_rvv_4(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4) +{ + kernel_rvv_begin(); + xor_regs_4_(bytes, p1, p2, p3, p4); + kernel_rvv_end(); +} + +static void xor_rvv_5(unsigned long bytes, unsigned long *__restrict p1, + const unsigned long *__restrict p2, + const unsigned long *__restrict p3, + const unsigned long *__restrict p4, + const unsigned long *__restrict p5) +{ + kernel_rvv_begin(); + xor_regs_5_(bytes, p1, p2, p3, p4, p5); + kernel_rvv_end(); +} + +static struct xor_block_template xor_block_rvv = { + .name = "rvv", + .do_2 = xor_rvv_2, + .do_3 = xor_rvv_3, + .do_4 = xor_rvv_4, + .do_5 = xor_rvv_5 +}; + +#undef XOR_TRY_TEMPLATES +#define XOR_TRY_TEMPLATES \ + do { \ + xor_speed(&xor_block_8regs); \ + xor_speed(&xor_block_32regs); \ + if (has_vector()) { \ + xor_speed(&xor_block_rvv);\ + } \ + } while (0) +#endif diff --git a/arch/riscv/lib/Makefile b/arch/riscv/lib/Makefile index 25d5c9664e57..acd87ac86d24 100644 --- a/arch/riscv/lib/Makefile +++ b/arch/riscv/lib/Makefile @@ -7,3 +7,4 @@ lib-$(CONFIG_MMU) += uaccess.o lib-$(CONFIG_64BIT) += tishift.o obj-$(CONFIG_FUNCTION_ERROR_INJECTION) += error-inject.o +lib-$(CONFIG_VECTOR) += xor.o diff --git a/arch/riscv/lib/xor.S b/arch/riscv/lib/xor.S new file mode 100644 index 000000000000..3bc059e18171 --- /dev/null +++ b/arch/riscv/lib/xor.S @@ -0,0 +1,81 @@ +/* SPDX-License-Identifier: GPL-2.0-or-later */ +/* + * Copyright (C) 2021 SiFive + */ +#include +#include +#include + +ENTRY(xor_regs_2_) + vsetvli a3, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a3 + vxor.vv v16, v0, v8 + add a2, a2, a3 + vse8.v v16, (a1) + add a1, a1, a3 + bnez a0, xor_regs_2_ + ret +END(xor_regs_2_) +EXPORT_SYMBOL(xor_regs_2_) + +ENTRY(xor_regs_3_) + vsetvli a4, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a4 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a4 + vxor.vv v16, v0, v16 + add a3, a3, a4 + vse8.v v16, (a1) + add a1, a1, a4 + bnez a0, xor_regs_3_ + ret +END(xor_regs_3_) +EXPORT_SYMBOL(xor_regs_3_) + +ENTRY(xor_regs_4_) + vsetvli a5, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a5 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a5 + vxor.vv v0, v0, v16 + vle8.v v24, (a4) + add a3, a3, a5 + vxor.vv v16, v0, v24 + add a4, a4, a5 + vse8.v v16, (a1) + add a1, a1, a5 + bnez a0, xor_regs_4_ + ret +END(xor_regs_4_) +EXPORT_SYMBOL(xor_regs_4_) + +ENTRY(xor_regs_5_) + vsetvli a6, a0, e8, m8, ta, ma + vle8.v v0, (a1) + vle8.v v8, (a2) + sub a0, a0, a6 + vxor.vv v0, v0, v8 + vle8.v v16, (a3) + add a2, a2, a6 + vxor.vv v0, v0, v16 + vle8.v v24, (a4) + add a3, a3, a6 + vxor.vv v0, v0, v24 + vle8.v v8, (a5) + add a4, a4, a6 + vxor.vv v16, v0, v8 + add a5, a5, a6 + vse8.v v16, (a1) + add a1, a1, a6 + bnez a0, xor_regs_5_ + ret +END(xor_regs_5_) +EXPORT_SYMBOL(xor_regs_5_) -- 2.25.1