Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp1624756rdb; Sat, 2 Dec 2023 03:31:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IG7NE2Pm7ql3R7Sn/Gdo1GFR4G7EoVCvHUJ4e0z3sn9Nm6W2DCVFLSIMlghWzfoqQ1ZH/ak X-Received: by 2002:a05:6a00:4c1a:b0:6ce:10dd:832e with SMTP id ea26-20020a056a004c1a00b006ce10dd832emr290502pfb.33.1701516698657; Sat, 02 Dec 2023 03:31:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701516698; cv=none; d=google.com; s=arc-20160816; b=m7l0aNADbhCv9+di3yxRPt1fy//HmKTTQ+eadKpTdXxaVNe9mpMZrAMKyllq5pOvAG VDA+C73N5HB5+37ySWvJXx/Ac3VCFbKWeFI4IabRwDPHtqGZrq9AoIu2lFbWbGcwa33d 59LHs1tbuvJtPAhwmWuZ8B3fEN19Lynnc4vcpMXS4RMHP2XY6dY9CI1lEWZY1SakmzdF WzfLgKVU4es2Hrxr+bDbReu8P+4//wluaW5MDZG6QQwUhIvNU6wbibHZVwJAl1bs7lPu guSxbjAv/FNb3XOup4jooojwA7VBk96NbJgoJlUYix2Ui0Mako4NKoygz07yfIo1m2HP Ksgg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=kDW1MWs9qYuXjWWKkQy4UPEfuhLWtRaEt9hCrWWZtbQ=; fh=srb/EijQIrMEdOeqn3qQX6qrsAGyqHAqeSFkl/qDdLM=; b=Cn91ro2O9Yu5LppJ0x1cF2BPQOmTxhakpae6xMBpV59t3MCECRMy0CB2h4OGzok32v w4rb99b3H0M7MwjPsccQiI2AwD460wj9nxDf6h2N2bWcfOMDoPAbEcGN0/+SK3L0Bcxc 0LntLpgZyy8j6KMruSHCLu6y0ET/yr8T5dvl6Kcf3wWYTpFOcQ5JsAcMfwewoMl7ZEH6 E4CWqwUaJGwBDFqr4n3zT5M1Wyqb+HPLFsVND9UTEIVQeoQ17yAx7hwQhD0G7t2BsUAL DYJHSDEuCUF2HjG48FeAkRAV4FV4ba32KVrxmL36Wv00Qk0V8Yc1V67Cbk1Eqn0oACTg mKjg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Xo3h81o4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from morse.vger.email (morse.vger.email. [2620:137:e000::3:1]) by mx.google.com with ESMTPS id h14-20020a056a00230e00b006cbe53e19fdsi4870923pfh.222.2023.12.02.03.31.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Dec 2023 03:31:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) client-ip=2620:137:e000::3:1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=Xo3h81o4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:1 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 4399D803EBAF; Sat, 2 Dec 2023 03:31:36 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232563AbjLBLar (ORCPT + 99 others); Sat, 2 Dec 2023 06:30:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48532 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232519AbjLBLap (ORCPT ); Sat, 2 Dec 2023 06:30:45 -0500 Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A41F5194 for ; Sat, 2 Dec 2023 03:30:51 -0800 (PST) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2535CC433C8; Sat, 2 Dec 2023 11:30:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1701516651; bh=rkDS2IBecDt7J2zRqqr6JaMkgcMoWUiIilFCMd6Llus=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Xo3h81o4SOsZgewKhvmyTV2Zj/40rxCtOTKddVTpzg9zF4/Zku7mMHWIVqXSWCEgP +CyMJUIOpiyAXr2eCnqWjwvW8NjryddrvUd9dAEaJnwZUnVY2HuXQlKbTd3m5Wuh5n JLBwI9Igz0Cp4Z+U/6VRIBtBHbjOjPHpjutDYrwtiK7HRph7m3hKHJVFpIqTWtEl98 W1S70016s/KgRfcks+hGjryvJ/4E3M0XAxaw06bxPNUQqkdy/3RBtVBYQihtJRnjkt hv5RK8JonV/Y6+JHEj2TciJIR+SmaISmeA/v3fkDGvr8auJULIl9sRA9CRsssfsYJN 7d/TDDXAgxBYg== From: Jisheng Zhang To: Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] riscv: select DCACHE_WORD_ACCESS for efficient unaligned access HW Date: Sat, 2 Dec 2023 19:18:22 +0800 Message-Id: <20231202111822.3569-3-jszhang@kernel.org> X-Mailer: git-send-email 2.40.0 In-Reply-To: <20231202111822.3569-1-jszhang@kernel.org> References: <20231202111822.3569-1-jszhang@kernel.org> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.7 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,MAILING_LIST_MULTI,PDS_BTC_ID, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Sat, 02 Dec 2023 03:31:36 -0800 (PST) DCACHE_WORD_ACCESS uses the word-at-a-time API for optimised string comparisons in the vfs layer. This patch implements support for load_unaligned_zeropad in much the same way as has been done for arm64. Here is the test program and step: $ cat tt.c #include #include #include #define ITERATIONS 1000000 #define PATH "123456781234567812345678123456781" int main(void) { unsigned long i; struct stat buf; for (i = 0; i < ITERATIONS; i++) stat(PATH, &buf); return 0; } $ gcc -O2 tt.c $ touch 123456781234567812345678123456781 $ time ./a.out Per my test on T-HEAD C910 platforms, the above test performance is improved by about 7.5%. Signed-off-by: Jisheng Zhang --- arch/riscv/Kconfig | 1 + arch/riscv/include/asm/asm-extable.h | 15 ++++++++++++ arch/riscv/include/asm/word-at-a-time.h | 23 ++++++++++++++++++ arch/riscv/mm/extable.c | 31 +++++++++++++++++++++++++ 4 files changed, 70 insertions(+) diff --git a/arch/riscv/Kconfig b/arch/riscv/Kconfig index 0a76209e9b02..bb366eb1870e 100644 --- a/arch/riscv/Kconfig +++ b/arch/riscv/Kconfig @@ -657,6 +657,7 @@ config RISCV_MISALIGNED config RISCV_EFFICIENT_UNALIGNED_ACCESS bool "Use unaligned access for some functions" depends on NONPORTABLE + select DCACHE_WORD_ACCESS if MMU select HAVE_EFFICIENT_UNALIGNED_ACCESS default n help diff --git a/arch/riscv/include/asm/asm-extable.h b/arch/riscv/include/asm/asm-extable.h index 00a96e7a9664..0c8bfd54fc4e 100644 --- a/arch/riscv/include/asm/asm-extable.h +++ b/arch/riscv/include/asm/asm-extable.h @@ -6,6 +6,7 @@ #define EX_TYPE_FIXUP 1 #define EX_TYPE_BPF 2 #define EX_TYPE_UACCESS_ERR_ZERO 3 +#define EX_TYPE_LOAD_UNALIGNED_ZEROPAD 4 #ifdef CONFIG_MMU @@ -47,6 +48,11 @@ #define EX_DATA_REG_ZERO_SHIFT 5 #define EX_DATA_REG_ZERO GENMASK(9, 5) +#define EX_DATA_REG_DATA_SHIFT 0 +#define EX_DATA_REG_DATA GENMASK(4, 0) +#define EX_DATA_REG_ADDR_SHIFT 5 +#define EX_DATA_REG_ADDR GENMASK(9, 5) + #define EX_DATA_REG(reg, gpr) \ "((.L__gpr_num_" #gpr ") << " __stringify(EX_DATA_REG_##reg##_SHIFT) ")" @@ -62,6 +68,15 @@ #define _ASM_EXTABLE_UACCESS_ERR(insn, fixup, err) \ _ASM_EXTABLE_UACCESS_ERR_ZERO(insn, fixup, err, zero) +#define _ASM_EXTABLE_LOAD_UNALIGNED_ZEROPAD(insn, fixup, data, addr) \ + __DEFINE_ASM_GPR_NUMS \ + __ASM_EXTABLE_RAW(#insn, #fixup, \ + __stringify(EX_TYPE_LOAD_UNALIGNED_ZEROPAD), \ + "(" \ + EX_DATA_REG(DATA, data) " | " \ + EX_DATA_REG(ADDR, addr) \ + ")") + #endif /* __ASSEMBLY__ */ #else /* CONFIG_MMU */ diff --git a/arch/riscv/include/asm/word-at-a-time.h b/arch/riscv/include/asm/word-at-a-time.h index 7c086ac6ecd4..5a3865ac3623 100644 --- a/arch/riscv/include/asm/word-at-a-time.h +++ b/arch/riscv/include/asm/word-at-a-time.h @@ -9,6 +9,7 @@ #define _ASM_RISCV_WORD_AT_A_TIME_H +#include #include struct word_at_a_time { @@ -45,4 +46,26 @@ static inline unsigned long find_zero(unsigned long mask) /* The mask we created is directly usable as a bytemask */ #define zero_bytemask(mask) (mask) +/* + * Load an unaligned word from kernel space. + * + * In the (very unlikely) case of the word being a page-crosser + * and the next page not being mapped, take the exception and + * return zeroes in the non-existing part. + */ +static inline unsigned long load_unaligned_zeropad(const void *addr) +{ + unsigned long ret; + + /* Load word from unaligned pointer addr */ + asm( + "1: " REG_L " %0, %2\n" + "2:\n" + _ASM_EXTABLE_LOAD_UNALIGNED_ZEROPAD(1b, 2b, %0, %1) + : "=&r" (ret) + : "r" (addr), "m" (*(unsigned long *)addr)); + + return ret; +} + #endif /* _ASM_RISCV_WORD_AT_A_TIME_H */ diff --git a/arch/riscv/mm/extable.c b/arch/riscv/mm/extable.c index 35484d830fd6..dd1530af3ef1 100644 --- a/arch/riscv/mm/extable.c +++ b/arch/riscv/mm/extable.c @@ -27,6 +27,14 @@ static bool ex_handler_fixup(const struct exception_table_entry *ex, return true; } +static inline unsigned long regs_get_gpr(struct pt_regs *regs, unsigned int offset) +{ + if (unlikely(!offset || offset > MAX_REG_OFFSET)) + return 0; + + return *(unsigned long *)((unsigned long)regs + offset); +} + static inline void regs_set_gpr(struct pt_regs *regs, unsigned int offset, unsigned long val) { @@ -50,6 +58,27 @@ static bool ex_handler_uaccess_err_zero(const struct exception_table_entry *ex, return true; } +static bool +ex_handler_load_unaligned_zeropad(const struct exception_table_entry *ex, + struct pt_regs *regs) +{ + int reg_data = FIELD_GET(EX_DATA_REG_DATA, ex->data); + int reg_addr = FIELD_GET(EX_DATA_REG_ADDR, ex->data); + unsigned long data, addr, offset; + + addr = regs_get_gpr(regs, reg_addr * sizeof(unsigned long)); + + offset = addr & 0x7UL; + addr &= ~0x7UL; + + data = *(unsigned long *)addr >> (offset * 8); + + regs_set_gpr(regs, reg_data * sizeof(unsigned long), data); + + regs->epc = get_ex_fixup(ex); + return true; +} + bool fixup_exception(struct pt_regs *regs) { const struct exception_table_entry *ex; @@ -65,6 +94,8 @@ bool fixup_exception(struct pt_regs *regs) return ex_handler_bpf(ex, regs); case EX_TYPE_UACCESS_ERR_ZERO: return ex_handler_uaccess_err_zero(ex, regs); + case EX_TYPE_LOAD_UNALIGNED_ZEROPAD: + return ex_handler_load_unaligned_zeropad(ex, regs); } BUG(); -- 2.42.0