Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp1477539rdb; Sat, 23 Dec 2023 08:05:15 -0800 (PST) X-Google-Smtp-Source: AGHT+IFUn/fdlXZyNmhWXTAPZg70QCY7cl4hReyfjFKMUoVWFqPtLl3pEypvwr+I0Djbiz9gacS5 X-Received: by 2002:a05:6a21:3397:b0:194:e01b:bd14 with SMTP id yy23-20020a056a21339700b00194e01bbd14mr2030300pzb.18.1703347514942; Sat, 23 Dec 2023 08:05:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703347514; cv=none; d=google.com; s=arc-20160816; b=fIRoDalpPQ4jcVumuimLu2s2RwDfbBDGfcAor7n4k5ST9bWkDFFzoAES78H/+OjXIh Xynj4w4C+9d0RienDpRfeRgFl8KicjxLVAvasTBhUeYmsAQa+e9T6JZYGeheQT2Ct7LE AGWaH0PyZcqsUmTX/Aa748LyHCNkv4rHS6VQwSgGxx9b8XVNjqOavXoGnrfyUYnyLzSI A7hX4NhYTpAYsOu7QHDVWKCkRGcocyK3MDnENmoSdmul51AM/jp5iHZHe6psKHYJJm92 49iJOxfRITnESKyKLvU5TDRRA0dl9jdm6qAbE/eLYM4o8nthGKlNVU8BR3y6tg1XUZvx uvvA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=faHlwGOReUEiQSE1etRiVHQvRTur5h2cj7WhEOIfJ0k=; fh=D9r4Sizq2FdKSK6sy9TLYUFhmy1mZUM4RB+eGjJiKEU=; b=xH3UCb55VT9lbsYuYvj7BU1XNGzoHNyC3S0m8K88B8Lg0vd/b4vhGs/tYowCUvXxWD vcXWCE30VOMbAw4ZQjnkXTHM037T8IDOolflwTD21f4QoPruK2tqJSoIbsnHLdMENEol 2nbUaFiATJ8LWQMamGo6m8od9XkK+vmE6D5/pPfYG3xwnSFfTwzswZNfrvr939NceJ7k W8k2cM9euhR1tk6Id+5VdJKmkHnu+AUGiR4haHi6LxWv2NC5U4yXGXdoiYPVy4H8OdfZ 4tPihzZQNO+eykYJ5AGUrR1cxrAa2irxtEcsbAyFyGvDQCp9hBaj5N+l7BBd/SW0tjAT 9x1g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=tJlsg+v5; spf=pass (google.com: domain of linux-kernel+bounces-10483-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-10483-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id t21-20020a63f355000000b005cd8723fab0si5058633pgj.557.2023.12.23.08.05.14 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 23 Dec 2023 08:05:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-10483-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=tJlsg+v5; spf=pass (google.com: domain of linux-kernel+bounces-10483-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-10483-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 4E5A4283B95 for ; Sat, 23 Dec 2023 16:05:14 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 68F5AE56B; Sat, 23 Dec 2023 16:05:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="tJlsg+v5" X-Original-To: linux-kernel@vger.kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A0477DF66 for ; Sat, 23 Dec 2023 16:05:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6CE8DC433C9; Sat, 23 Dec 2023 16:05:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1703347508; bh=4LFjPK+C/VGLc/HLRVhKkrd4EuuVYg8aZKuIRF5TOtM=; h=From:To:Cc:Subject:Date:From; b=tJlsg+v5w4DwnBsx049VU0UjqJNlzPtxxdQXzz3VEuYVVd/byZG7K3vuGaBTDXXtp DoboTQzwQ3zXscka6lPYJ6mgIvpRXG1BebTc+vHUVtMt3TPPnjkDKp0JqjKIegpSi3 ENZM14ZC2Eb37cdWNU1CK48tkhUI7YyUpifSRumuT8F9pTzp65BNaGG2/9+fFy0sOX N6IjN5oKtb3DsmJsG+HGX+Y9toX7kH/zITb6hia/2y4fkNQSq9s2M6frPvbdXPcRcB wjZMpPFE+wpKDVr5lWTQjKYMOKv7aZrbQS9xEXkjURK7bTe/ebOzxYGZFCx1rdGAbk LpcGbm4KTL10w== From: Jisheng Zhang To: Paul Walmsley , Palmer Dabbelt , Albert Ou Cc: Conor Dooley , linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, Qingfang DENG , Eric Biggers Subject: [PATCH v3 0/2] riscv: enable EFFICIENT_UNALIGNED_ACCESS and DCACHE_WORD_ACCESS Date: Sat, 23 Dec 2023 23:52:24 +0800 Message-Id: <20231223155226.4050-1-jszhang@kernel.org> X-Mailer: git-send-email 2.40.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Some riscv implementations such as T-HEAD's C906, C908, C910 and C920 support efficient unaligned access, for performance reason we want to enable HAVE_EFFICIENT_UNALIGNED_ACCESS on these platforms. To avoid performance regressions on non efficient unaligned access platforms, HAVE_EFFICIENT_UNALIGNED_ACCESS can't be globally selected. To solve this problem, runtime code patching based on the detected speed is a good solution. But that's not easy, it involves lots of work to modify vairous subsystems such as net, mm, lib and so on. This can be done step by step. So let's take an easier solution: add support to efficient unaligned access and hide the support under NONPORTABLE. patch1 introduces RISCV_EFFICIENT_UNALIGNED_ACCESS which depends on NONPORTABLE, if users know during config time that the kernel will be only run on those efficient unaligned access hw platforms, they can enable it. Obviously, generic unified kernel Image shouldn't enable it. patch2 adds support DCACHE_WORD_ACCESS when MMU and RISCV_EFFICIENT_UNALIGNED_ACCESS. Below test program and step shows how much performance can be improved: $ cat tt.c #include #include #include #define ITERATIONS 1000000 #define PATH "123456781234567812345678123456781" int main(void) { unsigned long i; struct stat buf; for (i = 0; i < ITERATIONS; i++) stat(PATH, &buf); return 0; } $ gcc -O2 tt.c $ touch 123456781234567812345678123456781 $ time ./a.out Per my test on T-HEAD C910 platforms, the above test performance is improved by about 7.5%. Since v2: - Don't set "-mstrict-align" CFLAGS if HAVE_EFFICIENT_UNALIGNED_ACCESS - collect Reviewed-by tag Since v1: - fix typo in commit msg - fix build error if NOMMU Jisheng Zhang (2): riscv: introduce RISCV_EFFICIENT_UNALIGNED_ACCESS riscv: select DCACHE_WORD_ACCESS for efficient unaligned access HW arch/riscv/Kconfig | 13 +++++++++++ arch/riscv/Makefile | 2 ++ arch/riscv/include/asm/asm-extable.h | 15 ++++++++++++ arch/riscv/include/asm/word-at-a-time.h | 27 +++++++++++++++++++++ arch/riscv/mm/extable.c | 31 +++++++++++++++++++++++++ 5 files changed, 88 insertions(+) -- 2.40.0