Received: by 2002:ab2:1347:0:b0:1f4:ac9d:b246 with SMTP id g7csp110085lqg; Wed, 10 Apr 2024 17:56:40 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXuhOtqy+7SXVURb8kp5kG7E+iGrF1KVs20r6cY0fbXNZ/LLBtwrK2KbPztkiJTJaHm3yjldybx0nP1tylr1q3/GHKlRgAuplygukBB4Q== X-Google-Smtp-Source: AGHT+IF7dhaozdjzarnu+ph+7ZxRo07xJjEmcEjSiK8U6bl3LyAzDk0vpI22F86XN7PcerN4+v9F X-Received: by 2002:a17:90a:fb92:b0:2a5:ffe8:5531 with SMTP id cp18-20020a17090afb9200b002a5ffe85531mr1965223pjb.7.1712797000039; Wed, 10 Apr 2024 17:56:40 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1712797000; cv=pass; d=google.com; s=arc-20160816; b=VeqZec2iOpfTdIk7YJ74MY56PUqZdg06++mdXD74vzTaVWgQOJ7L+sPly6xQOGFtkf YY+l32Eo79TXAN8kaNe7E2vLwIk5It94bqLhJB5W6NN9PPyqbKLLMbn39DXz5WqZ0cRN ZKG56iEV4GU0HcbcwrI8NsTgBoGovV+h/yZ5v8F4w6/OpEUGJrUqJFh2Vv84wzkPXtBK QdUa+LRkKu18jSxIhTKjHFIHQ6ZL5gdVCCNdFzoTB2EzGsEiQnJzkg0D0HSfK5CVTHFf esoTcE0eoLJyeuhAGJabuEL44kSpeIUUkjxbAYPHCJxwELoA6LOwohr+JDgobe6GgEGQ 674g== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:from :dkim-signature; bh=zT+K3kfROLp5U5Ie5R6IOSfk10H75P7AlRQGGpw0lQ4=; fh=6JoYChsnVFzsbcTrzEnZbDTSp1QTFnCDgh53qSJeVs4=; b=MxNxPUItiUmV55uZ4QW8wtup77raG+IwOOpn1JE7TMoHbPqDR4GPg8Xcjesx/eRdHq E/C4YPpspfQAppojT7Vbu3lUrxRuBNQ2hO/yZL1VDBcTSyYYqtZujL4kEjNohWVrJW3H UKfkYY+CcvkSKUNCUH9NLSmd+Fbuz0/a1cDAxMH9wglu9nOLeefCiUrPi2p5ZMjEZfZJ XQyX2MJUuJDjxHBlWOo33mKbK3AeNXjlV0eATgHbJOy2U/CBO7p4RADkRNGCI4XRfSjd 3OOHqPC0i0TQKuSkdkhWGDZzA29nZDK9DQgPXDIm/y9Pu31Pfi0n0xIBjPoUo9kpmMEA EGzg==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@marcan.st header.s=default header.b=ylASkj0g; arc=pass (i=1 spf=pass spfdomain=marcan.st dkim=pass dkdomain=marcan.st dmarc=pass fromdomain=marcan.st); spf=pass (google.com: domain of linux-kernel+bounces-139593-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-139593-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=marcan.st Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [2604:1380:40f1:3f00::1]) by mx.google.com with ESMTPS id z2-20020a17090acb0200b002a2fe298eedsi393886pjt.81.2024.04.10.17.56.39 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Apr 2024 17:56:40 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-139593-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) client-ip=2604:1380:40f1:3f00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@marcan.st header.s=default header.b=ylASkj0g; arc=pass (i=1 spf=pass spfdomain=marcan.st dkim=pass dkdomain=marcan.st dmarc=pass fromdomain=marcan.st); spf=pass (google.com: domain of linux-kernel+bounces-139593-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:40f1:3f00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-139593-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=marcan.st Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 19BE9B21CCB for ; Thu, 11 Apr 2024 00:51:44 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1D91E9463; Thu, 11 Apr 2024 00:51:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=marcan.st header.i=@marcan.st header.b="ylASkj0g" Received: from mail.marcansoft.com (marcansoft.com [212.63.210.85]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2ADBB2CA4 for ; Thu, 11 Apr 2024 00:51:32 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=212.63.210.85 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712796696; cv=none; b=u/Dx76mb04o4RsMPbRdwS5wWWYqkDCZ75Cy+Yawjf67OSzc0IhPEgiGzHc3hccxxtfix9dQc12CFdGGFBpWl+EBeWQ0u6Z2w5q7MFtqFr1HFkLWiYS4asXL1ZFGXu7rW/SlZGpsz6TJRS0aYHbXvQ2kjjIj/dt6EJf6qDIJq3dY= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1712796696; c=relaxed/simple; bh=nxya/b7E+A6BVmNU6QddSmRkpeeUmd91NwPZg8zTcWA=; h=From:Subject:Date:Message-Id:MIME-Version:Content-Type:To:Cc; b=EAbmjKvCZqJmuExht2idWdeRGgLJgT+g//+KkwEkApSw5GM/uGAZfmCQenHyvkvxcDvttz3S5PRbVw5VrNOX9VU4qZWCDafGHsvu7moq6Qmdr3gb0EEQVzsfLEGoVF52Xpm8pYGSVJ2x2ISQV/Qju3ZcabEU+CNe0nLCTK561qE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=marcan.st; spf=pass smtp.mailfrom=marcan.st; dkim=pass (2048-bit key) header.d=marcan.st header.i=@marcan.st header.b=ylASkj0g; arc=none smtp.client-ip=212.63.210.85 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=marcan.st Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=marcan.st Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: sendonly@marcansoft.com) by mail.marcansoft.com (Postfix) with ESMTPSA id 05645425BB; Thu, 11 Apr 2024 00:51:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=marcan.st; s=default; t=1712796690; bh=nxya/b7E+A6BVmNU6QddSmRkpeeUmd91NwPZg8zTcWA=; h=From:Subject:Date:To:Cc; b=ylASkj0gP/QMhmRNIvwKuEdCIVPxgosdu2BqIWIWtVcA+4sxwIfsCYxeYUd8Oig5A lrgYWXm3bTQWGn+DRkaoCV/0PUOcKJujcyC1l+JpuIF9dnY5DKFXCZP8Mwe29QeDpd vNwZ5LmEox/uyZgWoWP2ioV4Lvx9KnrJM8FNuWGjA+oItbpDBFqFdDWVfkEMNFM61j lN38i36EcHO6GN9i3eAY2BjNe0G936qUWj09l25T7TrGXKtGn+7un5IToz0G3vfUHY 64syV6TV2sOkgUf4TNcr/Vu+MZ82LWbf3WJAZd5CavNjnETt9+MLLwnuZNSBssTPmd u3TiC55+C7SWQ== From: Hector Martin Subject: [PATCH 0/4] arm64: Support the TSO memory model Date: Thu, 11 Apr 2024 09:51:19 +0900 Message-Id: <20240411-tso-v1-0-754f11abfbff@marcan.st> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-B4-Tracking: v=1; b=H4sIAAc0F2YC/6tWKk4tykwtVrJSqFYqSi3LLM7MzwNyDHUUlJIzE vPSU3UzU4B8JSMDIxMDE0ND3ZLifN1UC7O0lOTUJEuTJAsloMqCotS0zAqwKdGxtbUA7VNoR1U AAAA= To: Catalin Marinas , Will Deacon , Marc Zyngier , Mark Rutland Cc: Zayd Qumsieh , Justin Lu , Ryan Houdek , Mark Brown , Ard Biesheuvel , Mateusz Guzik , Anshuman Khandual , Oliver Upton , Miguel Luis , Joey Gouly , Christoph Paasch , Kees Cook , Sami Tolvanen , Baoquan He , Joel Granados , Dawei Li , Andrew Morton , Florent Revest , David Hildenbrand , Stefan Roesch , Andy Chiu , Josh Triplett , Oleg Nesterov , Helge Deller , Zev Weiss , Ondrej Mosnacek , Miguel Ojeda , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Asahi Linux , Hector Martin X-Mailer: b4 0.13.0 X-Developer-Signature: v=1; a=openpgp-sha256; l=5393; i=marcan@marcan.st; h=from:subject:message-id; bh=nxya/b7E+A6BVmNU6QddSmRkpeeUmd91NwPZg8zTcWA=; b=owGbwMvMwCUm+yP4NEe/cRLjabUkhjRxE26+LT/3PJ/+89teKROlYJ/0zYWnTohs2sS+jP/Zy im7K/9LdJSyMIhxMciKKbI0nug91e05/Zy6asp0mDmsTCBDGLg4BWAib18zMlzcNZ+/2EZF/cKx zGbzopnLArb8mlc9szl9bi2H2jb/Fy4M/zNOivxMeZv+2d5YV+jmu167wwYZQlFiB9odJRrC1Rb E8AAA X-Developer-Key: i=marcan@marcan.st; a=openpgp; fpr=FC18F00317968B7BE86201CBE22A629A4C515DD5 x86 CPUs implement a stricter memory modern than ARM64 (TSO). For this reason, x86 emulation on baseline ARM64 systems requires very expensive memory model emulation. Having hardware that supports this natively is therefore very attractive. Such hardware, in fact, exists. This series adds support for userspace to identify when TSO is available and toggle it on, if supported. Some ARM64 CPUs intrinsically implement the TSO memory model, while others expose is as an IMPDEF control. Apple Silicon SoCs are in the latter category. Using TSO for x86 emulation on chips that support it has been shown to provide a massive performance boost [1]. Patch 1 introduces the PR_{SET,GET}_MEM_MODEL userspace control, which is initially not implemented for any architectures. Patch 2 implements it for CPUs which are known, to the best of my knowledge, to always implement the TSO memory model unconditionally. This uses the cpufeature mechanism to only enable this if *all* cores in the system meet the requirements. Patch 3 adds the scaffolding necesasry to save/restore the ACTLR_EL1 register across context switches. This register contains IMPDEF flags related to CPU execution, and on Apple CPUs this is where the runtime TSO toggle bit is implemented. Other CPUs could conceivably benefit from this scaffolding if they also use ACTLR_EL1 for things that could ostensibly be runtime controlled and context-switched. For this to work, ACTLR_EL1 must have a uniform layout across all cores in the system. Finally, patch 4 implements PR_{SET,GET}_MEM_MODEL for Apple CPUs by hooking it up to flip the appropriate ACTLR_EL1 bit when the Apple TSO feature is detected (on all CPUs, which also implies the uniform ACTLR_EL1 layout). This series has been brewing in the downstream Asahi Linux tree for a while now, and ships to thousands of users. A subset have been using it with FEX-Emu, which already supports this feature. This rebase on v6.9-rc1 is only build-tested (all intermediate commits with and without the config enabled, on ARM64) but I'll update the downstream branch soon with this version and get it pushed out to users/testers. The Apple support works on bare metal and *should* work exactly the same way on macOS VMs (as alluded to by Zayd in his independent submission [3]), though I haven't personally verified this. KVM support for this is left for a future patchset. (Apologies for the large Cc: list; I want to make sure nobody who got Cced on Zayd's alternate take is left out of this one.) [1] https://fex-emu.com/FEX-2306/ [2] https://github.com/AsahiLinux/linux/tree/bits/220-tso [3] https://lore.kernel.org/lkml/20240410211652.16640-1-zayd_qumsieh@apple.com/ To: Catalin Marinas To: Will Deacon To: Marc Zyngier To: Mark Rutland Cc: Zayd Qumsieh Cc: Justin Lu Cc: Ryan Houdek Cc: Mark Brown Cc: Ard Biesheuvel Cc: Mateusz Guzik Cc: Anshuman Khandual Cc: Oliver Upton Cc: Miguel Luis Cc: Joey Gouly Cc: Christoph Paasch Cc: Kees Cook Cc: Sami Tolvanen Cc: Baoquan He Cc: Joel Granados Cc: Dawei Li Cc: Andrew Morton Cc: Florent Revest Cc: David Hildenbrand Cc: Stefan Roesch Cc: Andy Chiu Cc: Josh Triplett Cc: Oleg Nesterov Cc: Helge Deller Cc: Zev Weiss Cc: Ondrej Mosnacek Cc: Miguel Ojeda Cc: linux-arm-kernel@lists.infradead.org Cc: linux-kernel@vger.kernel.org Cc: Asahi Linux Signed-off-by: Hector Martin --- Hector Martin (4): prctl: Introduce PR_{SET,GET}_MEM_MODEL arm64: Implement PR_{GET,SET}_MEM_MODEL for always-TSO CPUs arm64: Introduce scaffolding to add ACTLR_EL1 to thread state arm64: Implement Apple IMPDEF TSO memory model control arch/arm64/Kconfig | 14 ++++++ arch/arm64/include/asm/apple_cpufeature.h | 15 +++++++ arch/arm64/include/asm/cpufeature.h | 10 +++++ arch/arm64/include/asm/processor.h | 3 ++ arch/arm64/kernel/Makefile | 3 +- arch/arm64/kernel/cpufeature.c | 11 ++--- arch/arm64/kernel/cpufeature_impdef.c | 61 ++++++++++++++++++++++++++ arch/arm64/kernel/process.c | 71 +++++++++++++++++++++++++++++++ arch/arm64/kernel/setup.c | 8 ++++ arch/arm64/tools/cpucaps | 2 + include/linux/memory_ordering_model.h | 11 +++++ include/uapi/linux/prctl.h | 5 +++ kernel/sys.c | 21 +++++++++ 13 files changed, 229 insertions(+), 6 deletions(-) --- base-commit: 4cece764965020c22cff7665b18a012006359095 change-id: 20240411-tso-e86fdceb94b8 Best regards, -- Hector Martin