Received: by 2002:ab2:78c:0:b0:1ec:b906:25e5 with SMTP id h12csp255746lqe; Fri, 23 Feb 2024 03:37:22 -0800 (PST) X-Forwarded-Encrypted: i=3; AJvYcCWsAWHIpjLOIaORD5kzavjLrVVH+Juf0rkkdgMHOgrrdKBT2i5izKkuUerPLLswK32RFdzRYB0Xxdm2/bfNMXqiQmNweGa8b1rTxz8L6w== X-Google-Smtp-Source: AGHT+IF7dB4S4mwfTxEALAEaTDfDBQHomJ169hdw/agBiDPcSEVtymFZVwt1vrjlAnOyxk5yfxkM X-Received: by 2002:a05:622a:180a:b0:42d:a88f:1cf6 with SMTP id t10-20020a05622a180a00b0042da88f1cf6mr2138218qtc.5.1708686435626; Fri, 23 Feb 2024 03:07:15 -0800 (PST) ARC-Seal: i=2; a=rsa-sha256; t=1708686435; cv=pass; d=google.com; s=arc-20160816; b=V1rff8pQq1YSB93YAPpR5fcgkNnM1OfeM30BIgWBd7EEAv/9DNJdLnCNx42S0mDgG5 AmGxy8EctyPKb+qOp7adhFTk6+J61YvV5ltZCgBQiPD3nlJU7M3YjK1BVY7hgF6Rh70i YOFaAktXWFg5GImWyULR3ppk+GnFndniaHuueecT9xwIAbBETyRjVVpcaBx+u4T3nRoj VvfPdMnpZ1DhkpxWttN8fybtfkHQv/NdGOjDZjOF5M0eXaSSt9staIGUfTM8fuiUz52s IZr/hUi33zbI6xafH/yvmVWvNgi/Opms5gW8F7yhSOC8cMc/1LJm8KK1RuLwzr9rQyua DgRQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :user-agent:references:in-reply-to:subject:cc:to:from:message-id :date:dkim-signature; bh=sH5nKu2sMRZvPy4/O2wXwE8VR4Kgwtv02t8kr4/00cQ=; fh=RYqlB0Ha2jTaT+kcSVoaCCFlRd1jaeEtEAbrpDPLyZ4=; b=eOF8TL+PJZmn242y+Xpy/Q1hSp9m6y5Ox32cjt13rNmJLEL6RQYGdamQbH70VOt2MS g6k/uC1vSVBEDzSXUgOCv++O0ZxJKD7ZaBw5n71gWoDYPeGGHexi0/baTsNS+005wjZd GjMuVCRYvcMv9JKSs0IMCkRy4274hfwg93Air6cl5a1eTL0fuKP7o56kXCpZ1aWKvG+t VvWTN8VT641Ih16V5UQAupu67bxR5qJ4sh/1KrQTmi08i1LvqVVF5k98xgKusD5dCLS5 RUVoyTpg9jAA9V5YEXlypcXAe2UEiqw/jfNBlNp2fl7lzQXUCQpo05rSau56r7wrOcV9 fsbw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=YAY3CJ5b; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-78201-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-78201-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id c6-20020a05622a024600b0042e69f78980si53697qtx.772.2024.02.23.03.07.15 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 23 Feb 2024 03:07:15 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-78201-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=YAY3CJ5b; arc=pass (i=1 dkim=pass dkdomain=kernel.org); spf=pass (google.com: domain of linux-kernel+bounces-78201-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-78201-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 35D681C234B4 for ; Fri, 23 Feb 2024 11:07:15 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 4C736657B0; Fri, 23 Feb 2024 11:07:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="YAY3CJ5b" Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3329E250E2; Fri, 23 Feb 2024 11:07:05 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708686426; cv=none; b=pM5etq6Sr8d3aw3iB/rcpv0d7BWoecKFWS34gKsp5WEzt/fclxqMrY6PqOJyWFyrSePKlO8JSB57qdA+/lK0xwwr7yTT/h5e4Er9StCI4Jza/aAxZPqzV1jOpeZyUkaH85EBEozqQsswi8nK1cGlfhhLYrhYvDd+s4OhedzMk6o= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1708686426; c=relaxed/simple; bh=Zlm5d+QOs/dktmMB+z0HKv98GRZBJLPrBX3i8rbARso=; h=Date:Message-ID:From:To:Cc:Subject:In-Reply-To:References: MIME-Version:Content-Type; b=KJq7sIcMdK5z9iasMvSnOTUGp5JL97zTHk2q2/+eDoENelNAdXNFiVdIHRCHko5Ac9oBYQgmj9z79T+fQQsRZKPgutN685QXm1Ph+ztuHyW7aKXzzkCWGcwaKryvzgoTc1Ne/OXqbHWgRFhClWZcihtbBUdieczeqMqh4RPlywg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=YAY3CJ5b; arc=none smtp.client-ip=10.30.226.201 Received: by smtp.kernel.org (Postfix) with ESMTPSA id 801FEC433F1; Fri, 23 Feb 2024 11:07:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1708686425; bh=Zlm5d+QOs/dktmMB+z0HKv98GRZBJLPrBX3i8rbARso=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=YAY3CJ5bB/KTy3+uDviC3rtHYpB5iZfKlvwFkPvcAsCbAFqyHYEYXThiV5Q5CMHhU j5a/GSSWxeub66LHvHTcWC1N3NXCWGn20UypDl00hL9UgnGxrInbr2ZidAuHyeXeqw xNAIC7vv+mTbC2hJfCSPozSvRc8xsz+XBtiFHtfBemd0JweUDx4bd9K2ICMO+ej5Mm HrLtBXBI/SnCHRRpmQAS8LrnLY+D7Mkl6kav5DgOjqhO8Gnt4Cb/39B49kgjdGrTD5 RZPw6nxk2uPRXpZd7t8Hmui1cIq3Hpf5kVfj9qHwZr5TLhQ7QeB1lOYg2KY9kGFN+p uf98LxxIE3b2w== Received: from sofa.misterjones.org ([185.219.108.64] helo=goblin-girl.misterjones.org) by disco-boy.misterjones.org with esmtpsa (TLS1.3) tls TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384 (Exim 4.95) (envelope-from ) id 1rdTOQ-0063wt-P7; Fri, 23 Feb 2024 11:07:03 +0000 Date: Fri, 23 Feb 2024 11:07:02 +0000 Message-ID: <86r0h330dl.wl-maz@kernel.org> From: Marc Zyngier To: Mark Brown Cc: Catalin Marinas , Will Deacon , Oliver Upton , James Morse , Suzuki K Poulose , Jonathan Corbet , Shuah Khan , linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, Dave Martin , kvmarm@lists.linux.dev, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org Subject: Re: [PATCH v4 03/14] arm64/fpsimd: Support FEAT_FPMR In-Reply-To: <20240122-arm64-2023-dpisa-v4-3-776e094861df@kernel.org> References: <20240122-arm64-2023-dpisa-v4-0-776e094861df@kernel.org> <20240122-arm64-2023-dpisa-v4-3-776e094861df@kernel.org> User-Agent: Wanderlust/2.15.9 (Almost Unreal) SEMI-EPG/1.14.7 (Harue) FLIM-LB/1.14.9 (=?UTF-8?B?R29qxY0=?=) APEL-LB/10.8 EasyPG/1.0.0 Emacs/29.1 (aarch64-unknown-linux-gnu) MULE/6.0 (HANACHIRUSATO) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 (generated by SEMI-EPG 1.14.7 - "Harue") Content-Type: text/plain; charset=US-ASCII X-SA-Exim-Connect-IP: 185.219.108.64 X-SA-Exim-Rcpt-To: broonie@kernel.org, catalin.marinas@arm.com, will@kernel.org, oliver.upton@linux.dev, james.morse@arm.com, suzuki.poulose@arm.com, corbet@lwn.net, shuah@kernel.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org, dave.martin@arm.com, kvmarm@lists.linux.dev, linux-doc@vger.kernel.org, linux-kselftest@vger.kernel.org X-SA-Exim-Mail-From: maz@kernel.org X-SA-Exim-Scanned: No (on disco-boy.misterjones.org); SAEximRunCond expanded to false On Mon, 22 Jan 2024 16:28:06 +0000, Mark Brown wrote: > > FEAT_FPMR defines a new EL0 accessible register FPMR use to configure the > FP8 related features added to the architecture at the same time. Detect > support for this register and context switch it for EL0 when present. > > Due to the sharing of responsibility for saving floating point state > between the host kernel and KVM FP8 support is not yet implemented in KVM > and a stub similar to that used for SVCR is provided for FPMR in order to > avoid bisection issues. To make it easier to share host state with the > hypervisor we store FPMR as a hardened usercopy field in uw (along with > some padding). > > Signed-off-by: Mark Brown > --- > arch/arm64/include/asm/cpufeature.h | 5 +++++ > arch/arm64/include/asm/fpsimd.h | 2 ++ > arch/arm64/include/asm/kvm_host.h | 1 + > arch/arm64/include/asm/processor.h | 4 ++++ > arch/arm64/kernel/cpufeature.c | 9 +++++++++ > arch/arm64/kernel/fpsimd.c | 13 +++++++++++++ > arch/arm64/kvm/fpsimd.c | 1 + > arch/arm64/tools/cpucaps | 1 + > 8 files changed, 36 insertions(+) > > diff --git a/arch/arm64/include/asm/cpufeature.h b/arch/arm64/include/asm/cpufeature.h > index 21c824edf8ce..34fcdbc65d7d 100644 > --- a/arch/arm64/include/asm/cpufeature.h > +++ b/arch/arm64/include/asm/cpufeature.h > @@ -768,6 +768,11 @@ static __always_inline bool system_supports_tpidr2(void) > return system_supports_sme(); > } > > +static __always_inline bool system_supports_fpmr(void) > +{ > + return alternative_has_cap_unlikely(ARM64_HAS_FPMR); > +} > + > static __always_inline bool system_supports_cnp(void) > { > return alternative_has_cap_unlikely(ARM64_HAS_CNP); > diff --git a/arch/arm64/include/asm/fpsimd.h b/arch/arm64/include/asm/fpsimd.h > index 50e5f25d3024..6cf72b0d2c04 100644 > --- a/arch/arm64/include/asm/fpsimd.h > +++ b/arch/arm64/include/asm/fpsimd.h > @@ -89,6 +89,7 @@ struct cpu_fp_state { > void *sve_state; > void *sme_state; > u64 *svcr; > + unsigned long *fpmr; > unsigned int sve_vl; > unsigned int sme_vl; > enum fp_type *fp_type; > @@ -154,6 +155,7 @@ extern void cpu_enable_sve(const struct arm64_cpu_capabilities *__unused); > extern void cpu_enable_sme(const struct arm64_cpu_capabilities *__unused); > extern void cpu_enable_sme2(const struct arm64_cpu_capabilities *__unused); > extern void cpu_enable_fa64(const struct arm64_cpu_capabilities *__unused); > +extern void cpu_enable_fpmr(const struct arm64_cpu_capabilities *__unused); > > extern u64 read_smcr_features(void); > > diff --git a/arch/arm64/include/asm/kvm_host.h b/arch/arm64/include/asm/kvm_host.h > index 21c57b812569..7993694a54af 100644 > --- a/arch/arm64/include/asm/kvm_host.h > +++ b/arch/arm64/include/asm/kvm_host.h > @@ -543,6 +543,7 @@ struct kvm_vcpu_arch { > enum fp_type fp_type; > unsigned int sve_max_vl; > u64 svcr; > + unsigned long fpmr; As this directly represents a register, I'd rather you use a type that represents the size of that register unambiguously (u64). > > /* Stage 2 paging state used by the hardware on next switch */ > struct kvm_s2_mmu *hw_mmu; > diff --git a/arch/arm64/include/asm/processor.h b/arch/arm64/include/asm/processor.h > index 5b0a04810b23..b453c66d3fae 100644 > --- a/arch/arm64/include/asm/processor.h > +++ b/arch/arm64/include/asm/processor.h > @@ -155,6 +155,8 @@ struct thread_struct { > struct { > unsigned long tp_value; /* TLS register */ > unsigned long tp2_value; > + unsigned long fpmr; > + unsigned long pad; > struct user_fpsimd_state fpsimd_state; > } uw; > > @@ -253,6 +255,8 @@ static inline void arch_thread_struct_whitelist(unsigned long *offset, > BUILD_BUG_ON(sizeof_field(struct thread_struct, uw) != > sizeof_field(struct thread_struct, uw.tp_value) + > sizeof_field(struct thread_struct, uw.tp2_value) + > + sizeof_field(struct thread_struct, uw.fpmr) + > + sizeof_field(struct thread_struct, uw.pad) + > sizeof_field(struct thread_struct, uw.fpsimd_state)); > > *offset = offsetof(struct thread_struct, uw); > diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c > index eae59ec0f4b0..0263565f617a 100644 > --- a/arch/arm64/kernel/cpufeature.c > +++ b/arch/arm64/kernel/cpufeature.c > @@ -272,6 +272,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = { > }; > > static const struct arm64_ftr_bits ftr_id_aa64pfr2[] = { > + ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR2_EL1_FPMR_SHIFT, 4, 0), > ARM64_FTR_END, > }; > > @@ -2767,6 +2768,14 @@ static const struct arm64_cpu_capabilities arm64_features[] = { > .type = ARM64_CPUCAP_SYSTEM_FEATURE, > .matches = has_lpa2, > }, > + { > + .desc = "FPMR", > + .type = ARM64_CPUCAP_SYSTEM_FEATURE, > + .capability = ARM64_HAS_FPMR, > + .matches = has_cpuid_feature, > + .cpu_enable = cpu_enable_fpmr, > + ARM64_CPUID_FIELDS(ID_AA64PFR2_EL1, FPMR, IMP) > + }, > {}, > }; > > diff --git a/arch/arm64/kernel/fpsimd.c b/arch/arm64/kernel/fpsimd.c > index a5dc6f764195..8e24b5e5e192 100644 > --- a/arch/arm64/kernel/fpsimd.c > +++ b/arch/arm64/kernel/fpsimd.c > @@ -359,6 +359,9 @@ static void task_fpsimd_load(void) > WARN_ON(preemptible()); > WARN_ON(test_thread_flag(TIF_KERNEL_FPSTATE)); > > + if (system_supports_fpmr()) > + write_sysreg_s(current->thread.uw.fpmr, SYS_FPMR); > + > if (system_supports_sve() || system_supports_sme()) { > switch (current->thread.fp_type) { > case FP_STATE_FPSIMD: > @@ -446,6 +449,9 @@ static void fpsimd_save_user_state(void) > if (test_thread_flag(TIF_FOREIGN_FPSTATE)) > return; > > + if (system_supports_fpmr()) > + *(last->fpmr) = read_sysreg_s(SYS_FPMR); > + > /* > * If a task is in a syscall the ABI allows us to only > * preserve the state shared with FPSIMD so don't bother > @@ -688,6 +694,12 @@ static void sve_to_fpsimd(struct task_struct *task) > } > } > > +void cpu_enable_fpmr(const struct arm64_cpu_capabilities *__always_unused p) > +{ > + write_sysreg_s(read_sysreg_s(SYS_SCTLR_EL1) | SCTLR_EL1_EnFPM_MASK, > + SYS_SCTLR_EL1); > +} > + > #ifdef CONFIG_ARM64_SVE > /* > * Call __sve_free() directly only if you know task can't be scheduled > @@ -1680,6 +1692,7 @@ static void fpsimd_bind_task_to_cpu(void) > last->sve_vl = task_get_sve_vl(current); > last->sme_vl = task_get_sme_vl(current); > last->svcr = ¤t->thread.svcr; > + last->fpmr = ¤t->thread.uw.fpmr; > last->fp_type = ¤t->thread.fp_type; > last->to_save = FP_STATE_CURRENT; > current->thread.fpsimd_cpu = smp_processor_id(); > diff --git a/arch/arm64/kvm/fpsimd.c b/arch/arm64/kvm/fpsimd.c > index 8c1d0d4853df..e3e611e30e91 100644 > --- a/arch/arm64/kvm/fpsimd.c > +++ b/arch/arm64/kvm/fpsimd.c > @@ -153,6 +153,7 @@ void kvm_arch_vcpu_ctxsync_fp(struct kvm_vcpu *vcpu) > fp_state.sve_vl = vcpu->arch.sve_max_vl; > fp_state.sme_state = NULL; > fp_state.svcr = &vcpu->arch.svcr; > + fp_state.fpmr = &vcpu->arch.fpmr; > fp_state.fp_type = &vcpu->arch.fp_type; Given the number of fields you keep track of, it would make a lot more sense if these FP-related fields were in their own little structure and tracked by a single pointer (I don't think there is a case where we track them independently). Thanks, M. -- Without deviation from the norm, progress is not possible.