2022-04-14 12:49:52

by Tianjia Zhang

[permalink] [raw]
Subject: [PATCH] arm64/sme: Add hwcap for Scalable Matrix Extension

Allow userspace to detect support for SME (Scalable Matrix Extension)
by providing a hwcap for it, using the official feature name FEAT_SME,
declared in ARM DDI 0487H.a specification.

Signed-off-by: Tianjia Zhang <[email protected]>
---
Documentation/arm64/elf_hwcaps.rst | 4 ++++
arch/arm64/include/asm/hwcap.h | 1 +
arch/arm64/include/asm/sysreg.h | 1 +
arch/arm64/include/uapi/asm/hwcap.h | 1 +
arch/arm64/kernel/cpufeature.c | 13 +++++++++++++
arch/arm64/kernel/cpuinfo.c | 1 +
arch/arm64/tools/cpucaps | 1 +
7 files changed, 22 insertions(+)

diff --git a/Documentation/arm64/elf_hwcaps.rst b/Documentation/arm64/elf_hwcaps.rst
index a8f30963e550..50d2309a60d5 100644
--- a/Documentation/arm64/elf_hwcaps.rst
+++ b/Documentation/arm64/elf_hwcaps.rst
@@ -264,6 +264,10 @@ HWCAP2_MTE3
Functionality implied by ID_AA64PFR1_EL1.MTE == 0b0011, as described
by Documentation/arm64/memory-tagging-extension.rst.

+HWCAP2_SME
+
+ Functionality implied by ID_AA64PFR1_EL1.SME == 0b0001.
+
4. Unused AT_HWCAP bits
-----------------------

diff --git a/arch/arm64/include/asm/hwcap.h b/arch/arm64/include/asm/hwcap.h
index 8db5ec0089db..5299afc30fb0 100644
--- a/arch/arm64/include/asm/hwcap.h
+++ b/arch/arm64/include/asm/hwcap.h
@@ -109,6 +109,7 @@
#define KERNEL_HWCAP_AFP __khwcap2_feature(AFP)
#define KERNEL_HWCAP_RPRES __khwcap2_feature(RPRES)
#define KERNEL_HWCAP_MTE3 __khwcap2_feature(MTE3)
+#define KERNEL_HWCAP_SME __khwcap2_feature(SME)

/*
* This yields a mask that user programs can use to figure out what
diff --git a/arch/arm64/include/asm/sysreg.h b/arch/arm64/include/asm/sysreg.h
index fbf5f8bb9055..e66f9360cd93 100644
--- a/arch/arm64/include/asm/sysreg.h
+++ b/arch/arm64/include/asm/sysreg.h
@@ -836,6 +836,7 @@
#define ID_AA64PFR0_ELx_32BIT_64BIT 0x2

/* id_aa64pfr1 */
+#define ID_AA64PFR1_SME_SHIFT 24
#define ID_AA64PFR1_MPAMFRAC_SHIFT 16
#define ID_AA64PFR1_RASFRAC_SHIFT 12
#define ID_AA64PFR1_MTE_SHIFT 8
diff --git a/arch/arm64/include/uapi/asm/hwcap.h b/arch/arm64/include/uapi/asm/hwcap.h
index 99cb5d383048..0371779c7ca2 100644
--- a/arch/arm64/include/uapi/asm/hwcap.h
+++ b/arch/arm64/include/uapi/asm/hwcap.h
@@ -79,5 +79,6 @@
#define HWCAP2_AFP (1 << 20)
#define HWCAP2_RPRES (1 << 21)
#define HWCAP2_MTE3 (1 << 22)
+#define HWCAP2_SME (1 << 23)

#endif /* _UAPI__ASM_HWCAP_H */
diff --git a/arch/arm64/kernel/cpufeature.c b/arch/arm64/kernel/cpufeature.c
index d72c4b4d389c..55c5e4b9c50e 100644
--- a/arch/arm64/kernel/cpufeature.c
+++ b/arch/arm64/kernel/cpufeature.c
@@ -261,6 +261,7 @@ static const struct arm64_ftr_bits ftr_id_aa64pfr0[] = {
};

static const struct arm64_ftr_bits ftr_id_aa64pfr1[] = {
+ ARM64_FTR_BITS(FTR_VISIBLE, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_SME_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_MPAMFRAC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_HIDDEN, FTR_STRICT, FTR_LOWER_SAFE, ID_AA64PFR1_RASFRAC_SHIFT, 4, 0),
ARM64_FTR_BITS(FTR_VISIBLE_IF_IS_ENABLED(CONFIG_ARM64_MTE),
@@ -2442,6 +2443,17 @@ static const struct arm64_cpu_capabilities arm64_features[] = {
.matches = has_cpuid_feature,
.min_field_value = 1,
},
+ {
+ .desc = "Scalable Matrix Extension",
+ .capability = ARM64_SME,
+ .type = ARM64_CPUCAP_SYSTEM_FEATURE,
+ .matches = has_cpuid_feature,
+ .sys_reg = SYS_ID_AA64PFR1_EL1,
+ .field_pos = ID_AA64PFR1_SME_SHIFT,
+ .field_width = 4,
+ .sign = FTR_UNSIGNED,
+ .min_field_value = 1,
+ },
{},
};

@@ -2572,6 +2584,7 @@ static const struct arm64_cpu_capabilities arm64_elf_hwcaps[] = {
HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_MTE_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_MTE, CAP_HWCAP, KERNEL_HWCAP_MTE),
HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_MTE_SHIFT, 4, FTR_UNSIGNED, ID_AA64PFR1_MTE_ASYMM, CAP_HWCAP, KERNEL_HWCAP_MTE3),
#endif /* CONFIG_ARM64_MTE */
+ HWCAP_CAP(SYS_ID_AA64PFR1_EL1, ID_AA64PFR1_SME_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_SME),
HWCAP_CAP(SYS_ID_AA64MMFR0_EL1, ID_AA64MMFR0_ECV_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_ECV),
HWCAP_CAP(SYS_ID_AA64MMFR1_EL1, ID_AA64MMFR1_AFP_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_AFP),
HWCAP_CAP(SYS_ID_AA64ISAR2_EL1, ID_AA64ISAR2_RPRES_SHIFT, 4, FTR_UNSIGNED, 1, CAP_HWCAP, KERNEL_HWCAP_RPRES),
diff --git a/arch/arm64/kernel/cpuinfo.c b/arch/arm64/kernel/cpuinfo.c
index 330b92ea863a..87be4ba601eb 100644
--- a/arch/arm64/kernel/cpuinfo.c
+++ b/arch/arm64/kernel/cpuinfo.c
@@ -98,6 +98,7 @@ static const char *const hwcap_str[] = {
[KERNEL_HWCAP_AFP] = "afp",
[KERNEL_HWCAP_RPRES] = "rpres",
[KERNEL_HWCAP_MTE3] = "mte3",
+ [KERNEL_HWCAP_SME] = "sme",
};

#ifdef CONFIG_COMPAT
diff --git a/arch/arm64/tools/cpucaps b/arch/arm64/tools/cpucaps
index 3ed418f70e3b..c0c05399b24a 100644
--- a/arch/arm64/tools/cpucaps
+++ b/arch/arm64/tools/cpucaps
@@ -49,6 +49,7 @@ SPECTRE_V4
SPECTRE_BHB
SSBS
SVE
+SME
UNMAP_KERNEL_AT_EL0
WORKAROUND_834220
WORKAROUND_843419
--
2.24.3 (Apple Git-128)


2022-04-14 13:39:36

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH] arm64/sme: Add hwcap for Scalable Matrix Extension

On Thu, Apr 14, 2022 at 07:55:44PM +0800, Tianjia Zhang wrote:

> Allow userspace to detect support for SME (Scalable Matrix Extension)
> by providing a hwcap for it, using the official feature name FEAT_SME,
> declared in ARM DDI 0487H.a specification.

There's already a hwcap for the core feature and all the subfeatures
added as part of the series I've been posting for SME:

https://lore.kernel.org/linux-arm-kernel/[email protected]/

Why add something independently, especially given that there is no way
for userspace to do anything constructive with the feature without the
rest of the kernel support? Any attempt to use SME instructions without
kernel support will trap and generate a SIGILL even if the feature is
present in hardware.

Do you have a system with SME that you're trying to use? Review/testing
on the current series would be appreciated.


Attachments:
(No filename) (913.00 B)
signature.asc (499.00 B)
Download all attachments

2022-04-15 16:05:59

by Marc Zyngier

[permalink] [raw]
Subject: Re: [PATCH] arm64/sme: Add hwcap for Scalable Matrix Extension

On Thu, 14 Apr 2022 12:55:44 +0100,
Tianjia Zhang <[email protected]> wrote:
>
> Allow userspace to detect support for SME (Scalable Matrix Extension)
> by providing a hwcap for it, using the official feature name FEAT_SME,
> declared in ARM DDI 0487H.a specification.

Err, not just that, for sure. What does this patch buys you on its
own, given that the kernel doesn't implement anything yet and that all
the SME instructions will UNDEF?

[1] is the real deal.

Thanks,

M.

[1] https://lore.kernel.org/r/[email protected]

--
Without deviation from the norm, progress is not possible.

2022-04-16 01:15:56

by Tianjia Zhang

[permalink] [raw]
Subject: Re: [PATCH] arm64/sme: Add hwcap for Scalable Matrix Extension

Hi Marc,

On 4/14/22 8:06 PM, Marc Zyngier wrote:
> On Thu, 14 Apr 2022 12:55:44 +0100,
> Tianjia Zhang <[email protected]> wrote:
>>
>> Allow userspace to detect support for SME (Scalable Matrix Extension)
>> by providing a hwcap for it, using the official feature name FEAT_SME,
>> declared in ARM DDI 0487H.a specification.
>
> Err, not just that, for sure. What does this patch buys you on its
> own, given that the kernel doesn't implement anything yet and that all
> the SME instructions will UNDEF?
>
> [1] is the real deal.
>
> Thanks,
>
> M.
>
> [1] https://lore.kernel.org/r/[email protected]
>

Thanks for your suggestion, I have a very simple scenario, I can see
whether the SME feature is supported in cpuinfo, it seems impractical at
the moment.

Kind regards,
Tianjia

2022-04-16 01:28:10

by Tianjia Zhang

[permalink] [raw]
Subject: Re: [PATCH] arm64/sme: Add hwcap for Scalable Matrix Extension

Hi Mark,

On 4/14/22 8:02 PM, Mark Brown wrote:
> On Thu, Apr 14, 2022 at 07:55:44PM +0800, Tianjia Zhang wrote:
>
>> Allow userspace to detect support for SME (Scalable Matrix Extension)
>> by providing a hwcap for it, using the official feature name FEAT_SME,
>> declared in ARM DDI 0487H.a specification.
>
> There's already a hwcap for the core feature and all the subfeatures
> added as part of the series I've been posting for SME:
>
> https://lore.kernel.org/linux-arm-kernel/[email protected]/
>
> Why add something independently, especially given that there is no way
> for userspace to do anything constructive with the feature without the
> rest of the kernel support? Any attempt to use SME instructions without
> kernel support will trap and generate a SIGILL even if the feature is
> present in hardware.

Great job, I encountered the issue of invalid REVD (requires FEAT_SME)
instruction when developing SVE2 programs, so I plan to gradually
support SME in the kernel, thanks for your contribution, you can ignore
my patch.

In addition, I would like to ask a question, whether there is an
alternative SVE2 instruction for the REVD instruction that can complete
this operation, if the machine does not support SME.

>
> Do you have a system with SME that you're trying to use? Review/testing
> on the current series would be appreciated.

Unfortunately, the value currently read by my machine ID_AA64PFR1_EL1
register is 0x121. It seems that the hardware does not support SME. Is
there any other help I can provide?

Kind regards,
Tianjia

2022-04-20 13:02:08

by Mark Brown

[permalink] [raw]
Subject: Re: [PATCH] arm64/sme: Add hwcap for Scalable Matrix Extension

On Fri, Apr 15, 2022 at 10:25:33AM +0800, Tianjia Zhang wrote:
> On 4/14/22 8:02 PM, Mark Brown wrote:
> > On Thu, Apr 14, 2022 at 07:55:44PM +0800, Tianjia Zhang wrote:

> > Why add something independently, especially given that there is no way
> > for userspace to do anything constructive with the feature without the
> > rest of the kernel support? Any attempt to use SME instructions without
> > kernel support will trap and generate a SIGILL even if the feature is
> > present in hardware.

> Great job, I encountered the issue of invalid REVD (requires FEAT_SME)
> instruction when developing SVE2 programs, so I plan to gradually
> support SME in the kernel, thanks for your contribution, you can ignore
> my patch.

I see. Unfortunately all the new registers mean that we really need to
define all the ABI as soon as we enable anything and the only thing we
can really skip out on when doing initial enablement is KVM (which I
have in fact skipped for the time being, I'll look at that at some point
after the initial support is landed).

> In addition, I would like to ask a question, whether there is an
> alternative SVE2 instruction for the REVD instruction that can complete
> this operation, if the machine does not support SME.

I'm not aware of anything, but I am mostly focused on the OS support
rather than any of the actual mathematical operations that are more the
point of these architecture features so I might be missing something.

> > Do you have a system with SME that you're trying to use? Review/testing
> > on the current series would be appreciated.

> Unfortunately, the value currently read by my machine ID_AA64PFR1_EL1
> register is 0x121. It seems that the hardware does not support SME. Is
> there any other help I can provide?

Other than verifying that the series doesn't cause trouble for systems
without SME


Attachments:
(No filename) (1.85 kB)
signature.asc (499.00 B)
Download all attachments

2022-04-24 17:44:25

by Tianjia Zhang

[permalink] [raw]
Subject: Re: [PATCH] arm64/sme: Add hwcap for Scalable Matrix Extension

Hi Mark,

On 4/19/22 9:58 PM, Mark Brown wrote:
> On Fri, Apr 15, 2022 at 10:25:33AM +0800, Tianjia Zhang wrote:
>> On 4/14/22 8:02 PM, Mark Brown wrote:
>>> On Thu, Apr 14, 2022 at 07:55:44PM +0800, Tianjia Zhang wrote:
>
>>> Why add something independently, especially given that there is no way
>>> for userspace to do anything constructive with the feature without the
>>> rest of the kernel support? Any attempt to use SME instructions without
>>> kernel support will trap and generate a SIGILL even if the feature is
>>> present in hardware.
>
>> Great job, I encountered the issue of invalid REVD (requires FEAT_SME)
>> instruction when developing SVE2 programs, so I plan to gradually
>> support SME in the kernel, thanks for your contribution, you can ignore
>> my patch.
>
> I see. Unfortunately all the new registers mean that we really need to
> define all the ABI as soon as we enable anything and the only thing we
> can really skip out on when doing initial enablement is KVM (which I
> have in fact skipped for the time being, I'll look at that at some point
> after the initial support is landed).
>
>> In addition, I would like to ask a question, whether there is an
>> alternative SVE2 instruction for the REVD instruction that can complete
>> this operation, if the machine does not support SME.
>
> I'm not aware of anything, but I am mostly focused on the OS support
> rather than any of the actual mathematical operations that are more the
> point of these architecture features so I might be missing something.
>
>>> Do you have a system with SME that you're trying to use? Review/testing
>>> on the current series would be appreciated.
>
>> Unfortunately, the value currently read by my machine ID_AA64PFR1_EL1
>> register is 0x121. It seems that the hardware does not support SME. Is
>> there any other help I can provide?
>
> Other than verifying that the series doesn't cause trouble for systems
> without SME

Thanks for your reply, I have indirectly implemented the functionality
of the REVD instruction using the tbl instruction on a machine that does
not support SME.

For this group of patchsets, I will do some tests later, which may take
a long time, and there is currently no exclusive machine at hand.

Best regards,
Tianjia