A couple of test updates are included:
* With this option [1,2], the kernel's altstack check becomes stringent.
The x86 sigaltstack test is ignorant about this. Adjust the test now.
This check was established [3] to ensure every AMX task's altstack is
sufficient (regardless of that option) [4].
* The AMX test wrongly fails on non-AMX machines. Fix the code to skip the
test instead.
The series is available in this repository:
git://github.com/intel/amx-linux.git selftest
[1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/Kconfig#n2432
[2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kernel-parameters.txt#n5676
[3] 3aac3ebea08f ("x86/signal: Implement sigaltstack size validation")
[4] 4b7ca609a33d ("x86/signal: Use fpu::__state_user_size for sigalt stack validation")
Chang S. Bae (2):
selftests/x86/signal: Adjust the test to the kernel's altstack check
selftests/x86/amx: Fix the test to avoid failure when AMX is
unavailable
tools/testing/selftests/x86/amx.c | 42 +++++++++++++++++------
tools/testing/selftests/x86/sigaltstack.c | 12 ++++++-
2 files changed, 42 insertions(+), 12 deletions(-)
base-commit: f443e374ae131c168a065ea1748feac6b2e76613
--
2.17.1
When a CPU does not have AMX, the test fails. But this is wrong as it
should be runnable regardless. Skip the test instead.
Reported-by: Thomas Gleixner <[email protected]>
Fixes: 6a3e0651b4a ("selftests/x86/amx: Add test cases for AMX state management")
Signed-off-by: Chang S. Bae <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
tools/testing/selftests/x86/amx.c | 42 +++++++++++++++++++++++--------
1 file changed, 31 insertions(+), 11 deletions(-)
diff --git a/tools/testing/selftests/x86/amx.c b/tools/testing/selftests/x86/amx.c
index 3615ef4a48bb..14abb6072a7d 100644
--- a/tools/testing/selftests/x86/amx.c
+++ b/tools/testing/selftests/x86/amx.c
@@ -106,6 +106,12 @@ static void clearhandler(int sig)
#define CPUID_LEAF1_ECX_XSAVE_MASK (1 << 26)
#define CPUID_LEAF1_ECX_OSXSAVE_MASK (1 << 27)
+
+static struct {
+ unsigned xsave: 1;
+ unsigned osxsave: 1;
+} cpuinfo;
+
static inline void check_cpuid_xsave(void)
{
uint32_t eax, ebx, ecx, edx;
@@ -118,10 +124,8 @@ static inline void check_cpuid_xsave(void)
eax = 1;
ecx = 0;
cpuid(&eax, &ebx, &ecx, &edx);
- if (!(ecx & CPUID_LEAF1_ECX_XSAVE_MASK))
- fatal_error("cpuid: no CPU xsave support");
- if (!(ecx & CPUID_LEAF1_ECX_OSXSAVE_MASK))
- fatal_error("cpuid: no OS xsave support");
+ cpuinfo.xsave = !!(ecx & CPUID_LEAF1_ECX_XSAVE_MASK);
+ cpuinfo.osxsave = !!(ecx & CPUID_LEAF1_ECX_OSXSAVE_MASK);
}
static uint32_t xbuf_size;
@@ -161,14 +165,31 @@ static void check_cpuid_xtiledata(void)
* eax: XTILEDATA state component size
* ebx: XTILEDATA state component offset in user buffer
*/
- if (!eax || !ebx)
- fatal_error("xstate cpuid: invalid tile data size/offset: %d/%d",
- eax, ebx);
-
xtiledata.size = eax;
xtiledata.xbuf_offset = ebx;
}
+static bool amx_available(void)
+{
+ check_cpuid_xsave();
+ if (!cpuinfo.xsave) {
+ printf("[SKIP]\tcpuid: no CPU xsave support\n");
+ return false;
+ } else if (!cpuinfo.osxsave) {
+ printf("[SKIP]\tcpuid: no OS xsave support\n");
+ return false;
+ }
+
+ check_cpuid_xtiledata();
+ if (!xtiledata.size || !xtiledata.xbuf_offset) {
+ printf("[SKIP]\txstate cpuid: no tile data (size/offset: %d/%d)\n",
+ xtiledata.size, xtiledata.xbuf_offset);
+ return false;
+ }
+
+ return true;
+}
+
/* The helpers for managing XSAVE buffer and tile states: */
struct xsave_buffer *alloc_xbuf(void)
@@ -826,9 +847,8 @@ static void test_context_switch(void)
int main(void)
{
- /* Check hardware availability at first */
- check_cpuid_xsave();
- check_cpuid_xtiledata();
+ if (!amx_available())
+ return 0;
init_stashed_xsave();
sethandler(SIGILL, handle_noperm, 0);
--
2.17.1
The test assumes an insufficient altstack is allowed. Then it raises a
signal to test the delivery failure due to an altstack overflow.
The kernel now provides an option to tweak sigaltstack()'s sanity check to
prevent an insufficient altstack. ENOMEM is returned on the check failure.
Adjust the code to skip the test when this option is on.
Signed-off-by: Chang S. Bae <[email protected]>
Cc: [email protected]
Cc: [email protected]
---
tools/testing/selftests/x86/sigaltstack.c | 12 +++++++++++-
1 file changed, 11 insertions(+), 1 deletion(-)
diff --git a/tools/testing/selftests/x86/sigaltstack.c b/tools/testing/selftests/x86/sigaltstack.c
index f689af75e979..22a88b764a8e 100644
--- a/tools/testing/selftests/x86/sigaltstack.c
+++ b/tools/testing/selftests/x86/sigaltstack.c
@@ -88,8 +88,18 @@ static void sigalrm(int sig, siginfo_t *info, void *ctx_void)
static void test_sigaltstack(void *altstack, unsigned long size)
{
- if (setup_altstack(altstack, size))
+ if (setup_altstack(altstack, size)) {
+ /*
+ * The kernel may return ENOMEM when the altstack size
+ * is insufficient. Skip the test in this case.
+ */
+ if (errno == ENOMEM && size < at_minstack_size) {
+ printf("[SKIP]\tThe running kernel disallows an insufficient size.\n");
+ return;
+ }
+
err(1, "sigaltstack()");
+ }
sigalrm_expected = (size > at_minstack_size) ? true : false;
--
2.17.1
On 4/1/22 4:10 PM, Chang S. Bae wrote:
> The test assumes an insufficient altstack is allowed. Then it raises a
> signal to test the delivery failure due to an altstack overflow.
>
> The kernel now provides an option to tweak sigaltstack()'s sanity check to
> prevent an insufficient altstack. ENOMEM is returned on the check failure.
>
> Adjust the code to skip the test when this option is on.
Mention the option name here and in the Skip message.
>
> Signed-off-by: Chang S. Bae <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> tools/testing/selftests/x86/sigaltstack.c | 12 +++++++++++-
> 1 file changed, 11 insertions(+), 1 deletion(-)
>
> diff --git a/tools/testing/selftests/x86/sigaltstack.c b/tools/testing/selftests/x86/sigaltstack.c
> index f689af75e979..22a88b764a8e 100644
> --- a/tools/testing/selftests/x86/sigaltstack.c
> +++ b/tools/testing/selftests/x86/sigaltstack.c
> @@ -88,8 +88,18 @@ static void sigalrm(int sig, siginfo_t *info, void *ctx_void)
>
> static void test_sigaltstack(void *altstack, unsigned long size)
> {
> - if (setup_altstack(altstack, size))
> + if (setup_altstack(altstack, size)) {
> + /*
> + * The kernel may return ENOMEM when the altstack size
> + * is insufficient. Skip the test in this case.
> + */
> + if (errno == ENOMEM && size < at_minstack_size) {
> + printf("[SKIP]\tThe running kernel disallows an insufficient size.\n");
Please improve this message to clearly why it is okay to skip the test.
Mention that the option to disallowing insufficient is enabled and that
the test can't be run.
> + return;
> + }
> +
> err(1, "sigaltstack()");
> + }
>
> sigalrm_expected = (size > at_minstack_size) ? true : false;
>
>
With these changes:
Reviewed-by: Shuah Khan <[email protected]>
thanks,
-- Shuah
On 4/1/22 4:10 PM, Chang S. Bae wrote:
> When a CPU does not have AMX, the test fails. But this is wrong as it
> should be runnable regardless. Skip the test instead.
>
> Reported-by: Thomas Gleixner <[email protected]>
> Fixes: 6a3e0651b4a ("selftests/x86/amx: Add test cases for AMX state management")
> Signed-off-by: Chang S. Bae <[email protected]>
> Cc: [email protected]
> Cc: [email protected]
> ---
> tools/testing/selftests/x86/amx.c | 42 +++++++++++++++++++++++--------
> 1 file changed, 31 insertions(+), 11 deletions(-)
>
> diff --git a/tools/testing/selftests/x86/amx.c b/tools/testing/selftests/x86/amx.c
> index 3615ef4a48bb..14abb6072a7d 100644
> --- a/tools/testing/selftests/x86/amx.c
> +++ b/tools/testing/selftests/x86/amx.c
> @@ -106,6 +106,12 @@ static void clearhandler(int sig)
>
> #define CPUID_LEAF1_ECX_XSAVE_MASK (1 << 26)
> #define CPUID_LEAF1_ECX_OSXSAVE_MASK (1 << 27)
> +
> +static struct {
> + unsigned xsave: 1;
> + unsigned osxsave: 1;
> +} cpuinfo;
> +
Why is this needed? Also naming this cpuinfo is confuing.
> static inline void check_cpuid_xsave(void)
> {
> uint32_t eax, ebx, ecx, edx;
> @@ -118,10 +124,8 @@ static inline void check_cpuid_xsave(void)
> eax = 1;
> ecx = 0;
> cpuid(&eax, &ebx, &ecx, &edx);
> - if (!(ecx & CPUID_LEAF1_ECX_XSAVE_MASK))
> - fatal_error("cpuid: no CPU xsave support");
> - if (!(ecx & CPUID_LEAF1_ECX_OSXSAVE_MASK))
> - fatal_error("cpuid: no OS xsave support");
> + cpuinfo.xsave = !!(ecx & CPUID_LEAF1_ECX_XSAVE_MASK);
> + cpuinfo.osxsave = !!(ecx & CPUID_LEAF1_ECX_OSXSAVE_MASK);
Why add this complexity. Why not just Skip here?
> }
>
> static uint32_t xbuf_size;
> @@ -161,14 +165,31 @@ static void check_cpuid_xtiledata(void)
> * eax: XTILEDATA state component size
> * ebx: XTILEDATA state component offset in user buffer
> */
> - if (!eax || !ebx)
> - fatal_error("xstate cpuid: invalid tile data size/offset: %d/%d",
> - eax, ebx);
> -
> xtiledata.size = eax;
> xtiledata.xbuf_offset = ebx;
> }
>
> +static bool amx_available(void)
> +{
> + check_cpuid_xsave();
> + if (!cpuinfo.xsave) {
> + printf("[SKIP]\tcpuid: no CPU xsave support\n");
> + return false;
> + } else if (!cpuinfo.osxsave) {
> + printf("[SKIP]\tcpuid: no OS xsave support\n");
> + return false;
> + }
> +
> + check_cpuid_xtiledata();
> + if (!xtiledata.size || !xtiledata.xbuf_offset) {
> + printf("[SKIP]\txstate cpuid: no tile data (size/offset: %d/%d)\n",
> + xtiledata.size, xtiledata.xbuf_offset);
> + return false;
> + }
> +
> + return true;
> +}
> +
I am not seeing any value in adding this layer of abstraction.
Keep it simple and do the handling in main()
> /* The helpers for managing XSAVE buffer and tile states: */
>
> struct xsave_buffer *alloc_xbuf(void)
> @@ -826,9 +847,8 @@ static void test_context_switch(void)
>
> int main(void)
> {
> - /* Check hardware availability at first */
> - check_cpuid_xsave();
> - check_cpuid_xtiledata();
> + if (!amx_available())
> + return 0;
This should KSFT_SKIP for this to be reported as a skip. Returning 0
will be reported as a Pass.
>
> init_stashed_xsave();
> sethandler(SIGILL, handle_noperm, 0);
>
thanks,
-- Shuah
On 4/1/22 4:10 PM, Chang S. Bae wrote:
> A couple of test updates are included:
>
> * With this option [1,2], the kernel's altstack check becomes stringent.
> The x86 sigaltstack test is ignorant about this. Adjust the test now.
> This check was established [3] to ensure every AMX task's altstack is
> sufficient (regardless of that option) [4].
>
> * The AMX test wrongly fails on non-AMX machines. Fix the code to skip the
> test instead.
>
> The series is available in this repository:
> git://github.com/intel/amx-linux.git selftest
>
> [1] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/arch/x86/Kconfig#n2432
> [2] https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/Documentation/admin-guide/kernel-parameters.txt#n5676
> [3] 3aac3ebea08f ("x86/signal: Implement sigaltstack size validation")
> [4] 4b7ca609a33d ("x86/signal: Use fpu::__state_user_size for sigalt stack validation")
>
> Chang S. Bae (2):
> selftests/x86/signal: Adjust the test to the kernel's altstack check
> selftests/x86/amx: Fix the test to avoid failure when AMX is
> unavailable
>
> tools/testing/selftests/x86/amx.c | 42 +++++++++++++++++------
> tools/testing/selftests/x86/sigaltstack.c | 12 ++++++-
> 2 files changed, 42 insertions(+), 12 deletions(-)
>
>
> base-commit: f443e374ae131c168a065ea1748feac6b2e76613
>
These are fixes and I can take them through kselftest once the review
comments are addressed and get an ack from x86 maintainers.
thanks,
-- Shuah
On 6/16/2022 3:54 PM, Shuah Khan wrote:
> On 4/1/22 4:10 PM, Chang S. Bae wrote:
>>
>> +
>> +static struct {
>> + unsigned xsave: 1;
>> + unsigned osxsave: 1;
>> +} cpuinfo;
>> +
>
> Why is this needed? Also naming this cpuinfo is confuing.
This came from the below CPUID check which seems to be moot.
>
>> static inline void check_cpuid_xsave(void)
>> {
>> uint32_t eax, ebx, ecx, edx;
>> @@ -118,10 +124,8 @@ static inline void check_cpuid_xsave(void)
>> eax = 1;
>> ecx = 0;
>> cpuid(&eax, &ebx, &ecx, &edx);
>> - if (!(ecx & CPUID_LEAF1_ECX_XSAVE_MASK))
>> - fatal_error("cpuid: no CPU xsave support");
>> - if (!(ecx & CPUID_LEAF1_ECX_OSXSAVE_MASK))
>> - fatal_error("cpuid: no OS xsave support");
>> + cpuinfo.xsave = !!(ecx & CPUID_LEAF1_ECX_XSAVE_MASK);
>> + cpuinfo.osxsave = !!(ecx & CPUID_LEAF1_ECX_OSXSAVE_MASK);
>
> Why add this complexity. Why not just Skip here?
I think these CPUID checks can go away with ARCH_GET_XCOMP_SUPP.
>
>> }
>> static uint32_t xbuf_size;
>> @@ -161,14 +165,31 @@ static void check_cpuid_xtiledata(void)
>> * eax: XTILEDATA state component size
>> * ebx: XTILEDATA state component offset in user buffer
>> */
>> - if (!eax || !ebx)
>> - fatal_error("xstate cpuid: invalid tile data size/offset:
>> %d/%d",
>> - eax, ebx);
>> -
>> xtiledata.size = eax;
>> xtiledata.xbuf_offset = ebx;
>> }
>> +static bool amx_available(void)
>> +{
>> + check_cpuid_xsave();
>> + if (!cpuinfo.xsave) {
>> + printf("[SKIP]\tcpuid: no CPU xsave support\n");
>> + return false;
>> + } else if (!cpuinfo.osxsave) {
>> + printf("[SKIP]\tcpuid: no OS xsave support\n");
>> + return false;
>> + }
>> +
>> + check_cpuid_xtiledata();
>> + if (!xtiledata.size || !xtiledata.xbuf_offset) {
>> + printf("[SKIP]\txstate cpuid: no tile data (size/offset:
>> %d/%d)\n",
>> + xtiledata.size, xtiledata.xbuf_offset);
>> + return false;
>> + }
>> +
>> + return true;
>> +}
>> +
>
> I am not seeing any value in adding this layer of abstraction.
> Keep it simple and do the handling in main()
Sure.
>
>> /* The helpers for managing XSAVE buffer and tile states: */
>> struct xsave_buffer *alloc_xbuf(void)
>> @@ -826,9 +847,8 @@ static void test_context_switch(void)
>> int main(void)
>> {
>> - /* Check hardware availability at first */
>> - check_cpuid_xsave();
>> - check_cpuid_xtiledata();
>> + if (!amx_available())
>> + return 0;
>
> This should KSFT_SKIP for this to be reported as a skip. Returning 0
> will be reported as a Pass.
I think that's a good point, thanks.
Now, along with the on-going documentation [1], this test code can be
simplified like the below changes, instead of having those cpuid functions:
diff --git a/tools/testing/selftests/x86/amx.c
b/tools/testing/selftests/x86/amx.c
index 625e42901237..83705c472a5c 100644
--- a/tools/testing/selftests/x86/amx.c
+++ b/tools/testing/selftests/x86/amx.c
@@ -348,6 +348,7 @@ enum expected_result { FAIL_EXPECTED,
SUCCESS_EXPECTED };
/* arch_prctl() and sigaltstack() test */
+#define ARCH_GET_XCOMP_SUPP 0x1021
#define ARCH_GET_XCOMP_PERM 0x1022
#define ARCH_REQ_XCOMP_PERM 0x1023
@@ -828,9 +829,14 @@ static void test_context_switch(void)
int main(void)
{
- /* Check hardware availability at first */
- check_cpuid_xsave();
- check_cpuid_xtiledata();
+ unsigned long features;
+ long rc;
+
+ rc = syscall(SYS_arch_prctl, ARCH_GET_XCOMP_SUPP, &features);
+ if (rc || (features & XFEATURE_MASK_XTILE) != XFEATURE_MASK_XTILE) {
+ printf("[SKIP]\tno AMX support\n");
+ exit(KSFT_FAIL);
+ }
init_stashed_xsave();
sethandler(SIGILL, handle_noperm, 0);
Thanks,
Chang
[1]
https://lore.kernel.org/lkml/[email protected]/
On 6/17/22 4:21 PM, Chang S. Bae wrote:
> On 6/16/2022 3:54 PM, Shuah Khan wrote:
>> On 4/1/22 4:10 PM, Chang S. Bae wrote:
>>>
>>
>> This should KSFT_SKIP for this to be reported as a skip. Returning 0
>> will be reported as a Pass.
>
> I think that's a good point, thanks.
>
> Now, along with the on-going documentation [1], this test code can be simplified like the below changes, instead of having those cpuid functions:
>
Simplifying is always good. Send me v2 and I will review it.
thanks,
-- Shuah