LinuxLists.cc - [PATCH 0/2] Enumerate and expose AVX512

2020-12-08 04:12:47

Subject: [PATCH 0/2] Enumerate and expose AVX512_FP16 feature

Introduce AVX512_FP16 feature and expose it to KVM CPUID for processors
that support it. KVM reports this information and guests can make use
of it.

Detailed information on the instruction and CPUID feature flag can be found
in the latest "extensions" manual [1].

Reference:
[1]. https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html

Cathy Zhang (1):
x86: Expose AVX512_FP16 for supported CPUID

Kyung Min Park (1):
Enumerate AVX512 FP16 CPUID feature flag

arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
arch/x86/kvm/cpuid.c | 2 +-
3 files changed, 3 insertions(+), 1 deletion(-)

--
2.17.1

2020-12-08 04:12:52

by Kyung Min Park

[permalink] [raw]

Subject: [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag

Enumerate AVX512 Half-precision floating point (FP16) CPUID feature
flag. Compared with using FP32, using FP16 cut the number of bits
required for storage in half, reducing the exponent from 8 bits to 5,
and the mantissa from 23 bits to 10. Using FP16 also enables developers
to train and run inference on deep learning models fast when all
precision or magnitude (FP32) is not needed.

A processor supports AVX512 FP16 if CPUID.(EAX=7,ECX=0):EDX[bit 23]
is present. The AVX512 FP16 requires AVX512BW feature be implemented
since the instructions for manipulating 32bit masks are associated with
AVX512BW.

The only in-kernel usage of this is kvm passthrough. The CPU feature
flag is shown as "avx512_fp16" in /proc/cpuinfo.

Signed-off-by: Kyung Min Park <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
arch/x86/include/asm/cpufeatures.h | 1 +
arch/x86/kernel/cpu/cpuid-deps.c | 1 +
2 files changed, 2 insertions(+)

diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
index b6b9b3407c22..bec37ec7101e 100644
--- a/arch/x86/include/asm/cpufeatures.h
+++ b/arch/x86/include/asm/cpufeatures.h
@@ -375,6 +375,7 @@
#define X86_FEATURE_TSXLDTRK (18*32+16) /* TSX Suspend Load Address Tracking */
#define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */
#define X86_FEATURE_ARCH_LBR (18*32+19) /* Intel ARCH LBR */
+#define X86_FEATURE_AVX512_FP16 (18*32+23) /* AVX512 FP16 */
#define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */
#define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */
#define X86_FEATURE_FLUSH_L1D (18*32+28) /* Flush L1D cache */
diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
index d502241995a3..42af31b64c2c 100644
--- a/arch/x86/kernel/cpu/cpuid-deps.c
+++ b/arch/x86/kernel/cpu/cpuid-deps.c
@@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = {
{ X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC },
{ X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL },
+ { X86_FEATURE_AVX512_FP16, X86_FEATURE_AVX512BW },
{ X86_FEATURE_ENQCMD, X86_FEATURE_XSAVES },
{ X86_FEATURE_PER_THREAD_MBA, X86_FEATURE_MBA },
{}
--
2.17.1

2020-12-08 04:12:57

by Kyung Min Park

[permalink] [raw]

Subject: [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID

From: Cathy Zhang <[email protected]>

AVX512_FP16 is supported by Intel processors, like Sapphire Rapids.
It could gain better performance for it's faster compared to FP32
while meets the precision or magnitude requirement. It's availability
is indicated by CPUID.(EAX=7,ECX=0):EDX[bit 23].

Expose it in KVM supported CPUID, then guest could make use of it.

Signed-off-by: Cathy Zhang <[email protected]>
Signed-off-by: Kyung Min Park <[email protected]>
Acked-by: Dave Hansen <[email protected]>
Reviewed-by: Tony Luck <[email protected]>
---
arch/x86/kvm/cpuid.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
index e83bfe2daf82..d7707cfc9401 100644
--- a/arch/x86/kvm/cpuid.c
+++ b/arch/x86/kvm/cpuid.c
@@ -416,7 +416,7 @@ void kvm_set_cpu_caps(void)
F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) |
- F(SERIALIZE) | F(TSXLDTRK)
+ F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16)
);

/* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
--
2.17.1

2020-12-08 09:31:29

by Borislav Petkov

[permalink] [raw]

Subject: Re: [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag

On Mon, Dec 07, 2020 at 07:34:40PM -0800, Kyung Min Park wrote:
> Enumerate AVX512 Half-precision floating point (FP16) CPUID feature
> flag. Compared with using FP32, using FP16 cut the number of bits
> required for storage in half, reducing the exponent from 8 bits to 5,
> and the mantissa from 23 bits to 10. Using FP16 also enables developers
> to train and run inference on deep learning models fast when all
> precision or magnitude (FP32) is not needed.
>
> A processor supports AVX512 FP16 if CPUID.(EAX=7,ECX=0):EDX[bit 23]
> is present. The AVX512 FP16 requires AVX512BW feature be implemented
> since the instructions for manipulating 32bit masks are associated with
> AVX512BW.
>
> The only in-kernel usage of this is kvm passthrough. The CPU feature
> flag is shown as "avx512_fp16" in /proc/cpuinfo.
>
> Signed-off-by: Kyung Min Park <[email protected]>
> Acked-by: Dave Hansen <[email protected]>
> Reviewed-by: Tony Luck <[email protected]>
> ---
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/kernel/cpu/cpuid-deps.c | 1 +
> 2 files changed, 2 insertions(+)
>
> diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h
> index b6b9b3407c22..bec37ec7101e 100644
> --- a/arch/x86/include/asm/cpufeatures.h
> +++ b/arch/x86/include/asm/cpufeatures.h
> @@ -375,6 +375,7 @@
> #define X86_FEATURE_TSXLDTRK (18*32+16) /* TSX Suspend Load Address Tracking */
> #define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */
> #define X86_FEATURE_ARCH_LBR (18*32+19) /* Intel ARCH LBR */
> +#define X86_FEATURE_AVX512_FP16 (18*32+23) /* AVX512 FP16 */
> #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */
> #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */
> #define X86_FEATURE_FLUSH_L1D (18*32+28) /* Flush L1D cache */
> diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c
> index d502241995a3..42af31b64c2c 100644
> --- a/arch/x86/kernel/cpu/cpuid-deps.c
> +++ b/arch/x86/kernel/cpu/cpuid-deps.c
> @@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = {
> { X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC },
> { X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC },
> { X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL },
> + { X86_FEATURE_AVX512_FP16, X86_FEATURE_AVX512BW },
> { X86_FEATURE_ENQCMD, X86_FEATURE_XSAVES },
> { X86_FEATURE_PER_THREAD_MBA, X86_FEATURE_MBA },
> {}
> --

Acked-by: Borislav Petkov <[email protected]>

Paolo, you can pick those up if you prefer.

Thx.

--
Regards/Gruss,
Boris.

https://people.kernel.org/tglx/notes-about-netiquette

2020-12-11 10:48:17

by Sean Christopherson

[permalink] [raw]

Subject: Re: [PATCH 2/2] x86: Expose AVX512_FP16 for supported CPUID

Shortlog should use "KVM: x86: ...", and probably s/for/in. It currently reads
like the kernel is exposing the flag to KVM for KVM's supported CPUID, e.g.:

KVM: x86: Expose AVX512_FP16 in supported CPUID

On Mon, Dec 07, 2020, Kyung Min Park wrote:
> From: Cathy Zhang <[email protected]>
>
> AVX512_FP16 is supported by Intel processors, like Sapphire Rapids.
> It could gain better performance for it's faster compared to FP32
> while meets the precision or magnitude requirement. It's availability
> is indicated by CPUID.(EAX=7,ECX=0):EDX[bit 23].
>
> Expose it in KVM supported CPUID, then guest could make use of it.

For new features like this that don't require additional KVM enabling, it would
be nice to explicitly state as much in the changelog, along with a brief
explanation of why additional KVM enabling is not necessary. It doesn't have to
be much, just something to help people that aren't already familiar with FP16
understand what this patch actually exposes to the guest. E.g. I assume there
are new instructions that are available with FP16?

> Signed-off-by: Cathy Zhang <[email protected]>
> Signed-off-by: Kyung Min Park <[email protected]>
> Acked-by: Dave Hansen <[email protected]>
> Reviewed-by: Tony Luck <[email protected]>
> ---
> arch/x86/kvm/cpuid.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/arch/x86/kvm/cpuid.c b/arch/x86/kvm/cpuid.c
> index e83bfe2daf82..d7707cfc9401 100644
> --- a/arch/x86/kvm/cpuid.c
> +++ b/arch/x86/kvm/cpuid.c
> @@ -416,7 +416,7 @@ void kvm_set_cpu_caps(void)
> F(AVX512_4VNNIW) | F(AVX512_4FMAPS) | F(SPEC_CTRL) |
> F(SPEC_CTRL_SSBD) | F(ARCH_CAPABILITIES) | F(INTEL_STIBP) |
> F(MD_CLEAR) | F(AVX512_VP2INTERSECT) | F(FSRM) |
> - F(SERIALIZE) | F(TSXLDTRK)
> + F(SERIALIZE) | F(TSXLDTRK) | F(AVX512_FP16)
> );
>
> /* TSC_ADJUST and ARCH_CAPABILITIES are emulated in software. */
> --
> 2.17.1
>

2020-12-13 21:22:47

by Paolo Bonzini

[permalink] [raw]

Subject: Re: [PATCH 0/2] Enumerate and expose AVX512_FP16 feature

On 08/12/20 04:34, Kyung Min Park wrote:
> Introduce AVX512_FP16 feature and expose it to KVM CPUID for processors
> that support it. KVM reports this information and guests can make use
> of it.
>
> Detailed information on the instruction and CPUID feature flag can be found
> in the latest "extensions" manual [1].
>
> Reference:
> [1]. https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
>
> Cathy Zhang (1):
> x86: Expose AVX512_FP16 for supported CPUID
>
> Kyung Min Park (1):
> Enumerate AVX512 FP16 CPUID feature flag
>
> arch/x86/include/asm/cpufeatures.h | 1 +
> arch/x86/kernel/cpu/cpuid-deps.c | 1 +
> arch/x86/kvm/cpuid.c | 2 +-
> 3 files changed, 3 insertions(+), 1 deletion(-)
>

Queued, with adjusted commit message according to Sean's review.

Paolo

2020-12-15 02:24:19

by Zhang, Cathy

[permalink] [raw]

Subject: Re: [PATCH 0/2] Enumerate and expose AVX512_FP16 feature

Thanks Paolo and Sean! Sorry for the delay response, I'm back from
vacation and just see Sean's comment, and I see Paolo has made changes,
thanks a bunch!

On 12/12/2020 7:42 AM, Paolo Bonzini wrote:
> On 08/12/20 04:34, Kyung Min Park wrote:
>> Introduce AVX512_FP16 feature and expose it to KVM CPUID for processors
>> that support it. KVM reports this information and guests can make use
>> of it.
>>
>> Detailed information on the instruction and CPUID feature flag can be
>> found
>> in the latest "extensions" manual [1].
>>
>> Reference:
>> [1].
>> https://software.intel.com/content/www/us/en/develop/download/intel-architecture-instruction-set-extensions-programming-reference.html
>>
>> Cathy Zhang (1):
>>    x86: Expose AVX512_FP16 for supported CPUID
>>
>> Kyung Min Park (1):
>>    Enumerate AVX512 FP16 CPUID feature flag
>>
>> arch/x86/include/asm/cpufeatures.h | 1 +
>> arch/x86/kernel/cpu/cpuid-deps.c   | 1 +
>> arch/x86/kvm/cpuid.c               | 2 +-
>> 3 files changed, 3 insertions(+), 1 deletion(-)
>>
>
> Queued, with adjusted commit message according to Sean's review.
>
> Paolo
>