Received: by 2002:a05:6a10:f347:0:0:0:0 with SMTP id d7csp2894328pxu; Mon, 7 Dec 2020 20:12:52 -0800 (PST) X-Google-Smtp-Source: ABdhPJw0NjzRcGgC5vU8wahznyUhbNeyohpp0iarWArBZJoP7BNU+JNHswqhINqoEcTlPXt2PYD8 X-Received: by 2002:a17:906:da08:: with SMTP id fi8mr21666792ejb.517.1607400772655; Mon, 07 Dec 2020 20:12:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1607400772; cv=none; d=google.com; s=arc-20160816; b=CEpZyA6TROxGno4PuSl3eOCZ5SATACQH0j+gR2Lz+ItrW7OexFuVULGS9GlMYRVrDC EAHEPZbrUuwZhsUe9WJchD5Lu/XuNIl/R3D5UWWyi9tUw22gxLAaL2b7TL7oeMzY6LDY AeM4hOPt8CMh0affzWvzhk8OeFlvTtdyguGz8eQXnaYZ8n+mFlJ/z3oXPlIqW8CMTU/v +BDWcVZMhCD/bX8AUFKQprzpVaj0OvSbuwo4b1E93d8rMAYh96HSyK5kGd34WxYvXLv3 ct8TvV18wMa+SMBuuaR4fKWEgyTuQ4BYHADcBNfC/uHfXPzDjCpaZ9DSIAaQd7SPgkOA NclQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:ironport-sdr:ironport-sdr; bh=Uv2ha+HbsfUuPmvGjODLjaSe2Y/IWmOs9r6fBRtz+zE=; b=tSEEI6rNVKrl/mG5eePJwMQRFX/dQEh7o4l2IhKuK24YuFd671fO2v9AQZ1xpdlAmG IK54XEXdu6sKfBrPb2G4crDwTfrAQP38XCOy1Z7pEwv6Lkt4jzFkubpoWm8ACQnW/B7H 97E0tiYoJZGbQGuteJe8zVn6ueDIPmhQbwLfBcbGNHdok8HJMqbD6Y86QgOxQxjLrRSM qArrL6PlNDMrhjTrtCM1IeC0FXPxfbdEd1GVl0iJR7BamFXHWu3VUz6irSCZ1KaszcRJ +Eft5c9qJX8hqCKBK+YrHcXf84FFoX5bSUpAVo6/EFewkLHK17qPLls/1ym2vVPbntuf qQug== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b7si8493140edy.561.2020.12.07.20.12.30; Mon, 07 Dec 2020 20:12:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726924AbgLHD4a (ORCPT + 99 others); Mon, 7 Dec 2020 22:56:30 -0500 Received: from mga14.intel.com ([192.55.52.115]:59706 "EHLO mga14.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726556AbgLHD4a (ORCPT ); Mon, 7 Dec 2020 22:56:30 -0500 IronPort-SDR: 4bap6LJ9XyrO89KUGzw447xJzLX16Qtg8D16T+ztHbArfNdwWKIKUVPhgHz2lTWiXKTNA2HjSF Y6uovgLArv+Q== X-IronPort-AV: E=McAfee;i="6000,8403,9828"; a="173060180" X-IronPort-AV: E=Sophos;i="5.78,401,1599548400"; d="scan'208";a="173060180" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga103.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Dec 2020 19:55:35 -0800 IronPort-SDR: nvu23G2qECChJTtgThqTte/uySMynDEZAW3ErrvTP74P9lLCwnVcbuKlTKjyTAxbe68EorHFUQ s5xHgsHIFkFg== X-IronPort-AV: E=Sophos;i="5.78,401,1599548400"; d="scan'208";a="363469717" Received: from km-skylake-client-platform.sc.intel.com ([10.3.52.146]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 07 Dec 2020 19:55:34 -0800 From: Kyung Min Park To: x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, hpa@zytor.com, pbonzini@redhat.com, sean.j.christopherson@intel.com, jmattson@google.com, joro@8bytes.org, vkuznets@redhat.com, wanpengli@tencent.com, kyung.min.park@intel.com, cathy.zhang@intel.com Subject: [PATCH 1/2] Enumerate AVX512 FP16 CPUID feature flag Date: Mon, 7 Dec 2020 19:34:40 -0800 Message-Id: <20201208033441.28207-2-kyung.min.park@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20201208033441.28207-1-kyung.min.park@intel.com> References: <20201208033441.28207-1-kyung.min.park@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Enumerate AVX512 Half-precision floating point (FP16) CPUID feature flag. Compared with using FP32, using FP16 cut the number of bits required for storage in half, reducing the exponent from 8 bits to 5, and the mantissa from 23 bits to 10. Using FP16 also enables developers to train and run inference on deep learning models fast when all precision or magnitude (FP32) is not needed. A processor supports AVX512 FP16 if CPUID.(EAX=7,ECX=0):EDX[bit 23] is present. The AVX512 FP16 requires AVX512BW feature be implemented since the instructions for manipulating 32bit masks are associated with AVX512BW. The only in-kernel usage of this is kvm passthrough. The CPU feature flag is shown as "avx512_fp16" in /proc/cpuinfo. Signed-off-by: Kyung Min Park Acked-by: Dave Hansen Reviewed-by: Tony Luck --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/kernel/cpu/cpuid-deps.c | 1 + 2 files changed, 2 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index b6b9b3407c22..bec37ec7101e 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -375,6 +375,7 @@ #define X86_FEATURE_TSXLDTRK (18*32+16) /* TSX Suspend Load Address Tracking */ #define X86_FEATURE_PCONFIG (18*32+18) /* Intel PCONFIG */ #define X86_FEATURE_ARCH_LBR (18*32+19) /* Intel ARCH LBR */ +#define X86_FEATURE_AVX512_FP16 (18*32+23) /* AVX512 FP16 */ #define X86_FEATURE_SPEC_CTRL (18*32+26) /* "" Speculation Control (IBRS + IBPB) */ #define X86_FEATURE_INTEL_STIBP (18*32+27) /* "" Single Thread Indirect Branch Predictors */ #define X86_FEATURE_FLUSH_L1D (18*32+28) /* Flush L1D cache */ diff --git a/arch/x86/kernel/cpu/cpuid-deps.c b/arch/x86/kernel/cpu/cpuid-deps.c index d502241995a3..42af31b64c2c 100644 --- a/arch/x86/kernel/cpu/cpuid-deps.c +++ b/arch/x86/kernel/cpu/cpuid-deps.c @@ -69,6 +69,7 @@ static const struct cpuid_dep cpuid_deps[] = { { X86_FEATURE_CQM_MBM_TOTAL, X86_FEATURE_CQM_LLC }, { X86_FEATURE_CQM_MBM_LOCAL, X86_FEATURE_CQM_LLC }, { X86_FEATURE_AVX512_BF16, X86_FEATURE_AVX512VL }, + { X86_FEATURE_AVX512_FP16, X86_FEATURE_AVX512BW }, { X86_FEATURE_ENQCMD, X86_FEATURE_XSAVES }, { X86_FEATURE_PER_THREAD_MBA, X86_FEATURE_MBA }, {} -- 2.17.1