Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp2840519pxj; Sun, 23 May 2021 12:40:52 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxssDLdFkQ6bZ8VLaL+vk68ye9pWoULfN5vWSYBfdiDf2odvQ2jzBoQTAlKXAkPowqe7LZD X-Received: by 2002:a05:6602:3281:: with SMTP id d1mr8482793ioz.87.1621798852440; Sun, 23 May 2021 12:40:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621798852; cv=none; d=google.com; s=arc-20160816; b=S7Cgum1qhW7Xx8Yxfg/ykLLuJVcnjDxe5zLx66HGZbI6Ov8WNZyUbhmcPkCJUBO8Fa KrjqleupIonfSSuGnvKNP39llbN0gkc3ty8XYhBbrjPCogaL9rWecMh8ZZU9x/9peeHN BGxzrKi2RUo0HMLI7lkZQw+t4oUtDxixhrejSmEP8ITbi+bUDKdig8FCsoaaq41Etg2B fUqc+KY7uyUhCtL5h9Al5Zharj4vL+pGWrz2vR8roJVI7bFbL9NONNZSiAlQpgJqLqoz HdVER7COaCPRhaFx/2xtk5qxyKK9osXJ73LZ05zKLA7xIZck7SnZYgtiRrf5XUK5mfMp zXcw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:ironport-sdr:ironport-sdr; bh=LIfKAJ6WQWj5C4bvV6BxQVMIB5u0PYEzxc5fiYBTdRA=; b=wp7kY3M7pFrGkJCQfNmLh5uph/fYlH+PF10A/gJQBiQtoiuv8l9CPR6qxGFrSOGn3E 6fwqRG9J+98edJgSu5zXIeGs1KYwVstcEI6c3YxnA0ix41F14pOZYhtgQsAnlO6qvlqz 5zjFjhOrc47e0ff8pwdygI7X9EjTPMPSGh9GJeWpXJT20KP9Canrj0myFmFuVa3IVLJA cPJXC6sBF/f1P18YfJ530G4z3sGJHagps/7E4G6Jhi5f5sp6v3ofVa8SqDz0IMPvvWW3 QRUJ/0eBzwda3AYt3IThV9Tik9vjIcyUIJPjULkScw7HBUk/pQeXDY68sApxN6J+FU/w lMnQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id x11si11490703iov.17.2021.05.23.12.40.40; Sun, 23 May 2021 12:40:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232056AbhEWTku (ORCPT + 99 others); Sun, 23 May 2021 15:40:50 -0400 Received: from mga07.intel.com ([134.134.136.100]:12061 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232001AbhEWTj6 (ORCPT ); Sun, 23 May 2021 15:39:58 -0400 IronPort-SDR: TyvgOmpy7351fZnPG87SgLJ4mnrlNeGb3OmtYhjzOpL8hAcqGobXnVRiHyHWaDA+1mqsnq/PcF J//uWKS2i12A== X-IronPort-AV: E=McAfee;i="6200,9189,9993"; a="265703541" X-IronPort-AV: E=Sophos;i="5.82,319,1613462400"; d="scan'208";a="265703541" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 May 2021 12:38:29 -0700 IronPort-SDR: Dmtgg6OniYOA7/BeiyhMWSyrfoMs5uohPQOi8LgdlmIzTwtV7IFWeKRFtEHUf+DJ5A9oN/MslQ 9jYzfhqVnC2w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.82,319,1613462400"; d="scan'208";a="407467122" Received: from chang-linux-3.sc.intel.com ([172.25.66.175]) by fmsmga007.fm.intel.com with ESMTP; 23 May 2021 12:38:29 -0700 From: "Chang S. Bae" To: bp@suse.de, luto@kernel.org, tglx@linutronix.de, mingo@kernel.org, x86@kernel.org Cc: len.brown@intel.com, dave.hansen@intel.com, jing2.liu@intel.com, ravi.v.shankar@intel.com, linux-kernel@vger.kernel.org, chang.seok.bae@intel.com Subject: [PATCH v5 20/28] x86/fpu/amx: Define AMX state components and have it used for boot-time checks Date: Sun, 23 May 2021 12:32:51 -0700 Message-Id: <20210523193259.26200-21-chang.seok.bae@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20210523193259.26200-1-chang.seok.bae@intel.com> References: <20210523193259.26200-1-chang.seok.bae@intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Linux uses check_xstate_against_struct() to sanity check the size of XSTATE-enabled features. AMX is the XSAVE-enabled feature, and its size is not hard-coded but discoverable at run-time via CPUID. The AMX state is composed of state components 17 and 18, which are all user state components. The first component is the XTILECFG state of a 64-byte tile-related control register. The state component 18, called XTILEDATA, contains the actual tile data, and the state size varies on implementations. The architectural maximum, as defined in the CPUID(0x1d, 1): EAX[15:0], is a byte less than 64KB. The first implementation supports 8KB. Check the XTILEDATA state size dynamically. The feature introduces the new tile register, TMM. Define one register struct only and read the number of registers from CPUID. Cross-check the overall size with CPUID again. Signed-off-by: Chang S. Bae Reviewed-by: Len Brown Cc: x86@kernel.org Cc: linux-kernel@vger.kernel.org --- Changes from v4: * Changed to return an error when tile data size mismatches. (Thomas Gleixner) * Updated the function description and code comments. Changes from v2: * Updated the code comments. Changes from v1: * Rebased on the upstream kernel (5.10) --- arch/x86/include/asm/fpu/types.h | 27 +++++++++++ arch/x86/include/asm/fpu/xstate.h | 2 + arch/x86/kernel/fpu/xstate.c | 74 +++++++++++++++++++++++++++++++ 3 files changed, 103 insertions(+) diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index 387b96e0b643..4e620e006787 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -120,6 +120,9 @@ enum xfeature { XFEATURE_RSRVD_COMP_13, XFEATURE_RSRVD_COMP_14, XFEATURE_LBR, + XFEATURE_RSRVD_COMP_16, + XFEATURE_XTILE_CFG, + XFEATURE_XTILE_DATA, XFEATURE_MAX, }; @@ -136,11 +139,15 @@ enum xfeature { #define XFEATURE_MASK_PKRU (1 << XFEATURE_PKRU) #define XFEATURE_MASK_PASID (1 << XFEATURE_PASID) #define XFEATURE_MASK_LBR (1 << XFEATURE_LBR) +#define XFEATURE_MASK_XTILE_CFG (1 << XFEATURE_XTILE_CFG) +#define XFEATURE_MASK_XTILE_DATA (1 << XFEATURE_XTILE_DATA) #define XFEATURE_MASK_FPSSE (XFEATURE_MASK_FP | XFEATURE_MASK_SSE) #define XFEATURE_MASK_AVX512 (XFEATURE_MASK_OPMASK \ | XFEATURE_MASK_ZMM_Hi256 \ | XFEATURE_MASK_Hi16_ZMM) +#define XFEATURE_MASK_XTILE (XFEATURE_MASK_XTILE_DATA \ + | XFEATURE_MASK_XTILE_CFG) #define FIRST_EXTENDED_XFEATURE XFEATURE_YMM @@ -153,6 +160,9 @@ struct reg_256_bit { struct reg_512_bit { u8 regbytes[512/8]; }; +struct reg_1024_byte { + u8 regbytes[1024]; +}; /* * State component 2: @@ -255,6 +265,23 @@ struct arch_lbr_state { u64 ler_to; u64 ler_info; struct lbr_entry entries[]; +}; + +/* + * State component 17: 64-byte tile configuration register. + */ +struct xtile_cfg { + u64 tcfg[8]; +} __packed; + +/* + * State component 18: 1KB tile data register. + * Each register represents 16 64-byte rows of the matrix + * data. But the number of registers depends on the actual + * implementation. + */ +struct xtile_data { + struct reg_1024_byte tmm; } __packed; /* diff --git a/arch/x86/include/asm/fpu/xstate.h b/arch/x86/include/asm/fpu/xstate.h index 30f83eb0aef4..2c9156e4f799 100644 --- a/arch/x86/include/asm/fpu/xstate.h +++ b/arch/x86/include/asm/fpu/xstate.h @@ -13,6 +13,8 @@ #define XSTATE_CPUID 0x0000000d +#define TILE_CPUID 0x0000001d + #define FXSAVE_SIZE 512 #define XSAVE_HDR_SIZE 64 diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index 544e35a9d777..0d10a9ec29da 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -42,6 +42,14 @@ static const char *xfeature_names[] = "Protection Keys User registers", "PASID state", "unknown xstate feature" , + "unknown xstate feature" , + "unknown xstate feature" , + "unknown xstate feature" , + "unknown xstate feature" , + "unknown xstate feature" , + "AMX Tile config" , + "AMX Tile data" , + "unknown xstate feature" , }; static unsigned short xsave_cpuid_features[] __initdata = { @@ -55,6 +63,8 @@ static unsigned short xsave_cpuid_features[] __initdata = { [XFEATURE_PT_UNIMPLEMENTED_SO_FAR] = X86_FEATURE_INTEL_PT, [XFEATURE_PKRU] = X86_FEATURE_PKU, [XFEATURE_PASID] = X86_FEATURE_ENQCMD, + [XFEATURE_XTILE_CFG] = X86_FEATURE_AMX_TILE, + [XFEATURE_XTILE_DATA] = X86_FEATURE_AMX_TILE, }; /* @@ -479,6 +489,8 @@ static void __init print_xstate_features(void) print_xstate_feature(XFEATURE_MASK_Hi16_ZMM); print_xstate_feature(XFEATURE_MASK_PKRU); print_xstate_feature(XFEATURE_MASK_PASID); + print_xstate_feature(XFEATURE_MASK_XTILE_CFG); + print_xstate_feature(XFEATURE_MASK_XTILE_DATA); } /* @@ -738,6 +750,63 @@ static void __xstate_dump_leaves(void) } \ } while (0) +/** + * check_xtile_data_against_struct() - Do a sanity check for tile data state size. + * + * Calculate the state size by multiplying the single tile size which is recorded in a C struct, and + * the number of tiles that the CPU informs. Compare the provided size with the calculation. + * + * @size: The tile data state size + * + * Returns: 0 on success, -EINVAL on mismatch. + */ +static int check_xtile_data_against_struct(int size) +{ + u32 max_palid, palid, state_size; + u32 eax, ebx, ecx, edx; + u16 max_tile; + + /* + * Check the maximum palette id: + * eax: the highest numbered palette subleaf. + */ + cpuid_count(TILE_CPUID, 0, &max_palid, &ebx, &ecx, &edx); + + /* Cross-check each tile size and find the maximum number of supported tiles. */ + for (palid = 1, max_tile = 0; palid <= max_palid; palid++) { + u16 tile_size, max; + + /* + * Check the tile size info: + * eax[31:16]: bytes per title + * ebx[31:16]: the max names (or max number of tiles) + */ + cpuid_count(TILE_CPUID, palid, &eax, &ebx, &edx, &edx); + tile_size = eax >> 16; + max = ebx >> 16; + + if (tile_size != sizeof(struct xtile_data)) { + pr_err("%s: struct is %zu bytes, cpu xtile %d bytes\n", + __stringify(XFEATURE_XTILE_DATA), + sizeof(struct xtile_data), tile_size); + __xstate_dump_leaves(); + return -EINVAL; + } + + if (max > max_tile) + max_tile = max; + } + + state_size = sizeof(struct xtile_data) * max_tile; + if (size != state_size) { + pr_err("%s: calculated size is %u bytes, cpu state %d bytes\n", + __stringify(XFEATURE_XTILE_DATA), state_size, size); + __xstate_dump_leaves(); + return -EINVAL; + } + return 0; +} + /* * We have a C struct for each 'xstate'. We need to ensure * that our software representation matches what the CPU @@ -761,6 +830,11 @@ static int check_xstate_against_struct(int nr) XCHECK_SZ(sz, nr, XFEATURE_Hi16_ZMM, struct avx_512_hi16_state); XCHECK_SZ(sz, nr, XFEATURE_PKRU, struct pkru_state); XCHECK_SZ(sz, nr, XFEATURE_PASID, struct ia32_pasid_state); + XCHECK_SZ(sz, nr, XFEATURE_XTILE_CFG, struct xtile_cfg); + + /* The tile data size varies between implementations. */ + if (nr == XFEATURE_XTILE_DATA) + check_xtile_data_against_struct(sz); /* * Make *SURE* to add any feature numbers in below if -- 2.17.1