Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp7639888imu; Wed, 14 Nov 2018 22:18:18 -0800 (PST) X-Google-Smtp-Source: AJdET5eWB/tM1L3QWKIOIPPF/m5Qr/kW3eDYaE/qyVxFk3+b35CPF/yY9wCPvEyB9lJfJn4MQESQ X-Received: by 2002:a63:a552:: with SMTP id r18mr4664660pgu.176.1542262698306; Wed, 14 Nov 2018 22:18:18 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542262698; cv=none; d=google.com; s=arc-20160816; b=T/JPuB3ramH390hSAoHcE8II3U7V3XPxizpdHinHs6GJpKCoQf+OFgnp/DZCd+2Fjm n8Ap/fSc30ZWvr9lBeECn7EN5O5zczjRHDZpvDnuAx/Umct9hjo6EtGzRrHfhUE2XU/z s2JAx4JNrukgBo7PBtAo+K+qdEkzEbSGIpXTlf0GN1S/KOz78hyWs8y17Ys0rV10dxI5 BI7DU4cyMOPgJfaU6NNRyFkUSYEb6YTFkWXLLKW3f4Ku6jbQQggtvfhMD2UFQ1NiDswP j8f5ZKzYgqaIVLq/KwT+JHbGPhTwl9kbVioWXNwnpdn4G5TfafTX+5pyNfjSeI+TXEBS kWzg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from; bh=IwdCVbX4qGyVPJKKN1tJMLm8fa9IU4aO6WaLBpdBvsU=; b=OtSmfKEIlom+gN6QMffX0pfO2Z+JcJv5iHSASIdECKbLdmw+NGAmDWQwU/iv4mvf3b OxhLOsx2XWljsq3NFQTSHVU5pT18tgad0uX4ayfvqgCe12PBVVpl3TG/ZHQaw7XlTWXI 6JjxHfWgG6QPYm4nHPEb8s2Zcuci3yG7nQjPgaoBpDHLV4tQLAG08Ydp+fJCvHfx4geC QcNFsX7E/Ni4Q5tq4GpnQCxAZG6vHpvTJmDAMYmtL+tuAFWzMpEq/1cfyly4Q3xm86I7 gbIM5RqFaFga0eKrtldKvkFlCqNbC7UFjP3nDNTNLlntT9JtRS0KgvmJAWcbqc2lCjgO O5ew== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id c19si10523486pls.242.2018.11.14.22.18.03; Wed, 14 Nov 2018 22:18:18 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728177AbeKOQXr (ORCPT + 99 others); Thu, 15 Nov 2018 11:23:47 -0500 Received: from mga12.intel.com ([192.55.52.136]:13841 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726574AbeKOQXr (ORCPT ); Thu, 15 Nov 2018 11:23:47 -0500 X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga006.fm.intel.com ([10.253.24.20]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 14 Nov 2018 22:17:18 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.56,235,1539673200"; d="scan'208";a="281236372" Received: from aubrey-skl.sh.intel.com ([10.239.53.9]) by fmsmga006.fm.intel.com with ESMTP; 14 Nov 2018 22:17:16 -0800 From: Aubrey Li To: tglx@linutronix.de, mingo@redhat.com, peterz@infradead.org, hpa@zytor.com Cc: ak@linux.intel.com, tim.c.chen@linux.intel.com, dave.hansen@intel.com, arjan@linux.intel.com, aubrey.li@intel.com, linux-kernel@vger.kernel.org, Aubrey Li Subject: [PATCH v3 1/2] x86/fpu: track AVX-512 usage of tasks Date: Thu, 15 Nov 2018 07:00:06 +0800 Message-Id: <1542236407-4323-1-git-send-email-aubrey.li@intel.com> X-Mailer: git-send-email 2.7.4 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org User space tools which do automated task placement need information about AVX-512 usage of tasks, because AVX-512 usage could cause core turbo frequency drop and impact the running task on the sibling CPU. XSAVE header contains a state-component bitmap, which allows software to discover the state of the init optimization used by XSAVEOPT and XSAVES. Set bits in the bitmap denotes the usage of the components. AVX-512 component has 3 states, only Hi16_ZMM state causes notable frequency drop. Add per task Hi16_ZMM state tracking to context switch. The tracking turns on the usage flag immediately, but requires 3 consecutive context switches with no usage to clear it. This decay is required because of AVX-512 using tasks could set Hi16_ZMM state back to the init state themselves. Signed-off-by: Aubrey Li Cc: Peter Zijlstra Cc: Andi Kleen Cc: Tim Chen Cc: Dave Hansen Cc: Arjan van de Ven --- arch/x86/include/asm/fpu/internal.h | 26 ++++++++++++++++++++++++++ arch/x86/include/asm/fpu/types.h | 9 +++++++++ 2 files changed, 35 insertions(+) diff --git a/arch/x86/include/asm/fpu/internal.h b/arch/x86/include/asm/fpu/internal.h index a38bf5a..f382449 100644 --- a/arch/x86/include/asm/fpu/internal.h +++ b/arch/x86/include/asm/fpu/internal.h @@ -275,6 +275,31 @@ static inline void copy_fxregs_to_kernel(struct fpu *fpu) : "D" (st), "m" (*st), "a" (lmask), "d" (hmask) \ : "memory") +#define HI16ZMM_STATE_DECAY_COUNT 3 +/* + * This function is called during context switch to update Hi16_ZMM state + */ +static inline void update_hi16zmm_state(struct fpu *fpu) +{ + /* + * XSAVE header contains a state-component bitmap(xfeatures), + * which allows software to discover the state of the init + * optimization used by XSAVEOPT and XSAVES. + * + * Hi16_ZMM state(one state of AVX-512 component) is tracked here + * because its usage could cause notable core turbo frequency drop. + * + * AVX512-using tasks could set Hi16_ZMM state back to the init + * state themselves. Thus, this tracking mechanism can miss. + * The decay usage ensures that false-negatives do not immediately + * make a task be considered as not using Hi16_ZMM registers. + */ + if (fpu->state.xsave.header.xfeatures & XFEATURE_MASK_Hi16_ZMM) + fpu->hi16zmm_usage = HI16ZMM_STATE_DECAY_COUNT; + else if (fpu->hi16zmm_usage) + fpu->hi16zmm_usage--; +} + /* * This function is called only during boot time when x86 caps are not set * up and alternative can not be used yet. @@ -411,6 +436,7 @@ static inline int copy_fpregs_to_fpstate(struct fpu *fpu) { if (likely(use_xsave())) { copy_xregs_to_kernel(&fpu->state.xsave); + update_hi16zmm_state(fpu); return 1; } diff --git a/arch/x86/include/asm/fpu/types.h b/arch/x86/include/asm/fpu/types.h index 202c539..c0c7577 100644 --- a/arch/x86/include/asm/fpu/types.h +++ b/arch/x86/include/asm/fpu/types.h @@ -303,6 +303,15 @@ struct fpu { unsigned char initialized; /* + * @hi16zmm_usage: + * + * Records the usage of the upper 16 AVX512 registers: ZMM16-ZMM31. + * A value of non-zero is used to indicate whether there is valid + * state in these AVX512 registers. + */ + unsigned char hi16zmm_usage; + + /* * @state: * * In-memory copy of all FPU registers that we save/restore -- 2.7.4