Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp1784867pxb; Sun, 17 Jan 2021 22:59:24 -0800 (PST) X-Google-Smtp-Source: ABdhPJzAveJiA7ocWcpy48h1c0l9ZGkKDS0fjEcmeSX1jQ37gJexFrj8gVny4EPLAtoZclZ8tkLh X-Received: by 2002:a17:906:2694:: with SMTP id t20mr11688027ejc.48.1610953163900; Sun, 17 Jan 2021 22:59:23 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610953163; cv=none; d=google.com; s=arc-20160816; b=MoDFFa4oUsi85VSQpRvGzigbN5FjZObcsUrnfNF33hCRD5Bhtz9+6u7W1srP/C+NAE qVeCCOtbAXQdHJyJlYEvvYez6cyw7bCkzzBckwuwMnFcPpc5vSn3YWv0Y5SGHrcfTAfh SZ0vMyf5CERtLPWy0KUtpTK8VRnPFnX82let+ZxsiXbboQiYE5xciJrhmqgiLq4UgbiN wCw2f96v0IBey6BwryR6Lng8c4YqA73ik9nuoOSE6+t4zk0qsLQppXUHsgxLHNeqjp6m 2Tmr3yWn7FBDA7RVEfO5QB3XTpnHqmTHZmiRymvGfaCmVrqDULZcpRTA5qQ5xdw5/OKu yLbg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=TVGwYR1gZgr0UTLqtG80Qgh3lpIpjj6EGK/8qzX7IXM=; b=pluEulbckUOfv8CHNk9ALlrlqiNJRo17Stm6vR3M2irU0R66biUxTqVn5oj2WjRXX0 qQ/0hQlCMaL86PRmVFtlUbm7k2+5e/mPvSv83wZpBDLvrACavVbK+0vKKtyISTBG7GwD 6OBI982Q3YOruqmiTXLKh0CEPtOz4jmx6AfCKiwnJyNnhq4RHPmHFmDK6H8YP+YbJJzq 6lk3JzJiAmC5UkFEaoaCgU4aHTRdW61f6XwNenvEsDFmYgbxp33wRf4gWpWffzQTtF9Z pvmSR/588nGo0WuBAT1f1U03moXz5XqQTfB1oasZKdurBDU3ZDXG+laVgWMSuD+N6zs3 NOwg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=aeWC1+hq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id zn8si7247457ejb.32.2021.01.17.22.58.58; Sun, 17 Jan 2021 22:59:23 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=aeWC1+hq; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732371AbhARGWj (ORCPT + 99 others); Mon, 18 Jan 2021 01:22:39 -0500 Received: from mail.kernel.org ([198.145.29.99]:54324 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730151AbhARGV0 (ORCPT ); Mon, 18 Jan 2021 01:21:26 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 2BDE62251F; Mon, 18 Jan 2021 06:20:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1610950844; bh=yHUNxes0FjrxicUu+KbRSFti619OIcluX0Ar0TPOrf8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=aeWC1+hqIQ1iawYzsBYheBTNxJDatHGMM4GJ88vDkAMkh6gECWwlKnbjDxSbqus3S eSMNtvUp/ccajiNGcV56kLUVjSBBycCfl5FqjshC7e0z2V4CyqGX8IlQsnKvDObX8j wxBB2KquN5nRxPBnQIEbaMPnRVtfIij0dwxQKJH9ZNxIm0h77ubEeZEqIlXR1bfnM5 VVXuO9Gj6dU4P3RY/vu3yX+RQTAKkvicirKbFS5UMqy7NZVru7y/1JBLUg1rGSHMh5 F+k30PSwRklBv6oT+rVcZwWTd8kUXK/LtlcyypAm7bX++MkhTdZu/ysCEZRjdg/tNF BycAW603TUckQ== From: Andy Lutomirski To: x86@kernel.org Cc: LKML , Krzysztof Mazur , =?UTF-8?q?Krzysztof=20Ol=C4=99dzki?= , Arnd Bergmann , Andy Lutomirski Subject: [PATCH 1/4] x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize state Date: Sun, 17 Jan 2021 22:20:38 -0800 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently, requesting kernel FPU access doesn't distinguish which parts of the extended ("FPU") state are needed. This is nice for simplicity, but there are a few cases in which it's suboptimal: - The vast majority of in-kernel FPU users want XMM/YMM/ZMM state but do not use legacy 387 state. These users want MXCSR initialized but don't care about the FPU control word. Skipping FNINIT would save time. (Empirically, FNINIT is several times slower than LDMXCSR.) - Code that wants MMX doesn't want need MXCSR or FCW initialized. _mmx_memcpy(), for example, can run before CR4.OSFXSR gets set, and initializing MXCSR will fail. - Any future in-kernel users of XFD (eXtended Feature Disable)-capable dynamic states will need special handling. This patch adds a more specific API that allows callers specify exactly what they want. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/fpu/api.h | 16 ++++++++++++++-- arch/x86/kernel/fpu/core.c | 17 +++++++++++------ 2 files changed, 25 insertions(+), 8 deletions(-) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index dcd9503b1098..133907a200ef 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -16,14 +16,26 @@ * Use kernel_fpu_begin/end() if you intend to use FPU in kernel context. It * disables preemption so be careful if you intend to use it for long periods * of time. - * If you intend to use the FPU in softirq you need to check first with + * If you intend to use the FPU in irq/softirq you need to check first with * irq_fpu_usable() if it is possible. */ -extern void kernel_fpu_begin(void); + +/* Kernel FPU states to initialize in kernel_fpu_begin_mask() */ +#define KFPU_387 _BITUL(0) /* FCW will be initialized */ +#define KFPU_XYZMM _BITUL(1) /* MXCSR will be initialized */ +#define KFPU_MMX 0 /* nothing gets initialized */ + +extern void kernel_fpu_begin_mask(unsigned int kfpu_mask); extern void kernel_fpu_end(void); extern bool irq_fpu_usable(void); extern void fpregs_mark_activate(void); +/* Code that is unaware of kernel_fpu_begin_mask() can use this */ +static inline void kernel_fpu_begin(void) +{ + kernel_fpu_begin_mask(KFPU_387 | KFPU_XYZMM); +} + /* * Use fpregs_lock() while editing CPU's FPU registers or fpu->state. * A context switch will (and softirq might) save CPU's FPU registers to diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index eb86a2b831b1..52d05c806aa6 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -121,7 +121,7 @@ int copy_fpregs_to_fpstate(struct fpu *fpu) } EXPORT_SYMBOL(copy_fpregs_to_fpstate); -void kernel_fpu_begin(void) +void kernel_fpu_begin_mask(unsigned int kfpu_mask) { preempt_disable(); @@ -141,13 +141,18 @@ void kernel_fpu_begin(void) } __cpu_invalidate_fpregs_state(); - if (boot_cpu_has(X86_FEATURE_XMM)) - ldmxcsr(MXCSR_DEFAULT); + /* Put sane initial values into the control registers. */ + if (likely(kfpu_mask & KFPU_XYZMM)) { + if (boot_cpu_has(X86_FEATURE_XMM)) + ldmxcsr(MXCSR_DEFAULT); + } - if (boot_cpu_has(X86_FEATURE_FPU)) - asm volatile ("fninit"); + if (unlikely(kfpu_mask & KFPU_387)) { + if (boot_cpu_has(X86_FEATURE_FPU)) + asm volatile ("fninit"); + } } -EXPORT_SYMBOL_GPL(kernel_fpu_begin); +EXPORT_SYMBOL_GPL(kernel_fpu_begin_mask); void kernel_fpu_end(void) { -- 2.29.2