Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp754383pxb; Wed, 20 Jan 2021 21:20:45 -0800 (PST) X-Google-Smtp-Source: ABdhPJwe2wF9Tgn6Ua1c32lzr38AX/v8gFPWkJnL4f4mM7p/BLoiUm1AFFDm8RijAHLL+WEV+0KH X-Received: by 2002:a50:bc06:: with SMTP id j6mr10278431edh.235.1611206445515; Wed, 20 Jan 2021 21:20:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1611206445; cv=none; d=google.com; s=arc-20160816; b=bPe3KK62qqrVjVEYUZvz/8d/nPDlSFVQaZeqoo5ShOu874Tkt3z6h8Eu8NWS4aiIeF ak0Y7f9I/zy0vdICROg8GYEr4KIu0GmcXEi0tYPt+h0v59Y6ozhPfI4sIxQwdDcqnLZg S6NBp2bTK1rpXOlgA0HRidhk+uu7fUPGlICvNkAUc3VKAa1tEnvC0qVlXElAEIdX23+p cRNJ3PWMKZmvnxIKrfbnQxB2S1JOqlMMRWsDBbKsPKOsOTcg+MeR+Q0vnB/MA6h5Rg87 lXpzDZHDSYer+dU45K6x9uhdHm+KTbdObEBXJMeoMItCikUxXUfOTOv0tG9mYxd+H0pR ZvVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=dPZk+NKHc74INEpq4nLVg5RHi/8dpNn1dLKAbOPZjD4=; b=ItiTWlx0R7h3+ibqX0n2vd8vLevr7xc84yjo6JHvztuhLQFZaMJ3dkqeefIEULIguP HOHeVweplHhRwy6nNzXNzOuzn6aF+IkIbQx5j8ackSH6rtsLBpWYWc3TuGqZhixy/dT0 08LdSSo5Ge1wdthoNHlh21r6azXwiIilhTHAw/YHY5uSAOuujPMi0JcTyx6Nf6QtEFG7 u1vZeYN5Q71Nf4dTtvlm9p3Vv3Mc/6RUCNSISyvEOdQScFFTiHcP2VLwnrOrBFbSBtgW hlNoIvM/aQl93+674RWD8N8+/dymmzWllz1s4RrybUC9vt+y5Y3ooUvyFb5/kqXPwKhE EwcA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="VIzB8/sL"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n26si1387758ejg.362.2021.01.20.21.20.22; Wed, 20 Jan 2021 21:20:45 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b="VIzB8/sL"; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727119AbhAUFQo (ORCPT + 99 others); Thu, 21 Jan 2021 00:16:44 -0500 Received: from mail.kernel.org ([198.145.29.99]:46824 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725819AbhAUFKe (ORCPT ); Thu, 21 Jan 2021 00:10:34 -0500 Received: by mail.kernel.org (Postfix) with ESMTPSA id 8C5BA238EB; Thu, 21 Jan 2021 05:09:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1611205793; bh=VQoVL5YWfcUFH/Rt5WChiTy0ghlSgObFvFGuziLJHYY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VIzB8/sLkAFsu/XhcA3XnnGFREMlTUobzEauUghXf3xpfdXJd6Z/89Or/3fUBDir9 OW7mG9//mZ7rDHl3H+OQsQmJDJdRVdy+ZvdJ3310aiIEy+Uv4zW0y5XitHUnNZV3M0 gftId9FP9dBsEnAjSHlws6/Lua1Yl//S0bHC2bB78frbbwkwF7y3ACerNI5MOWXpop /rnKnVU9o3iFPvqF55Hvhe6VHE74VyCVdnHppUVPxjbZOiRXr805FarOVuxC5y1avb zz8ohaeju+1ylHHk4Rt5nPjt6OTAFZlrQtAp4iM8HUO4iaoN4HuMgRG76tYwvcenSf wV2As9dsuHRnQ== From: Andy Lutomirski To: x86@kernel.org Cc: LKML , Krzysztof Mazur , =?UTF-8?q?Krzysztof=20Ol=C4=99dzki?= , Arnd Bergmann , Andy Lutomirski Subject: [PATCH v3 1/4] x86/fpu: Add kernel_fpu_begin_mask() to selectively initialize state Date: Wed, 20 Jan 2021 21:09:48 -0800 Message-Id: X-Mailer: git-send-email 2.29.2 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Currently, requesting kernel FPU access doesn't distinguish which parts of the extended ("FPU") state are needed. This is nice for simplicity, but there are a few cases in which it's suboptimal: - The vast majority of in-kernel FPU users want XMM/YMM/ZMM state but do not use legacy 387 state. These users want MXCSR initialized but don't care about the FPU control word. Skipping FNINIT would save time. (Empirically, FNINIT is several times slower than LDMXCSR.) - Code that wants MMX doesn't want or need MXCSR initialized. _mmx_memcpy(), for example, can run before CR4.OSFXSR gets set, and initializing MXCSR will fail. - Any future in-kernel users of XFD (eXtended Feature Disable)-capable dynamic states will need special handling. Add a more specific API that allows callers specify exactly what they want. Signed-off-by: Andy Lutomirski --- arch/x86/include/asm/fpu/api.h | 15 +++++++++++++-- arch/x86/kernel/fpu/core.c | 9 +++++---- 2 files changed, 18 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/fpu/api.h b/arch/x86/include/asm/fpu/api.h index dcd9503b1098..38f4936045ab 100644 --- a/arch/x86/include/asm/fpu/api.h +++ b/arch/x86/include/asm/fpu/api.h @@ -16,14 +16,25 @@ * Use kernel_fpu_begin/end() if you intend to use FPU in kernel context. It * disables preemption so be careful if you intend to use it for long periods * of time. - * If you intend to use the FPU in softirq you need to check first with + * If you intend to use the FPU in irq/softirq you need to check first with * irq_fpu_usable() if it is possible. */ -extern void kernel_fpu_begin(void); + +/* Kernel FPU states to initialize in kernel_fpu_begin_mask() */ +#define KFPU_387 _BITUL(0) /* 387 state will be initialized */ +#define KFPU_MXCSR _BITUL(1) /* MXCSR will be initialized */ + +extern void kernel_fpu_begin_mask(unsigned int kfpu_mask); extern void kernel_fpu_end(void); extern bool irq_fpu_usable(void); extern void fpregs_mark_activate(void); +/* Code that is unaware of kernel_fpu_begin_mask() can use this */ +static inline void kernel_fpu_begin(void) +{ + kernel_fpu_begin_mask(KFPU_387 | KFPU_MXCSR); +} + /* * Use fpregs_lock() while editing CPU's FPU registers or fpu->state. * A context switch will (and softirq might) save CPU's FPU registers to diff --git a/arch/x86/kernel/fpu/core.c b/arch/x86/kernel/fpu/core.c index eb86a2b831b1..571220ac8bea 100644 --- a/arch/x86/kernel/fpu/core.c +++ b/arch/x86/kernel/fpu/core.c @@ -121,7 +121,7 @@ int copy_fpregs_to_fpstate(struct fpu *fpu) } EXPORT_SYMBOL(copy_fpregs_to_fpstate); -void kernel_fpu_begin(void) +void kernel_fpu_begin_mask(unsigned int kfpu_mask) { preempt_disable(); @@ -141,13 +141,14 @@ void kernel_fpu_begin(void) } __cpu_invalidate_fpregs_state(); - if (boot_cpu_has(X86_FEATURE_XMM)) + /* Put sane initial values into the control registers. */ + if (likely(kfpu_mask & KFPU_MXCSR) && boot_cpu_has(X86_FEATURE_XMM)) ldmxcsr(MXCSR_DEFAULT); - if (boot_cpu_has(X86_FEATURE_FPU)) + if (unlikely(kfpu_mask & KFPU_387) && boot_cpu_has(X86_FEATURE_FPU)) asm volatile ("fninit"); } -EXPORT_SYMBOL_GPL(kernel_fpu_begin); +EXPORT_SYMBOL_GPL(kernel_fpu_begin_mask); void kernel_fpu_end(void) { -- 2.29.2