Received: by 2002:a05:6a10:2785:0:0:0:0 with SMTP id ia5csp230377pxb; Thu, 14 Jan 2021 04:38:42 -0800 (PST) X-Google-Smtp-Source: ABdhPJxowA8LUsQZpfe+Esx5wQLuAoGGL4MwQJJxNqby5AUjxzH2bXLto0T0ABMWVUOtZijlZD21 X-Received: by 2002:a17:907:a8a:: with SMTP id by10mr4972528ejc.423.1610627922766; Thu, 14 Jan 2021 04:38:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1610627922; cv=none; d=google.com; s=arc-20160816; b=f8OXUx75P9mhyF+BAdCGU/v7HOgH4FjtaaZPM8DnTgd0EjaYMw55LhwYY+wDFvhE1E jEEF6cI+NrC6wkB7d5h6R/4yy2uPzhwnaowc+JRrgtWoSUEvP3PSR8tWv553eLYb2LOU jV06DsoBr0IdJQkw+s8qub8lmJzPpQZ4HgBgrbTgAEykDwP2Iak0xZge11o7UlrA0XMM A+XV5L0lpoSaz9RQL9dRmq8A/B90TsWkLHk1USnfc5c6r/5UNtsdEliWVeO7M3eEYci7 Sz5vopo+MUSDARtmNRahhBxeHCXIOYQwDRzCwPbUTsjYLSsvjrXBiG/OKcGa445nf3fC mGVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=y8LbPFhZqIxl+ejVvwFXNE1IRqgQYtjoW/4InSAVQI8=; b=ymWAdbuAsjBl7RwF4Jw2UbZaL0+ElB1cKgIxCyecEMUKFs6z1nZoWZMuqEKDQ7uo1o Os2WkPziRipLGL9dm2MlgFXU8cOL15oP62T6x1FdoYjtjD/fdgspAUqjBpSqHlbZfYSb 6FQSFtDFwq4s40kCV31JmqyotJ+AlbanZAFjzqqZPEINDyonXAgKjEm6LN9UwxIlr6ZG 2hbyb8aHEmjcyu5t1/o3YX2wJcy+o6vaZk9o80uGvU0IhiyUjaMlqRug9u1w+2jgN+Fh I3JxmmoKQ8syQmk7Y6Q4vJGc4qVt64J0zrFd+bw31e/HD0pwsQWAJ9sGtbIDEJx+FWqQ N8SQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=podlesie.net Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a15si2327019ejd.293.2021.01.14.04.38.17; Thu, 14 Jan 2021 04:38:42 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=podlesie.net Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726620AbhANMhl (ORCPT + 99 others); Thu, 14 Jan 2021 07:37:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40740 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726259AbhANMhl (ORCPT ); Thu, 14 Jan 2021 07:37:41 -0500 Received: from shrek.podlesie.net (shrek-3s.podlesie.net [IPv6:2a00:13a0:3010::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id EE66FC061574; Thu, 14 Jan 2021 04:36:58 -0800 (PST) Received: by shrek.podlesie.net (Postfix, from userid 603) id D198549B; Thu, 14 Jan 2021 13:36:57 +0100 (CET) Date: Thu, 14 Jan 2021 13:36:57 +0100 From: Krzysztof Mazur To: Borislav Petkov Cc: Thomas Gleixner , Ingo Molnar , x86@kernel.org, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: Re: [PATCH] x86/lib: don't use MMX before FPU initialization Message-ID: <20210114123657.GA6358@shrek.podlesie.net> References: <20201228160631.32732-1-krzysiek@podlesie.net> <20210112000923.GK25645@zn.tnic> <20210114092218.GA26786@shrek.podlesie.net> <20210114094425.GA12284@zn.tnic> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20210114094425.GA12284@zn.tnic> User-Agent: Mutt/1.6.2 (2016-07-01) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Jan 14, 2021 at 10:44:25AM +0100, Borislav Petkov wrote: > I believe the correct fix should be > > if (unlikely(in_interrupt()) || !(cr4_read_shadow() & X86_CR4_OSFXSR)) > return __memcpy(to, from, len); > > in _mmx_memcpy() as you had it in your first patch. > The OSFXSR must be set only on CPUs with SSE. There are some CPUs with 3DNow!, but without SSE and FXSR (like AMD Geode LX, which is still used in many embedded systems). So, I've changed that to: if (unlikely(in_interrupt()) || (boot_cpu_has(X86_FEATURE_XMM) && unlikely(!(cr4_read_shadow() & X86_CR4_OSFXSR)))) However, I'm not sure that adding two taken branches (that second unlikely() does not help gcc to generate better code) and a call to cr4_read_shadow() (not inlined) is a good idea. The static branch adds a single jump, which is changed to NOP. The code with boot_cpu_has() and CR4 checks adds: c09932ad: a1 50 50 c3 c0 mov 0xc0c35050,%eax c09932b2: a9 00 00 00 02 test $0x2000000,%eax /* this branch is taken */ c09932b7: 0f 85 13 01 00 00 jne c09933d0 <_mmx_memcpy+0x140> c09932bd: /* MMX code is here */ /* cr4_read_shadow is not inlined */ c09933d0: e8 8b 01 78 ff call c0113560 c09933d5: f6 c4 02 test $0x2,%ah /* this branch is also taken */ c09933d8: 0f 85 df fe ff ff jne c09932bd <_mmx_memcpy+0x2d> However, except for some embedded systems, probably almost nobody uses that code today, and the most imporant is avoiding future breakage. And I'm not sure which approach is better. Because that CR4 test tests for a feature that is not used in mmx_memcpy(), but it's used in kernel_fpu_begin(). And in future kernel_fpu_begin() may change and require also other stuff. So I think that the best approach would be delay any FPU optimized memcpy() after fpu__init_system() is executed. Best regards, Krzysiek -- >8 -- Subject: [PATCH] x86/lib: don't use mmx_memcpy() to early The MMX 3DNow! optimized memcpy() is used very early, even before FPU is initialized in the kernel. It worked fine, but commit 7ad816762f9bf89e940e618ea40c43138b479e10 ("x86/fpu: Reset MXCSR to default in kernel_fpu_begin()") broke that. After that commit the kernel_fpu_begin() assumes that FXSR is enabled in the CR4 register on all processors with SSE. Because memcpy() is used before FXSR is enabled, the kernel crashes just after "Booting the kernel." message. It affects all kernels with CONFIG_X86_USE_3DNOW (enabled when some AMD/Cyrix processors are selected) on processors with SSE (like AMD K7, which supports both MMX 3DNow! and SSE). Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Borislav Petkov Cc: "H. Peter Anvin" Signed-off-by: Krzysztof Mazur --- arch/x86/lib/mmx_32.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-) diff --git a/arch/x86/lib/mmx_32.c b/arch/x86/lib/mmx_32.c index 4321fa02e18d..70aa769570e6 100644 --- a/arch/x86/lib/mmx_32.c +++ b/arch/x86/lib/mmx_32.c @@ -25,13 +25,20 @@ #include #include +#include void *_mmx_memcpy(void *to, const void *from, size_t len) { void *p; int i; - if (unlikely(in_interrupt())) + /* + * kernel_fpu_begin() assumes that FXSR is enabled on all processors + * with SSE. Thus, MMX-optimized version can't be used + * before the kernel enables FXSR (OSFXSR bit in the CR4 register). + */ + if (unlikely(in_interrupt()) || (boot_cpu_has(X86_FEATURE_XMM) && + unlikely(!(cr4_read_shadow() & X86_CR4_OSFXSR)))) return __memcpy(to, from, len); p = to; -- 2.27.0.rc1.207.gb85828341f