Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752141AbcCKKuu (ORCPT ); Fri, 11 Mar 2016 05:50:50 -0500 Received: from mail-wm0-f47.google.com ([74.125.82.47]:33609 "EHLO mail-wm0-f47.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751634AbcCKKum (ORCPT ); Fri, 11 Mar 2016 05:50:42 -0500 Message-ID: <1457693438.2007.8.camel@nexus-software.ie> Subject: Re: Got FPU related warning on Intel Quark during boot From: "Bryan O'Donoghue" To: Andy Lutomirski Cc: Andy Shevchenko , Borislav Petkov , Ingo Molnar , "linux-kernel@vger.kernel.org" , "x86@kernel.org" , Fenghua Yu , Linus Torvalds , "H. Peter Anvin" , Thomas Gleixner , Andrew Morton , Dave Hansen , Oleg Nesterov , "Yu, Yu-cheng" Date: Fri, 11 Mar 2016 10:50:38 +0000 In-Reply-To: References: <20160310111935.GB13102@gmail.com> <20160310125610.GA26708@pd.tnic> <20160310145940.GB26708@pd.tnic> <1457624721.5784.0.camel@nexus-software.ie> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.16.5-1ubuntu3.1 Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3948 Lines: 127 On Thu, 2016-03-10 at 17:31 -0800, Andy Lutomirski wrote: > On Thu, Mar 10, 2016 at 7:45 AM, Bryan O'Donoghue > wrote: > > On Thu, 2016-03-10 at 17:22 +0200, Andy Shevchenko wrote: > > > On Thu, Mar 10, 2016 at 4:59 PM, Borislav Petkov > > > wrote: > > > > On Thu, Mar 10, 2016 at 03:31:43PM +0200, Andy Shevchenko > > > > wrote: > > > > > Looks like it lacks that one. > > > > > > > > > > # grep -i fxsr /proc/cpuinfo; echo $? > > > > > 1 > > > > > > > > Ok, so looking at where the warning comes from: > > > > > > > > [ 14.714533] WARNING: CPU: 0 PID: 823 at > > > > arch/x86/include/asm/fpu/internal.h:163 fpu__clear+0x8c/0x160 > > > > > > > > static inline void copy_kernel_to_fxregs(struct fxregs_state > > > > *fx) > > > > { > > > > int err; > > > > > > > > if (config_enabled(CONFIG_X86_32)) { > > > > err = check_insn(fxrstor %[fx], "=m" (*fx), > > > > [fx] > > > > "m" (*fx)); > > > > ^^^^^^^^^^^^^^^^^ > > > > } else { > > > > > > > > ... > > > > > > > > /* Copying from a kernel buffer to FPU registers should > > > > never fail: */ > > > > WARN_ON_FPU(err); > > > > > > > > > > > > and the stacktrace is pretty clear: > > > > > > > > flush_thread > > > > > -> fpu__clear(&tsk->thread.fpu); > > > > |-> we are eager by default here: > > > > > > > > if (!use_eager_fpu() || > > > > !static_cpu_has(X86_FEATURE_FPU)) { > > > > /* FPU state will be reallocated lazily at the > > > > first use. */ > > > > fpu__drop(fpu); > > > > } else { > > > > > > > > --> we're in that branch. > > > > > > > > copy_init_fpstate_to_fpregs(); > > > > |-> copy_kernel_to_fxregs() > > > > > > > > > > > > I think we should use FRSTOR on quark, i.e., > > > > copy_kernel_to_fregs(). > > > > > > > > Does this untested wild guess even work? > > > > > > > > --- > > > > diff --git a/arch/x86/kernel/fpu/core.c > > > > b/arch/x86/kernel/fpu/core.c > > > > index dea8e76d60c6..bbafe5e8a1a6 100644 > > > > --- a/arch/x86/kernel/fpu/core.c > > > > +++ b/arch/x86/kernel/fpu/core.c > > > > @@ -474,8 +474,11 @@ static inline void > > > > copy_init_fpstate_to_fpregs(void) > > > > { > > > > if (use_xsave()) > > > > copy_kernel_to_xregs(&init_fpstate.xsave, -1); > > > > - else > > > > + else if (static_cpu_has(X86_FEATURE_FXSR)) > > > > copy_kernel_to_fxregs(&init_fpstate.fxsave); > > > > + else > > > > + copy_kernel_to_fregs(&init_fpstate.fsave); > > > > + > > > > > > Obviously redundant line, otherwise it indeed works > > > > > > Tested-by: Andy Shevchenko > > > > > > > } > > > > > > > > /* > > > > > > > > > > > > > It works but user-space FPU is broken; something's wrong with the > > initial state of the FPU regs - it looks as though they aren't > > being > > properly initialized and FPU context in the signal handler is wrong > > too. > > > > Linux 3.8.7: > > /root@galileo:~# ./fpu > > f is 10.000000 g is 10.100000 > > Double value is 0.000000 > > Double value is 0.100000 > > Double value is 0.200000 > > ^Chandler value of variable is 0.300000 > > Double value is 0.300000 > > Double value is 0.400000 > > > > Linux-next + Boris' fix: > > root@galileo:~# ./fpu > > f is -nan g is -nan > > Double value is 0.000000 > > Double value is 0.100000 > > Double value is 0.200000^C > > handler value of variable is -nan > > Double value is 0.300000 > > Double value is 0.400000^Z[1]+ Stopped > > > > Just to check: are you running the exact same compiled binary on both > kernels? Because your test case invokes undefined behavior, and I'm > a > bit surprised you get anything sensible from it. That being said, > the > f = -nan part is worrisome. > > --Andy It's the same binary yes.