Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752845AbbF0ITk (ORCPT ); Sat, 27 Jun 2015 04:19:40 -0400 Received: from mail-wi0-f175.google.com ([209.85.212.175]:32792 "EHLO mail-wi0-f175.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753357AbbF0ITA (ORCPT ); Sat, 27 Jun 2015 04:19:00 -0400 Date: Sat, 27 Jun 2015 10:18:55 +0200 From: Ingo Molnar To: Mike Galbraith Cc: Ingo Molnar , LKML , Borislav Petkov , "H. Peter Anvin" , Thomas Gleixner Subject: Re: regression: massive trouble with fpu rework Message-ID: <20150627081855.GA10192@gmail.com> References: <1435386316.3664.23.camel@gmail.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1435386316.3664.23.camel@gmail.com> User-Agent: Mutt/1.5.23 (2014-03-12) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4049 Lines: 118 * Mike Galbraith wrote: > Hi Ingo, > > My i7-4790 box is having one hell of a time with this merge window, is > dead in the water. The netconsole log below is v4.1-7254-gc13c81006314, > but trouble begins at bisected point much earlier. If I turn off kvm, > such that I can kinda sorta boot, systemd says many services "enter > failed state", box is pretty much a doorstop. Though I can get to a > prompt, I can't login. If kvm is enabled, it explodes as soon as it > autoloads (wtf does it do that when it's not being used?) > > Bisecting to the beginning of my woes takes me to the below. Before > that, it doesn't matter if kvm is enabled or not, all is well. Below > the current gripage with kvm disabled, find the kvm explosion, and > another explosion as I approach the beginning of my box's woes. > > 067051ccd209623cb56152cf4cb06616ee2bcc5c is the first bad commit > commit 067051ccd209623cb56152cf4cb06616ee2bcc5c > Author: Ingo Molnar > Date: Sat Apr 25 08:27:44 2015 +0200 > > x86/fpu: Do system-wide setup from fpu__detect() > > fpu__cpu_init() is called on every CPU, so it is the wrong place > to call fpu__init_system() from. Call it from fpu__detect(): > this is early CPU init code, but we already have CPU features detected, > so we can call the system-wide FPU init code from here. > > Reviewed-by: Borislav Petkov > Cc: Andy Lutomirski > Cc: Dave Hansen > Cc: Fenghua Yu > Cc: H. Peter Anvin > Cc: Linus Torvalds > Cc: Oleg Nesterov > Cc: Peter Zijlstra > Cc: Thomas Gleixner > Signed-off-by: Ingo Molnar Just as a quick workaround, if you add back a per CPU init fpu__init_system() call, as per the disgusting hack below, do things get happier? ( You might trigger a few WARN_ON_ONCE() whinges, especially if you have CONFIG_X86_DEBUG_FPU=y, but those should be one-time warnings that are not fatal. ) Totally untested, unfortunately. My theory of the bug is that there is something that needs to be set up per CPU, which is a side effect of fpu__init_system(), and which the new fpu__init_cpu() does not capture. If this patch helps then the real fix would be to figure out that side effect. Thanks, Ingo arch/x86/kernel/fpu/init.c | 15 ++++++++++++--- 1 file changed, 12 insertions(+), 3 deletions(-) diff --git a/arch/x86/kernel/fpu/init.c b/arch/x86/kernel/fpu/init.c index fc878fee6a51..421babb08fe6 100644 --- a/arch/x86/kernel/fpu/init.c +++ b/arch/x86/kernel/fpu/init.c @@ -4,6 +4,9 @@ #include #include +#undef __init +#define __init + /* * Initialize the TS bit in CR0 according to the style of context-switches * we are using: @@ -44,13 +47,18 @@ static void fpu__init_cpu_generic(void) /* * Enable all supported FPU features. Called when a CPU is brought online: */ -void fpu__init_cpu(void) +static void __fpu__init_cpu(void) { fpu__init_cpu_generic(); fpu__init_cpu_xstate(); fpu__init_cpu_ctx_switch(); } +void fpu__init_cpu(void) +{ + fpu__init_system(NULL); +} + /* * The earliest FPU detection code. * @@ -267,13 +275,14 @@ static void __init fpu__init_system_ctx_switch(void) */ void __init fpu__init_system(struct cpuinfo_x86 *c) { - fpu__init_system_early_generic(c); + if (c) + fpu__init_system_early_generic(c); /* * The FPU has to be operational for some of the * later FPU init activities: */ - fpu__init_cpu(); + __fpu__init_cpu(); /* * But don't leave CR0::TS set yet, as some of the FPU setup -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/