Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754478AbbGPJRX (ORCPT ); Thu, 16 Jul 2015 05:17:23 -0400 Received: from 8bytes.org ([81.169.241.247]:34278 "EHLO theia.8bytes.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752402AbbGPJRW (ORCPT ); Thu, 16 Jul 2015 05:17:22 -0400 From: Joerg Roedel To: Thomas Gleixner , Ingo Molnar , "H. Peter Anvin" Cc: x86@kernel.org, Borislav Petkov , linux-kernel@vger.kernel.org, Joerg Roedel Subject: [PATCH] x86/smpboot: Check for cpu_active on cpu initialization Date: Thu, 16 Jul 2015 11:17:17 +0200 Message-Id: <1437038237-16741-1-git-send-email-joro@8bytes.org> X-Mailer: git-send-email 1.9.1 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1842 Lines: 55 From: Joerg Roedel Currently the code to bring up secondary CPUs only checks for cpu_online before it proceeds with launching the per-cpu threads for the freshly booted remote CPU. But the code to move these threads to the new CPU checks for cpu_active to do so. If this check fails the threads end up on the wrong CPU, causing warnings and bugs like: WARNING: CPU: 0 PID: 1 at ../kernel/workqueue.c:4417 workqueue_cpu_up_callback and/or: kernel BUG at ../kernel/smpboot.c:135! The reason is that the cpu_active bit for the new CPU becomes visible significantly later than the cpu_online bit. The reasons could be that the kernel runs in a KVM guest, where the vCPU thread gets preempted when the cpu_online bit is set, but with cpu_active still clear. But this could also happen on bare-metal systems with lots of CPUs. We have observed this issue on an 88 core x86 system on bare-metal. To fix this issue, wait before the remote CPU is online *and* active before launching the per-cpu threads. Signed-off-by: Joerg Roedel --- arch/x86/kernel/smpboot.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/kernel/smpboot.c b/arch/x86/kernel/smpboot.c index d3010aa..30b7b8b 100644 --- a/arch/x86/kernel/smpboot.c +++ b/arch/x86/kernel/smpboot.c @@ -1006,7 +1006,7 @@ int native_cpu_up(unsigned int cpu, struct task_struct *tidle) check_tsc_sync_source(cpu); local_irq_restore(flags); - while (!cpu_online(cpu)) { + while (!cpu_online(cpu) || !cpu_active(cpu)) { cpu_relax(); touch_nmi_watchdog(); } -- 1.9.1 -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/