Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755932AbcLOKgJ (ORCPT ); Thu, 15 Dec 2016 05:36:09 -0500 Received: from smtp5-g21.free.fr ([212.27.42.5]:61924 "EHLO smtp5-g21.free.fr" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752693AbcLOKgH (ORCPT ); Thu, 15 Dec 2016 05:36:07 -0500 Subject: Re: Linux crashes when trying to online secondary core From: Mason To: Linux ARM , LKML Cc: Thomas Gleixner , Mark Rutland , Anna-Maria Gleixner , Richard Cochran , Sebastian Andrzej Siewior , Daniel Lezcano , Peter Zijlstra , Ingo Molnar , Sebastian Frias , Thibaud Cornic , Robin Murphy References: <8a021a90-e69e-f38b-c8df-ea8963f3973f@free.fr> Message-ID: <1147ef90-7877-e4d2-bb2b-5c4fa8d3144b@free.fr> Date: Thu, 15 Dec 2016 11:35:12 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:50.0) Gecko/20100101 Firefox/50.0 SeaMonkey/2.47 MIME-Version: 1.0 In-Reply-To: <8a021a90-e69e-f38b-c8df-ea8963f3973f@free.fr> Content-Type: text/plain; charset=ISO-8859-1 Content-Language: Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2576 Lines: 76 On 14/12/2016 18:47, Mason wrote: > On 14/12/2016 18:08, Thomas Gleixner wrote: > >> On Wed, 14 Dec 2016, Mason wrote: >> >>> I'm seeing Linux v4.9 crash (dereferencing NULL) when I try to online >>> the secondary core, after putting it offline. >> >> Does the patch below fix the issue? >> >> Thanks, >> >> tglx >> >> 8<--------------- >> >> diff --git a/include/linux/cpuhotplug.h b/include/linux/cpuhotplug.h >> index 22acee76cf4c..2594c287b078 100644 >> --- a/include/linux/cpuhotplug.h >> +++ b/include/linux/cpuhotplug.h >> @@ -101,7 +101,6 @@ enum cpuhp_state { >> CPUHP_AP_ARM_L2X0_STARTING, >> CPUHP_AP_ARM_ARCH_TIMER_STARTING, >> CPUHP_AP_ARM_GLOBAL_TIMER_STARTING, >> - CPUHP_AP_DUMMY_TIMER_STARTING, >> CPUHP_AP_JCORE_TIMER_STARTING, >> CPUHP_AP_EXYNOS4_MCT_TIMER_STARTING, >> CPUHP_AP_ARM_TWD_STARTING, >> @@ -111,6 +110,7 @@ enum cpuhp_state { >> CPUHP_AP_MARCO_TIMER_STARTING, >> CPUHP_AP_MIPS_GIC_TIMER_STARTING, >> CPUHP_AP_ARC_TIMER_STARTING, >> + CPUHP_AP_DUMMY_TIMER_STARTING, >> CPUHP_AP_KVM_STARTING, >> CPUHP_AP_KVM_ARM_VGIC_INIT_STARTING, >> CPUHP_AP_KVM_ARM_VGIC_STARTING, > > $ patch -p1 < tglx.patch > patching file include/linux/cpuhotplug.h > Hunk #1 succeeded at 80 (offset -21 lines). > Hunk #2 succeeded at 89 (offset -21 lines). > > It does seem to fix the problem: > > # echo 0 > /sys/devices/system/cpu/cpu1/online > SMC called with a0=0x00[000001 a1=0x00000121 a2=0x00000005 a3 =0xc01189b4 0x00000121 > [1][flow/suspend3.c:39] CPU 1 die: jumping6 to. post-boot WFE > 402826] CPU1: shutdown > SMC called with a0=0x00000001 a1=0x00000122 a2=0x00000000 a3=0x00000000 0x00000122 > [0][flow/suspend.c:82] Killing core1 > armor+++ armor: core 1 booted, entering wfe... > # echo 1 > /sys/devices/system/cpu/cpu1/online > [ 215.692700] tango_boot_secondary from __cpu_up > SMC called with a0=0x80101500 a1=0x00000105 a2=0x00000000 a3=0x00000000 0x00000105 > [ 215.704494] tango_set_aux_boot_addr=0 > SMC called with a0=0x00000001 a1=0x00000104 a2=0x00000000 a3=0x00000000 0x00000104 > [0][flow/smc_handler.c:127] waking up CPU1 > [ 215.719308] tango_start_aux_core=0 > > > I reverted your patch, and the kernel blows up again. > > So what's the problem, and how does your patch solve it? Link to the original report: https://marc.info/?l=linux-arm-kernel&m=148173152524746&w=2 Forgot to CC Robin Murphy, who had provided valuable input in similar circumstances a few months back. Also add LKML, since this doesn't appear to be ARM-specific. Do I need to specify which device tree I was using? Regards.