Received: by 2002:a05:6358:53a8:b0:117:f937:c515 with SMTP id z40csp4182110rwe; Mon, 17 Apr 2023 09:02:47 -0700 (PDT) X-Google-Smtp-Source: AKy350b9rXQYNsfudZi5g4Gdi690bHFeX0mCfuAJnS042q9HFcUatOaVjUac30BymzwDGoioQBid X-Received: by 2002:a17:902:e842:b0:1a2:4921:f9a1 with SMTP id t2-20020a170902e84200b001a24921f9a1mr14959977plg.44.1681747367017; Mon, 17 Apr 2023 09:02:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1681747367; cv=none; d=google.com; s=arc-20160816; b=AXqPpDQL9CiWOSFclcIQh9DH9kdiaGm/OszVmHTRAlGcn7Rk6mJ0UsjFUBgnVOL70K lsaFyCu6o4mca77aU+KRId7eHy7Uxz/zfhIUFSt4fNTLoJ6IK1GqCr4rh6qIy81Hcnln D2SQ/QQq6ZIyVmCpRMFvVvJhxCT1sU7MaNeZnXZJFR4eG//mUlTE5T4ini367JnFC18L hJDVy6FzyKl6F5xP97yn69Kuhm7ldpvg6Rpdiu0py/1Q51ir5zMHtnmXAcjyGQMzlYf7 zsHdU/gukg4vYuz7PNpXl4u6p/qSH4ZnMTgC+MkI4l795W+Mw//qYgPVv9b28NF+LnsR MRCQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=osCJvCEjxQ62+l8fYMj/EtKdkv3jY6wQErDoQSD+A4M=; b=zg8c5zBgrKNJ+1M47EUt9unoulh0DIwu5yWZs1vPwcdMq9dM/Sux1eyBthMXxyHGIi tsjkr1va8Nwkmza4FR6MW8kj8eAvDFtugE0zU3ObIOGPuGbYXmvI1CePamnoGWp+2HtV q/VnjVjAvsXemJlTJNYHdJhnivlRVmd25z4XUCgrH7zpeR1QxUypMdZ7c97iPC+j3j8i VINXPp+9bqjtmpj8uiU8vjn5KOfUeRYTHFCK4ufPzMLNDI70865Q7GScptQD9zxj4lEw 1gt094Omxz1tNFaWRf1ZylGXqojPrcOzO1fwXzGUbSWgS3DStwiKj6WyVdo7ciOn7KeJ 1hZg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id bc7-20020a170902930700b0019935e9b087si11712362plb.234.2023.04.17.09.02.31; Mon, 17 Apr 2023 09:02:46 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230286AbjDQP5K (ORCPT + 99 others); Mon, 17 Apr 2023 11:57:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52782 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229515AbjDQP5I (ORCPT ); Mon, 17 Apr 2023 11:57:08 -0400 Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id E88EB92; Mon, 17 Apr 2023 08:57:06 -0700 (PDT) Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 748271691; Mon, 17 Apr 2023 08:51:45 -0700 (PDT) Received: from FVFF77S0Q05N (unknown [10.57.19.253]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 99D093F5A1; Mon, 17 Apr 2023 08:50:56 -0700 (PDT) Date: Mon, 17 Apr 2023 16:50:53 +0100 From: Mark Rutland To: Thomas Gleixner Cc: LKML , x86@kernel.org, David Woodhouse , Andrew Cooper , Brian Gerst , Arjan van de Veen , Paolo Bonzini , Paul McKenney , Tom Lendacky , Sean Christopherson , Oleksandr Natalenko , Paul Menzel , "Guilherme G. Piccoli" , Piotr Gorski , Catalin Marinas , Will Deacon , linux-arm-kernel@lists.infradead.org, David Woodhouse , Usama Arif , Juergen Gross , Boris Ostrovsky , xen-devel@lists.xenproject.org, Russell King , Arnd Bergmann , Guo Ren , linux-csky@vger.kernel.org, Thomas Bogendoerfer , linux-mips@vger.kernel.org, "James E.J. Bottomley" , Helge Deller , linux-parisc@vger.kernel.org, Paul Walmsley , Palmer Dabbelt , linux-riscv@lists.infradead.org, Sabin Rapan Subject: Re: [patch 22/37] arm64: smp: Switch to hotplug core state synchronization Message-ID: References: <20230414225551.858160935@linutronix.de> <20230414232310.569498144@linutronix.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20230414232310.569498144@linutronix.de> X-Spam-Status: No, score=-4.2 required=5.0 tests=BAYES_00,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Apr 15, 2023 at 01:44:49AM +0200, Thomas Gleixner wrote: > Switch to the CPU hotplug core state tracking and synchronization > mechanim. No functional change intended. > > Signed-off-by: Thomas Gleixner > Cc: Catalin Marinas > Cc: Will Deacon > Cc: linux-arm-kernel@lists.infradead.org I gave this a spin on arm64 (in a 64-vCPU VM on an M1 host), and it seems to work fine with a bunch of vCPUs being hotplugged off and on again randomly. FWIW: Tested-by: Mark Rutland I also hacked the code to have the dying CPU spin forever before the call to cpuhp_ap_report_dead(). In that case I see a warning, and that we don't call arch_cpuhp_cleanup_dead_cpu(), and that the CPU is marked as offline (per /sys/devices/system/cpu/$N/online). As a tangent/aside, we might need to improve that for confidential compute architectures, and we might want to generically track cpus which might still be using kernel text/data. On arm64 we ensure that via our cpu_kill() callback (which'll use PSCI CPU_AFFINITY_INFO), but I'm not sure if TDX and/or SEV-SNP have a similar mechanism. Otherwise, a malicious hypervisor can pause a vCPU just before it leaves the kernel (e.g. immediately after the arch_cpuhp_cleanup_dead_cpu() call), wait for a kexec (or resuse of stack memroy), and unpause the vCPU to cause things to blow up. Thanks, Mark. > --- > arch/arm64/Kconfig | 1 + > arch/arm64/include/asm/smp.h | 2 +- > arch/arm64/kernel/smp.c | 14 +++++--------- > 3 files changed, 7 insertions(+), 10 deletions(-) > > --- a/arch/arm64/Kconfig > +++ b/arch/arm64/Kconfig > @@ -216,6 +216,7 @@ config ARM64 > select HAVE_KPROBES > select HAVE_KRETPROBES > select HAVE_GENERIC_VDSO > + select HOTPLUG_CORE_SYNC_DEAD if HOTPLUG_CPU > select IRQ_DOMAIN > select IRQ_FORCED_THREADING > select KASAN_VMALLOC if KASAN > --- a/arch/arm64/include/asm/smp.h > +++ b/arch/arm64/include/asm/smp.h > @@ -99,7 +99,7 @@ static inline void arch_send_wakeup_ipi_ > > extern int __cpu_disable(void); > > -extern void __cpu_die(unsigned int cpu); > +static inline void __cpu_die(unsigned int cpu) { } > extern void cpu_die(void); > extern void cpu_die_early(void); > > --- a/arch/arm64/kernel/smp.c > +++ b/arch/arm64/kernel/smp.c > @@ -333,17 +333,13 @@ static int op_cpu_kill(unsigned int cpu) > } > > /* > - * called on the thread which is asking for a CPU to be shutdown - > - * waits until shutdown has completed, or it is timed out. > + * Called on the thread which is asking for a CPU to be shutdown after the > + * shutdown completed. > */ > -void __cpu_die(unsigned int cpu) > +void arch_cpuhp_cleanup_dead_cpu(unsigned int cpu) > { > int err; > > - if (!cpu_wait_death(cpu, 5)) { > - pr_crit("CPU%u: cpu didn't die\n", cpu); > - return; > - } > pr_debug("CPU%u: shutdown\n", cpu); > > /* > @@ -370,8 +366,8 @@ void cpu_die(void) > > local_daif_mask(); > > - /* Tell __cpu_die() that this CPU is now safe to dispose of */ > - (void)cpu_report_death(); > + /* Tell cpuhp_bp_sync_dead() that this CPU is now safe to dispose of */ > + cpuhp_ap_report_dead(); > > /* > * Actually shutdown the CPU. This must never fail. The specific hotplug >