Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp5139869rdb; Tue, 12 Dec 2023 23:03:04 -0800 (PST) X-Google-Smtp-Source: AGHT+IGJ5MRgUlDJFkBMEhb4Z+PQpRlT92RERsOEsXfxqTOGvLMZlZVISfQLzUQvN3r8t5kBhUfy X-Received: by 2002:a05:6808:10d2:b0:3b8:b063:8243 with SMTP id s18-20020a05680810d200b003b8b0638243mr10287780ois.69.1702450984261; Tue, 12 Dec 2023 23:03:04 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702450984; cv=none; d=google.com; s=arc-20160816; b=UjZGXOr/2zqGQhLj/0/j5LOBysU9TpOtS4RDjCpOW1kPcF2eyGrZ/nbtijowUhQQta YKerScrzCwOrtK/A4m8ucZQ5+uMzboHplWLC2r23YHTtXti5n2dRGvjC9bxDukIAspI+ bjvo9MadTYuCRYut1mGCM1Vtw5cVgnR3vYS8+dZf3uDIOYC5US2hCcwQ9zFSE6E3mXT2 roqp6xxjJFdCD5w0CSJhLyQH4ghKoI0VF+FpNQHY9nbnwodECDhljZGOeDdBhvBGZY8c wKQYX7xdonsVIUYF7b9GIefqcy/UCUaXsUkXQHBbRfPBJnYJjPqopNWs00TxTrZCYODh YvJg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :message-id:date:subject:cc:to:from:dkim-signature; bh=94amqqZKB2zJhADw4oYeciSDrXheAqykdeuVtMiJBwE=; fh=JEr9ptmwkGMP1fTNrTC13Up/5IG3uBmO0d3JG+PETy0=; b=BSMKbtif2bBXlrVaBYc9KQTJY28J9Pv4hy1q0Bda8wtiRGNUgxV2qcoX1GDybBBkYE H4L77lHArrq6z6SYKBTkgClHzryFIKGfqgjuklfuDkUfXr0h4YFjTYk/XCt3qPXG5VAI 8F47wx1sMdKCVqQYxvoeHRZQYX+giS1wCL6tigd16aCVb/fbeMobkE+dj3hXkckt9eS1 7kz+HtVW2vY2CQ3wWbVhAOiwOARYjX6ph2sBanh+rF++v4RMocSNXhDghsVRDguNaSdi tcYmiq8JuXgE7GvKNxx49cwLjgjETY8p0gJ7xd6zcKxYYbMqcqQMquCDmZ4WsGolO+iQ PR2g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=YewaEIIb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Return-Path: Received: from pete.vger.email (pete.vger.email. [2620:137:e000::3:6]) by mx.google.com with ESMTPS id y15-20020a17090aa40f00b0028681d132c8si10278091pjp.24.2023.12.12.23.03.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Dec 2023 23:03:04 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) client-ip=2620:137:e000::3:6; Authentication-Results: mx.google.com; dkim=pass header.i=@amazon.com header.s=amazon201209 header.b=YewaEIIb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:6 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=amazon.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by pete.vger.email (Postfix) with ESMTP id 90B31801BAE3; Tue, 12 Dec 2023 23:03:01 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at pete.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232971AbjLMHCk (ORCPT + 99 others); Wed, 13 Dec 2023 02:02:40 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41288 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235287AbjLMGku (ORCPT ); Wed, 13 Dec 2023 01:40:50 -0500 Received: from smtp-fw-52004.amazon.com (smtp-fw-52004.amazon.com [52.119.213.154]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6BB78AB for ; Tue, 12 Dec 2023 22:40:55 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=amazon.com; i=@amazon.com; q=dns/txt; s=amazon201209; t=1702449655; x=1733985655; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=94amqqZKB2zJhADw4oYeciSDrXheAqykdeuVtMiJBwE=; b=YewaEIIb3vMLMNA8xGcEHKQgCbXPhZe7rnM/9qasdjdcJS+UJjq3BO2k uueMVMAckvXiVBDmeJyIOigD7t7faqcMbWwELxFcVtENaUUufqjlJhDlR DDkc2gz8+UR4MTm71Gpio8MxUGvyIzBaAdUX5NNR+BlqSf3uOC8PTSCMG g=; X-IronPort-AV: E=Sophos;i="6.04,272,1695686400"; d="scan'208";a="171634859" Received: from iad12-co-svc-p1-lb1-vlan2.amazon.com (HELO email-inbound-relay-iad-1d-m6i4x-00fceed5.us-east-1.amazon.com) ([10.43.8.2]) by smtp-border-fw-52004.iad7.amazon.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Dec 2023 06:40:53 +0000 Received: from smtpout.prod.us-east-1.prod.farcaster.email.amazon.dev (iad7-ws-svc-p70-lb3-vlan3.iad.amazon.com [10.32.235.38]) by email-inbound-relay-iad-1d-m6i4x-00fceed5.us-east-1.amazon.com (Postfix) with ESMTPS id E9EAFA0B0C; Wed, 13 Dec 2023 06:40:48 +0000 (UTC) Received: from EX19MTAEUA002.ant.amazon.com [10.0.43.254:56758] by smtpin.naws.eu-west-1.prod.farcaster.email.amazon.dev [10.0.22.100:2525] with esmtp (Farcaster) id 9438de3e-ca1c-46b1-a8b0-23d75734eed4; Wed, 13 Dec 2023 06:40:47 +0000 (UTC) X-Farcaster-Flow-ID: 9438de3e-ca1c-46b1-a8b0-23d75734eed4 Received: from EX19D014EUC004.ant.amazon.com (10.252.51.182) by EX19MTAEUA002.ant.amazon.com (10.252.50.126) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Wed, 13 Dec 2023 06:40:47 +0000 Received: from u5d18b891348c5b.ant.amazon.com (10.146.13.114) by EX19D014EUC004.ant.amazon.com (10.252.51.182) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.2.1118.40; Wed, 13 Dec 2023 06:40:37 +0000 From: James Gowans To: , Eric Biederman , "Sean Christopherson" CC: , , Paolo Bonzini , Marc Zyngier , Arnd Bergmann , Tony Luck , Borislav Petkov , Thomas Gleixner , Ingo Molnar , Chen-Yu Tsai , Jernej Skrabec , Samuel Holland , "Pavel Machek" , Sebastian Reichel , Orson Zhai , Alexander Graf , "Jan H . Schoenherr" Subject: [PATCH] kexec: do syscore_shutdown() in kernel_kexec Date: Wed, 13 Dec 2023 08:40:04 +0200 Message-ID: <20231213064004.2419447-1-jgowans@amazon.com> X-Mailer: git-send-email 2.34.1 MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Content-Type: text/plain X-Originating-IP: [10.146.13.114] X-ClientProxiedBy: EX19D031UWC004.ant.amazon.com (10.13.139.246) To EX19D014EUC004.ant.amazon.com (10.252.51.182) X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE, UNPARSEABLE_RELAY autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on pete.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (pete.vger.email [0.0.0.0]); Tue, 12 Dec 2023 23:03:01 -0800 (PST) syscore_shutdown() runs driver and module callbacks to get the system into a state where it can be correctly shut down. In commit 6f389a8f1dd2 ("PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()") syscore_shutdown() was removed from kernel_restart_prepare() and hence got (incorrectly?) removed from the kexec flow. This was innocuous until commit 6735150b6997 ("KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown") changed the way that KVM registered its shutdown callbacks, switching from reboot notifiers to syscore_ops.shutdown. As syscore_shutdown() is missing from kexec, KVM's shutdown hook is not run and virtualisation is left enabled on the boot CPU which results in triple faults when switching to the new kernel on Intel x86 VT-x with VMXE enabled. Fix this by adding syscore_shutdown() to the kexec sequence. In terms of where to add it, it is being added after migrating the kexec task to the boot CPU, but before APs are shut down. It is not totally clear if this is the best place: in commit 6f389a8f1dd2 ("PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()") it is stated that "syscore_ops operations should be carried with one CPU on-line and interrupts disabled." APs are only offlined later in machine_shutdown(), so this syscore_shutdown() is being run while APs are still online. This seems to be the correct place as it matches where syscore_shutdown() is run in the reboot and halt flows - they also run it before APs are shut down. The assumption is that the commit message in commit 6f389a8f1dd2 ("PM / reboot: call syscore_shutdown() after disable_nonboot_cpus()") is no longer valid. KVM has been discussed here as it is what broke loudly by not having syscore_shutdown() in kexec, but this change impacts more than just KVM; all drivers/modules which register a syscore_ops.shutdown callback will now be invoked in the kexec flow. Looking at some of them like x86 MCE it is probably more correct to also shut these down during kexec. Maintainers of all drivers which use syscore_ops.shutdown are added on CC for visibility. They are: arch/powerpc/platforms/cell/spu_base.c .shutdown = spu_shutdown, arch/x86/kernel/cpu/mce/core.c .shutdown = mce_syscore_shutdown, arch/x86/kernel/i8259.c .shutdown = i8259A_shutdown, drivers/irqchip/irq-i8259.c .shutdown = i8259A_shutdown, drivers/irqchip/irq-sun6i-r.c .shutdown = sun6i_r_intc_shutdown, drivers/leds/trigger/ledtrig-cpu.c .shutdown = ledtrig_cpu_syscore_shutdown, drivers/power/reset/sc27xx-poweroff.c .shutdown = sc27xx_poweroff_shutdown, kernel/irq/generic-chip.c .shutdown = irq_gc_shutdown, virt/kvm/kvm_main.c .shutdown = kvm_shutdown, This has been tested by doing a kexec on x86_64 and aarch64. Fixes: 6735150b6997 ("KVM: Use syscore_ops instead of reboot_notifier to hook restart/shutdown") Signed-off-by: James Gowans Cc: Eric Biederman Cc: Paolo Bonzini Cc: Sean Christopherson Cc: Marc Zyngier Cc: Arnd Bergmann Cc: Tony Luck Cc: Borislav Petkov Cc: Thomas Gleixner Cc: Ingo Molnar Cc: Chen-Yu Tsai Cc: Jernej Skrabec Cc: Samuel Holland Cc: Pavel Machek Cc: Sebastian Reichel Cc: Orson Zhai Cc: Alexander Graf Cc: Jan H. Schoenherr --- kernel/kexec_core.c | 1 + 1 file changed, 1 insertion(+) diff --git a/kernel/kexec_core.c b/kernel/kexec_core.c index be5642a4ec49..b926c4db8a91 100644 --- a/kernel/kexec_core.c +++ b/kernel/kexec_core.c @@ -1254,6 +1254,7 @@ int kernel_kexec(void) kexec_in_progress = true; kernel_restart_prepare("kexec reboot"); migrate_to_reboot_cpu(); + syscore_shutdown(); /* * migrate_to_reboot_cpu() disables CPU hotplug assuming that -- 2.34.1