Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp1013445pxb; Wed, 6 Apr 2022 06:43:56 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxcNqe9vDGFbAI7X0KmcvBKt87ArYlqdXOsY6FtCCGT731DyPiafdFcFhTxfTMG9O+TEQEM X-Received: by 2002:a63:1c22:0:b0:385:fcae:cb3f with SMTP id c34-20020a631c22000000b00385fcaecb3fmr7283130pgc.102.1649252636556; Wed, 06 Apr 2022 06:43:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649252636; cv=none; d=google.com; s=arc-20160816; b=kwYSAm/3K4K7ZrHZbHl57xe1A96FCbI8KHc1lLoz33aHi3FJhXPulSDe5dM2X8ELAx q99wKS4DWNYdqeAOlX54Y7XCjMR6M/HoPpWXJDyGAre5Ils6u2UdGdRQucPsJG9mFXxV VXbDz6dR1hQLrwpCC0zBndDdL9lXPQ217BnGtaOQaI70rlnSp9rgl+uxkFr9dLvx8pSg Hi3d3vISkP7upHjTo/8hOIiTEssFaEiL6hVrLGBSaHyg78i9ka3tZHBYSUAac21bJZxw uLuGue6oqLYOvkFotPfKLLHvR/r2vr/QrTRHtZvJF2gpb0EdxYePCn8apZufDKJx/805 kwXg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=Qd1AGpL4z1bYZ2y3JLPp7Wa4GphlIKh/E72nQo1Zo3o=; b=QJHASYWU1cF+d884xph8tI2MQPqnHYMpqDERpVoZb55Mxh1F1nGiwGlJRc4W9ikxWB KnKGDym2y1xnLC/KXMnEXw0bqWYZn+7qPyZjOYZRoR2NxBjbklzEv6v2IBwaG5uiXat0 ErdMl5j8I0zcLKlM7SOZVkX7IR79A/D8LcyBapPMAJTRtxM7BwLdvg51waFr9Ryf9uSS tHkikj01RoqqiRgcR0GPLl2IlO3kZ7e8MEpGOmH2XfLKVkM69cRX6YxnsgZvjCx8dZAF 4T1XAIj3zx1HSYx+NiXDsGp6r6vy4QGiLgRb7D6OqofnZXNWfy93UZeWqhzHvMhGhWTJ 424Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DgZHSkM7; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id f31-20020a63101f000000b003816043ee23si16302580pgl.24.2022.04.06.06.43.56 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Apr 2022 06:43:56 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=DgZHSkM7; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 8880C60382D; Wed, 6 Apr 2022 04:32:08 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2359889AbiDFDXK (ORCPT + 99 others); Tue, 5 Apr 2022 23:23:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1847223AbiDFCQt (ORCPT ); Tue, 5 Apr 2022 22:16:49 -0400 Received: from mga04.intel.com (mga04.intel.com [192.55.52.120]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4B86F1B7553 for ; Tue, 5 Apr 2022 16:35:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1649201740; x=1680737740; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=+iHwbUcUMwezIvQ6jBwvOkSCgSGXsZZzYZ+ChiocSJk=; b=DgZHSkM72I4tnnxyMN0SWpznMaclGgzCzuKEeJ9642vNV5d8lMEvG7zE Gp0aHIoW6Ek+0KA6HM6nOkxmug+FEBnI8XFv1+DKmgfJjoYVF6LzXgvU3 gn3/JhHAVudsGndeTjk6kP+UKfPf/CTQ/IlzQNDLktD2OSNoWqILU+yaz YPBnT8trm0g+A8g9XsCuF6lgsD/e3PLf19PKZgl8k3Kmnve5FTyDrFaql F1GDEacspg6yYyQaQA4uJgbBlo6DJT1LP226SUC3Jv2v4prasyeH8W6sb DqLIQSgQriOttB44lrH4nN+VoSN1ThzhWbG2XO77j0El+hqk6ZF4isRqA A==; X-IronPort-AV: E=McAfee;i="6200,9189,10308"; a="259721576" X-IronPort-AV: E=Sophos;i="5.90,238,1643702400"; d="scan'208";a="259721576" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 05 Apr 2022 16:34:56 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.90,238,1643702400"; d="scan'208";a="549273521" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga007.jf.intel.com with ESMTP; 05 Apr 2022 16:34:48 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id 95842CC3; Wed, 6 Apr 2022 02:29:46 +0300 (EEST) From: "Kirill A. Shutemov" To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@intel.com, luto@kernel.org, peterz@infradead.org Cc: sathyanarayanan.kuppuswamy@linux.intel.com, aarcange@redhat.com, ak@linux.intel.com, dan.j.williams@intel.com, david@redhat.com, hpa@zytor.com, jgross@suse.com, jmattson@google.com, joro@8bytes.org, jpoimboe@redhat.com, knsathya@kernel.org, pbonzini@redhat.com, sdeep@vmware.com, seanjc@google.com, tony.luck@intel.com, vkuznets@redhat.com, wanpengli@tencent.com, thomas.lendacky@amd.com, brijesh.singh@amd.com, x86@kernel.org, linux-kernel@vger.kernel.org, Sean Christopherson , "Rafael J . Wysocki" , "Kirill A . Shutemov" , Dave Hansen Subject: [PATCHv8 21/30] x86/acpi, x86/boot: Add multiprocessor wake-up support Date: Wed, 6 Apr 2022 02:29:30 +0300 Message-Id: <20220405232939.73860-22-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405232939.73860-1-kirill.shutemov@linux.intel.com> References: <20220405232939.73860-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org From: Kuppuswamy Sathyanarayanan Secondary CPU startup is currently performed with something called the "INIT/SIPI protocol". This protocol requires assistance from VMMs to boot guests. As should be a familiar story by now, that support can not be provded to TDX guests because TDX VMMs are not trusted by guests. To remedy this situation a new[1] "Multiprocessor Wakeup Structure" has been added to to an existing ACPI table (MADT). This structure provides the physical address of a "mailbox". A write to the mailbox then steers the secondary CPU to the boot code. Add ACPI MADT wake structure parsing support and wake support. Use this support to wake CPUs whenever it is present instead of INIT/SIPI. While this structure can theoretically be used on 32-bit kernels, there are no 32-bit TDX guest kernels. It has not been tested and can not practically *be* tested on 32-bit. Make it 64-bit only. 1. Details about the new structure can be found in ACPI v6.4, in the "Multiprocessor Wakeup Structure" section. Co-developed-by: Sean Christopherson Signed-off-by: Sean Christopherson Reviewed-by: Andi Kleen Reviewed-by: Rafael J. Wysocki Signed-off-by: Kuppuswamy Sathyanarayanan Signed-off-by: Kirill A. Shutemov Reviewed-by: Dave Hansen --- arch/x86/include/asm/apic.h | 5 ++ arch/x86/kernel/acpi/boot.c | 93 ++++++++++++++++++++++++++++++++++++- arch/x86/kernel/apic/apic.c | 10 ++++ 3 files changed, 107 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/asm/apic.h b/arch/x86/include/asm/apic.h index 35006e151774..bd8ae0a7010a 100644 --- a/arch/x86/include/asm/apic.h +++ b/arch/x86/include/asm/apic.h @@ -490,6 +490,11 @@ static inline unsigned int read_apic_id(void) return apic->get_apic_id(reg); } +#ifdef CONFIG_X86_64 +typedef int (*wakeup_cpu_handler)(int apicid, unsigned long start_eip); +extern void acpi_wake_cpu_handler_update(wakeup_cpu_handler handler); +#endif + extern int default_apic_id_valid(u32 apicid); extern int default_acpi_madt_oem_check(char *, char *); extern void default_setup_apic_routing(void); diff --git a/arch/x86/kernel/acpi/boot.c b/arch/x86/kernel/acpi/boot.c index 0d01e7f5078c..6d2c50819501 100644 --- a/arch/x86/kernel/acpi/boot.c +++ b/arch/x86/kernel/acpi/boot.c @@ -65,6 +65,13 @@ static u64 acpi_lapic_addr __initdata = APIC_DEFAULT_PHYS_BASE; static bool acpi_support_online_capable; #endif +#ifdef CONFIG_X86_64 +/* Physical address of the Multiprocessor Wakeup Structure mailbox */ +static u64 acpi_mp_wake_mailbox_paddr; +/* Virtual address of the Multiprocessor Wakeup Structure mailbox */ +static struct acpi_madt_multiproc_wakeup_mailbox *acpi_mp_wake_mailbox; +#endif + #ifdef CONFIG_X86_IO_APIC /* * Locks related to IOAPIC hotplug @@ -336,7 +343,60 @@ acpi_parse_lapic_nmi(union acpi_subtable_headers * header, const unsigned long e return 0; } -#endif /*CONFIG_X86_LOCAL_APIC */ +#ifdef CONFIG_X86_64 +static int acpi_wakeup_cpu(int apicid, unsigned long start_ip) +{ + /* + * Remap mailbox memory only for the first call to acpi_wakeup_cpu(). + * + * Wakeup of secondary CPUs is fully serialized in the core code. + * No need to protect acpi_mp_wake_mailbox from concurrent accesses. + */ + if (!acpi_mp_wake_mailbox) { + acpi_mp_wake_mailbox = memremap(acpi_mp_wake_mailbox_paddr, + sizeof(*acpi_mp_wake_mailbox), + MEMREMAP_WB); + } + + /* + * Mailbox memory is shared between the firmware and OS. Firmware will + * listen on mailbox command address, and once it receives the wakeup + * command, the CPU associated with the given apicid will be booted. + * + * The value of 'apic_id' and 'wakeup_vector' must be visible to the + * firmware before the wakeup command is visible. smp_store_release() + * ensures ordering and visibility. + */ + acpi_mp_wake_mailbox->apic_id = apicid; + acpi_mp_wake_mailbox->wakeup_vector = start_ip; + smp_store_release(&acpi_mp_wake_mailbox->command, + ACPI_MP_WAKE_COMMAND_WAKEUP); + + /* + * Wait for the CPU to wake up. + * + * The CPU being woken up is essentially in a spin loop waiting to be + * woken up. It should not take long for it wake up and acknowledge by + * zeroing out ->command. + * + * ACPI specification doesn't provide any guidance on how long kernel + * has to wait for a wake up acknowledgement. It also doesn't provide + * a way to cancel a wake up request if it takes too long. + * + * In TDX environment, the VMM has control over how long it takes to + * wake up secondary. It can postpone scheduling secondary vCPU + * indefinitely. Giving up on wake up request and reporting error opens + * possible attack vector for VMM: it can wake up a secondary CPU when + * kernel doesn't expect it. Wait until positive result of the wake up + * request. + */ + while (READ_ONCE(acpi_mp_wake_mailbox->command)) + cpu_relax(); + + return 0; +} +#endif /* CONFIG_X86_64 */ +#endif /* CONFIG_X86_LOCAL_APIC */ #ifdef CONFIG_X86_IO_APIC #define MP_ISA_BUS 0 @@ -1083,6 +1143,29 @@ static int __init acpi_parse_madt_lapic_entries(void) } return 0; } + +#ifdef CONFIG_X86_64 +static int __init acpi_parse_mp_wake(union acpi_subtable_headers *header, + const unsigned long end) +{ + struct acpi_madt_multiproc_wakeup *mp_wake; + + if (!IS_ENABLED(CONFIG_SMP)) + return -ENODEV; + + mp_wake = (struct acpi_madt_multiproc_wakeup *)header; + if (BAD_MADT_ENTRY(mp_wake, end)) + return -EINVAL; + + acpi_table_print_madt_entry(&header->common); + + acpi_mp_wake_mailbox_paddr = mp_wake->base_address; + + acpi_wake_cpu_handler_update(acpi_wakeup_cpu); + + return 0; +} +#endif /* CONFIG_X86_64 */ #endif /* CONFIG_X86_LOCAL_APIC */ #ifdef CONFIG_X86_IO_APIC @@ -1278,6 +1361,14 @@ static void __init acpi_process_madt(void) smp_found_config = 1; } + +#ifdef CONFIG_X86_64 + /* + * Parse MADT MP Wake entry. + */ + acpi_table_parse_madt(ACPI_MADT_TYPE_MULTIPROC_WAKEUP, + acpi_parse_mp_wake, 1); +#endif } if (error == -EINVAL) { /* diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index b70344bf6600..3c8f2c797a98 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -2551,6 +2551,16 @@ u32 x86_msi_msg_get_destid(struct msi_msg *msg, bool extid) } EXPORT_SYMBOL_GPL(x86_msi_msg_get_destid); +#ifdef CONFIG_X86_64 +void __init acpi_wake_cpu_handler_update(wakeup_cpu_handler handler) +{ + struct apic **drv; + + for (drv = __apicdrivers; drv < __apicdrivers_end; drv++) + (*drv)->wakeup_secondary_cpu_64 = handler; +} +#endif + /* * Override the generic EOI implementation with an optimized version. * Only called during early boot when only one CPU is active and with -- 2.35.1