Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp6939846rdb; Fri, 15 Dec 2023 12:29:25 -0800 (PST) X-Google-Smtp-Source: AGHT+IHlBWfr5BpDfhsZ/4TGQeywAjyf1B51ZewX3+9ky6Ojl517/rftAx28DrkWB5EGYJHDQSMW X-Received: by 2002:a05:6e02:1c4b:b0:35d:6930:e924 with SMTP id d11-20020a056e021c4b00b0035d6930e924mr17724093ilg.48.1702672164765; Fri, 15 Dec 2023 12:29:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702672164; cv=none; d=google.com; s=arc-20160816; b=xJrnXdtopTT4pyCYjmyeLct6IBXah3Dy20hmj1IL61tif7j/nXVLXUfPVJa0VWbnY3 v6PBh69PCE7NoDxAncIUGyDNJUxjiY+u10D2LtCcaWJLhi2duAftzNPYXMsmq9TJw5d5 Qu8WpBc4Vh4Nym3GUlm1LwsaFjR+x6jc3Leja/1G6/W3XumpEKJNmYR7b7lCvbuxVdU8 EzIhstehfBYN+mlTKymn8OjkF+GTE+4UYrBd/YXJq8cNjq9ZpCMW3LwDgcMPgySRew8x Mfix4c07bR7YV8e0IjzW0ThVTTjYJ9TbEvsy6WSW2ums/AEGisXz2frwmcFtBe4q78sw LfNA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=mime-version:list-unsubscribe:list-subscribe:list-id:precedence :message-id:date:references:in-reply-to:subject:cc:to:dkim-signature :dkim-signature:from; bh=pdq43RkAMkdg64YcpCIrr5Wg9aGVqWCQPynJbOk//bU=; fh=8QDqhbUYqUmo6njZfsgNvzC+8/sB9djjWa/jdu7CcQ8=; b=nFwRDfTi6Gghr0oPYAaadL4j40lCuoY8H/r/NZ696kLQN2TNM4u6Hco2zzAk8ZjW// pE97e7b79Q2h26U/tesCJZO/9BppDpiQ3YqHcdUhImb2a6JjJpZIeJlO9nX1genAvs7m M1CdLIyaNEiNaD/OPrk/Tp+fc2q0ngJCQQfxLtgnF2FGCOjau58DSreFjj/xyBmltc9o Mg/s9mDAbuzHUOXBktFSFkV2ho26LHHtN8QY50LKodu0lCg0B2vYvopNRRLfPRqmRPUv 4h/bpRiQiDkLtYrwqN1KQz/EJP/RF6s2W4O6nhowpTUQ9wFuhVVaspLeTdpSR8PwFixX 37oA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=TtQP+GF6; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=d849ArFb; spf=pass (google.com: domain of linux-kernel+bounces-1626-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-1626-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id v3-20020a655683000000b005775e2a7951si13539001pgs.345.2023.12.15.12.29.24 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 15 Dec 2023 12:29:24 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-1626-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=TtQP+GF6; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=d849ArFb; spf=pass (google.com: domain of linux-kernel+bounces-1626-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-1626-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 1E4E42867A2 for ; Fri, 15 Dec 2023 20:29:24 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 0193E45C1B; Fri, 15 Dec 2023 20:29:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="TtQP+GF6"; dkim=permerror (0-bit key) header.d=linutronix.de header.i=@linutronix.de header.b="d849ArFb" X-Original-To: linux-kernel@vger.kernel.org Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CE11A45BE8 for ; Fri, 15 Dec 2023 20:29:15 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linutronix.de Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linutronix.de From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1702672154; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=pdq43RkAMkdg64YcpCIrr5Wg9aGVqWCQPynJbOk//bU=; b=TtQP+GF6498gZz0p5GV9BYwaeHviZOgZWKhk7Fv+0l5DAjQSNO16kXf3H/3Ec5x3Zk3veD 6Mv5luUPKrBmwgLzqIVKZ30I8T6Ups/t+8RG12TESFuVzfjai1GtQUnyNz5zukSJWVPq2V rGUz8E090pXzNqQpRxLd4Cw1tIvIaNsyJ7pGGdpaa6SRXPiO3btEE8U/IYdNzx7OJvLdg+ nCI5OUFbKeewFcermt0SokMBUzIzVSA4kVzN72x5qOCY/ybspROM1CwFd/DYelIRac4CMW nQY3av0SKR8IiRif3d2L0DtL423+Xix5WKG5FlEkcKnkKjYJBk2xd4rCNsCmig== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1702672154; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=pdq43RkAMkdg64YcpCIrr5Wg9aGVqWCQPynJbOk//bU=; b=d849ArFbwUVVVXS+AJgHszyAoTXPSCgpdH4Mzi0wPuRdF+SUHe3NpdPovJ4cHqVgo4s2tC 1hKUNFtrIwjsXgDQ== To: "Kirill A. Shutemov" , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org Cc: "Rafael J. Wysocki" , Peter Zijlstra , Adrian Hunter , Kuppuswamy Sathyanarayanan , Elena Reshetova , Jun Nakajima , Rick Edgecombe , Tom Lendacky , "Kalra, Ashish" , Sean Christopherson , "Huang, Kai" , Baoquan He , kexec@lists.infradead.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: Re: [PATCHv4 14/14] x86/acpi: Add support for CPU offlining for ACPI MADT wakeup method In-Reply-To: <20231205004510.27164-15-kirill.shutemov@linux.intel.com> References: <20231205004510.27164-1-kirill.shutemov@linux.intel.com> <20231205004510.27164-15-kirill.shutemov@linux.intel.com> Date: Fri, 15 Dec 2023 21:29:13 +0100 Message-ID: <877clfmcpy.ffs@tglx> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain On Tue, Dec 05 2023 at 03:45, Kirill A. Shutemov wrote: > MADT Multiprocessor Wakeup structure version 1 brings support of CPU > offlining: BIOS provides a reset vector where the CPU has to jump to > offline itself. CPU has to jump to for offlining itself. > The new TEST mailbox command can be used to test the CPU offlined > successfully and BIOS has control over it. test whether the CPU offlined itself which means the BIOS has control over the CPU and can online it again via the ACPI MADT wakeup method. > Add CPU offling support for ACPI MADT wakeup method by implementing for the ACPI > custom cpu_die, play_dead and stop_other_cpus SMP operations. cpu_die(), play_dead() ... > CPU offlining makes is possible to hand over secondary CPUs over kexec, > not limiting the second kernel to single CPU. to a single CPU. > diff --git a/arch/x86/include/asm/smp.h b/arch/x86/include/asm/smp.h > index 4fab2ed454f3..3c8efba86d5c 100644 > --- a/arch/x86/include/asm/smp.h > +++ b/arch/x86/include/asm/smp.h > @@ -38,6 +38,7 @@ struct smp_ops { > int (*cpu_disable)(void); > void (*cpu_die)(unsigned int cpu); > void (*play_dead)(void); > + void (*crash_play_dead)(void); This new callback and the callsite change wants to be introduced in a preparatory patch. This one is doing too many things at once, really. > diff --git a/arch/x86/kernel/acpi/madt_playdead.S b/arch/x86/kernel/acpi/madt_playdead.S > new file mode 100644 > index 000000000000..68f83865a1e3 > --- /dev/null > +++ b/arch/x86/kernel/acpi/madt_playdead.S > @@ -0,0 +1,21 @@ > +#include > +#include > +#include > +#include > + > + .text > + .align PAGE_SIZE Newline please Please document what the register arguments to this function are. > +SYM_FUNC_START(asm_acpi_mp_play_dead) > + /* Turn off global entries. Following CR3 write will flush them. */ > + movq %cr4, %rdx > + andq $~(X86_CR4_PGE), %rdx > + movq %rdx, %cr4 > + > + /* Switch to identity mapping */ > + movq %rsi, %rax > + movq %rax, %cr3 > + > + /* Jump to reset vector */ > + ANNOTATE_RETPOLINE_SAFE > + jmp *%rdi > +SYM_FUNC_END(asm_acpi_mp_play_dead) > +static u64 acpi_mp_pgd __ro_after_init; > +static u64 acpi_mp_reset_vector_paddr __ro_after_init; > + > +void asm_acpi_mp_play_dead(u64 reset_vector, u64 pgd_pa); Declarations want to be in a header file. > +static void crash_acpi_mp_play_dead(void) > +{ > + asm_acpi_mp_play_dead(acpi_mp_reset_vector_paddr, > + acpi_mp_pgd); Pointless line break. > +} > + > +static void acpi_mp_play_dead(void) > +{ > + play_dead_common(); > + asm_acpi_mp_play_dead(acpi_mp_reset_vector_paddr, > + acpi_mp_pgd); Ditto. > +} > + > +static void acpi_mp_cpu_die(unsigned int cpu) > +{ > + u32 apicid = per_cpu(x86_cpu_to_apicid, cpu); > + unsigned long timeout; > + > + /* > + * Use TEST mailbox command to prove that BIOS got control over > + * the CPU before declaring it dead. > + * > + * BIOS has to clear 'command' field of the mailbox. > + */ > + acpi_mp_wake_mailbox->apic_id = apicid; > + smp_store_release(&acpi_mp_wake_mailbox->command, > + ACPI_MP_WAKE_COMMAND_TEST); > + > + /* Don't wait longer than a second. */ > + timeout = USEC_PER_SEC; > + while (READ_ONCE(acpi_mp_wake_mailbox->command) && timeout--) > + udelay(1); So this waits and then does nothing if the wait fails. What's the point? ... Do we really need this specific hackery or is there some similar identity mapping muck which can be generalized? > + smp_ops.play_dead = acpi_mp_play_dead; > + smp_ops.crash_play_dead = crash_acpi_mp_play_dead; > + smp_ops.cpu_die = acpi_mp_cpu_die; > + smp_ops.stop_other_cpus = acpi_mp_stop_other_cpus; > + > + acpi_mp_reset_vector_paddr = reset_vector; > + acpi_mp_pgd = __pa(pgd); > + > + return 0; > +} > + > static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip) > { > if (!acpi_mp_wake_mailbox_paddr) { > @@ -68,37 +299,63 @@ static int acpi_wakeup_cpu(u32 apicid, unsigned long start_ip) > return 0; > } > > +static void acpi_mp_disable_offlining(struct acpi_madt_multiproc_wakeup *mp_wake) > +{ > + cpu_hotplug_disable_offlining(); > + > + /* > + * Zero out mailbox address in the ACPI MADT wakeup structure > + * to indicate that the mailbox is not usable. This prevents > + * the kexec()-ed kernel from reading a vaild mailbox, which in > + * turn makes the kexec()-ed kernel only be able to use the boot > + * CPU. > + * > + * This is Linux-specific protocol and not reflected in ACPI spec. > + * > + * acpi_mp_wake_mailbox_paddr already has the mailbox address. > + * The acpi_wakeup_cpu() will use it to bring up secondary cpus for > + * the current kernel. > + */ > + mp_wake->mailbox_address = 0; > +} The previous patch could have split this out into a helper already, no? > + > int __init acpi_parse_mp_wake(union acpi_subtable_headers *header, > const unsigned long end) > { > struct acpi_madt_multiproc_wakeup *mp_wake; > > mp_wake = (struct acpi_madt_multiproc_wakeup *)header; > - if (BAD_MADT_ENTRY(mp_wake, end)) > + > + /* > + * Cannot use the standard BAD_MADT_ENTRY() to sanity check the @mp_wake > + * entry. 'sizeof (struct acpi_madt_multiproc_wakeup)' can be larger > + * than the actual size of the MP wakeup entry in ACPI table because the > + * 'reset_vector' is only available in the V1 MP wakeup structure. > + */ The comment is white space damaged. Use tabs everywhere please and not only in one line. Thanks, tglx