Received: by 2002:a89:d88:0:b0:1fa:5c73:8e2d with SMTP id eb8csp2532809lqb; Tue, 28 May 2024 02:59:13 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWFUKXjDFQ6aZgBf5um3GD6XAGVfYMhwtCkHEE1MAywE01w5U3PLFg/4ydrg48517NZw19jd9th/XFnMf/WF1yQsY9YgwbGXNUy/DG60A== X-Google-Smtp-Source: AGHT+IH+AcMpcfWOM6rntTdIa4KtcySEpqaLxpSgZtAe6HPV7vE4KKWOzB8M0GDPzws4emkjIJ8s X-Received: by 2002:a05:622a:3d4:b0:43e:3b2f:e2 with SMTP id d75a77b69052e-43fb0e8f7f5mr127926011cf.37.1716890353667; Tue, 28 May 2024 02:59:13 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1716890353; cv=pass; d=google.com; s=arc-20160816; b=jBjZhbGQayeZ1ZVx9ZkdZdH6P4BZkbnZr/5y4pZdaOj5/n3tdL/wdzZDAd/KfBkSho xnvs951nBIpMTK3pabK7cDr6TcLWrbFgQTHGcKE14aq4ygr0k9L8ex1E4OsfHjn/9Gia zVI6t1qfgFa7vJkDbWvohzVllwgaoriwK+JEsKjUQbzwOxyfDEicMSf9VoniGYQLqsEV Ivin0nMzzjEraDuuhaMQ4m0SOujnaXERSI/chtK1Y8sXybKuoRn65+ZbVVBWXhsc954d F1HOW6n/9yz1ry8zvfju0Q/5rrBlrgNyrmDqiF7UUardfBiTdE/wNBKTLSHjNvlqDBwZ AONw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=RSCYLgDcq+e4lBuEsZmak2KtnVzazQ85YlgbyEvLCBU=; fh=svit0DvKvT6NgB98xF4CdIScGruzoaBmkZ4xn0BK2Xk=; b=KpqT7dB6D4y3KRSTCdnxjZ58seT4LLi0081zs8CrUn76ik5Dw1EgAMyhoapnB9s210 F9P6RvtS9RxS+dSQD/LG2A0bvWNXv5r97hf1BPKFo5/BbF/L+kuEjW9JCL1zN8npShIh ndnUtp/k2mG4ps62YpxoL9lpkTx/V5XlWVJ+JYNRlY9VwYztOXIgLgwzb8k37lJC1AXW LxryFXpOYKPd6sDxmEOZU0kDlRk7yddxaKPpoav8namjJ2TdhVEVyDEnH6yIWLKsUPPT BvIbZrXiVWHlPVANUv37yJLL38Wn5lKgYTc6Tbp7/79tX3MJ4SwOtBxkRPpK8jeL3mcL RQRA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ioe9gIXA; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-192053-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-192053-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id d75a77b69052e-43fd92c946dsi12712261cf.223.2024.05.28.02.59.13 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 28 May 2024 02:59:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-192053-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=ioe9gIXA; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-192053-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-192053-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id CA0C71C23DE7 for ; Tue, 28 May 2024 09:59:12 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1E76E16EBFA; Tue, 28 May 2024 09:55:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="ioe9gIXA" Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.21]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0D3416D9DB; Tue, 28 May 2024 09:55:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=198.175.65.21 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716890157; cv=none; b=MQg6DgkzceIscVopKS0OuRPaRFCh2PGcbHNftnWRaqK1SR1vUnwILSTIQfG8JegWO9f9QIYvqXAjph95NPwHfia9F43JQPlRiTIqzjXnpqO2XES1y7Pu0UmcXsS1GXpl0NpDyRgWQ+HfkSRiUyQ6000AvrDcjfoTxnB1Dm0cpcc= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716890157; c=relaxed/simple; bh=6wIXclynsLFdchuG1Z21o7bIw4fgwFAhF255TAknuUc=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=DX0GEjYqTNZsslAYCgG/5z9fscY8jH9z8LQ7LRoOgTrzC3deSmzcu0OeSxEtFuF/0UI/xPahEQ6/478mJIO7pL0aX2SO7k5LTzKKXmMIuX5fN9t152JWkws5HT0BZe4eUm6e7/jxeDMFLd2B8R/jPhiWhuRXkA2b6IB4GmeZ6RM= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.helo=mgamail.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=ioe9gIXA; arc=none smtp.client-ip=198.175.65.21 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.helo=mgamail.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1716890156; x=1748426156; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6wIXclynsLFdchuG1Z21o7bIw4fgwFAhF255TAknuUc=; b=ioe9gIXAzdbIXqehSOX3QwAaQdRoOdAZujGa5cP93H9oEei2m1P26CA6 evnner7o143oUoBMZyxaJMBfy6bcYGxBmX3mSu1p9SVNLb5jUCFAercBA 5sNCyPKQX3KtqKow6KfShljQfuQIqX7rUper0Iwsfu0/VbdJ/HriX1kSP IEtvSjkoPEp+6cV29wUCZvQOF8p7mA0N4qhNUp/BOTanNENuqoJvJiOfU NreAKJ1do3+fNNJ5ImPRdxCr3Qvfz0GVUlJXKAZHZ5l6SQqVNDOSgEPdY CRxOsKAwfA5SBuxI2/D3CvwOUUozKzd+TlI9QZszg9p1kkNkIKL0ivHas w==; X-CSE-ConnectionGUID: 1QWkNxdMTtuxjeUF+79qlQ== X-CSE-MsgGUID: TkDItUEHS8+labKijprydw== X-IronPort-AV: E=McAfee;i="6600,9927,11085"; a="13172194" X-IronPort-AV: E=Sophos;i="6.08,194,1712646000"; d="scan'208";a="13172194" Received: from orviesa004.jf.intel.com ([10.64.159.144]) by orvoesa113.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 28 May 2024 02:55:41 -0700 X-CSE-ConnectionGUID: S22G6BcVRZOggNwwxI/3YQ== X-CSE-MsgGUID: 3S5fbNYhSfuhdoYUYRoNjg== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,194,1712646000"; d="scan'208";a="39984753" Received: from black.fi.intel.com ([10.237.72.28]) by orviesa004.jf.intel.com with ESMTP; 28 May 2024 02:55:34 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id BA329789; Tue, 28 May 2024 12:55:26 +0300 (EEST) From: "Kirill A. Shutemov" To: Thomas Gleixner , Ingo Molnar , Borislav Petkov , Dave Hansen , x86@kernel.org Cc: "Rafael J. Wysocki" , Peter Zijlstra , Adrian Hunter , Kuppuswamy Sathyanarayanan , Elena Reshetova , Jun Nakajima , Rick Edgecombe , Tom Lendacky , "Kalra, Ashish" , Sean Christopherson , "Huang, Kai" , Ard Biesheuvel , Baoquan He , "H. Peter Anvin" , "Kirill A. Shutemov" , "K. Y. Srinivasan" , Haiyang Zhang , kexec@lists.infradead.org, linux-hyperv@vger.kernel.org, linux-acpi@vger.kernel.org, linux-coco@lists.linux.dev, linux-kernel@vger.kernel.org, Nikolay Borisov , Tao Liu Subject: [PATCHv11 10/19] x86/mm: Add callbacks to prepare encrypted memory for kexec Date: Tue, 28 May 2024 12:55:13 +0300 Message-ID: <20240528095522.509667-11-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240528095522.509667-1-kirill.shutemov@linux.intel.com> References: <20240528095522.509667-1-kirill.shutemov@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit AMD SEV and Intel TDX guests allocate shared buffers for performing I/O. This is done by allocating pages normally from the buddy allocator and then converting them to shared using set_memory_decrypted(). On kexec, the second kernel is unaware of which memory has been converted in this manner. It only sees E820_TYPE_RAM. Accessing shared memory as private is fatal. Therefore, the memory state must be reset to its original state before starting the new kernel with kexec. The process of converting shared memory back to private occurs in two steps: - enc_kexec_begin() stops new conversions. - enc_kexec_finish() unshares all existing shared memory, reverting it back to private. Signed-off-by: Kirill A. Shutemov Reviewed-by: Nikolay Borisov Reviewed-by: Kai Huang Tested-by: Tao Liu --- arch/x86/include/asm/x86_init.h | 9 +++++++++ arch/x86/kernel/crash.c | 12 ++++++++++++ arch/x86/kernel/reboot.c | 12 ++++++++++++ arch/x86/kernel/x86_init.c | 4 ++++ 4 files changed, 37 insertions(+) diff --git a/arch/x86/include/asm/x86_init.h b/arch/x86/include/asm/x86_init.h index 28ac3cb9b987..6cade48811cc 100644 --- a/arch/x86/include/asm/x86_init.h +++ b/arch/x86/include/asm/x86_init.h @@ -149,12 +149,21 @@ struct x86_init_acpi { * @enc_status_change_finish Notify HV after the encryption status of a range is changed * @enc_tlb_flush_required Returns true if a TLB flush is needed before changing page encryption status * @enc_cache_flush_required Returns true if a cache flush is needed before changing page encryption status + * @enc_kexec_begin Begin the two-step process of conversion shared memory back + * to private. It stops the new conversions from being started + * and waits in-flight conversions to finish, if possible. + * @enc_kexec_finish Finish the two-step process of conversion shared memory to + * private. All memory is private after the call. + * It called with all CPUs but one shutdown and interrupts + * disabled. */ struct x86_guest { int (*enc_status_change_prepare)(unsigned long vaddr, int npages, bool enc); int (*enc_status_change_finish)(unsigned long vaddr, int npages, bool enc); bool (*enc_tlb_flush_required)(bool enc); bool (*enc_cache_flush_required)(void); + void (*enc_kexec_begin)(bool crash); + void (*enc_kexec_finish)(void); }; /** diff --git a/arch/x86/kernel/crash.c b/arch/x86/kernel/crash.c index f06501445cd9..74f6305eb9ec 100644 --- a/arch/x86/kernel/crash.c +++ b/arch/x86/kernel/crash.c @@ -128,6 +128,18 @@ void native_machine_crash_shutdown(struct pt_regs *regs) #ifdef CONFIG_HPET_TIMER hpet_disable(); #endif + + /* + * Non-crash kexec calls enc_kexec_begin() while scheduling is still + * active. This allows the callback to wait until all in-flight + * shared<->private conversions are complete. In a crash scenario, + * enc_kexec_begin() get call after all but one CPU has been shut down + * and interrupts have been disabled. This only allows the callback to + * detect a race with the conversion and report it. + */ + x86_platform.guest.enc_kexec_begin(true); + x86_platform.guest.enc_kexec_finish(); + crash_save_cpu(regs, safe_smp_processor_id()); } diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c index f3130f762784..097313147ad3 100644 --- a/arch/x86/kernel/reboot.c +++ b/arch/x86/kernel/reboot.c @@ -12,6 +12,7 @@ #include #include #include +#include #include #include #include @@ -716,6 +717,14 @@ static void native_machine_emergency_restart(void) void native_machine_shutdown(void) { + /* + * Call enc_kexec_begin() while all CPUs are still active and + * interrupts are enabled. This will allow all in-flight memory + * conversions to finish cleanly. + */ + if (kexec_in_progress) + x86_platform.guest.enc_kexec_begin(false); + /* Stop the cpus and apics */ #ifdef CONFIG_X86_IO_APIC /* @@ -752,6 +761,9 @@ void native_machine_shutdown(void) #ifdef CONFIG_X86_64 x86_platform.iommu_shutdown(); #endif + + if (kexec_in_progress) + x86_platform.guest.enc_kexec_finish(); } static void __machine_emergency_restart(int emergency) diff --git a/arch/x86/kernel/x86_init.c b/arch/x86/kernel/x86_init.c index a7143bb7dd93..8a79fb505303 100644 --- a/arch/x86/kernel/x86_init.c +++ b/arch/x86/kernel/x86_init.c @@ -138,6 +138,8 @@ static int enc_status_change_prepare_noop(unsigned long vaddr, int npages, bool static int enc_status_change_finish_noop(unsigned long vaddr, int npages, bool enc) { return 0; } static bool enc_tlb_flush_required_noop(bool enc) { return false; } static bool enc_cache_flush_required_noop(void) { return false; } +static void enc_kexec_begin_noop(bool crash) {} +static void enc_kexec_finish_noop(void) {} static bool is_private_mmio_noop(u64 addr) {return false; } struct x86_platform_ops x86_platform __ro_after_init = { @@ -161,6 +163,8 @@ struct x86_platform_ops x86_platform __ro_after_init = { .enc_status_change_finish = enc_status_change_finish_noop, .enc_tlb_flush_required = enc_tlb_flush_required_noop, .enc_cache_flush_required = enc_cache_flush_required_noop, + .enc_kexec_begin = enc_kexec_begin_noop, + .enc_kexec_finish = enc_kexec_finish_noop, }, }; -- 2.43.0