Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp1039318rdb; Wed, 6 Dec 2023 07:08:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IG141DhuIh0FHeTI6ym64scsMP0xC/oldqmE1Dqlo/WeuSIqZc8PQS1Zfk6Sar45pK14q5p X-Received: by 2002:a17:903:2581:b0:1d0:723e:e43 with SMTP id jb1-20020a170903258100b001d0723e0e43mr565336plb.91.1701875292835; Wed, 06 Dec 2023 07:08:12 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701875292; cv=none; d=google.com; s=arc-20160816; b=f0ajIw/EBZ4kXqIQs8PSvaCvon+34dcVnfk8XvZnyJwrDj8NcDtVIb0wDzv9w4gTLx h66U+pQAtKsp+E8XtL+NDKzBp+nggbazUxYgErLnGkQhAlgea1zJPsuph41q6tebY5tg dgJmcuWBmGKVdWas80B48SXm0ucWqOyY1QXmtQcxS14ameUPY8NCpxyR0bDlBPYnI8a1 /LSmpvJHCyE8nQ7gY3WbhQiISSa6omYDDB9MgB/gcReR+bw0Wkfwts/LIRJRovVCM3ty ABJ4VSfUdbubHVl9yktGzhkU8/xEEVLpwVQgFktWQX17JxLU5jS0v5T3hSmGoLT/N0x1 xvTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=kNiBKU1Asc9r5Epqu/GEFIBwLCo0QSobpa1rqvZsq+I=; fh=wUtAS8bu7u+SUsoOpnLKHGbhKmUynxyqc31PmvfG918=; b=SZorY1PBIYReVXyJsiiHRP/YeZ3gSCGk65x1p2GNnOQqny7QQPesTSbnMesoGnDupp VeitRNyBdhAufTgs3JfhYdsqZ4Z3MJAP1nOc7ARVt5Cmy/YHhizUO+W/Uc9bjFrU/oQa 7X+orl0/rmBTCv0ZJO+aPDYLQ+EihQPj7yvofjoY/kaTOiNuEN5AW2nqrOHMdxQxIvX6 B1j9Z+5EVDE4AaqV6qHrkJlk+HzlKv9R9VYxopysdCqP+nqz6SI3Bkcf+7sd8ddrYCNj 8NSPDeWg/iEvgOnq2m1GpQvhvW40Dx8gkoJRGNlI94PO18LSyqUA5SaPqpxIi5QHYsIu 5ntQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=U4ZBPgbe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from howler.vger.email (howler.vger.email. [2620:137:e000::3:4]) by mx.google.com with ESMTPS id n19-20020a170902969300b001cfee168506si446914plp.393.2023.12.06.07.08.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 06 Dec 2023 07:08:12 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) client-ip=2620:137:e000::3:4; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=U4ZBPgbe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:4 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 6C7F18339648; Wed, 6 Dec 2023 07:08:09 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442428AbjLFPHu (ORCPT + 99 others); Wed, 6 Dec 2023 10:07:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1442337AbjLFPHq (ORCPT ); Wed, 6 Dec 2023 10:07:46 -0500 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.9]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D7883D4F for ; Wed, 6 Dec 2023 07:07:52 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701875273; x=1733411273; h=date:from:to:cc:subject:message-id:references: mime-version:content-transfer-encoding:in-reply-to; bh=OxpvoPymnRmbz5abQuGqeWGHWYBRoA/LaZb29RStz5I=; b=U4ZBPgbevGikpoB2DGt9bNOJxTNTXLNwROYPRt2mSlUMlNf/EzOQr9ie doLcGGQdPDX0jpc3/ffw/yVPjwGIMHW0y9PwEdxLwuFaTEvjzzPVQKVFO pnlm9v7mFdOjdDYkRBgRnTPU10Jpu2ngAVijqKsoZpQSOx4tOulX2gihW hPCZfA401U0kenMqLtilEvM7FrVp0S/cF+Vv7wJ3v1Pes7KQUHXbA3yFH FCwY4Qe5rVlLLkXXCEsCbJ+9cHbA04davgPZC8eiQCaXpe19alDCBuAOx PFOkqALEl3D3ARgz+Rh33TXc4VZf/gS3TWzJsgF8imdw94jEQCpsNUD4k w==; X-IronPort-AV: E=McAfee;i="6600,9927,10916"; a="12787916" X-IronPort-AV: E=Sophos;i="6.04,255,1695711600"; d="scan'208";a="12787916" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orvoesa101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 07:07:53 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10916"; a="915215154" X-IronPort-AV: E=Sophos;i="6.04,255,1695711600"; d="scan'208";a="915215154" Received: from eborisov-mobl2.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.46.36]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 06 Dec 2023 07:07:46 -0800 Received: by box.shutemov.name (Postfix, from userid 1000) id 8C59410A3F5; Wed, 6 Dec 2023 18:07:43 +0300 (+03) Date: Wed, 6 Dec 2023 18:07:43 +0300 From: "kirill.shutemov@linux.intel.com" To: "Edgecombe, Rick P" Cc: "tglx@linutronix.de" , "mingo@redhat.com" , "x86@kernel.org" , "bp@alien8.de" , "dave.hansen@linux.intel.com" , "kexec@lists.infradead.org" , "Reshetova, Elena" , "Nakajima, Jun" , "rafael@kernel.org" , "peterz@infradead.org" , "Huang, Kai" , "sathyanarayanan.kuppuswamy@linux.intel.com" , "Hunter, Adrian" , "thomas.lendacky@amd.com" , "ashish.kalra@amd.com" , "linux-coco@lists.linux.dev" , "seanjc@google.com" , "bhe@redhat.com" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCHv4 10/14] x86/tdx: Convert shared memory back to private on kexec Message-ID: <20231206150743.ylgdh2b3qjnacws3@box.shutemov.name> References: <20231205004510.27164-1-kirill.shutemov@linux.intel.com> <20231205004510.27164-11-kirill.shutemov@linux.intel.com> <3cf8b953c449320cc4c085924ef0e2eed5eadcf7.camel@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <3cf8b953c449320cc4c085924ef0e2eed5eadcf7.camel@intel.com> X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Wed, 06 Dec 2023 07:08:09 -0800 (PST) On Wed, Dec 06, 2023 at 01:28:08AM +0000, Edgecombe, Rick P wrote: > On Tue, 2023-12-05 at 03:45 +0300, Kirill A. Shutemov wrote:? > > +static void tdx_kexec_unshare_mem(bool crash) > > +{ > > +???????unsigned long addr, end; > > +???????long found = 0, shared; > > + > > +???????/* Stop new private<->shared conversions */ > > +???????conversion_allowed = false; > > I wonder if this might need a compiler barrier here to be totally safe. > I'm not sure. Yeah, it should be cleaner with a barrier. > > + > > +???????/* > > +??????? * Crash kernel reaches here with interrupts disabled: can't > > wait for > > +??????? * conversions to finish. > > +??????? * > > +??????? * If race happened, just report and proceed. > > +??????? */ > > +???????if (!crash) { > > +???????????????unsigned long timeout; > > + > > +???????????????/* > > +??????????????? * Wait for in-flight conversions to complete. > > +??????????????? * > > +??????????????? * Do not wait more than 30 seconds. > > +??????????????? */ > > +???????????????timeout = 30 * USEC_PER_SEC; > > +???????????????while (atomic_read(&conversions_in_progress) && > > timeout--) > > +???????????????????????udelay(1); > > +???????} > > + > > +???????if (atomic_read(&conversions_in_progress)) > > +???????????????pr_warn("Failed to finish shared<->private > > conversions\n"); > > I can't think of any non-ridiculous way to handle this case. Maybe we > need VMM help. Do you see a specific way how VMM can help here? > > diff --git a/arch/x86/kernel/reboot.c b/arch/x86/kernel/reboot.c > > index 830425e6d38e..c81afffaa954 100644 > > --- a/arch/x86/kernel/reboot.c > > +++ b/arch/x86/kernel/reboot.c > > @@ -12,6 +12,7 @@ > > ?#include > > ?#include > > ?#include > > +#include > > ?#include > > ?#include > > ?#include > > @@ -31,6 +32,7 @@ > > ?#include > > ?#include > > ?#include > > +#include > > ? > > ?/* > > ? * Power off function, if any > > @@ -716,6 +718,14 @@ static void > > native_machine_emergency_restart(void) > > ? > > ?void native_machine_shutdown(void) > > ?{ > > +???????/* > > +??????? * Call enc_kexec_unshare_mem() while all CPUs are still > > active and > > +??????? * interrupts are enabled. This will allow all in-flight > > memory > > +??????? * conversions to finish cleanly before unsharing all memory. > > +??????? */ > > +???????if (cc_platform_has(CC_ATTR_GUEST_MEM_ENCRYPT) && > > kexec_in_progress) > > +???????????????x86_platform.guest.enc_kexec_unshare_mem(false); > > These questions are coming from an incomplete understanding of the > kexec/reboot operation. Please disregard if it is not helpful. > > By doing this while other tasks can still run, it handles the > conversion races in the !crash case. But then it sets shared pages to > NP. What happens if another active task tries to write to one? > > I guess we rely on the kernel_restart_prepare()->device_shutdown() to > clean up, which runs before native_machine_shutdown(). So there might > be conversions in progress when tdx_kexec_unshare_mem() is called, from > the allocator work queues. But the actual memory won't be accessed > during that operation. Right, devices has to be shutdown by then. > But the console must be active? Or otherwise who can see these > warnings. It doesn't use a shared page? Or the KVM clock, which looks > to clean up at cpu tear down, which now happens after > tdx_kexec_unshare_mem()? So I wonder if there might be cases. Virtio console is not functional by then, but serial is. Serial uses port I/O and doesn't need shared memory. > If so, maybe you could halt the conversions in > native_machine_shutdown(), then do the actual reset to private after > tasks can't schedule. It would also mean that we cannot use set_memory_np() there as it requires sleepable context. I would rather keep conversion in native_machine_shutdown() path. > I'd still wonder about if anything might try to > access a shared page triggered by the console output. set_memory_np() would make it obvious if it ever happens. -- Kiryl Shutsemau / Kirill A. Shutemov