Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp422333rwb; Tue, 25 Jul 2023 18:41:38 -0700 (PDT) X-Google-Smtp-Source: APBJJlEVhx+zJIxwxf+sgCxA3+6HCh2yStiTS2A3rpkyDVkUjm16IDz3WNc0zGnGKUxg5fNYlILa X-Received: by 2002:a05:6358:89d:b0:133:c62:59f4 with SMTP id m29-20020a056358089d00b001330c6259f4mr556997rwj.24.1690335698486; Tue, 25 Jul 2023 18:41:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690335698; cv=none; d=google.com; s=arc-20160816; b=RkCjJzEfblVFZERQWmwJ5aG5GJ0HkufpO+1CulduA1qpnY6hbmHgJ+jBQvWOoRZnOT H+H4awBk+BHN6CjhTioTv7hvK0V2s0aVjryfLqGZcfqYusq6dho2dj3rADnWJXuuc4OF 5SiUMgTuJ/+ciNT9t1pVpfi5lx5/UknEAXp31nC1uwEI1ZIaUb1FZd/6DApLOzEuzfEa T5RkFp0e6dRz4YKjM7HKe1L+pq/v2wtvsm5qBlRSqDSkFumGikG2NAX3LPdULzSgzmOb kpSxoFY4YN86yH1yj6sbkE/XqYSVwbgAvoFUkLGH2u6T3djVdAq0yAwm3ftgRiKejZUl 7BUA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=AiaTv32zBX7LzK4EHHyJzhObJfD9EUca5Ecsb0mzi8c=; fh=4XslgJTnG8HtKjsWD9DQZuXpVj4IlumluARxO2PrSm8=; b=Ox3cUNS73N1BltXsqC3MJDGsKAgPMBi2P9J2XI2nPojRhbby7oGiv1swG3vMAqqDjH LWiASVQ174cBPsBuWtkUV8Kq3k9Ma7a06kZRMEvnHDtxHgfcaPJtnfPwMv2kxdbnWTQh nOaPLs1ktD2pwL8GLWaMakdT7WQcPcJJCbmFHvz59uQNdY4Z04k9mAkqtsMCGHBlfXPa DijleMLibyaRecq4KX9TIgx7AkC7mGb+BDyeCoCUx+9WJBx6tK5HLbbSp3x3cvkFrBCO LMG7AFq07VlIROH66TzbDyZNdNGIuWGY78kXOdUCpP5CR553MF9s8B8Yq6sTMw+9HYZB Bxew== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=eAhiDWPC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id w4-20020a656944000000b0053ef08b29bbsi11838379pgq.564.2023.07.25.18.41.26; Tue, 25 Jul 2023 18:41:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=eAhiDWPC; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231518AbjGYXMo (ORCPT + 99 others); Tue, 25 Jul 2023 19:12:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33178 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231379AbjGYXMm (ORCPT ); Tue, 25 Jul 2023 19:12:42 -0400 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71733193 for ; Tue, 25 Jul 2023 16:12:40 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1690326760; x=1721862760; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=4s9F2RMzca1SnjP8iJv+eGAM2teJG1YvpI+AnkF73n0=; b=eAhiDWPCb1a5cDcwcOnG3a9L48NSFiJkhy2MCGfC5t+L0uKRQvXHABGS zy9a8fCiSz2dud3euK8lMMhwvme55VDIDIDLToiuptqCGqbdnU6jIJasV Z7b8OgjIHxcFKlp/KVzMjvuMaklfi8NnamlSS0dIQ8Ni3DUfAM1hHeyhc W/ud3AfD/FUpMaZTfMyZG9Ujoo4/XVyU7RRnbJkVbc+Yp4N6Opujxev8J pnH5Xq23xtvqOGRvvbHyUb54VV5xgQrWL48JXbOdYOQFJAuxETRMnGO1h c91p9TvR3Pdt3VsQjxDkRSdeX8vHSub6YSjhtHH8ZcBu1CBcXK3RkYclU Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10782"; a="434119089" X-IronPort-AV: E=Sophos;i="6.01,231,1684825200"; d="scan'208";a="434119089" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Jul 2023 16:12:39 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10782"; a="850212118" X-IronPort-AV: E=Sophos;i="6.01,231,1684825200"; d="scan'208";a="850212118" Received: from mlytkin-mobl2.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.57.129]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 25 Jul 2023 16:12:36 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 55635103A12; Wed, 26 Jul 2023 02:12:33 +0300 (+03) Date: Wed, 26 Jul 2023 02:12:33 +0300 From: "Kirill A. Shutemov" To: "Michael Kelley (LINUX)" Cc: "Kirill A. Shutemov" , "dave.hansen@intel.com" , "tglx@linutronix.de" , "mingo@redhat.com" , "bp@alien8.de" , Dexuan Cui , "rick.p.edgecombe@intel.com" , "sathyanarayanan.kuppuswamy@linux.intel.com" , "seanjc@google.com" , "thomas.lendacky@amd.com" , "x86@kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCHv3 0/3] x86/tdx: Fix one more load_unaligned_zeropad() issue Message-ID: <20230725231233.mu2yarso3tcamimu@box.shutemov.name> References: <20230606095622.1939-1-kirill.shutemov@linux.intel.com> <20230707140633.jzuucz52d7jdc763@box.shutemov.name> <20230709060904.w3czdz23453eyx2h@box.shutemov.name> <20230724231927.pah3dt6gszwtsu45@box.shutemov.name> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-4.3 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 25, 2023 at 03:51:24PM +0000, Michael Kelley (LINUX) wrote: > From: Kirill A. Shutemov Sent: Monday, July 24, 2023 4:19 PM > > > > On Thu, Jul 13, 2023 at 02:43:39PM +0000, Michael Kelley (LINUX) wrote: > > > From: Kirill A. Shutemov Sent: Saturday, July 8, 2023 11:09 PM > > > > > > > > On Sat, Jul 08, 2023 at 11:53:08PM +0000, Michael Kelley (LINUX) wrote: > > > > > From: Kirill A. Shutemov Sent: Friday, July 7, 2023 7:07 AM > > > > > > > > > > > > On Thu, Jul 06, 2023 at 04:48:32PM +0000, Michael Kelley (LINUX) wrote: > > > > > > > From: Kirill A. Shutemov Sent: Tuesday, June 6, 2023 2:56 AM > > > > > > > > > > [snip] > > > > > > > > > > > > > > > > > It only addresses the problem that happens on transition, but > > > > > > load_unaligned_zeropad() is still a problem for the shared mappings in > > > > > > general, after transition is complete. Like if load_unaligned_zeropad() > > > > > > steps from private to shared mapping and shared mapping triggers #VE, > > > > > > kernel should be able to handle it. > > > > > > > > > > I'm showing my ignorance of TDX architectural details, but what's the > > > > > situation where shared mappings in general can trigger a #VE? How > > > > > do such situations get handled for references that aren't from > > > > > load_unaligned_zeropad()? > > > > > > > > > > > > > Shared mappings are under host/VMM control. It can just not map the page > > > > in shared-ept and trigger ept-violation #VE. > > > > > > I know you are out on vacation, but let me follow up now for further > > > discussion when you are back. > > > > > > Isn't the scenario you are describing a malfunctioning or malicious > > > host/VMM? Would what you are describing be done as part of normal > > > operation? Kernel code must have switched the page from private to > > > shared for some purpose. As soon as that code (which presumably > > > does not have any entry in the exception table) touches the page, it > > > would take the #VE and the enter the die path because there's no fixup. > > > So is there value in having load_unaligned_zeropad() handle the #VE and > > > succeed where a normal reference would fail? > > > > #VE on shared memory is legitimately used for MMIO. But MMIO region is > > usually separate from the real memory in physical address space. > > > > But we also have DMA. > > > > DMA pages allocated from common pool of memory and they can be next to > > dentry cache that kernel accesses with load_unaligned_zeropad(). DMA pages > > are shared, but they usually backed by memory and not cause #VE. However > > shared memory is under full control from VMM and VMM can remove page at > > any point which would make platform to deliver #VE to the guest. This is > > pathological scenario, but I think it still worth handling gracefully. > > Yep, pages targeted by DMA have been transitioned to shared, and could > be scattered anywhere in the guest physical address space. Such a page > could be touched by load_unaligned_zeropad(). But could you elaborate > more on the "pathological scenario" where the guest physical page isn't > backed by memory? > > Sure, the VMM can remove the page at any point, but why would it do > so? Is doing so a legitimate part of the host/guest contract that the guest > must handle cleanly? More importantly, what is the guest expected to > do in such a case? I would expect that at some point such a DMA page > is accessed by a guest vCPU with an explicit reference that is not > load_unaligned_zeropad(). Then the guest would take a #VE that > goes down the #GP path and invokes die(). > > I don't object to make the load_unaligned_zeropad() fixup path work > correctly in response to a #VE, as it seems like general goodness. I'm > just trying to make sure I understand the nuances of "why". :-) We actually triggered the issue during internal testing. I wrote initial patch after that. But looking back on the report I don't have an answer why the page triggered #VE. Maybe VMM or virtual BIOS screwed up and put MMIO next to normal memory, I donno. > > > I'd still like to see the private <-> shared transition code mark the pages > > > as invalid during the transition, and avoid the possibility of #VE and > > > similar cases with SEV-SNP. Such approach reduces (eliminates?) > > > entanglement between CoCo-specific exceptions and > > > load_unaligned_zeropad(). It also greatly simplifies TD Partition cases > > > and SEV-SNP cases where a paravisor is used. > > > > I doesn't eliminates issue for TDX as the scenario above is not transient. > > It can happen after memory is converted to shared. > > Notwithstanding, do you have any objections to the private <-> shared > transition code being changed so it won't be the cause of #VE, and > similar on SEV-SNP? I am not yet convinced it is needed. But let's see the code. -- Kiryl Shutsemau / Kirill A. Shutemov