Received: by 2002:a05:6602:18e:0:0:0:0 with SMTP id m14csp1659348ioo; Sun, 22 May 2022 23:22:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy/jFBkTth11Gy9alPv57NLCzqgZqaUptUdhmrGZehHNPWbzVe1FyjlR/uD+t7TwS6Nwosa X-Received: by 2002:a05:6a00:174a:b0:50d:44ca:4b with SMTP id j10-20020a056a00174a00b0050d44ca004bmr22422121pfc.0.1653286924952; Sun, 22 May 2022 23:22:04 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1653286924; cv=none; d=google.com; s=arc-20160816; b=Cp9vHIDZRO72jPQULo4DcibmSLtsG75uzc4wWZszgIGUneYafHCmGMFLW/u486qdWI OE6bSnfY0FED0K6nHg6B/rd35p1bslHJGe+VQctOew3MIRyb7fhbil3qR5U8tdbikyDf j36WX1iSWlfjacRgUyp39TzZ3bs0RAt4j7VKiMvhD6ewdpJ6b2iHK62dkqczHdlw6V5L yRNQXFmM0KNqJaUSjblHjQP2Vm0IgU0QHfINIgMgNSOLUjiZBKe0IESl0KUTZOGOBzpJ gzTQgdH8Op7kW/wGQ+Bvianr/2uW7pbO/3x6bcNLbbsSENG9fqw43glo3SVARvfwAP8d DoTg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=DzAgQwwmvPPiOvwgypX4Pu4zDxghEN9wMWgGGWDNqUs=; b=cs7S46m9/cRmpI/50ixkVM3HfVabYxnAdhu+PJrK7QoMUo8PUz56twmwoU7VFDeWdY DhV9d7Eh8NEwqWyihDLKguLavuWN0ytCk3JL274GyUmLuqAo05KT/E22Up8GrvHtOYsS 8QLB3wyhpcbcXVV7oHoY9Bi4S8U+rPLUqxqbtHI8WQ3KSjjrYOOa37sx+DO1ZMsPkfJz bsXCNsScgh2StDHiWku6JXq4FqmDBV/LkqcgAXrh2M0DV201VbnUv3AJltfd76yNUsty kyySYezcmWr6/0SvOHd7zlX3yhUdSnOZS/fRXLDHj+cuuE5w9gB68+1Ost2uZaWcxo86 sWMQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=lscPWBI4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id q2-20020a632a02000000b003f5dca215bfsi9446320pgq.770.2022.05.22.23.22.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 22 May 2022 23:22:04 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=lscPWBI4; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 6245116593; Sun, 22 May 2022 23:04:55 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352990AbiETSno (ORCPT + 99 others); Fri, 20 May 2022 14:43:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:42272 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238179AbiETSnk (ORCPT ); Fri, 20 May 2022 14:43:40 -0400 Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B8425468E for ; Fri, 20 May 2022 11:43:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1653072219; x=1684608219; h=date:from:to:cc:subject:message-id:references: mime-version:in-reply-to; bh=K2pmBkR7HaYgM5K8jO0luWcfEyX30jduvESLhKA1EPY=; b=lscPWBI4NJWA3siXxUZPt4zuDP+oerDBtdk237NL6iucveAfLW9GPdcY db8rlYuUmfcBZgtqMMLnBu1+yfCboT+HUADmJZby5dhs/O8mQOI32vM80 73pYbaDcAxmWE8xO2uWYuaXfI3OR3OCgMTYKtZE61rOTNzojSm+2XCuhD GyXLZhK0x52UTCgKjKjrSiskbbZ5SzYYzTlZOS/iWUIdcAEncwfrhOfpt ujDf/k8cbYgA8bRYBZGSLmKWZV/eBeY2jpyRtXVuxm7SkfSp2RD9npmlE 8eStWsiMO7md4GXvbrvCkf7EX/kbOFJT51pBjT9jumOOdE1UaHkJnpahs Q==; X-IronPort-AV: E=McAfee;i="6400,9594,10353"; a="333324966" X-IronPort-AV: E=Sophos;i="5.91,240,1647327600"; d="scan'208";a="333324966" Received: from orsmga001.jf.intel.com ([10.7.209.18]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 20 May 2022 11:43:38 -0700 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.91,240,1647327600"; d="scan'208";a="607143113" Received: from black.fi.intel.com ([10.237.72.28]) by orsmga001.jf.intel.com with ESMTP; 20 May 2022 11:43:35 -0700 Received: by black.fi.intel.com (Postfix, from userid 1000) id ABBC2109; Fri, 20 May 2022 21:43:35 +0300 (EEST) Date: Fri, 20 May 2022 21:43:35 +0300 From: "Kirill A. Shutemov" To: Sean Christopherson Cc: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@intel.com, luto@kernel.org, peterz@infradead.org, ak@linux.intel.com, dan.j.williams@intel.com, david@redhat.com, hpa@zytor.com, linux-kernel@vger.kernel.org, sathyanarayanan.kuppuswamy@linux.intel.com, thomas.lendacky@amd.com, x86@kernel.org Subject: Re: [PATCHv2 3/3] x86/tdx: Handle load_unaligned_zeropad() page-cross to a shared page Message-ID: <20220520184335.oygw2q3rov2go45b@black.fi.intel.com> References: <20220520031316.47722-1-kirill.shutemov@linux.intel.com> <20220520031316.47722-4-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Spam-Status: No, score=-2.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, May 20, 2022 at 05:47:30PM +0000, Sean Christopherson wrote: > On Fri, May 20, 2022, Kirill A. Shutemov wrote: > > load_unaligned_zeropad() can lead to unwanted loads across page boundaries. > > The unwanted loads are typically harmless. But, they might be made to > > totally unrelated or even unmapped memory. load_unaligned_zeropad() > > relies on exception fixup (#PF, #GP and now #VE) to recover from these > > unwanted loads. > > > > In TDX guests, the second page can be shared page and VMM may configure > > it to trigger #VE. > > > > Kernel assumes that #VE on a shared page is MMIO access and tries to > > decode instruction to handle it. In case of load_unaligned_zeropad() it > > may result in confusion as it is not MMIO access. > > > > Check fixup table before trying to handle MMIO. > > > > The issue was discovered by analysis. It was not triggered during the > > testing. > > > > Signed-off-by: Kirill A. Shutemov > > --- > > arch/x86/coco/tdx/tdx.c | 20 ++++++++++++++++++++ > > 1 file changed, 20 insertions(+) > > > > diff --git a/arch/x86/coco/tdx/tdx.c b/arch/x86/coco/tdx/tdx.c > > index 010dc229096a..1a1c8a92cfa5 100644 > > --- a/arch/x86/coco/tdx/tdx.c > > +++ b/arch/x86/coco/tdx/tdx.c > > @@ -11,6 +11,8 @@ > > #include > > #include > > #include > > +#include > > +#include > > > > /* TDX module Call Leaf IDs */ > > #define TDX_GET_INFO 1 > > @@ -299,6 +301,24 @@ static int handle_mmio(struct pt_regs *regs, struct ve_info *ve) > > if (WARN_ON_ONCE(user_mode(regs))) > > return -EFAULT; > > > > + /* > > + * load_unaligned_zeropad() relies on exception fixups in case of the > > + * word being a page-crosser and the second page is not accessible. > > + * > > + * In TDX guests, the second page can be shared page and VMM may > > + * configure it to trigger #VE. > > + * > > + * Kernel assumes that #VE on a shared page is MMIO access and tries to > > + * decode instruction to handle it. In case of load_unaligned_zeropad() > > + * it may result in confusion as it is not MMIO access. > > The guest kernel can't know that it's not "MMIO", e.g. nothing prevents the host > from manually serving accesses to some chunk of shared memory instead of backing > the shared chunk with host DRAM. It would require the guest to access shared memory only with instructions that we can deal with. I don't think we have such guarantee. > > + * > > + * Check fixup table before trying to handle MMIO. > > This ordering is wrong, fixup should be done if and only if the instruction truly > "faults". E.g. if there's an MMIO access lurking in the kernel that is wrapped in > exception fixup, then this will break that usage and provide garbage data on a read > and drop any write. When I tried to trigger the bug, the #VE actually succeed, because load_unaligned_zeropad() uses instruction we can decode. But due misalignment, the part of that came from non-shared page got overwritten with data that came from VMM. I guess we can try to detect misaligned accesses and handle them correctly. But it gets complicated and easer to screw up. Do we ever use exception fixups for MMIO accesses to justify the complication? -- Kirill A. Shutemov