Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp902654pxb; Thu, 31 Mar 2022 22:15:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzClRljVg6EDCtYpObg8epuQHu7LfDz5fvgpvyJua0yuUDnbnjl7ZwVcBWz0fuIb4ENvRgE X-Received: by 2002:a17:907:7daa:b0:6e0:c04f:be44 with SMTP id oz42-20020a1709077daa00b006e0c04fbe44mr7818827ejc.375.1648790154422; Thu, 31 Mar 2022 22:15:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648790154; cv=none; d=google.com; s=arc-20160816; b=xwRfIWGLtYaCBL4w5iEM6N8ND3Qg3hC01wfThyDunes4530YfewFYyt/w1ebVHySKJ Zzpad7HWbBilQ6YWegj5i3ALf3voCa2fD9qIVVJneprlgq9cnQYJo71C9r1TDlO+6TcF 861Sudlcpg5fswwtBmtqtiwIr06odMGoIBqIGsXQcL0o4Sfi+AS9Xo0Qd9stss/HwfYP WqRJ9dqfx9KFkQzbsLaGO7o9tVLlDZ3J/ywEV8vP/eTQmIFKFYpx545klYERwXmfJTiv X3TLjZi3I2U5pRyCLOYYvjxInvp2z++BrMP/Hnl8vqoSVHssdTRE8R3TAhkEo3/2tZbx WktQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:subject:cc:to:from:date:references:in-reply-to :message-id:mime-version:user-agent:dkim-signature; bh=IbsRBMjbc55+1DflYYEppVpJN+07FKh0Mku3vQPu0CI=; b=MNQgEbfeGsMWJTnVe8Ryra0vi9SeDTTEKF46z0VRNig0VKSNapzu/EDWi3dmEjTb1I D4ILXbU6kHcVYPcocg60i6ML1+DLXWiTG8MCW1EjqmltBzCdeivEG38NrhhfXhvzzBzL X+dznD+8Slzs6XVgbFKomrbsL/jIKSv1G8c+t3fbtkKunOFayJV4D6bIxHSHcaUyI3v6 Bkhs14p7F+L2FuT8uLA07HWsvYiEs+Joro7Z/eu1Yx4RLPiID0vm0F+QK+s0k7E1RLN/ KRDu0EMEP0C+AX0U/sfCpqo36tMvCkEOQZsc5QgkWr60wAPcAH+YSGQ4Zxhic9FgqzFc daMg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=AbT+KHC1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id l9-20020a056402254900b0041b53115756si950998edb.440.2022.03.31.22.15.28; Thu, 31 Mar 2022 22:15:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=AbT+KHC1; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239626AbiCaQHU (ORCPT + 99 others); Thu, 31 Mar 2022 12:07:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44798 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239625AbiCaQHO (ORCPT ); Thu, 31 Mar 2022 12:07:14 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 83A2D52E55; Thu, 31 Mar 2022 09:05:26 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 3CAF7B81741; Thu, 31 Mar 2022 16:05:25 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7D46EC36AE3; Thu, 31 Mar 2022 16:05:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1648742723; bh=EdQ/DEWAOGhT7tt+PGpjGVQZwkU6Zx+llTp2J7Hfjk8=; h=In-Reply-To:References:Date:From:To:Cc:Subject:From; b=AbT+KHC1V+lTMkYInG9LQqRURe90y/UbGDjjiL4tm9oDicV7aq2L2MyRc1+Bt4ab0 8R1XTmouglcqHACbStGRGC2oziEgb4AsQAnsGavRPyUlSVnarFijflXafs5HWWiPR8 enxZZueAw0njiu1O405t6gw04XdJYB07O2YmyGERu8TP3F4Zt1g4ZYvWStfZzZJUY7 kyyYZSmzt7QnWSpr70cE6zChHoq74KrH66O64TB/3SCgGaBRRiFMi9wa/Lmpy6lnqI rEDEvwZEaUJcB11Gtv6+YzLd+bReFKoeXygrA2jngV/y4WtTelUKK63vpON+f6NxeL XjPwr/fPsgwsQ== Received: from compute2.internal (compute2.nyi.internal [10.202.2.46]) by mailauth.nyi.internal (Postfix) with ESMTP id 2048527C0054; Thu, 31 Mar 2022 12:05:21 -0400 (EDT) Received: from imap48 ([10.202.2.98]) by compute2.internal (MEProxy); Thu, 31 Mar 2022 12:05:21 -0400 X-ME-Sender: X-ME-Proxy-Cause: gggruggvucftvghtrhhoucdtuddrgedvvddrudeigedgleejucetufdoteggodetrfdotf fvucfrrhhofhhilhgvmecuhfgrshhtofgrihhlpdfqfgfvpdfurfetoffkrfgpnffqhgen uceurghilhhouhhtmecufedttdenucesvcftvggtihhpihgvnhhtshculddquddttddmne cujfgurhepofgfggfkjghffffhvffutgesthdtredtreertdenucfhrhhomhepfdetnhgu hicunfhuthhomhhirhhskhhifdcuoehluhhtoheskhgvrhhnvghlrdhorhhgqeenucggtf frrghtthgvrhhnpedthfehtedtvdetvdetudfgueeuhfdtudegvdelveelfedvteelfffg fedvkeegfeenucevlhhushhtvghrufhiiigvpedtnecurfgrrhgrmhepmhgrihhlfhhroh hmpegrnhguhidomhgvshhmthhprghuthhhphgvrhhsohhnrghlihhthidqudduiedukeeh ieefvddqvdeifeduieeitdekqdhluhhtoheppehkvghrnhgvlhdrohhrgheslhhinhhugi drlhhuthhordhush X-ME-Proxy: Received: by mailuser.nyi.internal (Postfix, from userid 501) id 99E0F21E0073; Thu, 31 Mar 2022 12:05:17 -0400 (EDT) X-Mailer: MessagingEngine.com Webmail Interface User-Agent: Cyrus-JMAP/3.7.0-alpha0-382-g88b93171a9-fm-20220330.001-g88b93171 Mime-Version: 1.0 Message-Id: <80aad2f9-9612-4e87-a27a-755d3fa97c92@www.fastmail.com> In-Reply-To: References: <20220310140911.50924-1-chao.p.peng@linux.intel.com> <88620519-029e-342b-0a85-ce2a20eaf41b@arm.com> Date: Thu, 31 Mar 2022 09:04:56 -0700 From: "Andy Lutomirski" To: "Sean Christopherson" , "Quentin Perret" Cc: "Steven Price" , "Chao Peng" , "kvm list" , "Linux Kernel Mailing List" , linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, "Linux API" , qemu-devel@nongnu.org, "Paolo Bonzini" , "Jonathan Corbet" , "Vitaly Kuznetsov" , "Wanpeng Li" , "Jim Mattson" , "Joerg Roedel" , "Thomas Gleixner" , "Ingo Molnar" , "Borislav Petkov" , "the arch/x86 maintainers" , "H. Peter Anvin" , "Hugh Dickins" , "Jeff Layton" , "J . Bruce Fields" , "Andrew Morton" , "Mike Rapoport" , "Maciej S . Szmigiero" , "Vlastimil Babka" , "Vishal Annapurve" , "Yu Zhang" , "Kirill A. Shutemov" , "Nakajima, Jun" , "Dave Hansen" , "Andi Kleen" , "David Hildenbrand" , "Marc Zyngier" , "Will Deacon" Subject: Re: [PATCH v5 00/13] KVM: mm: fd-based approach for supporting KVM guest private memory Content-Type: text/plain X-Spam-Status: No, score=-7.1 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_HI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 30, 2022, at 10:58 AM, Sean Christopherson wrote: > On Wed, Mar 30, 2022, Quentin Perret wrote: >> On Wednesday 30 Mar 2022 at 09:58:27 (+0100), Steven Price wrote: >> > On 29/03/2022 18:01, Quentin Perret wrote: >> > > Is implicit sharing a thing? E.g., if a guest makes a memory access in >> > > the shared gpa range at an address that doesn't have a backing memslot, >> > > will KVM check whether there is a corresponding private memslot at the >> > > right offset with a hole punched and report a KVM_EXIT_MEMORY_ERROR? Or >> > > would that just generate an MMIO exit as usual? >> > >> > My understanding is that the guest needs some way of tagging whether a >> > page is expected to be shared or private. On the architectures I'm aware >> > of this is done by effectively stealing a bit from the IPA space and >> > pretending it's a flag bit. >> >> Right, and that is in fact the main point of divergence we have I think. >> While I understand this might be necessary for TDX and the likes, this >> makes little sense for pKVM. This would effectively embed into the IPA a >> purely software-defined non-architectural property/protocol although we >> don't actually need to: we (pKVM) can reasonably expect the guest to >> explicitly issue hypercalls to share pages in-place. So I'd be really >> keen to avoid baking in assumptions about that model too deep in the >> host mm bits if at all possible. > > There is no assumption about stealing PA bits baked into this API. Even within > x86 KVM, I consider it a hard requirement that the common flows not assume the > private vs. shared information is communicated through the PA. Quentin, I think we might need a clarification. The API in this patchset indeed has no requirement that a PA bit distinguish between private and shared, but I think it makes at least a weak assumption that *something*, a priori, distinguishes them. In particular, there are private memslots and shared memslots, so the logical flow of resolving a guest memory access looks like: 1. guest accesses a GVA 2. read guest paging structures 3. determine whether this is a shared or private access 4. read host (KVM memslots and anything else, EPT, NPT, RMP, etc) structures accordingly. In particular, the memslot to reference is different depending on the access type. For TDX, this maps on to the fd-based model perfectly: the host-side paging structures for the shared and private slots are completely separate. For SEV, the structures are shared and KVM will need to figure out what to do in case a private and shared memslot overlap. Presumably it's sufficient to declare that one of them wins, although actually determining which one is active for a given GPA may involve checking whether the backing store for a given page actually exists. But I don't understand pKVM well enough to understand how it fits in. Quentin, how is the shared vs private mode of a memory access determined? How do the paging structures work? Can a guest switch between shared and private by issuing a hypercall without changing any guest-side paging structures or anything else? It's plausible that SEV and (maybe) pKVM would be better served if memslots could be sparse or if there was otherwise a direct way for host userspace to indicate to KVM which address ranges are actually active (not hole-punched) in a given memslot or to otherwise be able to make a rule that two different memslots (one shared and one private) can't claim the same address.