Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp737028pxv; Thu, 15 Jul 2021 14:54:13 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzPTRQPoqPcluPNo3m/YiPQR+IEj2bSICoUL2DN+dxYdM2zYicCPrph0JjFaEbFZ7pH86/R X-Received: by 2002:a05:6e02:e82:: with SMTP id t2mr4152593ilj.218.1626386052940; Thu, 15 Jul 2021 14:54:12 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626386052; cv=none; d=google.com; s=arc-20160816; b=e0jkyrl66oRdwXT5eNxD2Bk0my/IMNmXidLygHqq/0t/HmQeI6seXr/Eh66cUu3jln Eynfsxy3tCjiv2AQUKqtbaoOxTgHiMSGnMa41IH7obtT4rqrOXF7cTaYcNbWIkXKrqzR GyzhdIFdg/r3wk24MUxJZqsmx7K/e3DgQckmYMOXFoNXsq6BmWSP2JSS4/hnQf1goZAT W0yeu79cRIonf52Q75pCRJrhvoY08VY1ip3uTDLvuhyGtdb//LkuZbhPNdl06nCwAtkZ SzMBvCnGu1pM2DvMdD+NaGv6gqVVNocfpNBY8IdTDACH+Lq8QQ47YAQ5HwYz8mCPsJcI a16g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=fT5PdprpzzVdpuGRQjB10wOKBXaawvXKa53cRx1HK6I=; b=gCEqNO3qwYQ6AzZa2emOg8C8c73IZ6dayGqVo6izAveCFAtkgRdODo9h3S0CiqmcSQ OC8whZ/0hxXP5Sa3imwC4b6ldk2yr9MrYvYI9llWegyqhbGhE7oGnLaCGT/6vECWuxzw 6xEjxH9Zn0g2ehktQYRi/RXnujnbvICGYvyuUE7nX4Wb5EwOYZOAMBCdq7pmTp6OblHr 6bixdvBFw5b5FlgyeuvXvWe0SH09xixkJUTsbe30SUXiXumsZcrJDTZ/QjHlulZ8doKA r89jKeOTGbccWs0pQ8W0xy2mWN8JrVj7UMLdrV9WcRWydcmrTVm3l6d0YoVmNcxE92VU KIxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=svmrN+O4; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u10si6425845ilv.84.2021.07.15.14.53.52; Thu, 15 Jul 2021 14:54:12 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=svmrN+O4; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230187AbhGOV4S (ORCPT + 99 others); Thu, 15 Jul 2021 17:56:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44264 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230126AbhGOV4R (ORCPT ); Thu, 15 Jul 2021 17:56:17 -0400 Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBD78C061764 for ; Thu, 15 Jul 2021 14:53:23 -0700 (PDT) Received: by mail-pj1-x1034.google.com with SMTP id gp5-20020a17090adf05b0290175c085e7a5so356481pjb.0 for ; Thu, 15 Jul 2021 14:53:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=fT5PdprpzzVdpuGRQjB10wOKBXaawvXKa53cRx1HK6I=; b=svmrN+O4TI98vQfj9n2X/vE50XmQmqn1PaXKCP4Nr/CQyCC4gpvE/2A44G7O3EEASy /EfSX6dHh4q+I/a6RzTa5Z+Q8thrryvjaD57bsWLXbdH95pEPz+xykR9xEgUdH4Bv4Z3 ZoAGTa8Dt+Ijjc+XetNUvdXC7ojCQeJcD0LZiPZWKk/RdXBOQA7PxVYGDtHqNFQ/g3+1 ouQ3Nz6ZGFswmzMs7fe8Jr1UislrbNvLnmgdVpL88bQPCqp5nJlc9aQYclxLYlImlTzk iqbO3a14CqCw0bGU2cATmeN0NT1CTHGMNXISk50SAZ9lM50oC08Hg9k5l20FPm3XWnsR s63g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=fT5PdprpzzVdpuGRQjB10wOKBXaawvXKa53cRx1HK6I=; b=mr0bopm/jGUWclrwnGlKKkRNeDZcV47FtJNMRnWmewRp3xk02wScvdGCmLRgXeuaNb KeqEkDAt9nFVrBBO55gHeqAtkjJd54dCxWMr1Wq01r25/L4PtsYD7cP9tCHmRhlx1B3h DK5mwFrhv5YRXGRU0F8hxygAU5zegsB2oTu4lv+WvkVPv6mYsHKTj5BDS07iJQT9Xhjt mgzKfYbH6I52ZR7PGWqiRjaAz13wicyMtamZpeuGbMk7odidJIx73ZJ9/dG7f0zutWRg ns3T57YufiI3cSnrNS4DtxCu9vptEwTYRVVL2ganJyRH+NHWt+wE22ZY/asZCEn1WJsV Asfw== X-Gm-Message-State: AOAM530ykbJlmL4V4jJCd9EyGgfv6RudcJwuXMoK9cEuRDVm86DmcSn0 gWzwDyU4U0LSp/2kmSVdZH+DCQ== X-Received: by 2002:a17:90a:9b89:: with SMTP id g9mr12007893pjp.200.1626386003022; Thu, 15 Jul 2021 14:53:23 -0700 (PDT) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id m21sm7561729pfo.159.2021.07.15.14.53.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 15 Jul 2021 14:53:22 -0700 (PDT) Date: Thu, 15 Jul 2021 21:53:18 +0000 From: Sean Christopherson To: Brijesh Singh Cc: Dave Hansen , x86@kernel.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, linux-efi@vger.kernel.org, platform-driver-x86@vger.kernel.org, linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, Thomas Gleixner , Ingo Molnar , Joerg Roedel , Tom Lendacky , "H. Peter Anvin" , Ard Biesheuvel , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Andy Lutomirski , Dave Hansen , Sergio Lopez , Peter Gonda , Peter Zijlstra , Srinivas Pandruvada , David Rientjes , Dov Murik , Tobin Feldman-Fitzthum , Borislav Petkov , Michael Roth , Vlastimil Babka , tony.luck@intel.com, npmccallum@redhat.com, brijesh.ksingh@gmail.com Subject: Re: [PATCH Part2 RFC v4 10/40] x86/fault: Add support to handle the RMP fault for user address Message-ID: References: <20210707183616.5620-1-brijesh.singh@amd.com> <20210707183616.5620-11-brijesh.singh@amd.com> <3c6b6fc4-05b2-8d18-2eb8-1bd1a965c632@intel.com> <2b4accb6-b68e-02d3-6fed-975f90558099@amd.com> <5592d8ff-e2c3-6474-4a10-96abe1962d6f@amd.com> <298d2e19-566d-2e58-b639-724c10885646@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <298d2e19-566d-2e58-b639-724c10885646@amd.com> Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Mon, Jul 12, 2021, Brijesh Singh wrote: > > > On 7/12/21 11:29 AM, Dave Hansen wrote: > > On 7/12/21 9:24 AM, Brijesh Singh wrote: > > > Apologies if I was not clear in the messaging, that's exactly what I > > > mean that we don't feed RMP entries during the page state change. > > > > > > The sequence of the operation is: > > > > > > 1. Guest issues a VMGEXIT (page state change) to add a page in the RMP > > > 2. Hyperivosr adds the page in the RMP table. > > > > > > The check will be inside the hypervisor (#2), to query the backing page > > > type, if the backing page is from the hugetlbfs, then don't add the page > > > in the RMP, and fail the page state change VMGEXIT. > > > > Right, but *LOOOOOONG* before that, something walked the page tables and > > stuffed the PFN into the NPT (that's the AMD equivalent of EPT, right?). > > You could also avoid this whole mess by refusing to allow hugetblfs to > > be mapped into the guest in the first place. > > > > Ah, that should be doable. For SEV stuff, we require the VMM to register the > memory region to the hypervisor during the VM creation time. I can check the > hugetlbfs while registering the memory region and fail much earlier. That's technically unnecessary, because this patch is working on the wrong set of page tables when handling faults from KVM. The host page tables constrain KVM's NPT, but the two are not mirrors of each other. Specifically, KVM cannot exceed the size of the host page tables because that would give the guest access to memory it does not own, but KVM isn't required to use the same size as the host. E.g. a 1gb page in the host can be 1gb, 2mb, or 4kb in the NPT. The code "works" because the size contraints mean it can't get false negatives, only false positives, false positives will never be fatal, e.g. the fault handler may unnecessarily demote a 1gb, and demoting a host page will further constrain KVM's NPT. The distinction matters because it changes our options. For RMP violations on NPT due to page size mismatches, KVM can and should handle the fault without consulting the primary MMU, i.e. by demoting the NPT entry. That means KVM does not need to care about hugetlbfs or any other backing type that cannot be split since KVM will never initiate a host page split in response to a #NPT RMP violation. That doesn't mean that hugetlbfs will magically work since e.g. get/put_user() will fault and fail, but that's a generic non-KVM problem since nothing prevents remapping and/or accessing the page(s) outside of KVM context. The other reason to not disallow hugetlbfs and co. is that a guest that's enlightened to operate at 2mb granularity, e.g. always do page state changes on 2mb chunks, can play nice with hugetlbfs without ever hitting an RMP violation. Last thought, have we taken care in the guest side of things to work at 2mb granularity when possible? AFAICT, PSMASH is effectively a one-way street since RMPUPDATE to restore a 2mb RMP is destructive, i.e. requires PVALIDATE on the entire 2mb chunk, and the guest can't safely do that without reinitializing the whole page, e.g. would either lose data or have to save/init/restore.