Received: by 2002:a05:6358:489b:b0:bb:da1:e618 with SMTP id x27csp8151719rwn; Wed, 14 Sep 2022 09:40:22 -0700 (PDT) X-Google-Smtp-Source: AA6agR4YvrguoRuqcPGb8yeBcbAhwP1FVjdPJJl7meae4WA58KOALGmiZXg4HV/KzhmwzUdqBU0T X-Received: by 2002:aa7:80d3:0:b0:52b:9237:a355 with SMTP id a19-20020aa780d3000000b0052b9237a355mr38161527pfn.73.1663173622721; Wed, 14 Sep 2022 09:40:22 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1663173622; cv=none; d=google.com; s=arc-20160816; b=CxZ/FsNkVQcCI9F2i0EKyj55zeH2/D1Vjh6b46i+bOFFJNC/4ZMCBJ9o+u4+5QNtYW hGZRqBYA09H7hESLT+J0s3GSX/dyy5c8evv+M8h3cGVWF4iPgB5eoU2+PosNFC+K490J M85J0QyBF9v2zP5ga8ypZjRviSywoXMTCcNeH4XWE45+XY67nfWl/K5A8rCkKedzoZB5 eHa+I4/2iW+xsHw+/lhKu+bKB5+DMEewIQvkgeq4c5zVGFXT8NZiBwgvVRmd4IuiNFOx TgeSzJ4NfCxw6rgJlmuSg+gjNyJigcxW65MbsxII1We3vJyCImedBrFIm3cBUxe+f0Vp xUsQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=TLSZ0fG4jkVsKxHaOJUp349np8FVF8YfoTMtMUTuG8A=; b=sdZYAFJupzz9BhLDlcloAu9nDA/+mq3/p6+A6yKYFuS1up8wCZ30/JfIUtkfUDU3X8 SZPCUsW6aPcH3/Q5ylxZvQkakX/FNXiIoMMRzm2eJioOBVqkAj3KUUJtqTup/LXP7+D9 bDAGicKf+IMAaQK3/NjnItWAS0jrLs3uo12drXY6L3kmVnKBmLagsUtJnBs3psgIeVm6 pTopARvvtvbDqC2HhJ1Zpxr9VZhcAA6Bf9kRmO7qBWpzNjfgcY2xl6zsrHu0IQ6Rfg+J zgjK608jHQwpoHSaHM/VXunc6kci8pVUyV7rG4EkDqdRrV8SCF+0Ntqj1oy8Aa/QojHu cjjA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=lYAYteHJ; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id t10-20020a17090340ca00b00177e9288cdesi14020006pld.180.2022.09.14.09.40.01; Wed, 14 Sep 2022 09:40:22 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=lYAYteHJ; spf=pass (google.com: domain of linux-crypto-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-crypto-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229797AbiINQdS (ORCPT + 99 others); Wed, 14 Sep 2022 12:33:18 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56226 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229875AbiINQdA (ORCPT ); Wed, 14 Sep 2022 12:33:00 -0400 Received: from mail-yw1-x1135.google.com (mail-yw1-x1135.google.com [IPv6:2607:f8b0:4864:20::1135]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6949832F4 for ; Wed, 14 Sep 2022 09:32:20 -0700 (PDT) Received: by mail-yw1-x1135.google.com with SMTP id 00721157ae682-349c4310cf7so21584417b3.3 for ; Wed, 14 Sep 2022 09:32:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:from:to:cc:subject:date; bh=TLSZ0fG4jkVsKxHaOJUp349np8FVF8YfoTMtMUTuG8A=; b=lYAYteHJi3IbnOFnnIGskPJm4LFXQyIlElbzGWypdvdmuOzPYj/pxEWqsuDYmXdtDT +J1fUhuR0nhJSF9vE2IpcB6JMQQVhOxL1FrdrpmKfq/+Pf/0NtZGKAKyIJNPCIIbB3kj vWi4nL+PVud7ZLGOFm7hlsQsP4WGqcGQY5y652IsMJt0X4Satpn4pwnGti6aqOjR7WNb b5tss/uSLmxVOvAGyxZK6jMqSNUzxkIwbjBSIb/bf0uu+j7cYjltNEAMKT0SrlRij08o vO6mJbmXCXzYCHWU2jM86xagKPMTdIsdFaNh0fAXa8lsKrIt9q1/gcJSJ01jSLIuS/+g IKcw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=cc:to:subject:message-id:date:from:in-reply-to:references :mime-version:x-gm-message-state:from:to:cc:subject:date; bh=TLSZ0fG4jkVsKxHaOJUp349np8FVF8YfoTMtMUTuG8A=; b=v2fTTHi+r2dA4WxJo220JfNs93Hp/3vwxqk6hLH11r6vWjrWMmf1k9W8H0I4xf2DO9 dqMLJLRFxMtpXUDqMspCw7T6t4x31yVETCoxjdzovkR4I7vQkMW1ekaT0DscUxT/jIKw BW2PJKBJdw5IT+2B8NC3QcKghjdxRcOropb3k4oGOAYYH4uXS13G7By4TfVowE4f203r V07OzcdZUHw+d/LLTLL8EeWT3V990ld9xznlT0NMyne6MbajZIH1SzRV88RQHzIAztfc x5DJMfD6UjgSm/ttAAQuaz3Q7nzqYBbenWGx4odu0SLQKkPelZfbTDeWQ90Vp0ci/ihN z7OQ== X-Gm-Message-State: ACgBeo16ApqtMHrUtD3g2XH1aF2JBRobJH4u6jk0SpLHndL0gXuae839 R5IjMHloDHF9syuqT1xpks5iJ7f9TqDLjaEMMR4lkw== X-Received: by 2002:a81:7cd7:0:b0:345:221c:5671 with SMTP id x206-20020a817cd7000000b00345221c5671mr30574218ywc.297.1663173139128; Wed, 14 Sep 2022 09:32:19 -0700 (PDT) MIME-Version: 1.0 References: <20210820155918.7518-1-brijesh.singh@amd.com> <20210820155918.7518-40-brijesh.singh@amd.com> <4e41dcff-7c7b-cf36-434a-c7732e7e8ff2@amd.com> <20220908212114.sqne7awimfwfztq7@amd.com> In-Reply-To: From: Marc Orr Date: Wed, 14 Sep 2022 17:32:08 +0100 Message-ID: Subject: Re: [PATCH Part2 v5 39/45] KVM: SVM: Introduce ops for the post gfn map and unmap To: Sean Christopherson Cc: Michael Roth , Brijesh Singh , x86 , LKML , kvm list , linux-coco@lists.linux.dev, Linux Memory Management List , Linux Crypto Mailing List , Thomas Gleixner , Ingo Molnar , Joerg Roedel , Tom Lendacky , "H. Peter Anvin" , Ard Biesheuvel , Paolo Bonzini , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Andy Lutomirski , Dave Hansen , Sergio Lopez , Peter Gonda , Peter Zijlstra , Srinivas Pandruvada , David Rientjes , Dov Murik , Tobin Feldman-Fitzthum , Borislav Petkov , Vlastimil Babka , "Kirill A . Shutemov" , Andi Kleen , Tony Luck , Sathyanarayanan Kuppuswamy , jarkko@profian.com Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-crypto@vger.kernel.org On Wed, Sep 14, 2022 at 5:15 PM Sean Christopherson wrote: > > On Wed, Sep 14, 2022, Marc Orr wrote: > > On Wed, Sep 14, 2022 at 9:05 AM Sean Christopherson wrote: > > > > > > On Thu, Sep 08, 2022, Michael Roth wrote: > > > > On Fri, Oct 15, 2021 at 05:16:28PM +0000, Sean Christopherson wrote: > > > > So in the context of this interim solution, we're trying to look for a > > > > solution that's simple enough that it can be used reliably, without > > > > introducing too much additional complexity into KVM. There is one > > > > approach that seems to fit that bill, that Brijesh attempted in an > > > > earlier version of this series (I'm not sure what exactly was the > > > > catalyst to changing the approach, as I wasn't really in the loop at > > > > the time, but AIUI there weren't any showstoppers there, but please > > > > correct me if I'm missing anything): > > > > > > > > - if the host is writing to a page that it thinks is supposed to be > > > > shared, and the guest switches it to private, we get an RMP fault > > > > (actually, we will get a !PRESENT fault, since as of v5 we now > > > > remove the mapping from the directmap as part of conversion) > > > > - in the host #PF handler, if we see that the page is marked private > > > > in the RMP table, simply switch it back to shared > > > > - if this was a bug on the part of the host, then the guest will see > > > > > > As discussed off-list, attempting to fix up RMP violations in the host #PF handler > > > is not a viable approach. There was also extensive discussion on-list a while back: > > > > > > https://lore.kernel.org/all/8a244d34-2b10-4cf8-894a-1bf12b59cf92@www.fastmail.com > > > > I mentioned this during Mike's talk at the micro-conference: For pages > > mapped in by the kernel can we disallow them to be converted to > > private? > > In theory, yes. Do we want to do something like this? No. kmap() does something > vaguely similar for 32-bit PAE/PSE kernels, but that's a lot of complexity and > overhead to take on. And this issue goes far beyond a kmap(); when the kernel gup()s > a page, the kernel expects the pfn to be available, no exceptions (pun intended). > > > Note, userspace accesses are already handled by UPM. > > I'm confused by the UPM comment. Isn't the gist of this thread about the ability > to merge SNP _without_ UPM? Or am I out in left field? I think that was the overall gist: yes. But it's not what I was trying to comment on :-). HOWEVER, thinking about this more: I was confused when I wrote out my last reply. I had thought that the issue that Michael brought up applied even with UPM. That is, I was thinking it was still possibly for a guest to maliciously convert a page to private mapped in by the kernel and assumed to be shared. But I now realize that is not what will actually happen. To be concrete, let's assume the GHCB page. What will happen is: - KVM has GHCB page mapped in. GHCB is always assumed to be shared. So far so good. - Malicious guest converts GHCB page to private (e.g., via Page State Change request) - Guest exits to KVM - KVM exits to userspace VMM - Userspace VM allocates page in private FD. Now, what happens here depends on how UPM works. If we allow double allocation then our host kernel is safe. However, now we have the "double allocation problem". If on the other hand, we deallocate the page in the shared FD, the host kernel can segfault. And now we actually do have essentially the same problem Michael was describing that we have without UPM. Because we'll end up in fault.c in the kernel context and likely panic the host. I hope I got this right this time. Sorry for the confusion on my last reply. > > In pseudo-code, I'm thinking something like this: > > > > kmap_helper() { > > // And all other interfaces where the kernel can map a GPA > > // into the kernel page tables > > mapped_into_kernel_mem_set[hpa] = true; > > } > > > > kunmap_helper() { > > // And all other interfaces where the kernel can unmap a GPA > > // into the kernel page tables > > mapped_into_kernel_mem_set[hpa] = false; > > > > // Except it's not this simple because we probably need ref counting > > // for multiple mappings. Sigh. But you get the idea. > > A few issues off the top of my head: > > - It's not just refcounting, there would also likely need to be locking to > guarantee sane behavior. > - kmap() isn't allowed to fail and RMPUPDATE isn't strictly guaranteed to succeed, > which is problematic if the kernel attempts to kmap() a page that's already > private, especially for kmap_atomic(), which isn't allowed to sleep. > - Not all kernel code is well behaved and bounces through kmap(); undoubtedly > some of the 1200+ users of page_address() will be problematic. > > $ git grep page_address | wc -l > 1267 > - It's not sufficient for TDX. Merging something this complicated when we know > we still need UPM would be irresponsible from a maintenance perspective. > - KVM would need to support two separate APIs for SNP, which I very much don't > want to do. Ack on merging without UPM. I wasn't trying to chime in on merging before/after UPM. See my other comment above. Sorry for the confusion. Ack on other concerns about "enlightening kmap" as well. I agree.