Received: by 2002:ab2:710b:0:b0:1ef:a325:1205 with SMTP id z11csp1911804lql; Wed, 13 Mar 2024 11:35:06 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCX+puAgSr9/2Ie8NgnBMJEMQlp9cJTPCpv7hlSelrmwfMZY14OIlKswXJ7X7TC6eOONuk5BayVMfs5MGylMOLDPXZHo3IxLKZl6hLAJ9A== X-Google-Smtp-Source: AGHT+IHwKziVF53nu3yPy5W2j0YP6y467ohD9InWAFYSh30wlFFzjf+uP3QKJEgYWuzN1jUFPm42 X-Received: by 2002:a17:903:2381:b0:1dd:d88f:c79b with SMTP id v1-20020a170903238100b001ddd88fc79bmr1494859plh.49.1710354906301; Wed, 13 Mar 2024 11:35:06 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710354906; cv=pass; d=google.com; s=arc-20160816; b=JKovVbnawbz2V4dkqmOKfURZxA8pWsDND+KjdQavaj1jMJcx/mRfUpkwT29czAzqs0 5ORJXhLzHHXKKH52Bpy90hUfH0hHsL4K9vUafpE/Ox+WWp1Imle7mxiXPZ5Qp8gZMueh Au8rwkZFKe/IZELtaL4N8+1lPZ3zFg5H/s+NJ0qYmo58tIPbiqVHikjNaFJKwtG6Nc9l VdHNCJxU+A/dgzhmQQqiQTPdPhlkPTGWcfd/Rd+mS3ERWikIcGXKAuJa5DuSVu2I1ddM hWxAMYAB3jjQ+lQX8FK/sGXyLq0vBCyj3IuVUAHH/5XVMPL8bMIV9WkDhakc8PIrA531 5Eow== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:content-language:from :references:cc:to:subject:user-agent:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:date:message-id; bh=AFDJAJdviPFqVVrhS9/jjJsVkucwxo78BktUg7mmZgI=; fh=BTIpbqnPikDqZCjI/8KxXTC0I6Ad2R3K2dwczY8rnTY=; b=ssvGvOAjf7pHPQZRR8W1jyYmfyq8JiTLr4ep6D8YY2dTeaX+1q+2hSW/syXNZ+k2El DThrcVFCtOt0DdiA6UvWusBXePDN4yV0xPGlZG7rDv1J/wERVt38bxDJReWA5xn7Kq3X xFEPXLm78my52M9OS23gOxyShzymZH1qjr1SuRt/0RlYBHtZoSOeY4lKl19QfmE0LflN LXfcctZIq5P5Z98rju49LgOM+TrzHznTG9ar3VNSfXn8owuuDKKzgN0HEO2saY6QPt1K 8Mvf1/biQJe2rrzAXYuDnIMVcwl833dcCtwxQlBMFQkxLfW+WgAVL7q6VRaek39eGlNr LrYA==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-102237-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-102237-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id q2-20020a170902dac200b001dd5f702a61si3580936plx.268.2024.03.13.11.35.06 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Mar 2024 11:35:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-102237-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; arc=pass (i=1 spf=pass spfdomain=arm.com dmarc=pass fromdomain=arm.com); spf=pass (google.com: domain of linux-kernel+bounces-102237-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-102237-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=arm.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id E70B4286604 for ; Wed, 13 Mar 2024 18:32:13 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id D6FB279B88; Wed, 13 Mar 2024 17:11:38 +0000 (UTC) Received: from foss.arm.com (foss.arm.com [217.140.110.172]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 18D0E612FC; Wed, 13 Mar 2024 17:11:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=217.140.110.172 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710349898; cv=none; b=GjnuHwPOr7MEl7t35TTIVWt5x0DHjznT1VPmNfDCVclpTBdiSR/NryZNm8AB0gkc9Zbk2F8TNotFXHNICJ9iyEya2qEHHtW8gi2PG+XawSMShskAEvxi9iAk5J8MrLiYgqUwbSyI4QxzYAXjkSz1X1aBjs4ThLhFEFoMLJjxNB8= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710349898; c=relaxed/simple; bh=o3yUCeYnmGTYvzE+DYAY+ANpaPKcBEzwnq4w44cGrVw=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=LqwTZydkwXDx+cKMD+QOopTzqTeIOXxs2TDbGTJ8UkZwZQrctTfJbFeO7gr1Wk1y8jFvRq/HPHGLZ39L1PXJha3GKXPalTo1GKtHyTbAaJJmWnS3alfmauyX+PVPnGKn8iJ6rtmrImfG5oNmPFrFIUbcfJLudAWtqsWcjCiAPyk= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com; spf=pass smtp.mailfrom=arm.com; arc=none smtp.client-ip=217.140.110.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=arm.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id C30631007; Wed, 13 Mar 2024 10:12:11 -0700 (PDT) Received: from [10.57.15.67] (unknown [10.57.15.67]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id BA6BE3F73F; Wed, 13 Mar 2024 10:11:31 -0700 (PDT) Message-ID: Date: Wed, 13 Mar 2024 17:11:29 +0000 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH RFC gmem v1 4/8] KVM: x86: Add gmem hook for invalidating memory To: Sean Christopherson , Michael Roth Cc: kvm@vger.kernel.org, Suzuki K Poulose , "tabba@google.com" , linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, pbonzini@redhat.com, isaku.yamahata@intel.com, ackerleytng@google.com, vbabka@suse.cz, ashish.kalra@amd.com, nikunj.dadhania@amd.com, jroedel@suse.de, pankaj.gupta@amd.com References: <20231016115028.996656-1-michael.roth@amd.com> <20231016115028.996656-5-michael.roth@amd.com> <84d62953-527d-4837-acf8-315391f4b225@arm.com> <20240311172431.zqymfqd4xlpd3pft@amd.com> From: Steven Price Content-Language: en-GB In-Reply-To: Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit On 12/03/2024 20:26, Sean Christopherson wrote: > On Mon, Mar 11, 2024, Michael Roth wrote: >> On Fri, Feb 09, 2024 at 07:13:13AM -0800, Sean Christopherson wrote: >>> On Fri, Feb 09, 2024, Steven Price wrote: >>>>>> One option that I've considered is to implement a seperate CCA ioctl to >>>>>> notify KVM whether the memory should be mapped protected. >>>>> >>>>> That's what KVM_SET_MEMORY_ATTRIBUTES+KVM_MEMORY_ATTRIBUTE_PRIVATE is for, no? >>>> >>>> Sorry, I really didn't explain that well. Yes effectively this is the >>>> attribute flag, but there's corner cases for destruction of the VM. My >>>> thought was that if the VMM wanted to tear down part of the protected >>>> range (without making it shared) then a separate ioctl would be needed >>>> to notify KVM of the unmap. >>> >>> No new uAPI should be needed, because the only scenario time a benign VMM should >>> do this is if the guest also knows the memory is being removed, in which case >>> PUNCH_HOLE will suffice. >>> >>>>>> This 'solves' the problem nicely except for the case where the VMM >>>>>> deliberately punches holes in memory which the guest is using. >>>>> >>>>> I don't see what problem there is to solve in this case. PUNCH_HOLE is destructive, >>>>> so don't do that. >>>> >>>> A well behaving VMM wouldn't PUNCH_HOLE when the guest is using it, but >>>> my concern here is a VMM which is trying to break the host. In this case >>>> either the PUNCH_HOLE needs to fail, or we actually need to recover the >>>> memory from the guest (effectively killing the guest in the process). >>> >>> The latter. IIRC, we talked about this exact case somewhere in the hour-long >>> rambling discussion on guest_memfd at PUCK[1]. And we've definitely discussed >>> this multiple times on-list, though I don't know that there is a single thread >>> that captures the entire plan. >>> >>> The TL;DR is that gmem will invoke an arch hook for every "struct kvm_gmem" >>> instance that's attached to a given guest_memfd inode when a page is being fully >>> removed, i.e. when a page is being freed back to the normal memory pool. Something >>> like this proposed SNP patch[2]. >>> >>> Mike, do have WIP patches you can share? >> >> Sorry, I missed this query earlier. I'm a bit confused though, I thought >> the kvm_arch_gmem_invalidate() hook provided in this patch was what we >> ended up agreeing on during the PUCK call in question. > > Heh, I trust your memory of things far more than I trust mine. I'm just proving > Cunningham's Law. :-) > >> There was an open question about what to do if a use-case came along >> where we needed to pass additional parameters to >> kvm_arch_gmem_invalidate() other than just the start/end PFN range for >> the pages being freed, but we'd determined that SNP and TDX did not >> currently need this, so I didn't have any changes planned in this >> regard. >> >> If we now have such a need, what we had proposed was to modify >> __filemap_remove_folio()/page_cache_delete() to defer setting >> folio->mapping to NULL so that we could still access it in >> kvm_gmem_free_folio() so that we can still access mapping->i_private_list >> to get the list of gmem/KVM instances and pass them on via >> kvm_arch_gmem_invalidate(). > > Yeah, this is what I was remembering. I obviously forgot that we didn't have a > need to iterate over all bindings at this time. > >> So that's doable, but it's not clear from this discussion that that's >> needed. > > Same here. And even if it is needed, it's not your problem to solve. The above > blurb about needing to preserve folio->mapping being free_folio() is sufficient > to get the ARM code moving in the right direction. > > Thanks! > >> If the idea to block/kill the guest if VMM tries to hole-punch, >> and ARM CCA already has plans to wire up the shared/private flags in >> kvm_unmap_gfn_range(), wouldn't that have all the information needed to >> kill that guest? At that point, kvm_gmem_free_folio() can handle >> additional per-page cleanup (with additional gmem/KVM info plumbed in >> if necessary). Yes, the missing piece of the puzzle was provided by "KVM: Prepare for handling only shared mappings in mmu_notifier events"[1] - namely the "only_shared" flag. We don't need to actually block/kill the guest until it attempts access to the memory which has been removed from the guest - at that point the guest cannot continue because the security properties have been violated (the protected memory contents have been lost) so attempts to continue the guest will fail. You can ignore most of my other ramblings - as long as everyone is happy with that flag then Arm CCA should be fine. I was just looking at other options. Thanks, Steve [1] https://lore.kernel.org/lkml/20231027182217.3615211-13-seanjc@google.com/