Received: by 2002:ab2:710b:0:b0:1ef:a325:1205 with SMTP id z11csp1316070lql; Tue, 12 Mar 2024 13:27:00 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUeg0sjk2y9lNfH/lGWjv7UKhVzmmw9fAh4ne5dV8wkDLRlm4XQ86B+3q0P+hv4Jnd3dIa2EIlX03SqEhPq466BgQ2DPdw+VLCuXnqVgQ== X-Google-Smtp-Source: AGHT+IGh8onsstvGCxDIXVxH4uXY9XWgm9KaeIIShhopOQJ0HNhqtZQlvF3qVNbaMYw4ITB2AMdF X-Received: by 2002:a05:6358:b34f:b0:17b:f3f9:117d with SMTP id ox15-20020a056358b34f00b0017bf3f9117dmr12573498rwc.16.1710275220431; Tue, 12 Mar 2024 13:27:00 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710275220; cv=pass; d=google.com; s=arc-20160816; b=Hw+ZGCuDgFU4ehVCe3rIdJgeR9drCMwX6kabvqD6gA4u3YSV9eAqT/Xb9s4lIwYJ2N ewsNaT8JmhVrgOxRC+4LhpAiRlDHs1G7K7Ld7ShxIEPdvg1FXPuEFkP685i5ieRfurQ0 5eaTY0lV5YTA05CA1IZhnmWeSY3DaWv73NVRPnTpWwZ4TcrVuKHr23N2AJf7jpFxCN0m qRMh+INzkMrIMC6MDp+/Hy/3JC2MO+ygpHLo7/sR0UFcjWexSLU4zNyOQJKXXWXzgeHZ PIHe8b2qps9UDvW71SrxgHkASHyP4F1TvmHp2SzXOXDn96i8bPvjqnfiTKmOmq2GaNwI CleA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=cc:to:from:subject:message-id:references:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:in-reply-to:date :dkim-signature; bh=qkdeVHiZ4dO2mKNOI6prWlI/035w5V9mRcS9wJoax+0=; fh=dJyL14km3ZMw1Yr3ogB7QXgVbzfqgPVfYDK6C4EgSfM=; b=CrYaRJG0Cd/npyWNKdiKPuR/D7G5PGr86/sS/wdVk6uXC/hlia44YHLNGdme7pUh2A xNSA/EwfDbWu8k2Lghjdzfvsj5AHR17sqpYiY9cVtD5p4X8B5XPUDYxeDJA1qxSXMT43 0bwbN6Ht8+uJLlyFET4I03vrsIWBIeLD/TlbEI4EYrqVZ+FB41G2yg0QxV+ObcTQWYrv z2G3RRE1Hdhd+rRRY6rRwopm7ycApoO1wOVo3KppMPS98YVmAL15BeoT3Hr5mZSZ1kUu I5WvoNGRlC6EdM7exnp5ChD4kZQbMDfR8DyzFV76W+380WiGXa87t//YTt0WGRdvNOsr XDfQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=PuyKhozq; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-100760-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-100760-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id g23-20020a631117000000b005dc855c40a4si7652590pgl.645.2024.03.12.13.27.00 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 12 Mar 2024 13:27:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-100760-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=PuyKhozq; arc=pass (i=1 spf=pass spfdomain=flex--seanjc.bounces.google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-kernel+bounces-100760-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-100760-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 8700028386F for ; Tue, 12 Mar 2024 20:26:59 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 1DB62143729; Tue, 12 Mar 2024 20:26:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="PuyKhozq" Received: from mail-yb1-f202.google.com (mail-yb1-f202.google.com [209.85.219.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 861091386AA for ; Tue, 12 Mar 2024 20:26:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.202 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710275206; cv=none; b=n5doN7QYZdPfFCNh6FrQaa89p9KxqqydzWOH6MmCmMV1cn4CVphIxKE3bHHSQH70k7lVvu/KhZGUUE9niRiKxbYI1NqV0XcFTXR/j/do+W/uKWYsmAJUbJLcRB1ECOH+gzAD6TO0UB63u7ztb9G6O4n8vCFveCos2gCKKOGwI+s= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710275206; c=relaxed/simple; bh=Yh+LhS5yhRkoUraEclEvC+FapyUA4I8kLPY2KFzzQ6E=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=ZVrvBTY1Pk8FLeM8FYZacUyNZanUJ3oQX9zbCXKDcVCtFDyiKGfwm8XyuQgLYDcxU+1j5Z5KbZvXVTWaEXEKNwWtXq5fGwq5ke6Q1tkGl+xoVQ1G7eK/Pq9E9CoD7GHhRNULlEZxAq4nRqG7ufJaZUPgG9AMp4wOIrgcNx75fFc= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=PuyKhozq; arc=none smtp.client-ip=209.85.219.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--seanjc.bounces.google.com Received: by mail-yb1-f202.google.com with SMTP id 3f1490d57ef6-dcbee93a3e1so7728363276.3 for ; Tue, 12 Mar 2024 13:26:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1710275203; x=1710880003; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=qkdeVHiZ4dO2mKNOI6prWlI/035w5V9mRcS9wJoax+0=; b=PuyKhozqlfbC+6hjQddM8pdLLVWSOXVaU+Qw5kWbrPZi8VvDIU6rEYnZ0YlPJults/ LnKpQhiNDnMV3RY1NbpN4XoQMk9zhOoeb1R2U4BqIYLaiYoxmc7S43aejdEjwf3z43d2 5AYQ+IATL+JlW3ZV4UKmKUUYmHegeYGcPbP1I29y8ZpOzpc2JMlFQ4L0dSIpgFKd8OQm AHfasU3Jac1Mnzde8e3VyJ8atIrEg2Rh9pjE9y2VFye0CFBIVtoQ4kitdL3J0XaQsDTg STsvy1rnghf+urEwTL7Qg4kISHFOIYms9zMTzVOn/i8H16Gn0yMGPGJYWx/jrLdeYCNc Pofw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710275203; x=1710880003; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=qkdeVHiZ4dO2mKNOI6prWlI/035w5V9mRcS9wJoax+0=; b=Bo71XSXoaQB2Th7iS+1JYFp05eRYTaeiQuqq6HFEw8fvEG0ZKCuB5ymeMb87D2JYsO 8c3B4BwUXbbERadPH+fkmRi6n+bydCNHblcZ+9Eleh1zVZmKGupQGxfFokP9CiVF1Sfi PLs4QJBJYDOg8dEUsxrU8g2MRRHqhrP1qY2vqKCCbuRgfkNpC8XkzGBJMrSMwoGubAFI gdpKrsIyWFRCgnj52VjwzgE/yGGxFpgysACzxPyFB+DjK7Rgf3zCS2ZmcUo+ch5PMMBJ xF/R3CrvCmERFthghF75OeEXFwqmVihhEZclFBEykW3gRZZJTlxLjqLS30tc11ljmaio w0Ig== X-Forwarded-Encrypted: i=1; AJvYcCV4Aha7iFf+dFaN9A4c5ublAC13sZo8epzgBmwwEepa9WkPOdG2WI3KZcsgEUllrqbXCgSEAQ1j795vh9RVMPM1iYYUCQKGWrBfvsZE X-Gm-Message-State: AOJu0Yy752/8mYczjavt6rZRh6e0nmpAcNXIg2m9skxnMRwCCF/vNVPB qd4gHuxw/kq4JsRM33iJkCHTktdjJS/YEqlJ0HMilWq9jm5r433KYrCBpiHJp0kzeKcopjeIE+f O/g== X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a05:6902:114a:b0:dc6:d890:1a97 with SMTP id p10-20020a056902114a00b00dc6d8901a97mr59614ybu.9.1710275203718; Tue, 12 Mar 2024 13:26:43 -0700 (PDT) Date: Tue, 12 Mar 2024 13:26:42 -0700 In-Reply-To: <20240311172431.zqymfqd4xlpd3pft@amd.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: <20231016115028.996656-1-michael.roth@amd.com> <20231016115028.996656-5-michael.roth@amd.com> <84d62953-527d-4837-acf8-315391f4b225@arm.com> <20240311172431.zqymfqd4xlpd3pft@amd.com> Message-ID: Subject: Re: [PATCH RFC gmem v1 4/8] KVM: x86: Add gmem hook for invalidating memory From: Sean Christopherson To: Michael Roth Cc: Steven Price , kvm@vger.kernel.org, Suzuki K Poulose , "tabba@google.com" , linux-coco@lists.linux.dev, linux-mm@kvack.org, linux-crypto@vger.kernel.org, x86@kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org, pbonzini@redhat.com, isaku.yamahata@intel.com, ackerleytng@google.com, vbabka@suse.cz, ashish.kalra@amd.com, nikunj.dadhania@amd.com, jroedel@suse.de, pankaj.gupta@amd.com Content-Type: text/plain; charset="us-ascii" On Mon, Mar 11, 2024, Michael Roth wrote: > On Fri, Feb 09, 2024 at 07:13:13AM -0800, Sean Christopherson wrote: > > On Fri, Feb 09, 2024, Steven Price wrote: > > > >> One option that I've considered is to implement a seperate CCA ioctl to > > > >> notify KVM whether the memory should be mapped protected. > > > > > > > > That's what KVM_SET_MEMORY_ATTRIBUTES+KVM_MEMORY_ATTRIBUTE_PRIVATE is for, no? > > > > > > Sorry, I really didn't explain that well. Yes effectively this is the > > > attribute flag, but there's corner cases for destruction of the VM. My > > > thought was that if the VMM wanted to tear down part of the protected > > > range (without making it shared) then a separate ioctl would be needed > > > to notify KVM of the unmap. > > > > No new uAPI should be needed, because the only scenario time a benign VMM should > > do this is if the guest also knows the memory is being removed, in which case > > PUNCH_HOLE will suffice. > > > > > >> This 'solves' the problem nicely except for the case where the VMM > > > >> deliberately punches holes in memory which the guest is using. > > > > > > > > I don't see what problem there is to solve in this case. PUNCH_HOLE is destructive, > > > > so don't do that. > > > > > > A well behaving VMM wouldn't PUNCH_HOLE when the guest is using it, but > > > my concern here is a VMM which is trying to break the host. In this case > > > either the PUNCH_HOLE needs to fail, or we actually need to recover the > > > memory from the guest (effectively killing the guest in the process). > > > > The latter. IIRC, we talked about this exact case somewhere in the hour-long > > rambling discussion on guest_memfd at PUCK[1]. And we've definitely discussed > > this multiple times on-list, though I don't know that there is a single thread > > that captures the entire plan. > > > > The TL;DR is that gmem will invoke an arch hook for every "struct kvm_gmem" > > instance that's attached to a given guest_memfd inode when a page is being fully > > removed, i.e. when a page is being freed back to the normal memory pool. Something > > like this proposed SNP patch[2]. > > > > Mike, do have WIP patches you can share? > > Sorry, I missed this query earlier. I'm a bit confused though, I thought > the kvm_arch_gmem_invalidate() hook provided in this patch was what we > ended up agreeing on during the PUCK call in question. Heh, I trust your memory of things far more than I trust mine. I'm just proving Cunningham's Law. :-) > There was an open question about what to do if a use-case came along > where we needed to pass additional parameters to > kvm_arch_gmem_invalidate() other than just the start/end PFN range for > the pages being freed, but we'd determined that SNP and TDX did not > currently need this, so I didn't have any changes planned in this > regard. > > If we now have such a need, what we had proposed was to modify > __filemap_remove_folio()/page_cache_delete() to defer setting > folio->mapping to NULL so that we could still access it in > kvm_gmem_free_folio() so that we can still access mapping->i_private_list > to get the list of gmem/KVM instances and pass them on via > kvm_arch_gmem_invalidate(). Yeah, this is what I was remembering. I obviously forgot that we didn't have a need to iterate over all bindings at this time. > So that's doable, but it's not clear from this discussion that that's > needed. Same here. And even if it is needed, it's not your problem to solve. The above blurb about needing to preserve folio->mapping being free_folio() is sufficient to get the ARM code moving in the right direction. Thanks! > If the idea to block/kill the guest if VMM tries to hole-punch, > and ARM CCA already has plans to wire up the shared/private flags in > kvm_unmap_gfn_range(), wouldn't that have all the information needed to > kill that guest? At that point, kvm_gmem_free_folio() can handle > additional per-page cleanup (with additional gmem/KVM info plumbed in > if necessary).