Received: by 2002:a05:7412:31a9:b0:e2:908c:2ebd with SMTP id et41csp4293082rdb; Thu, 14 Sep 2023 19:15:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IHSQ35/UXRzy/s66TA19U3kjMaGYoVHoCgvaFKD0F5NYXvg725tBohxMyjf9Pv5/VoZxZlO X-Received: by 2002:a05:6a00:24ca:b0:690:463a:fa9d with SMTP id d10-20020a056a0024ca00b00690463afa9dmr430424pfv.22.1694744152338; Thu, 14 Sep 2023 19:15:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1694744152; cv=none; d=google.com; s=arc-20160816; b=QslxrkleFu8LXa8HHeukmcizt5CGSDjr3UXXnrQ/zKJBQQ6UvQaijJCWgS6XgESc90 I0FZb9bhc3uHz56viymbjcioE+7aZ1NGfdNrkThg7RKj0mBggKYyiVhryA+oQYDLtXeD iVYLcfdkJVYbTHypiE66o8hSc1oXEcmu4dkhNLzkf1wu2HuL0J4gDpzQ8aIRBCwSitN+ k/IzrufW3TqITGT/iCW8KXNfWseVAELeTX/OQco25C09ZiJ7jQn5J7IkJFN+Z/Ny9S9y D+bEO4qT97c+jyJ0U0vuU6+aurv8sD7ffzO6Hml5JfXgdxEgEXnltFsHtCxLbBvI3QrF +G+A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:from:subject:message-id:references :mime-version:in-reply-to:date:dkim-signature; bh=SYBWSie1WcKieNCkM5oJg6niwnE6sJx6iHmmYDkBNI4=; fh=I5JBBCd+DrRNhRHoEH2/hi1J3N5kMENoiuPIF4U2BhA=; b=GYyT1uFr8qnHcC3t6sCaSsbYXtOG3iuUUB1xxVH2W90yhxFCwsdff3GbWsrMmQZhoY +i5+HUIP9RZZPb8a48Suiy14uhCszbuqoQNbXmalAR6UMwo1Dgebzb1qgjRIaL+HAoEg jOMux+EIOhFK4Fhl2IAx5/D9xCJtPXzyWdvKQuJeZfsrrEdau+lRWr4wX86yMzRSs9P+ DOHBEw01VNt4sPsVCOvAzQ+v/UW97mosh+nwdHitoOg4qRugP0WmWk/m9hBdNnmTUk3q 59CI8O3E5TrxaCQurcVECewiib+PHLer0G9vrSJvYVLQXmH1kdmhFGjGnz7Jk0hwcTYc rHCg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=zYyWdm1l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id eb8-20020a056a004c8800b0068fcccf5c87si2531246pfb.300.2023.09.14.19.15.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Sep 2023 19:15:52 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=zYyWdm1l; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 6EE81802A374; Thu, 14 Sep 2023 12:12:33 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238757AbjINTMW (ORCPT + 99 others); Thu, 14 Sep 2023 15:12:22 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50060 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S239816AbjINTMU (ORCPT ); Thu, 14 Sep 2023 15:12:20 -0400 Received: from mail-yb1-xb49.google.com (mail-yb1-xb49.google.com [IPv6:2607:f8b0:4864:20::b49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 996541FDC for ; Thu, 14 Sep 2023 12:12:16 -0700 (PDT) Received: by mail-yb1-xb49.google.com with SMTP id 3f1490d57ef6-d818e01823aso1380799276.2 for ; Thu, 14 Sep 2023 12:12:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1694718736; x=1695323536; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=SYBWSie1WcKieNCkM5oJg6niwnE6sJx6iHmmYDkBNI4=; b=zYyWdm1l8P8WZXtjqDgUI3mRKkA/xQYpvM8+n/UB1d2M/ToPNMNOcK4LdO/ZCEqw59 BGcoqSqF8bVMoIjDA2nuATKvIi15WRQF70hvkFfiXZ2LFuS5BC/y8BpnqKf5KWVW5ICI JalGFAlrGIIdW0gEwvUhT7Lv+f4HNnHTIgG4JoQwGJHpjlMEdHa4NIp3PHdHJfSGLZlQ Z083RL/ixn5egsogS8Kqh7ocxiDkroPSJy+tDo/SIGLNEZxJJOHIgIV4uvwBcpS69l5r r+BZ8R/8bEEKE/aSC0JcYzf3sEw/dUxsVZIYSmroQnq2IDLJEBk0xIVUl/FbX/k8Ckxy qwlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1694718736; x=1695323536; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=SYBWSie1WcKieNCkM5oJg6niwnE6sJx6iHmmYDkBNI4=; b=eynXVuNbWd2avPu1JZpJ4N0BzAEnEiJd/l7qqWu65+eQJJmT5qlhpztwgmLfoEEjU7 LUqotBp1xQlTSrIeUnjf5AX1Z7Kvcpv7HtoSgegnlRU+asSvuKXD6c/OgRTZskzvS4rq 4y3xUjM+X2l36JTeMF74Dqoffhf1Nn9dHqW3qWk99nXhZQbgjQ+e/zTpt2qPGPG5at2v GTjTaOk9a7xI2lGvc5oEOeIrWqQwiVEwEwb+KAPVeOHVSAI6SFu+DfLXUcM7AtSdYCn4 skLLH3NGcO+Nj1Zn2dGaxGl7u13/btbjEzto/wQ+sfyRoh+ye5BmuYZlXp69vo8PeCNo N3xA== X-Gm-Message-State: AOJu0YyiZS/xNSFjwd1rS+qss9nkiD8LaYzvd7SPXCW3t/YPMfjlXW3L Xk5YokEM9ryeiUwWm3WoMNIQ+CFDcOc= X-Received: from zagreus.c.googlers.com ([fda3:e722:ac3:cc00:7f:e700:c0a8:5c37]) (user=seanjc job=sendgmr) by 2002:a25:bc7:0:b0:d77:fb4e:d85e with SMTP id 190-20020a250bc7000000b00d77fb4ed85emr138869ybl.6.1694718735743; Thu, 14 Sep 2023 12:12:15 -0700 (PDT) Date: Thu, 14 Sep 2023 12:12:14 -0700 In-Reply-To: <253965df-6d80-bbfd-ab01-f9e69b274bf3@quicinc.com> Mime-Version: 1.0 References: <253965df-6d80-bbfd-ab01-f9e69b274bf3@quicinc.com> Message-ID: Subject: Re: [RFC PATCH v11 12/29] KVM: Add KVM_CREATE_GUEST_MEMFD ioctl() for guest-specific backing memory From: Sean Christopherson To: Elliot Berman Cc: Ackerley Tng , pbonzini@redhat.com, maz@kernel.org, oliver.upton@linux.dev, chenhuacai@kernel.org, mpe@ellerman.id.au, anup@brainfault.org, paul.walmsley@sifive.com, palmer@dabbelt.com, aou@eecs.berkeley.edu, willy@infradead.org, akpm@linux-foundation.org, paul@paul-moore.com, jmorris@namei.org, serge@hallyn.com, kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, linux-mips@vger.kernel.org, linuxppc-dev@lists.ozlabs.org, kvm-riscv@lists.infradead.org, linux-riscv@lists.infradead.org, linux-fsdevel@vger.kernel.org, linux-mm@kvack.org, linux-security-module@vger.kernel.org, linux-kernel@vger.kernel.org, chao.p.peng@linux.intel.com, tabba@google.com, jarkko@kernel.org, yu.c.zhang@linux.intel.com, vannapurve@google.com, mail@maciej.szmigiero.name, vbabka@suse.cz, david@redhat.com, qperret@google.com, michael.roth@amd.com, wei.w.wang@intel.com, liam.merwick@oracle.com, isaku.yamahata@gmail.com, kirill.shutemov@linux.intel.com Content-Type: text/plain; charset="us-ascii" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Thu, 14 Sep 2023 12:12:33 -0700 (PDT) X-Spam-Status: No, score=-8.4 required=5.0 tests=DKIMWL_WL_MED,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_IN_DEF_DKIM_WL autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email On Mon, Aug 28, 2023, Elliot Berman wrote: > I had a 3rd question that's related to how to wire the gmem up to a virtual > machine: > > I learned of a usecase to implement copy-on-write for gmem. The premise > would be to have a "golden copy" of the memory that multiple virtual > machines can map in as RO. If a virtual machine tries to write to those > pages, they get copied to a virtual machine-specific page that isn't shared > with other VMs. How do we track those pages? The answer is going to be gunyah specific, because gmem itself isn't designed to provide a virtualization layer ("virtual" in the virtual memory sense, not in the virtual machine sense). Like any other CoW implementation, the RO page would need to be copied to a different physical page, and whatever layer translates gfns to physical pages would need to be updated. E.g. in gmem terms, allocate a new gmem page/instance and update the gfn=>gmem[offset] translation in KVM/gunyah. For VMA-based memory, that translation happens in the primary MMU, and is largely transparent to KVM (or any other secondary MMU). E.g. the primary MMU works with the backing store (if necessary) to allocate a new page and do the copy, notifies secondary MMUs, zaps the old PTE(s), and then installs the new PTE(s). KVM/gunyah just needs to react to the mmu_notifier event, e.g. zap secondary MMU PTEs, and then KVM/gunyah naturally gets the new, writable page/PTE when following the host virtual address, e.g. via gup(). The downside of eliminating the middle-man (primary MMU) from gmem is that the "owner" (KVM or gunyah) is now responsible for these types of operations. For some things, e.g. page migration, it's actually easier in some ways, but for CoW it's quite a bit more work for KVM/gunyah because KVM/gunyah now needs to do things that were previously handled by the primary MMU. In KVM, assuming no additional support in KVM, doing CoW would mean modifying memslots to redirect the gfn from the RO page to the writable page. For a variety of reasons, that would be _extremely_ expensive in KVM, but still possible. If there were a strong use case for supporting CoW with KVM+gmem, then I suspect that we'd probably implement new KVM uAPI of some form to provide reasonable performance. But I highly doubt we'll ever do that, because one of core tenets of KVM+gmem is to isolate guest memory from the rest of the world, and especially from host userspace, and that just doesn't mesh well with CoW'd memory being shared across multiple VMs.