Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34D3BC433EF for ; Thu, 23 Dec 2021 12:33:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243825AbhLWMd6 (ORCPT ); Thu, 23 Dec 2021 07:33:58 -0500 Received: from mga17.intel.com ([192.55.52.151]:7858 "EHLO mga17.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1348474AbhLWMdw (ORCPT ); Thu, 23 Dec 2021 07:33:52 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1640262832; x=1671798832; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=gr3qIQtqhPgZ2sWGOjcNk2j5JakW0dks3WgAyQ1CBMA=; b=K71yiEJ/fCwQmNrv28asR3IyWbW8VKgc8yn1G/fvQ+NkHoreUFgshfY9 Iam2RisJwnuOngCOY7PZfvYs19LSV9Hr0EiEKUKrYlrzcHUpjAhBeuI5F MdsJGe5AttFnOvR6Cy47OXO6m/PSMua60LgFRkYnDvHPIyMdjy8Now83m Zll5R99S7pwW1xeYEaqAqKBu3yhM3Ha7x0vobk8JVVWi2m3kEcUKS28or ImHHTDqCNk0CHT9UyqqmclYYnBffpWvGMj0mjUQpOJKYI3rcQod7VUNLr bqmLLOh3988I41YWVoUVqRv+DtVFD43S2+wm04OgAtSnsQGc5DIpn9Ef2 g==; X-IronPort-AV: E=McAfee;i="6200,9189,10206"; a="221493021" X-IronPort-AV: E=Sophos;i="5.88,229,1635231600"; d="scan'208";a="221493021" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 23 Dec 2021 04:33:02 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.88,229,1635231600"; d="scan'208";a="522079184" Received: from chaop.bj.intel.com ([10.240.192.101]) by orsmga008.jf.intel.com with ESMTP; 23 Dec 2021 04:32:54 -0800 From: Chao Peng To: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, qemu-devel@nongnu.org Cc: Paolo Bonzini , Jonathan Corbet , Sean Christopherson , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Yu Zhang , Chao Peng , "Kirill A . Shutemov" , luto@kernel.org, john.ji@intel.com, susie.li@intel.com, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com Subject: [PATCH v3 kvm/queue 16/16] KVM: Register/unregister private memory slot to memfd Date: Thu, 23 Dec 2021 20:30:11 +0800 Message-Id: <20211223123011.41044-17-chao.p.peng@linux.intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20211223123011.41044-1-chao.p.peng@linux.intel.com> References: <20211223123011.41044-1-chao.p.peng@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Expose KVM_MEM_PRIVATE flag and register/unregister private memory slot to memfd when userspace sets the flag. KVM_MEM_PRIVATE is disallowed by default but architecture code can turn on it by implementing kvm_arch_private_memory_supported(). Signed-off-by: Yu Zhang Signed-off-by: Chao Peng --- include/linux/kvm_host.h | 1 + virt/kvm/kvm_main.c | 34 ++++++++++++++++++++++++++++++++-- 2 files changed, 33 insertions(+), 2 deletions(-) diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index fabab3b77d57..5173c52e70d4 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -1229,6 +1229,7 @@ bool kvm_arch_dy_has_pending_interrupt(struct kvm_vcpu *vcpu); int kvm_arch_post_init_vm(struct kvm *kvm); void kvm_arch_pre_destroy_vm(struct kvm *kvm); int kvm_arch_create_vm_debugfs(struct kvm *kvm); +bool kvm_arch_private_memory_supported(struct kvm *kvm); #ifndef __KVM_HAVE_ARCH_VM_ALLOC /* diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index cf8dcb3b8c7f..1caebded52c4 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1514,10 +1514,19 @@ static void kvm_replace_memslot(struct kvm *kvm, } } -static int check_memory_region_flags(const struct kvm_userspace_memory_region_ext *mem) +bool __weak kvm_arch_private_memory_supported(struct kvm *kvm) +{ + return false; +} + +static int check_memory_region_flags(struct kvm *kvm, + const struct kvm_userspace_memory_region_ext *mem) { u32 valid_flags = KVM_MEM_LOG_DIRTY_PAGES; + if (kvm_arch_private_memory_supported(kvm)) + valid_flags |= KVM_MEM_PRIVATE; + #ifdef __KVM_HAVE_READONLY_MEM valid_flags |= KVM_MEM_READONLY; #endif @@ -1756,6 +1765,8 @@ static void kvm_delete_memslot(struct kvm *kvm, struct kvm_memory_slot *old, struct kvm_memory_slot *invalid_slot) { + if (old->flags & KVM_MEM_PRIVATE) + kvm_memfd_unregister(old); /* * Remove the old memslot (in the inactive memslots) by passing NULL as * the "new" slot, and for the invalid version in the active slots. @@ -1836,6 +1847,14 @@ static int kvm_set_memslot(struct kvm *kvm, kvm_invalidate_memslot(kvm, old, invalid_slot); } + if (new->flags & KVM_MEM_PRIVATE && change == KVM_MR_CREATE) { + r = kvm_memfd_register(kvm, new); + if (r) { + mutex_unlock(&kvm->slots_arch_lock); + return r; + } + } + r = kvm_prepare_memory_region(kvm, old, new, change); if (r) { /* @@ -1850,6 +1869,10 @@ static int kvm_set_memslot(struct kvm *kvm, } else { mutex_unlock(&kvm->slots_arch_lock); } + + if (new->flags & KVM_MEM_PRIVATE && change == KVM_MR_CREATE) + kvm_memfd_unregister(new); + return r; } @@ -1917,7 +1940,7 @@ int __kvm_set_memory_region(struct kvm *kvm, int as_id, id; int r; - r = check_memory_region_flags(mem); + r = check_memory_region_flags(kvm, mem); if (r) return r; @@ -1974,6 +1997,10 @@ int __kvm_set_memory_region(struct kvm *kvm, if ((kvm->nr_memslot_pages + npages) < kvm->nr_memslot_pages) return -EINVAL; } else { /* Modify an existing slot. */ + /* Private memslots are immutable, they can only be deleted. */ + if (mem->flags & KVM_MEM_PRIVATE) + return -EINVAL; + if ((mem->userspace_addr != old->userspace_addr) || (npages != old->npages) || ((mem->flags ^ old->flags) & KVM_MEM_READONLY)) @@ -2002,6 +2029,9 @@ int __kvm_set_memory_region(struct kvm *kvm, new->npages = npages; new->flags = mem->flags; new->userspace_addr = mem->userspace_addr; + new->fd = mem->fd; + new->file = NULL; + new->ofs = mem->ofs; r = kvm_set_memslot(kvm, old, new, change); if (r) -- 2.17.1