Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CB54DC433F5 for ; Thu, 23 Dec 2021 17:35:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235281AbhLWRfn (ORCPT ); Thu, 23 Dec 2021 12:35:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229821AbhLWRfm (ORCPT ); Thu, 23 Dec 2021 12:35:42 -0500 Received: from mail-pl1-x62d.google.com (mail-pl1-x62d.google.com [IPv6:2607:f8b0:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6357DC061401 for ; Thu, 23 Dec 2021 09:35:42 -0800 (PST) Received: by mail-pl1-x62d.google.com with SMTP id m24so4857715pls.10 for ; Thu, 23 Dec 2021 09:35:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=yIw9Fjpgijj44IXOUKUCZEBlyaVdzzfrrZ8Lojvg1lY=; b=KchQe9EBbruIMjeYeJXQs0w3O4kx4AL6Hoo8jlZapI4ZOjlRLvTUKO57VG05cszkq/ cxCx8ttS1wS+jESFzGiuNO8JRwklcfN6a0abl+aSLXx0ykUiCxgRj6UJb3XfwontSdT+ Pal7HtLokeKseV78BPG+ZFrr1TzNamA/WdL3i2cCdSkQYrTK4lMRF81oZHgXWIm42LQ+ AyZCyLTVbMX3OiICY5MitKDc2envJnLaX51xjFXqoiwUNMWIBKJlQ/5crWWzop9kAcgc e8RvOe5wFMZOhSy5s3N6DCPw0YeMa8o2m2iUEczeKV/8cBVE5bg1YbhZwv4BBhZiVzP6 Y3AA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=yIw9Fjpgijj44IXOUKUCZEBlyaVdzzfrrZ8Lojvg1lY=; b=dDmhnKTcwTQmg9iwp0vRyI5wsXe8xjDnV5HKXeRVUKuHb7uU7U6uOSmOiWpaIJjtv8 vOp+Ck/Ty/3QLhrGEIW66GVnVwBy0t+yQ2I0RYaPyayj9EJnq2s7TC9q+Encij58ePWi 9z+IIKpKu4qmKcjjIXWaoj45L/YRw2VguQfpAPGlJrZbyFhcWKeivneTTZjgqPnsR9Rb Id7X3lmshby0XALrzsZgJW77SpCrXaMcY9apoZU11VbgK9zbWDEABcS37k0xsH+FJg95 Vx0AnF2ciWK3vZ4sn9cSJjx8jwergp8qxu4audEd8gdIVvROeLZkXyPpwBbpNczIQlPP iedw== X-Gm-Message-State: AOAM533BX8Q6L6aBNVDkXAG7fOk/D9ghue/GtW7RH8YRrrluYUrLygaZ 2ezyOR6Djkq/Zc5kpC2vJMaDQg== X-Google-Smtp-Source: ABdhPJwwfNHbtzXtFf76NGe6MOSECepzIn2qqq6XNXeecRXGPz/CfFnejBjUga6fpLtCWNPFFSHAWA== X-Received: by 2002:a17:90a:d792:: with SMTP id z18mr3774755pju.182.1640280941694; Thu, 23 Dec 2021 09:35:41 -0800 (PST) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id t3sm6683459pfj.207.2021.12.23.09.35.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 Dec 2021 09:35:41 -0800 (PST) Date: Thu, 23 Dec 2021 17:35:37 +0000 From: Sean Christopherson To: Chao Peng Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, qemu-devel@nongnu.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Yu Zhang , "Kirill A . Shutemov" , luto@kernel.org, john.ji@intel.com, susie.li@intel.com, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com Subject: Re: [PATCH v3 kvm/queue 04/16] KVM: Extend the memslot to support fd-based private memory Message-ID: References: <20211223123011.41044-1-chao.p.peng@linux.intel.com> <20211223123011.41044-5-chao.p.peng@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20211223123011.41044-5-chao.p.peng@linux.intel.com> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Dec 23, 2021, Chao Peng wrote: > Extend the memslot definition to provide fd-based private memory support > by adding two new fields(fd/ofs). The memslot then can maintain memory > for both shared and private pages in a single memslot. Shared pages are > provided in the existing way by using userspace_addr(hva) field and > get_user_pages() while private pages are provided through the new > fields(fd/ofs). Since there is no 'hva' concept anymore for private > memory we cannot call get_user_pages() to get a pfn, instead we rely on > the newly introduced MEMFD_OPS callbacks to do the same job. > > This new extension is indicated by a new flag KVM_MEM_PRIVATE. > > Signed-off-by: Yu Zhang > Signed-off-by: Chao Peng > --- > include/linux/kvm_host.h | 10 ++++++++++ > include/uapi/linux/kvm.h | 12 ++++++++++++ > 2 files changed, 22 insertions(+) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index f8ed799e8674..2cd35560c44b 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -460,8 +460,18 @@ struct kvm_memory_slot { > u32 flags; > short id; > u16 as_id; > + u32 fd; There should be no need to store the fd in the memslot, the fd should be unneeded outside of __kvm_set_memory_region(), e.g. diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 1caebded52c4..4e43262887a3 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -2029,10 +2029,10 @@ int __kvm_set_memory_region(struct kvm *kvm, new->npages = npages; new->flags = mem->flags; new->userspace_addr = mem->userspace_addr; - new->fd = mem->fd; - new->file = NULL; - new->ofs = mem->ofs; - + if (mem->flags & KVM_MEM_PRIVATE) { + new->private_file = fget(mem->private_fd); + new->private_offset = mem->private_offset; + } r = kvm_set_memslot(kvm, old, new, change); if (r) kfree(new); > + struct file *file; Please use more descriptive names, shaving characters is not at all priority. > + u64 ofs; I believe this should be loff_t. struct file *private_file; struct loff_t private_offset; > }; > > +static inline bool kvm_slot_is_private(const struct kvm_memory_slot *slot) > +{ > + if (slot && (slot->flags & KVM_MEM_PRIVATE)) > + return true; > + return false; return slot && (slot->flags & KVM_MEM_PRIVATE); > +} > + > static inline bool kvm_slot_dirty_track_enabled(const struct kvm_memory_slot *slot) > { > return slot->flags & KVM_MEM_LOG_DIRTY_PAGES; > diff --git a/include/uapi/linux/kvm.h b/include/uapi/linux/kvm.h > index 1daa45268de2..41434322fa23 100644 > --- a/include/uapi/linux/kvm.h > +++ b/include/uapi/linux/kvm.h > @@ -103,6 +103,17 @@ struct kvm_userspace_memory_region { > __u64 userspace_addr; /* start of the userspace allocated memory */ > }; > > +struct kvm_userspace_memory_region_ext { > + __u32 slot; > + __u32 flags; > + __u64 guest_phys_addr; > + __u64 memory_size; /* bytes */ > + __u64 userspace_addr; /* hva */ Would it make sense to embed "struct kvm_userspace_memory_region"? > + __u64 ofs; /* offset into fd */ > + __u32 fd; Again, use descriptive names, then comments like "offset into fd" are unnecessary. __u64 private_offset; __u32 private_fd; > + __u32 padding[5]; > +}; > + > /* > * The bit 0 ~ bit 15 of kvm_memory_region::flags are visible for userspace, > * other bits are reserved for kvm internal use which are defined in > @@ -110,6 +121,7 @@ struct kvm_userspace_memory_region { > */ > #define KVM_MEM_LOG_DIRTY_PAGES (1UL << 0) > #define KVM_MEM_READONLY (1UL << 1) > +#define KVM_MEM_PRIVATE (1UL << 2) > > /* for KVM_IRQ_LINE */ > struct kvm_irq_level { > -- > 2.17.1 >