Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp981448pxb; Tue, 29 Mar 2022 14:11:35 -0700 (PDT) X-Google-Smtp-Source: ABdhPJyjoKM3afvpSkk0/w9n1y4gAyTJNqZezGOOXsIX7ASbkBD555ObhUYdKKiRQP7pJtAxUDGl X-Received: by 2002:a05:6a00:2392:b0:4fa:dcd2:5bc1 with SMTP id f18-20020a056a00239200b004fadcd25bc1mr29583087pfc.8.1648588295298; Tue, 29 Mar 2022 14:11:35 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1648588295; cv=none; d=google.com; s=arc-20160816; b=oH9pBm6EADP4tZhXKeofi36mwrkhSIC2R7ZrdkoAs4bpVNi6ebDsKgahXca6qvRDlQ H3Pb+zK8um4irz4EACGMUd+siAiMVKQHLkyH2nTkMuFGnpjjufd3aljIvgfTTpatLeUs cuJllRCbxI1cf0PlLAAnoL5FAFxvfPeJnypDleTPHutXckWDGLDKPlsHL235qDphXoFe tTe/PnmBxGXU7X3BrLHsXKySnNWgPzlcc9W61BPJeiLDpXoOf9YJ0j7z+zLqka52rUeg UAvrAFj+EyaColGpjRZxMdbwWM2jvd44Pj/G3oH2JFRLmJMPamc9iPxJJIYVhhI+9rCr 0EiA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=2XeIM1SVjYsGalPDLQxopalWmUCHnrUjVEEJ+tulJdw=; b=rVng1wcHT/IJ6PuZNyUzff+vwqFGdUI9SEyGQ7DZ5F1CKsyQLy+g0qebw6yeqexNaI wi2Mb1RoSM0zB/+UUUWuY/IXnK9m9VZBNzSZe1gf4/xYiSOQbM0043lNCSLWtySM58q3 YC8e7m9cR4utyT+xybzJvYzj1dxhaghj3M3HB7vO02/CWQtn809DfpBeQD1jj8eTndyF GG29kDez/On3ZBZVUBAWmytJmGmAADMtvrzPF5A0l6607EvKdTF8ewtL/6IkIQcq5SYk HRz2182uDK3wJTaZ9ALuDiWhkKBvOwk3WGiyrKzHNNVPEhSQCmY9kKxYEnm6zc+t2jxl XUZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=fkATFbG5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id 129-20020a630587000000b003987299c842si5712075pgf.672.2022.03.29.14.11.16; Tue, 29 Mar 2022 14:11:35 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=fkATFbG5; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S240780AbiC2TDm (ORCPT + 99 others); Tue, 29 Mar 2022 15:03:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S240777AbiC2TDl (ORCPT ); Tue, 29 Mar 2022 15:03:41 -0400 Received: from mail-pg1-x52a.google.com (mail-pg1-x52a.google.com [IPv6:2607:f8b0:4864:20::52a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1461FE1251 for ; Tue, 29 Mar 2022 12:01:58 -0700 (PDT) Received: by mail-pg1-x52a.google.com with SMTP id o8so15596580pgf.9 for ; Tue, 29 Mar 2022 12:01:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=2XeIM1SVjYsGalPDLQxopalWmUCHnrUjVEEJ+tulJdw=; b=fkATFbG5P1QdsRl3f5I5TTNI5ChWdFQqj0PejuKROxLUJnUOxzgKe+W4LhC5IZAI4H YtsmX3rK45fXNWubfbpZPqb9DnxrrAbfux1L4qkhlnrcAasThinipJ2Q3eV/AR/L8pQY cRcROodkTR03YOK9g3nobLnLQ905Gk1gr+g/5/aOMWcSLRh80tKrndhIfqqoN+N3vRsS DIOkXafKz0QlgKRKQ9CcqexPq7225Bm1SKVztp6KtFJXqZwd53geDw7ldIZ4DNNPAqx+ 3sIDGdLWnk5KF2cMv6hxUP0l3WPVCea9ErtR4WnbKShs+t9nnnof7J9xgjZLKizrsTPW Bp7A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=2XeIM1SVjYsGalPDLQxopalWmUCHnrUjVEEJ+tulJdw=; b=zhsAinN3BDTURDC7+GoUTjURBoGbwdP+O0zJcRGySwW54mMloR3zX5AokC1Jc7pm6h sVoQ6ljoyhFSSQAASxhmASqwGJ40knFtIpozBo8F9JwV5PYwcdTmFRHZSeU4yBy6aGN8 RN1QfUsUVWKs26KWDfbig9T2NwR6+1jBdF5VDNPSVM+zRcrJiLTNWyfFAyAgIfwzw1NT fQtTglT1PtMtjGgtP1s187u2FprZ9JpJT2czWNQVs86ECpu/qcZRUPqDVvIWcRvyEcvQ g14I4zR1FSltYnHZyrvVEm0zsC+rdQE8V+IkXcdQP8aQf1aH4BwQCURaGcwgEvhhM918 N0/Q== X-Gm-Message-State: AOAM530cMkRUonNzUm4Y7mAauF3iLvD7oBmhk+Okw4Y0ytKqHafAeIyP Hx6MzIMppdM6pjg1CW2Boy4tyg== X-Received: by 2002:a63:7888:0:b0:398:91:7b5e with SMTP id t130-20020a637888000000b0039800917b5emr3047768pgc.212.1648580517007; Tue, 29 Mar 2022 12:01:57 -0700 (PDT) Received: from google.com (157.214.185.35.bc.googleusercontent.com. [35.185.214.157]) by smtp.gmail.com with ESMTPSA id ip1-20020a17090b314100b001c7b10fe359sm3975575pjb.5.2022.03.29.12.01.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 29 Mar 2022 12:01:55 -0700 (PDT) Date: Tue, 29 Mar 2022 19:01:52 +0000 From: Sean Christopherson To: Chao Peng Cc: kvm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-api@vger.kernel.org, qemu-devel@nongnu.org, Paolo Bonzini , Jonathan Corbet , Vitaly Kuznetsov , Wanpeng Li , Jim Mattson , Joerg Roedel , Thomas Gleixner , Ingo Molnar , Borislav Petkov , x86@kernel.org, "H . Peter Anvin" , Hugh Dickins , Jeff Layton , "J . Bruce Fields" , Andrew Morton , Mike Rapoport , Steven Price , "Maciej S . Szmigiero" , Vlastimil Babka , Vishal Annapurve , Yu Zhang , "Kirill A . Shutemov" , luto@kernel.org, jun.nakajima@intel.com, dave.hansen@intel.com, ak@linux.intel.com, david@redhat.com Subject: Re: [PATCH v5 10/13] KVM: Register private memslot to memory backing store Message-ID: References: <20220310140911.50924-1-chao.p.peng@linux.intel.com> <20220310140911.50924-11-chao.p.peng@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20220310140911.50924-11-chao.p.peng@linux.intel.com> X-Spam-Status: No, score=-17.6 required=5.0 tests=BAYES_00,DKIMWL_WL_MED, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF, ENV_AND_HDR_SPF_MATCH,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE,USER_IN_DEF_DKIM_WL,USER_IN_DEF_SPF_WL autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Mar 10, 2022, Chao Peng wrote: > Add 'notifier' to memslot to make it a memfile_notifier node and then > register it to memory backing store via memfile_register_notifier() when > memslot gets created. When memslot is deleted, do the reverse with > memfile_unregister_notifier(). Note each KVM memslot can be registered > to different memory backing stores (or the same backing store but at > different offset) independently. > > Signed-off-by: Yu Zhang > Signed-off-by: Chao Peng > --- > include/linux/kvm_host.h | 1 + > virt/kvm/kvm_main.c | 75 ++++++++++++++++++++++++++++++++++++---- > 2 files changed, 70 insertions(+), 6 deletions(-) > > diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h > index 6e1d770d6bf8..9b175aeca63f 100644 > --- a/include/linux/kvm_host.h > +++ b/include/linux/kvm_host.h > @@ -567,6 +567,7 @@ struct kvm_memory_slot { > struct file *private_file; > loff_t private_offset; > struct memfile_pfn_ops *pfn_ops; > + struct memfile_notifier notifier; > }; > > static inline bool kvm_slot_is_private(const struct kvm_memory_slot *slot) > diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c > index d11a2628b548..67349421eae3 100644 > --- a/virt/kvm/kvm_main.c > +++ b/virt/kvm/kvm_main.c > @@ -840,6 +840,37 @@ static int kvm_init_mmu_notifier(struct kvm *kvm) > > #endif /* CONFIG_MMU_NOTIFIER && KVM_ARCH_WANT_MMU_NOTIFIER */ > > +#ifdef CONFIG_MEMFILE_NOTIFIER > +static inline int kvm_memfile_register(struct kvm_memory_slot *slot) This is a good oppurtunity to hide away the memfile details a bit. Maybe kvm_private_mem_{,un}register()? > +{ > + return memfile_register_notifier(file_inode(slot->private_file), > + &slot->notifier, > + &slot->pfn_ops); > +} > + > +static inline void kvm_memfile_unregister(struct kvm_memory_slot *slot) > +{ > + if (slot->private_file) { > + memfile_unregister_notifier(file_inode(slot->private_file), > + &slot->notifier); > + fput(slot->private_file); This should not do fput(), it makes the helper imbalanced with respect to the register path and will likely lead to double fput(). Indeed, if preparing the region fails, __kvm_set_memory_region() will double up on fput() due to checking its local "file" for null, not slot->private for null. > + slot->private_file = NULL; > + } > +} > + > +#else /* !CONFIG_MEMFILE_NOTIFIER */ > + > +static inline int kvm_memfile_register(struct kvm_memory_slot *slot) > +{ This should WARN_ON_ONCE(). Ditto for unregister. > + return -EOPNOTSUPP; > +} > + > +static inline void kvm_memfile_unregister(struct kvm_memory_slot *slot) > +{ > +} > + > +#endif /* CONFIG_MEMFILE_NOTIFIER */ > + > #ifdef CONFIG_HAVE_KVM_PM_NOTIFIER > static int kvm_pm_notifier_call(struct notifier_block *bl, > unsigned long state, > @@ -884,6 +915,9 @@ static void kvm_destroy_dirty_bitmap(struct kvm_memory_slot *memslot) > /* This does not remove the slot from struct kvm_memslots data structures */ > static void kvm_free_memslot(struct kvm *kvm, struct kvm_memory_slot *slot) > { > + if (slot->flags & KVM_MEM_PRIVATE) > + kvm_memfile_unregister(slot); With fput() move out of unregister, this needs to be: if (slot->flags & KVM_MEM_PRIVATE) { kvm_private_mem_unregister(slot); fput(slot->private_file); } > + > kvm_destroy_dirty_bitmap(slot); > > kvm_arch_free_memslot(kvm, slot); > @@ -1738,6 +1772,12 @@ static int kvm_set_memslot(struct kvm *kvm, > kvm_invalidate_memslot(kvm, old, invalid_slot); > } > > + if (new->flags & KVM_MEM_PRIVATE && change == KVM_MR_CREATE) { > + r = kvm_memfile_register(new); > + if (r) > + return r; > + } This belongs in kvm_prepare_memory_region(). The shenanigans for DELETE and MOVE are special. > + > r = kvm_prepare_memory_region(kvm, old, new, change); > if (r) { > /* > @@ -1752,6 +1792,10 @@ static int kvm_set_memslot(struct kvm *kvm, > } else { > mutex_unlock(&kvm->slots_arch_lock); > } > + > + if (new->flags & KVM_MEM_PRIVATE && change == KVM_MR_CREATE) > + kvm_memfile_unregister(new); > + > return r; > } > > @@ -1817,6 +1861,7 @@ int __kvm_set_memory_region(struct kvm *kvm, > enum kvm_mr_change change; > unsigned long npages; > gfn_t base_gfn; > + struct file *file = NULL; Nit, naming this private_file would help understand its use. Though I think it's easier to not have a local variable. More below. > int as_id, id; > int r; > > @@ -1890,14 +1935,24 @@ int __kvm_set_memory_region(struct kvm *kvm, > return 0; > } > > + if (mem->flags & KVM_MEM_PRIVATE) { > + file = fdget(region_ext->private_fd).file; This can use fget() instead of fdget(). > + if (!file) > + return -EINVAL; > + } > + > if ((change == KVM_MR_CREATE || change == KVM_MR_MOVE) && > - kvm_check_memslot_overlap(slots, id, base_gfn, base_gfn + npages)) > - return -EEXIST; > + kvm_check_memslot_overlap(slots, id, base_gfn, base_gfn + npages)) { > + r = -EEXIST; > + goto out; > + } > > /* Allocate a slot that will persist in the memslot. */ > new = kzalloc(sizeof(*new), GFP_KERNEL_ACCOUNT); > - if (!new) > - return -ENOMEM; > + if (!new) { > + r = -ENOMEM; > + goto out; > + } > > new->as_id = as_id; > new->id = id; > @@ -1905,10 +1960,18 @@ int __kvm_set_memory_region(struct kvm *kvm, > new->npages = npages; > new->flags = mem->flags; > new->userspace_addr = mem->userspace_addr; > + new->private_file = file; > + new->private_offset = mem->flags & KVM_MEM_PRIVATE ? > + region_ext->private_offset : 0; "new" is zero-allocated, so all the private stuff, including the fget(), can be wrapped in a single KVM_MEM_PRIVATE check. Moving fget() eliminates the number of gotos needed (the above -EEXIST and -ENOMEM paths don't need to be modified). > r = kvm_set_memslot(kvm, old, new, change); > - if (r) > - kfree(new); > + if (!r) > + return r; Use goto, e.g. if (r) goto out; return 0; Burying the happy path in a taken if-statement is confusing and error prone, mostly because it breaks well-established kernel patterns. Note, there's no need for a separate out_free since new->private_file will be NULL in either case. I don't have a strong preference, I just find it easier to read code that's more explicit, but I'm a-ok collapsing them into a single label. if ((change == KVM_MR_CREATE || change == KVM_MR_MOVE) && kvm_check_memslot_overlap(slots, id, base_gfn, base_gfn + npages)) return -EEXIST; /* Allocate a slot that will persist in the memslot. */ new = kzalloc(sizeof(*new), GFP_KERNEL_ACCOUNT); if (!new) return -ENOMEM; new->as_id = as_id; new->id = id; new->base_gfn = base_gfn; new->npages = npages; new->flags = mem->flags; new->userspace_addr = mem->userspace_addr; if (mem->flags & KVM_MEM_PRIVATE) { new->private_file = fget(mem->private_fd); if (!new->private_file) { r = -EINVAL; goto out_free; } new->private_offset = mem->private_offset; } r = kvm_set_memslot(kvm, old, new, change); if (r) goto out; return 0; out: if (new->private_file) fput(new->private_file); out_free: kfree(new); return r;