Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp1588670rdb; Sat, 2 Dec 2023 01:45:28 -0800 (PST) X-Google-Smtp-Source: AGHT+IHGVqMKHGCOHkQyGRgV7tjhutGX7XJhISsVBTxsbC0gfJXgtkFGxza/WQYEX//vIAnzfoSB X-Received: by 2002:a05:6e02:2189:b0:35d:5b39:8372 with SMTP id j9-20020a056e02218900b0035d5b398372mr520208ila.102.1701510328108; Sat, 02 Dec 2023 01:45:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701510328; cv=none; d=google.com; s=arc-20160816; b=PGd+aYqjSKxVNXY1aQLjorDMOuYuzqqdC+Qt0DBjSNToj495XiJT1PM8GYZD6Fim+M /JsjFJ4AJIS6xoGzcodS90WSo79secYny+SU4oSTTdt7874DQEOMZ67/JN8pJu779ajA DpJZaFGYpS+LWld41qzFFDqpcbJXAL2ah9trzhT2R9fixu2DCHoAp6Zd+APEinVaTOih HlOoTZV4vYZ4hrBkWfxkGDxz7zQiv3Q5esu+rzYmBGsg91zDE3ofR2kRI9lWR8r6Mxqr LA4n1nRYh8FePHhpTZ/Z4jehH7DC1Yg9R+rQO9UTJcC3+q4sPdzPHKwqiwu5E2jqyqdt poVg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=1ve5kc/czji3+3mMi0mzSHebuKMt6xQ2T6F/0U1qdW8=; fh=+WI4m5k3dRLR+dR3neThuZkNBTzIm/a8HgtddERL9fA=; b=lI0GHk7U/2qDs6XF30DuvlrfoafTvGnOu74OsLEB19COMuK0EOChLStrFHQEyh03yJ FNpgba6YY7TXUSxHncH3yKtTsWKR5sycOuuMreaq8Iw7SGySgN/PKtVdnyCtZlfr3dz8 F43S7blZhHIeEFvE4zVoEEfQy7qfyuL4UGQTRVUc64BeoOF86fvfr1jvZyd0A8Tx3c4i jNYaJD4ET5HgpaFvtg2QQM3dKQuwnlSpXO2mrnGbadv1QtXZX+ZyfRa2ULlxNzaaijxE O7QqdnVah1itvZAsb5uarxxTiOnG3WBwk/GEfE6OduNiRNfciJH45WQJ6CyPvq+fukGd eBzw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=deRAPI2L; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from snail.vger.email (snail.vger.email. [2620:137:e000::3:7]) by mx.google.com with ESMTPS id c19-20020a631c13000000b005c65eb2bc63si794309pgc.608.2023.12.02.01.45.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Dec 2023 01:45:28 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) client-ip=2620:137:e000::3:7; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=deRAPI2L; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:7 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by snail.vger.email (Postfix) with ESMTP id AADDB804E809; Sat, 2 Dec 2023 01:45:26 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at snail.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232304AbjLBJpK (ORCPT + 99 others); Sat, 2 Dec 2023 04:45:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44690 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229514AbjLBJpJ (ORCPT ); Sat, 2 Dec 2023 04:45:09 -0500 Received: from mgamail.intel.com (mgamail.intel.com [134.134.136.65]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 47DC819F; Sat, 2 Dec 2023 01:45:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701510315; x=1733046315; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=4NU1FsCRind+58ulZNL2WYHJcOcY7NHFVydNqKJRdKM=; b=deRAPI2L0kO8iPTYwCLUjCsIDtb4zE70WsXrZ5dA9/wunNe/hqzZoto/ eGRcPH3t5YBH5lJTbYf+6vxDvn0S2M9URP5mNDWL8eaco+diJsD6eR1Nh ReA7LMEcbGw4euzzZIF/r/Eth+h3VNkqc96CUMkyzrbFim3F6zrKS26Xo I6BM7xXVke8mBS2fROvZAUuJSMedYj7IFhnF4F8+E0OQ6IL1EL5hTcc5Y 5edk2rKYaRMELLrEaX6ePlkLcBnxImZKbMKjifbD7od5/Dbg1V+UJ3YwA p8JifwJJNHnRNiy2/ONm5ElW42lHwTw8BWDM216FLsOwXsmlk7cod5kCA w==; X-IronPort-AV: E=McAfee;i="6600,9927,10911"; a="397478868" X-IronPort-AV: E=Sophos;i="6.04,245,1695711600"; d="scan'208";a="397478868" Received: from fmsmga001.fm.intel.com ([10.253.24.23]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2023 01:45:14 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10911"; a="913852587" X-IronPort-AV: E=Sophos;i="6.04,245,1695711600"; d="scan'208";a="913852587" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by fmsmga001-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2023 01:45:09 -0800 From: Yan Zhao To: iommu@lists.linux.dev, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: alex.williamson@redhat.com, jgg@nvidia.com, pbonzini@redhat.com, seanjc@google.com, joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, kevin.tian@intel.com, baolu.lu@linux.intel.com, dwmw2@infradead.org, yi.l.liu@intel.com, Yan Zhao Subject: [RFC PATCH 04/42] KVM: Skeleton of KVM TDP FD object Date: Sat, 2 Dec 2023 17:16:15 +0800 Message-Id: <20231202091615.13643-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231202091211.13376-1-yan.y.zhao@intel.com> References: <20231202091211.13376-1-yan.y.zhao@intel.com> X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (snail.vger.email [0.0.0.0]); Sat, 02 Dec 2023 01:45:26 -0800 (PST) This is a skeleton implementation of KVM TDP FD object. The KVM TDP FD object is created by ioctl KVM_CREATE_TDP_FD in kvm_create_tdp_fd(), which contains Public part (defined in ): - A file object for reference count file reference count is 1 on creating KVM TDP FD object. On the reference count of the file object goes to 0, its .release() handler will destroy the KVM TDP FD object. - ops kvm_exported_tdp_ops (empty implementation in this patch). Private part (kvm_exported_tdp object defined in this patch) : The kvm_exported_tdp object is linked in kvm->exported_tdp_list, one for each KVM address space. It records address space id, and "kvm" pointer for TDP FD object, and KVM VM ref is hold during object life cycle. In later patches, this kvm_exported_tdp object will be associated to a TDP page table exported by KVM. Two symbols kvm_tdp_fd_get() and kvm_tdp_fd_put() are implemented and exported to external components to get/put KVM TDP FD object. Signed-off-by: Yan Zhao --- include/linux/kvm_host.h | 18 ++++ virt/kvm/Kconfig | 3 + virt/kvm/Makefile.kvm | 1 + virt/kvm/kvm_main.c | 5 + virt/kvm/tdp_fd.c | 208 +++++++++++++++++++++++++++++++++++++++ virt/kvm/tdp_fd.h | 5 + 6 files changed, 240 insertions(+) create mode 100644 virt/kvm/tdp_fd.c diff --git a/include/linux/kvm_host.h b/include/linux/kvm_host.h index 4944136efaa22..122f47c94ecae 100644 --- a/include/linux/kvm_host.h +++ b/include/linux/kvm_host.h @@ -44,6 +44,7 @@ #include #include +#include #ifndef KVM_MAX_VCPU_IDS #define KVM_MAX_VCPU_IDS KVM_MAX_VCPUS @@ -808,6 +809,11 @@ struct kvm { struct notifier_block pm_notifier; #endif char stats_id[KVM_STATS_NAME_SIZE]; + +#ifdef CONFIG_HAVE_KVM_EXPORTED_TDP + struct list_head exported_tdp_list; + spinlock_t exported_tdplist_lock; +#endif }; #define kvm_err(fmt, ...) \ @@ -2318,4 +2324,16 @@ static inline void kvm_account_pgtable_pages(void *virt, int nr) /* Max number of entries allowed for each kvm dirty ring */ #define KVM_DIRTY_RING_MAX_ENTRIES 65536 +#ifdef CONFIG_HAVE_KVM_EXPORTED_TDP + +struct kvm_exported_tdp { + struct kvm_tdp_fd *tdp_fd; + + struct kvm *kvm; + u32 as_id; + /* head at kvm->exported_tdp_list */ + struct list_head list_node; +}; + +#endif /* CONFIG_HAVE_KVM_EXPORTED_TDP */ #endif diff --git a/virt/kvm/Kconfig b/virt/kvm/Kconfig index 484d0873061ca..63b5d55c84e95 100644 --- a/virt/kvm/Kconfig +++ b/virt/kvm/Kconfig @@ -92,3 +92,6 @@ config HAVE_KVM_PM_NOTIFIER config KVM_GENERIC_HARDWARE_ENABLING bool + +config HAVE_KVM_EXPORTED_TDP + bool diff --git a/virt/kvm/Makefile.kvm b/virt/kvm/Makefile.kvm index 2c27d5d0c367c..fad4638e407c5 100644 --- a/virt/kvm/Makefile.kvm +++ b/virt/kvm/Makefile.kvm @@ -12,3 +12,4 @@ kvm-$(CONFIG_KVM_ASYNC_PF) += $(KVM)/async_pf.o kvm-$(CONFIG_HAVE_KVM_IRQ_ROUTING) += $(KVM)/irqchip.o kvm-$(CONFIG_HAVE_KVM_DIRTY_RING) += $(KVM)/dirty_ring.o kvm-$(CONFIG_HAVE_KVM_PFNCACHE) += $(KVM)/pfncache.o +kvm-$(CONFIG_HAVE_KVM_EXPORTED_TDP) += $(KVM)/tdp_fd.o diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 494b6301a6065..9fa9132055807 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1232,6 +1232,11 @@ static struct kvm *kvm_create_vm(unsigned long type, const char *fdname) INIT_HLIST_HEAD(&kvm->irq_ack_notifier_list); #endif +#ifdef CONFIG_HAVE_KVM_EXPORTED_TDP + INIT_LIST_HEAD(&kvm->exported_tdp_list); + spin_lock_init(&kvm->exported_tdplist_lock); +#endif + r = kvm_init_mmu_notifier(kvm); if (r) goto out_err_no_mmu_notifier; diff --git a/virt/kvm/tdp_fd.c b/virt/kvm/tdp_fd.c new file mode 100644 index 0000000000000..a5c4c3597e94f --- /dev/null +++ b/virt/kvm/tdp_fd.c @@ -0,0 +1,208 @@ +// SPDX-License-Identifier: GPL-2.0-only +/* + * KVM TDP FD + * + */ +#include +#include +#include + +#include "tdp_fd.h" + +static inline int is_tdp_fd_file(struct file *file); +static const struct file_operations kvm_tdp_fd_fops; +static const struct kvm_exported_tdp_ops exported_tdp_ops; + +int kvm_create_tdp_fd(struct kvm *kvm, struct kvm_create_tdp_fd *ct) +{ + struct kvm_exported_tdp *tdp; + struct kvm_tdp_fd *tdp_fd; + int as_id = ct->as_id; + int ret, fd; + + if (as_id >= KVM_ADDRESS_SPACE_NUM || ct->pad || ct->mode) + return -EINVAL; + + /* for each address space, only one exported tdp is allowed */ + spin_lock(&kvm->exported_tdplist_lock); + list_for_each_entry(tdp, &kvm->exported_tdp_list, list_node) { + if (tdp->as_id != as_id) + continue; + + spin_unlock(&kvm->exported_tdplist_lock); + return -EEXIST; + } + spin_unlock(&kvm->exported_tdplist_lock); + + tdp_fd = kzalloc(sizeof(*tdp_fd), GFP_KERNEL_ACCOUNT); + if (!tdp) + return -ENOMEM; + + tdp = kzalloc(sizeof(*tdp), GFP_KERNEL_ACCOUNT); + if (!tdp) { + kfree(tdp_fd); + return -ENOMEM; + } + tdp_fd->priv = tdp; + tdp->tdp_fd = tdp_fd; + tdp->as_id = as_id; + + if (!kvm_get_kvm_safe(kvm)) { + ret = -ENODEV; + goto out; + } + tdp->kvm = kvm; + + tdp_fd->file = anon_inode_getfile("tdp_fd", &kvm_tdp_fd_fops, + tdp_fd, O_RDWR | O_CLOEXEC); + if (!tdp_fd->file) { + ret = -EFAULT; + goto out; + } + + fd = get_unused_fd_flags(O_RDWR | O_CLOEXEC); + if (fd < 0) + goto out; + + fd_install(fd, tdp_fd->file); + ct->fd = fd; + tdp_fd->ops = &exported_tdp_ops; + + spin_lock(&kvm->exported_tdplist_lock); + list_add(&tdp->list_node, &kvm->exported_tdp_list); + spin_unlock(&kvm->exported_tdplist_lock); + return 0; + +out: + if (tdp_fd->file) + fput(tdp_fd->file); + + if (tdp->kvm) + kvm_put_kvm_no_destroy(tdp->kvm); + kfree(tdp); + kfree(tdp_fd); + return ret; +} + +static int kvm_tdp_fd_release(struct inode *inode, struct file *file) +{ + struct kvm_exported_tdp *tdp; + struct kvm_tdp_fd *tdp_fd; + + if (!is_tdp_fd_file(file)) + return -EINVAL; + + tdp_fd = file->private_data; + tdp = tdp_fd->priv; + + if (WARN_ON(!tdp || !tdp->kvm)) + return -EFAULT; + + spin_lock(&tdp->kvm->exported_tdplist_lock); + list_del(&tdp->list_node); + spin_unlock(&tdp->kvm->exported_tdplist_lock); + + kvm_put_kvm(tdp->kvm); + kfree(tdp); + kfree(tdp_fd); + return 0; +} + +static long kvm_tdp_fd_ioctl(struct file *file, unsigned int cmd, + unsigned long arg) +{ + /* Do not support ioctl currently. May add it in future */ + return -ENODEV; +} + +static int kvm_tdp_fd_mmap(struct file *filp, struct vm_area_struct *vma) +{ + return -ENODEV; +} + +static const struct file_operations kvm_tdp_fd_fops = { + .unlocked_ioctl = kvm_tdp_fd_ioctl, + .compat_ioctl = compat_ptr_ioctl, + .release = kvm_tdp_fd_release, + .mmap = kvm_tdp_fd_mmap, +}; + +static inline int is_tdp_fd_file(struct file *file) +{ + return file->f_op == &kvm_tdp_fd_fops; +} + +static int kvm_tdp_register_importer(struct kvm_tdp_fd *tdp_fd, + struct kvm_tdp_importer_ops *ops, void *data) +{ + return -EOPNOTSUPP; +} + +static void kvm_tdp_unregister_importer(struct kvm_tdp_fd *tdp_fd, + struct kvm_tdp_importer_ops *ops) +{ +} + +static void *kvm_tdp_get_metadata(struct kvm_tdp_fd *tdp_fd) +{ + return ERR_PTR(-EOPNOTSUPP); +} + +static int kvm_tdp_fault(struct kvm_tdp_fd *tdp_fd, struct mm_struct *mm, + unsigned long gfn, struct kvm_tdp_fault_type type) +{ + return -EOPNOTSUPP; +} + +static const struct kvm_exported_tdp_ops exported_tdp_ops = { + .register_importer = kvm_tdp_register_importer, + .unregister_importer = kvm_tdp_unregister_importer, + .get_metadata = kvm_tdp_get_metadata, + .fault = kvm_tdp_fault, +}; + +/** + * kvm_tdp_fd_get - Public interface to get KVM TDP FD object. + * + * @fd: fd of the KVM TDP FD object. + * @return: KVM TDP FD object if @fd corresponds to a valid KVM TDP FD file. + * -EBADF if @fd does not correspond a struct file. + * -EINVAL if @fd does not correspond to a KVM TDP FD file. + * + * Callers of this interface will get a KVM TDP FD object with ref count + * increased. + */ +struct kvm_tdp_fd *kvm_tdp_fd_get(int fd) +{ + struct file *file; + + file = fget(fd); + if (!file) + return ERR_PTR(-EBADF); + + if (!is_tdp_fd_file(file)) { + fput(file); + return ERR_PTR(-EINVAL); + } + return file->private_data; +} +EXPORT_SYMBOL_GPL(kvm_tdp_fd_get); + +/** + * kvm_tdp_fd_put - Public interface to put ref count of a KVM TDP FD object. + * + * @tdp_fd: KVM TDP FD object. + * + * Put reference count of the KVM TDP FD object. + * After the last reference count of the TDP fd goes away, + * kvm_tdp_fd_release() will be called to decrease KVM VM ref count and destroy + * the KVM TDP FD object. + */ +void kvm_tdp_fd_put(struct kvm_tdp_fd *tdp_fd) +{ + if (WARN_ON(!tdp_fd || !tdp_fd->file || !is_tdp_fd_file(tdp_fd->file))) + return; + + fput(tdp_fd->file); +} +EXPORT_SYMBOL_GPL(kvm_tdp_fd_put); diff --git a/virt/kvm/tdp_fd.h b/virt/kvm/tdp_fd.h index 05c8a6d767469..85da9d8cc1ce4 100644 --- a/virt/kvm/tdp_fd.h +++ b/virt/kvm/tdp_fd.h @@ -2,9 +2,14 @@ #ifndef __TDP_FD_H #define __TDP_FD_H +#ifdef CONFIG_HAVE_KVM_EXPORTED_TDP +int kvm_create_tdp_fd(struct kvm *kvm, struct kvm_create_tdp_fd *ct); + +#else static inline int kvm_create_tdp_fd(struct kvm *kvm, struct kvm_create_tdp_fd *ct) { return -EOPNOTSUPP; } +#endif /* CONFIG_HAVE_KVM_EXPORTED_TDP */ #endif /* __TDP_FD_H */ -- 2.17.1