Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp1590257rdb; Sat, 2 Dec 2023 01:51:30 -0800 (PST) X-Google-Smtp-Source: AGHT+IFJGcq9M2wh8pgkslpqW4/ZJ+2GscfJk2aqaX294bDzgyP+Dt+IUc7ZgPjaDHVQZFLhaXgb X-Received: by 2002:a05:6808:1a81:b0:3b8:a591:73e3 with SMTP id bm1-20020a0568081a8100b003b8a59173e3mr1045055oib.42.1701510690280; Sat, 02 Dec 2023 01:51:30 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701510690; cv=none; d=google.com; s=arc-20160816; b=GNN1H41TF8QMopq1hlvLh+vJuEVIA71vuQWvoxYWfAqcFvGKisO2k9mkeWnUf9ILFR R3SSBc8UMwW1+s/l+Fsb4dU5eSUSuVINZv7m1gGbVdSk6ahkbLS2Yg+Nsg55TbEGhvro o0WVv/VqG+erlmQ0gCU2GkJR1e8By2g5Ljljqd4SyentyuGHKPVtkjSLPkNDKLTUM+SV wvhEvqbvwFi8eXYUFW1GaNfa8n6A0s1LUnkagchHQ8B/Z/XwoyxzJR25uJudaw9eFba2 azuzb7g4URwBMX1bqvfXoZCfbEP1zp20wsd3d7hEmbN74+zneXkRgxk5swSsxkMkxMl3 H58g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:references:in-reply-to:message-id:date:subject :cc:to:from:dkim-signature; bh=PXyIz26QLs06JnLjeDvrE5lM34ZxAuGxmx25CXK2R+E=; fh=+WI4m5k3dRLR+dR3neThuZkNBTzIm/a8HgtddERL9fA=; b=hsBbJ94/VhzsLN4YxDnmWxazRx8D4aFpr4B8Kg95giwddKbM2A3TczwgNEiZJZIQKN h5r/tmax7bsoEzZl5Oznwqt20pVbb/sCJ6YyYK4b+mPyqW0gRk6DN8EgLATTLJJTKV0w cJAPeMJ7dvbRfiw2uWqK+ifwT4v/Fi9HeXuU3VCCZZmuFXjw3f5AEavbDBNZBM6yZYp4 sprxlM353pwVhx7+eQyrffXE0WhzxMvxYpNd9B3mAvFD5zLRoArWxAHJUOUpGMEQNW3X c0nmT110l6UTbeprfOmcRSGeDG4U9gUmRAR7sfAHz+eVcURUA3CHJ1vuKn77ayvrzY1k lh4g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=aDfxH2Fv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from groat.vger.email (groat.vger.email. [23.128.96.35]) by mx.google.com with ESMTPS id b7-20020a655cc7000000b005c6034ba3f4si4841042pgt.419.2023.12.02.01.51.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 02 Dec 2023 01:51:30 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) client-ip=23.128.96.35; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=aDfxH2Fv; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.35 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id 4DC7A80C2541; Sat, 2 Dec 2023 01:51:27 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232400AbjLBJvL (ORCPT + 99 others); Sat, 2 Dec 2023 04:51:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:49102 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232381AbjLBJvJ (ORCPT ); Sat, 2 Dec 2023 04:51:09 -0500 Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 020C6134; Sat, 2 Dec 2023 01:51:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1701510675; x=1733046675; h=from:to:cc:subject:date:message-id:in-reply-to: references; bh=oxDYXG3AJzaeGNtpp4op0WAgqkteWZvXNKWbORIdg58=; b=aDfxH2FvMZzJm83tpn9m4uJHPdlpv+39z7N8v06ZXJ12HyePB5DL4shj vgl/li599ggQ0qiZkpTsa15owob3Edxgg4DEJGhgCAO/Ztsn+gzg8sQKn UFDJCQfe2Lqt81C9vUxrEFRxhIS0FuNj6PpylAseERcwQe8Bi7gpghXQd jvqugmLodeVTPmz3f3iNdATbhDvPwRx3eFyRh3sQ3g8/VAKEo82E6UKmK LPZnzeSXWh56DH11+aiMZ/m0UL8r/Eus2BgfACmpjYTkujxHNkjKOp2QH K0sLIDIEDQJ32X6E+VxJmRhHoBsJNklF/NaHqYQDRSTchxHDrYR9/U4+p A==; X-IronPort-AV: E=McAfee;i="6600,9927,10911"; a="479794167" X-IronPort-AV: E=Sophos;i="6.04,245,1695711600"; d="scan'208";a="479794167" Received: from orviesa002.jf.intel.com ([10.64.159.142]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2023 01:51:15 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.04,245,1695711600"; d="scan'208";a="11414337" Received: from yzhao56-desk.sh.intel.com ([10.239.159.62]) by orviesa002-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 02 Dec 2023 01:51:12 -0800 From: Yan Zhao To: iommu@lists.linux.dev, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Cc: alex.williamson@redhat.com, jgg@nvidia.com, pbonzini@redhat.com, seanjc@google.com, joro@8bytes.org, will@kernel.org, robin.murphy@arm.com, kevin.tian@intel.com, baolu.lu@linux.intel.com, dwmw2@infradead.org, yi.l.liu@intel.com, Yan Zhao Subject: [RFC PATCH 14/42] iommufd: Enable KVM HW page table object to be proxy between KVM and IOMMU Date: Sat, 2 Dec 2023 17:22:16 +0800 Message-Id: <20231202092216.14278-1-yan.y.zhao@intel.com> X-Mailer: git-send-email 2.17.1 In-Reply-To: <20231202091211.13376-1-yan.y.zhao@intel.com> References: <20231202091211.13376-1-yan.y.zhao@intel.com> X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Sat, 02 Dec 2023 01:51:27 -0800 (PST) Enable IOMMUFD KVM HW page table object to serve as proxy between KVM and IOMMU driver. Config IOMMUFD_KVM_HWPT is added to turn on/off this ability. KVM HW page table object first gets KVM TDP fd object via KVM exported interface kvm_tdp_fd_get() and then queries KVM for vendor meta data of page tables exported (shared) by KVM. It then passes the meta data to IOMMU driver to create a IOMMU_DOMAIN_KVM domain via op domain_alloc_kvm. IOMMU driver is responsible to check compatibility between IOMMU hardware and the KVM exported page tables. After successfully creating IOMMU_DOMAIN_KVM domain, IOMMUFD KVM HW page table object registers invalidation callback to KVM to receive invalidation notifications. It then passes the notification to IOMMU driver via op cache_invalidate_kvm to invalidate hardware TLBs. Signed-off-by: Yan Zhao --- drivers/iommu/iommufd/Kconfig | 10 ++ drivers/iommu/iommufd/Makefile | 1 + drivers/iommu/iommufd/hw_pagetable_kvm.c | 183 +++++++++++++++++++++++ drivers/iommu/iommufd/iommufd_private.h | 9 ++ 4 files changed, 203 insertions(+) create mode 100644 drivers/iommu/iommufd/hw_pagetable_kvm.c diff --git a/drivers/iommu/iommufd/Kconfig b/drivers/iommu/iommufd/Kconfig index 99d4b075df49e..d79e0c1e00a4d 100644 --- a/drivers/iommu/iommufd/Kconfig +++ b/drivers/iommu/iommufd/Kconfig @@ -32,6 +32,16 @@ config IOMMUFD_VFIO_CONTAINER Unless testing IOMMUFD, say N here. +config IOMMUFD_KVM_HWPT + bool "Supports KVM managed HW page tables" + default n + help + Selecting this option will allow IOMMUFD to create IOMMU stage 2 + page tables whose paging structure and mappings are managed by + KVM MMU. IOMMUFD serves as proxy between KVM and IOMMU driver to + allow IOMMU driver to get paging structure meta data and cache + invalidate notifications from KVM. + config IOMMUFD_TEST bool "IOMMU Userspace API Test support" depends on DEBUG_KERNEL diff --git a/drivers/iommu/iommufd/Makefile b/drivers/iommu/iommufd/Makefile index 34b446146961c..ae1e0b5c300dc 100644 --- a/drivers/iommu/iommufd/Makefile +++ b/drivers/iommu/iommufd/Makefile @@ -8,6 +8,7 @@ iommufd-y := \ pages.o \ vfio_compat.o +iommufd-$(CONFIG_IOMMUFD_KVM_HWPT) += hw_pagetable_kvm.o iommufd-$(CONFIG_IOMMUFD_TEST) += selftest.o obj-$(CONFIG_IOMMUFD) += iommufd.o diff --git a/drivers/iommu/iommufd/hw_pagetable_kvm.c b/drivers/iommu/iommufd/hw_pagetable_kvm.c new file mode 100644 index 0000000000000..e0e205f384ed5 --- /dev/null +++ b/drivers/iommu/iommufd/hw_pagetable_kvm.c @@ -0,0 +1,183 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include + +#include "../iommu-priv.h" +#include "iommufd_private.h" + +static void iommufd_kvmtdp_invalidate(void *data, + unsigned long start, unsigned long size) +{ + void (*invalidate_fn)(struct iommu_domain *domain, + unsigned long iova, unsigned long size); + struct iommufd_hw_pagetable *hwpt = data; + + if (!hwpt || !hwpt_is_kvm(hwpt)) + return; + + invalidate_fn = hwpt->domain->ops->cache_invalidate_kvm; + + if (!invalidate_fn) + return; + + invalidate_fn(hwpt->domain, start, size); + +} + +struct kvm_tdp_importer_ops iommufd_import_ops = { + .invalidate = iommufd_kvmtdp_invalidate, +}; + +static inline int kvmtdp_register(struct kvm_tdp_fd *tdp_fd, void *data) +{ + if (!tdp_fd->ops->register_importer || !tdp_fd->ops->register_importer) + return -EOPNOTSUPP; + + return tdp_fd->ops->register_importer(tdp_fd, &iommufd_import_ops, data); +} + +static inline void kvmtdp_unregister(struct kvm_tdp_fd *tdp_fd) +{ + WARN_ON(!tdp_fd->ops->unregister_importer); + + tdp_fd->ops->unregister_importer(tdp_fd, &iommufd_import_ops); +} + +static inline void *kvmtdp_get_metadata(struct kvm_tdp_fd *tdp_fd) +{ + if (!tdp_fd->ops->get_metadata) + return ERR_PTR(-EOPNOTSUPP); + + return tdp_fd->ops->get_metadata(tdp_fd); +} + +/* + * Get KVM TDP FD object and ensure tdp_fd->ops is available + */ +static inline struct kvm_tdp_fd *kvmtdp_get(int fd) +{ + struct kvm_tdp_fd *tdp_fd = NULL; + struct kvm_tdp_fd *(*get_func)(int fd) = NULL; + void (*put_func)(struct kvm_tdp_fd *) = NULL; + + get_func = symbol_get(kvm_tdp_fd_get); + + if (!get_func) + goto out; + + put_func = symbol_get(kvm_tdp_fd_put); + if (!put_func) + goto out; + + tdp_fd = get_func(fd); + if (!tdp_fd) + goto out; + + if (tdp_fd->ops) { + /* success */ + goto out; + } + + put_func(tdp_fd); + tdp_fd = NULL; + +out: + if (get_func) + symbol_put(kvm_tdp_fd_get); + + if (put_func) + symbol_put(kvm_tdp_fd_put); + + return tdp_fd; +} + +static void kvmtdp_put(struct kvm_tdp_fd *tdp_fd) +{ + void (*put_func)(struct kvm_tdp_fd *) = NULL; + + put_func = symbol_get(kvm_tdp_fd_put); + WARN_ON(!put_func); + + put_func(tdp_fd); + + symbol_put(kvm_tdp_fd_put); +} + +void iommufd_hwpt_kvm_destroy(struct iommufd_object *obj) +{ + struct kvm_tdp_fd *tdp_fd; + struct iommufd_hwpt_kvm *hwpt_kvm = + container_of(obj, struct iommufd_hwpt_kvm, common.obj); + + if (hwpt_kvm->common.domain) + iommu_domain_free(hwpt_kvm->common.domain); + + tdp_fd = hwpt_kvm->context; + kvmtdp_unregister(tdp_fd); + kvmtdp_put(tdp_fd); +} + +void iommufd_hwpt_kvm_abort(struct iommufd_object *obj) +{ + iommufd_hwpt_kvm_destroy(obj); +} + +struct iommufd_hwpt_kvm * +iommufd_hwpt_kvm_alloc(struct iommufd_ctx *ictx, + struct iommufd_device *idev, u32 flags, + const struct iommu_hwpt_kvm_info *kvm_data) +{ + + const struct iommu_ops *ops = dev_iommu_ops(idev->dev); + struct iommufd_hwpt_kvm *hwpt_kvm; + struct iommufd_hw_pagetable *hwpt; + struct kvm_tdp_fd *tdp_fd; + void *meta_data; + int rc; + + if (!ops->domain_alloc_kvm) + return ERR_PTR(-EOPNOTSUPP); + + if (kvm_data->fd < 0) + return ERR_PTR(-EINVAL); + + tdp_fd = kvmtdp_get(kvm_data->fd); + if (!tdp_fd) + return ERR_PTR(-EOPNOTSUPP); + + meta_data = kvmtdp_get_metadata(tdp_fd); + if (!meta_data || IS_ERR(meta_data)) { + rc = -EFAULT; + goto out_put_tdp; + } + + hwpt_kvm = __iommufd_object_alloc(ictx, hwpt_kvm, IOMMUFD_OBJ_HWPT_KVM, + common.obj); + if (IS_ERR(hwpt_kvm)) { + rc = PTR_ERR(hwpt_kvm); + goto out_put_tdp; + } + + hwpt_kvm->context = tdp_fd; + hwpt = &hwpt_kvm->common; + + hwpt->domain = ops->domain_alloc_kvm(idev->dev, flags, meta_data); + if (IS_ERR(hwpt->domain)) { + rc = PTR_ERR(hwpt->domain); + hwpt->domain = NULL; + goto out_abort; + } + + rc = kvmtdp_register(tdp_fd, hwpt); + if (rc) + goto out_abort; + + return hwpt_kvm; + +out_abort: + iommufd_object_abort_and_destroy(ictx, &hwpt->obj); +out_put_tdp: + kvmtdp_put(tdp_fd); + return ERR_PTR(rc); +} diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index a46a6e3e537f9..2c3149b1d5b55 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -432,6 +432,14 @@ static inline bool iommufd_selftest_is_mock_dev(struct device *dev) #endif struct iommu_hwpt_kvm_info; +#ifdef CONFIG_IOMMUFD_KVM_HWPT +struct iommufd_hwpt_kvm * +iommufd_hwpt_kvm_alloc(struct iommufd_ctx *ictx, + struct iommufd_device *idev, u32 flags, + const struct iommu_hwpt_kvm_info *kvm_data); +void iommufd_hwpt_kvm_abort(struct iommufd_object *obj); +void iommufd_hwpt_kvm_destroy(struct iommufd_object *obj); +#else static inline struct iommufd_hwpt_kvm * iommufd_hwpt_kvm_alloc(struct iommufd_ctx *ictx, struct iommufd_device *idev, u32 flags, @@ -447,5 +455,6 @@ static inline void iommufd_hwpt_kvm_abort(struct iommufd_object *obj) static inline void iommufd_hwpt_kvm_destroy(struct iommufd_object *obj) { } +#endif #endif -- 2.17.1