Received: by 2002:a89:413:0:b0:1fd:dba5:e537 with SMTP id m19csp1736978lqs; Sun, 16 Jun 2024 02:22:54 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCV5AClxkkY6FndupxT00RcqvNtVwoPSvfiUnuX8X3TB/DDH7H8jwNARbmX/SwavtmF36sGgwAiu9+Ovbogwy2zeVlsoK843qC5x7yFy6g== X-Google-Smtp-Source: AGHT+IEkRMz+F7JjfosKHRPkk5KZfeimSj/oqWCf/fpFkoSrgWX+SmqcF+NB4uONrT2CHCITzSPB X-Received: by 2002:a05:6808:200a:b0:3d2:2e87:c110 with SMTP id 5614622812f47-3d24e8b5cccmr7949460b6e.8.1718529774032; Sun, 16 Jun 2024 02:22:54 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1718529773; cv=pass; d=google.com; s=arc-20160816; b=BKlMZKN8l9EmguZ76u61obDe6JJXgl0+6JP/wSPlXCcGdy0/xkDM+LfOL8be0v7lJD v1jSMyCEgfe+txoVyzIgW8/G126akKbvyHNiBXumyoyi04GULwKVQ7NgAkBgL46dWGQ6 XbNJDn/Dka/bhDnsEKhAhcxMC5YP5Tyjj45YGUnbQOcNxT2BHH72DKm7CFwjntuQKOV0 0T0jyzkXCTTaOP9XAXUY9PLyeP/EfljYe2zwm6dsLeAoXxV5etFzsiroocXe+0dNCdZi J3lO9gOgw+jn0KrnjJngfWb8jwaA6fAuPrzZxl+tzwywLTUF9+HdiS00/6XZbv40xh1Q MMUQ== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:in-reply-to:message-id :date:subject:cc:to:from:dkim-signature; bh=qH7lzJ7d39rSPHATpn+EwffKysEVsaDWNVBs3PE/KLY=; fh=tXslIEC/iII+0o9YvlUtRXapl6zSPusBr1yEidN1Q7w=; b=pXPyOPuSjjZRtxJ6LeSO81GJf7XpZOZWL+cPYXgrqvWUll6OIXdLQkAnd6YSSUDi3z pujAyENxmV1TkSSvEhNkMXQX2cZOmlriFLQR4DquOk2KRRaQjIcBgDBvhi0KYLepc2fd yva9psU/YOU+kibN0p5/rDzo71Gd8cGHfIHXRXgO867ocwtkNt9CfU3xP9iDUxCEaa36 VQsZlk9SG1skh1iHU/1xtDxwsAje77t4Vz1UN2rEwPJp0489qhPxWnmKlssKWjOIbT4y 3Gf1dsSQ+uh3XC37wj/eC8oOfItWqRxddhSdZ8Al512GP8t1GEoRAocdl/wqtqy8H2ug DMGQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YPFRGza8; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-216138-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-216138-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [2604:1380:45e3:2400::1]) by mx.google.com with ESMTPS id 41be03b00d2f7-6fee61fdabfsi7014213a12.786.2024.06.16.02.22.53 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 16 Jun 2024 02:22:53 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-216138-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) client-ip=2604:1380:45e3:2400::1; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=YPFRGza8; arc=pass (i=1 dkim=pass dkdomain=intel.com dmarc=pass fromdomain=linux.intel.com); spf=pass (google.com: domain of linux-kernel+bounces-216138-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45e3:2400::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-216138-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id C633A284336 for ; Sun, 16 Jun 2024 06:16:20 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 00E68181CE1; Sun, 16 Jun 2024 06:14:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="YPFRGza8" Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.13]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F181C181330 for ; Sun, 16 Jun 2024 06:14:45 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.13 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718518487; cv=none; b=LyVWUWMdtc8u1O3gteObZfrjqGFck1rWseNLrrRF6iicdkEBWcm7umyKEPRR8hAhjrAA0Tn2KCLFtJy1rS9snab5tTvAKnUhqMBrgGynHGgVvT/rQpF6dUgTV+x1vBXovF2i85pqKiSDT4usW4JHR1Gxp1ucUY1D8N2f+ynT5u0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1718518487; c=relaxed/simple; bh=9b3M3MA5gdEVgznGAvwgNvwxuj/7B6/TUJxXVMNH9hA=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version; b=dtGQnwH5SjVBtj28I1cnpEuF/oPId9HpETZRifTIG9LIJJbCbiGollI+NZMJ2sudBm5dItbYODyQVmNA7t0paQPSH4n0FV7owHPzgDN6f17wxI/1C3uYX5RhCR6YWEZ4SrCZ5BqzuXdexONbC7RXI6qBT+sc54569n3HLtoXn34= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com; spf=none smtp.mailfrom=linux.intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=YPFRGza8; arc=none smtp.client-ip=192.198.163.13 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1718518486; x=1750054486; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9b3M3MA5gdEVgznGAvwgNvwxuj/7B6/TUJxXVMNH9hA=; b=YPFRGza8IXI/60kYAyOBidPjPfszICWlkYtbLsKB/3Fz1COsqaNf6FkF NCbfaiuFyEkPNhaFXKtyycMBW7nPFduMJTmSzBDgbBnVyOX7Mne2fgPsb 2Y8PtZ+6DEH6xilxgGZHcIoUXBJVTtaISPLY+5dKfb7qRMrG9g0pJAyQc DuVmKiLSpqitL2JbHPfcS8QyPbT1Mrg6bti19sgXmXmNNelUHvixjc/+5 1VGFEPPxTkQ+XkkY+WTlksCnmZBjIcbB2dnxtMNdd4JAZrkxajbmzDU83 jSWan5I9OElXKRnltG5tfOjGxrWtAMAMZ8KPZLJxkCjU70vjkTe69sk48 w==; X-CSE-ConnectionGUID: VGgPiAntTLi6CFzDR6PaBw== X-CSE-MsgGUID: UpU4ru0ERzqw1XbMlMNaVQ== X-IronPort-AV: E=McAfee;i="6700,10204,11104"; a="18290071" X-IronPort-AV: E=Sophos;i="6.08,241,1712646000"; d="scan'208";a="18290071" Received: from fmviesa007.fm.intel.com ([10.60.135.147]) by fmvoesa107.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 15 Jun 2024 23:14:45 -0700 X-CSE-ConnectionGUID: /90wdJUoTgiKTRE07sAaXg== X-CSE-MsgGUID: 6yJ1Z6NtRUqipDZ7+JCt6w== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.08,241,1712646000"; d="scan'208";a="40748195" Received: from unknown (HELO allen-box.sh.intel.com) ([10.239.159.127]) by fmviesa007.fm.intel.com with ESMTP; 15 Jun 2024 23:14:42 -0700 From: Lu Baolu To: Jason Gunthorpe , Kevin Tian , Joerg Roedel , Will Deacon , Robin Murphy , Jean-Philippe Brucker , Nicolin Chen , Yi Liu , Jacob Pan , Joel Granados Cc: iommu@lists.linux.dev, virtualization@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Lu Baolu Subject: [PATCH v7 07/10] iommufd: Fault-capable hwpt attach/detach/replace Date: Sun, 16 Jun 2024 14:11:52 +0800 Message-Id: <20240616061155.169343-8-baolu.lu@linux.intel.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20240616061155.169343-1-baolu.lu@linux.intel.com> References: <20240616061155.169343-1-baolu.lu@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Add iopf-capable hw page table attach/detach/replace helpers. The pointer to iommufd_device is stored in the domain attachment handle, so that it can be echo'ed back in the iopf_group. The iopf-capable hw page tables can only be attached to devices that support the IOMMU_DEV_FEAT_IOPF feature. On the first attachment of an iopf-capable hw_pagetable to the device, the IOPF feature is enabled on the device. Similarly, after the last iopf-capable hwpt is detached from the device, the IOPF feature is disabled on the device. The current implementation allows a replacement between iopf-capable and non-iopf-capable hw page tables. This matches the nested translation use case, where a parent domain is attached by default and can then be replaced with a nested user domain with iopf support. Signed-off-by: Lu Baolu --- drivers/iommu/iommufd/iommufd_private.h | 41 +++++ drivers/iommu/iommufd/device.c | 7 +- drivers/iommu/iommufd/fault.c | 190 ++++++++++++++++++++++++ 3 files changed, 235 insertions(+), 3 deletions(-) diff --git a/drivers/iommu/iommufd/iommufd_private.h b/drivers/iommu/iommufd/iommufd_private.h index c8a4519f1405..aa4c26c87cb9 100644 --- a/drivers/iommu/iommufd/iommufd_private.h +++ b/drivers/iommu/iommufd/iommufd_private.h @@ -11,6 +11,7 @@ #include #include #include +#include "../iommu-priv.h" struct iommu_domain; struct iommu_group; @@ -293,6 +294,7 @@ int iommufd_check_iova_range(struct io_pagetable *iopt, struct iommufd_hw_pagetable { struct iommufd_object obj; struct iommu_domain *domain; + struct iommufd_fault *fault; }; struct iommufd_hwpt_paging { @@ -396,6 +398,9 @@ struct iommufd_device { /* always the physical device */ struct device *dev; bool enforce_cache_coherency; + /* protect iopf_enabled counter */ + struct mutex iopf_lock; + unsigned int iopf_enabled; }; static inline struct iommufd_device * @@ -456,6 +461,42 @@ struct iommufd_attach_handle { int iommufd_fault_alloc(struct iommufd_ucmd *ucmd); void iommufd_fault_destroy(struct iommufd_object *obj); +int iommufd_fault_domain_attach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev); +void iommufd_fault_domain_detach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev); +int iommufd_fault_domain_replace_dev(struct iommufd_device *idev, + struct iommufd_hw_pagetable *hwpt, + struct iommufd_hw_pagetable *old); + +static inline int iommufd_hwpt_attach_device(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + if (hwpt->fault) + return iommufd_fault_domain_attach_dev(hwpt, idev); + + return iommu_attach_group(hwpt->domain, idev->igroup->group); +} + +static inline void iommufd_hwpt_detach_device(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + if (hwpt->fault) + iommufd_fault_domain_detach_dev(hwpt, idev); + + iommu_detach_group(hwpt->domain, idev->igroup->group); +} + +static inline int iommufd_hwpt_replace_device(struct iommufd_device *idev, + struct iommufd_hw_pagetable *hwpt, + struct iommufd_hw_pagetable *old) +{ + if (old->fault || hwpt->fault) + return iommufd_fault_domain_replace_dev(idev, hwpt, old); + + return iommu_group_replace_domain(idev->igroup->group, hwpt->domain); +} + #ifdef CONFIG_IOMMUFD_TEST int iommufd_test(struct iommufd_ucmd *ucmd); void iommufd_selftest_destroy(struct iommufd_object *obj); diff --git a/drivers/iommu/iommufd/device.c b/drivers/iommu/iommufd/device.c index 873630c111c1..9a7ec5997c61 100644 --- a/drivers/iommu/iommufd/device.c +++ b/drivers/iommu/iommufd/device.c @@ -215,6 +215,7 @@ struct iommufd_device *iommufd_device_bind(struct iommufd_ctx *ictx, refcount_inc(&idev->obj.users); /* igroup refcount moves into iommufd_device */ idev->igroup = igroup; + mutex_init(&idev->iopf_lock); /* * If the caller fails after this success it must call @@ -376,7 +377,7 @@ int iommufd_hw_pagetable_attach(struct iommufd_hw_pagetable *hwpt, * attachment. */ if (list_empty(&idev->igroup->device_list)) { - rc = iommu_attach_group(hwpt->domain, idev->igroup->group); + rc = iommufd_hwpt_attach_device(hwpt, idev); if (rc) goto err_unresv; idev->igroup->hwpt = hwpt; @@ -402,7 +403,7 @@ iommufd_hw_pagetable_detach(struct iommufd_device *idev) mutex_lock(&idev->igroup->lock); list_del(&idev->group_item); if (list_empty(&idev->igroup->device_list)) { - iommu_detach_group(hwpt->domain, idev->igroup->group); + iommufd_hwpt_detach_device(hwpt, idev); idev->igroup->hwpt = NULL; } if (hwpt_is_paging(hwpt)) @@ -497,7 +498,7 @@ iommufd_device_do_replace(struct iommufd_device *idev, goto err_unlock; } - rc = iommu_group_replace_domain(igroup->group, hwpt->domain); + rc = iommufd_hwpt_replace_device(idev, hwpt, old_hwpt); if (rc) goto err_unresv; diff --git a/drivers/iommu/iommufd/fault.c b/drivers/iommu/iommufd/fault.c index 68ff94671d48..4934ae572638 100644 --- a/drivers/iommu/iommufd/fault.c +++ b/drivers/iommu/iommufd/fault.c @@ -8,6 +8,7 @@ #include #include #include +#include #include #include #include @@ -15,6 +16,195 @@ #include "../iommu-priv.h" #include "iommufd_private.h" +static int iommufd_fault_iopf_enable(struct iommufd_device *idev) +{ + struct device *dev = idev->dev; + int ret; + + /* + * Once we turn on PCI/PRI support for VF, the response failure code + * should not be forwarded to the hardware due to PRI being a shared + * resource between PF and VFs. There is no coordination for this + * shared capability. This waits for a vPRI reset to recover. + */ + if (dev_is_pci(dev) && to_pci_dev(dev)->is_virtfn) + return -EINVAL; + + mutex_lock(&idev->iopf_lock); + /* Device iopf has already been on. */ + if (++idev->iopf_enabled > 1) { + mutex_unlock(&idev->iopf_lock); + return 0; + } + + ret = iommu_dev_enable_feature(dev, IOMMU_DEV_FEAT_IOPF); + if (ret) + --idev->iopf_enabled; + mutex_unlock(&idev->iopf_lock); + + return ret; +} + +static void iommufd_fault_iopf_disable(struct iommufd_device *idev) +{ + mutex_lock(&idev->iopf_lock); + if (!WARN_ON(idev->iopf_enabled == 0)) { + if (--idev->iopf_enabled == 0) + iommu_dev_disable_feature(idev->dev, IOMMU_DEV_FEAT_IOPF); + } + mutex_unlock(&idev->iopf_lock); +} + +static int __fault_domain_attach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + struct iommufd_attach_handle *handle; + int ret; + + handle = kzalloc(sizeof(*handle), GFP_KERNEL); + if (!handle) + return -ENOMEM; + + handle->idev = idev; + ret = iommu_attach_group_handle(hwpt->domain, idev->igroup->group, + &handle->handle); + if (ret) + kfree(handle); + + return ret; +} + +int iommufd_fault_domain_attach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + int ret; + + if (!hwpt->fault) + return -EINVAL; + + ret = iommufd_fault_iopf_enable(idev); + if (ret) + return ret; + + ret = __fault_domain_attach_dev(hwpt, idev); + if (ret) + iommufd_fault_iopf_disable(idev); + + return ret; +} + +static void iommufd_auto_response_faults(struct iommufd_hw_pagetable *hwpt, + struct iommufd_attach_handle *handle) +{ + struct iommufd_fault *fault = hwpt->fault; + struct iopf_group *group, *next; + unsigned long index; + + if (!fault) + return; + + mutex_lock(&fault->mutex); + list_for_each_entry_safe(group, next, &fault->deliver, node) { + if (group->attach_handle != &handle->handle) + continue; + list_del(&group->node); + iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); + iopf_free_group(group); + } + + xa_for_each(&fault->response, index, group) { + if (group->attach_handle != &handle->handle) + continue; + xa_erase(&fault->response, index); + iopf_group_response(group, IOMMU_PAGE_RESP_INVALID); + iopf_free_group(group); + } + mutex_unlock(&fault->mutex); +} + +static struct iommufd_attach_handle * +iommufd_device_get_attach_handle(struct iommufd_device *idev) +{ + struct iommu_attach_handle *handle; + + handle = iommu_attach_handle_get(idev->igroup->group, IOMMU_NO_PASID, 0); + if (!handle) + return NULL; + + return to_iommufd_handle(handle); +} + +void iommufd_fault_domain_detach_dev(struct iommufd_hw_pagetable *hwpt, + struct iommufd_device *idev) +{ + struct iommufd_attach_handle *handle; + + handle = iommufd_device_get_attach_handle(idev); + iommu_detach_group_handle(hwpt->domain, idev->igroup->group); + iommufd_auto_response_faults(hwpt, handle); + iommufd_fault_iopf_disable(idev); + kfree(handle); +} + +static int __fault_domain_replace_dev(struct iommufd_device *idev, + struct iommufd_hw_pagetable *hwpt, + struct iommufd_hw_pagetable *old) +{ + struct iommufd_attach_handle *handle, *curr = NULL; + int ret; + + if (old->fault) + curr = iommufd_device_get_attach_handle(idev); + + if (hwpt->fault) { + handle = kzalloc(sizeof(*handle), GFP_KERNEL); + if (!handle) + return -ENOMEM; + + handle->handle.domain = hwpt->domain; + handle->idev = idev; + ret = iommu_replace_group_handle(idev->igroup->group, + hwpt->domain, &handle->handle); + } else { + ret = iommu_replace_group_handle(idev->igroup->group, + hwpt->domain, NULL); + } + + if (!ret && curr) { + iommufd_auto_response_faults(old, curr); + kfree(curr); + } + + return ret; +} + +int iommufd_fault_domain_replace_dev(struct iommufd_device *idev, + struct iommufd_hw_pagetable *hwpt, + struct iommufd_hw_pagetable *old) +{ + bool iopf_off = !hwpt->fault && old->fault; + bool iopf_on = hwpt->fault && !old->fault; + int ret; + + if (iopf_on) { + ret = iommufd_fault_iopf_enable(idev); + if (ret) + return ret; + } + + ret = __fault_domain_replace_dev(idev, hwpt, old); + if (ret) { + if (iopf_on) + iommufd_fault_iopf_disable(idev); + return ret; + } + + if (iopf_off) + iommufd_fault_iopf_disable(idev); + + return 0; +} + void iommufd_fault_destroy(struct iommufd_object *obj) { struct iommufd_fault *fault = container_of(obj, struct iommufd_fault, obj); -- 2.34.1