Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp3028080ybt; Sat, 4 Jul 2020 04:20:42 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx4/0vMR0YTXyjLPYPaKOrZ/zS/0WHt+D6w2dhA6qlhMn007WY0harqphLhJZa/G4SNshyD X-Received: by 2002:a50:f08c:: with SMTP id v12mr44558214edl.119.1593861641955; Sat, 04 Jul 2020 04:20:41 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593861641; cv=none; d=google.com; s=arc-20160816; b=WmrRDSUzJymYjIbzGEIxnKQ5qOC23NYmtjeK8KgmaIlzVYwjaLicdDa3G936y97XNl FricsxcVIq5PoYDwSH8QC75e2kpB6krrrXObAEooABY+gfmNNEDfiEmuGI89c86Z61wS cBcCvPgnWr5n3t1Zt2D2MgwkVfYS6LlqHB6u047bDR3nXRhFRoGL1nyOj8XzMWLl9n0/ 4pMeuSB5PFwAmTYGeDBEhPE2f6/QsYV58i912nz4vox9ojJ40EpUA08gAgIf5pZekYtu 3/8tK72Ftw6W7PmfS9/yhT8tcKoSMU00dV/VzKW6DmJS8cvOFyxy9+Z0F/Gx+vg6Wit0 sP2w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:ironport-sdr:ironport-sdr; bh=yQbY8DcG5QeRhFcTF0SsHQfLf8HjWa5J1MqxVzU62Gg=; b=ByK9Ed157voJAZLCgobEqYRZHoi3A+sRUDEqn7QWRmWBirv0rZtLCI0PmS/EJ6B6UG 14IDtiZQCVFQgiKtZsiFyhSY38NmbHKErSU3Po95DdJZs4/j37lHpZPq9jvFg54o2Zq2 CQ4cEnZT+P1CCAzgk5FwkIAI3aUghlAgSB+Emmq2JYxB5t/e6YSffTbkREWAbV8C3x7V kG89KTcv5ChutFTfqcOfvvU60tlDVGpy4oDJjAg7aVpot2d8YbV86GRmZ5ySXZupglY9 JRyJiPuYhBscMKwHgTXJlmLZ1jPvdHkku1dNySca07Wuk/GMuZR495Z0k+8XAYG6gR1D TSsQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z17si9773220ejo.11.2020.07.04.04.20.05; Sat, 04 Jul 2020 04:20:41 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727870AbgGDLUA (ORCPT + 99 others); Sat, 4 Jul 2020 07:20:00 -0400 Received: from mga11.intel.com ([192.55.52.93]:48334 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727109AbgGDLT6 (ORCPT ); Sat, 4 Jul 2020 07:19:58 -0400 IronPort-SDR: mg/tUxNMymYiWu5/0BdolbyORf4pwnG3aizBEwAHPKqiTBdHCLuyfWkxRmAHMYXhUWKro156gQ 9LBuey7//tcw== X-IronPort-AV: E=McAfee;i="6000,8403,9671"; a="145371342" X-IronPort-AV: E=Sophos;i="5.75,311,1589266800"; d="scan'208";a="145371342" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga006.jf.intel.com ([10.7.209.51]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 04 Jul 2020 04:19:53 -0700 IronPort-SDR: rrauR9Mx6KGiNvmAhCO4W1Sohabr5nGaDstSCt4d9E4oD/DBcWH2jjo6raGT1Gft7GI2yyioQR dvLBncfZexAw== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,311,1589266800"; d="scan'208";a="282521420" Received: from jacob-builder.jf.intel.com ([10.7.199.155]) by orsmga006.jf.intel.com with ESMTP; 04 Jul 2020 04:19:52 -0700 From: Liu Yi L To: alex.williamson@redhat.com, eric.auger@redhat.com, baolu.lu@linux.intel.com, joro@8bytes.org Cc: kevin.tian@intel.com, jacob.jun.pan@linux.intel.com, ashok.raj@intel.com, yi.l.liu@intel.com, jun.j.tian@intel.com, yi.y.sun@intel.com, jean-philippe@linaro.org, peterx@redhat.com, hao.wu@intel.com, stefanha@gmail.com, iommu@lists.linux-foundation.org, kvm@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [PATCH v4 04/15] vfio/type1: Report iommu nesting info to userspace Date: Sat, 4 Jul 2020 04:26:18 -0700 Message-Id: <1593861989-35920-5-git-send-email-yi.l.liu@intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1593861989-35920-1-git-send-email-yi.l.liu@intel.com> References: <1593861989-35920-1-git-send-email-yi.l.liu@intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This patch exports iommu nesting capability info to user space through VFIO. User space is expected to check this info for supported uAPIs (e.g. PASID alloc/free, bind page table, and cache invalidation) and the vendor specific format information for first level/stage page table that will be bound to. The nesting info is available only after the nesting iommu type is set for a container. Current implementation imposes one limitation - one nesting container should include at most one group. The philosophy of vfio container is having all groups/devices within the container share the same IOMMU context. When vSVA is enabled, one IOMMU context could include one 2nd-level address space and multiple 1st-level address spaces. While the 2nd-leve address space is reasonably sharable by multiple groups , blindly sharing 1st-level address spaces across all groups within the container might instead break the guest expectation. In the future sub/ super container concept might be introduced to allow partial address space sharing within an IOMMU context. But for now let's go with this restriction by requiring singleton container for using nesting iommu features. Below link has the related discussion about this decision. https://lkml.org/lkml/2020/5/15/1028 Cc: Kevin Tian CC: Jacob Pan Cc: Alex Williamson Cc: Eric Auger Cc: Jean-Philippe Brucker Cc: Joerg Roedel Cc: Lu Baolu Signed-off-by: Liu Yi L --- v3 -> v4: *) address comments against v3. v1 -> v2: *) added in v2 --- drivers/vfio/vfio_iommu_type1.c | 105 +++++++++++++++++++++++++++++++++++----- include/uapi/linux/vfio.h | 16 ++++++ 2 files changed, 109 insertions(+), 12 deletions(-) diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index 7accb59..80623b8 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -62,18 +62,20 @@ MODULE_PARM_DESC(dma_entry_limit, "Maximum number of user DMA mappings per container (65535)."); struct vfio_iommu { - struct list_head domain_list; - struct list_head iova_list; - struct vfio_domain *external_domain; /* domain for external user */ - struct mutex lock; - struct rb_root dma_list; - struct blocking_notifier_head notifier; - unsigned int dma_avail; - uint64_t pgsize_bitmap; - bool v2; - bool nesting; - bool dirty_page_tracking; - bool pinned_page_dirty_scope; + struct list_head domain_list; + struct list_head iova_list; + struct vfio_domain *external_domain; /* domain for + external user */ + struct mutex lock; + struct rb_root dma_list; + struct blocking_notifier_head notifier; + unsigned int dma_avail; + uint64_t pgsize_bitmap; + bool v2; + bool nesting; + bool dirty_page_tracking; + bool pinned_page_dirty_scope; + struct iommu_nesting_info *nesting_info; }; struct vfio_domain { @@ -130,6 +132,9 @@ struct vfio_regions { #define IS_IOMMU_CAP_DOMAIN_IN_CONTAINER(iommu) \ (!list_empty(&iommu->domain_list)) +#define IS_DOMAIN_IN_CONTAINER(iommu) ((iommu->external_domain) || \ + (!list_empty(&iommu->domain_list))) + #define DIRTY_BITMAP_BYTES(n) (ALIGN(n, BITS_PER_TYPE(u64)) / BITS_PER_BYTE) /* @@ -1929,6 +1934,13 @@ static void vfio_iommu_iova_insert_copy(struct vfio_iommu *iommu, list_splice_tail(iova_copy, iova); } + +static void vfio_iommu_release_nesting_info(struct vfio_iommu *iommu) +{ + kfree(iommu->nesting_info); + iommu->nesting_info = NULL; +} + static int vfio_iommu_type1_attach_group(void *iommu_data, struct iommu_group *iommu_group) { @@ -1959,6 +1971,12 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, } } + /* Nesting type container can include only one group */ + if (iommu->nesting && IS_DOMAIN_IN_CONTAINER(iommu)) { + mutex_unlock(&iommu->lock); + return -EINVAL; + } + group = kzalloc(sizeof(*group), GFP_KERNEL); domain = kzalloc(sizeof(*domain), GFP_KERNEL); if (!group || !domain) { @@ -2029,6 +2047,36 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, if (ret) goto out_domain; + /* Nesting cap info is available only after attaching */ + if (iommu->nesting) { + struct iommu_nesting_info tmp; + struct iommu_nesting_info *info; + + /* First get the size of vendor specific nesting info */ + ret = iommu_domain_get_attr(domain->domain, + DOMAIN_ATTR_NESTING, + &tmp); + if (ret) + goto out_detach; + + info = kzalloc(tmp.size, GFP_KERNEL); + if (!info) { + ret = -ENOMEM; + goto out_detach; + } + + /* Now get the nesting info */ + info->size = tmp.size; + ret = iommu_domain_get_attr(domain->domain, + DOMAIN_ATTR_NESTING, + info); + if (ret) { + kfree(info); + goto out_detach; + } + iommu->nesting_info = info; + } + /* Get aperture info */ iommu_domain_get_attr(domain->domain, DOMAIN_ATTR_GEOMETRY, &geo); @@ -2138,6 +2186,7 @@ static int vfio_iommu_type1_attach_group(void *iommu_data, return 0; out_detach: + vfio_iommu_release_nesting_info(iommu); vfio_iommu_detach_group(domain, group); out_domain: iommu_domain_free(domain->domain); @@ -2338,6 +2387,8 @@ static void vfio_iommu_type1_detach_group(void *iommu_data, vfio_iommu_unmap_unpin_all(iommu); else vfio_iommu_unmap_unpin_reaccount(iommu); + + vfio_iommu_release_nesting_info(iommu); } iommu_domain_free(domain->domain); list_del(&domain->next); @@ -2546,6 +2597,30 @@ static int vfio_iommu_migration_build_caps(struct vfio_iommu *iommu, return vfio_info_add_capability(caps, &cap_mig.header, sizeof(cap_mig)); } +static int vfio_iommu_info_add_nesting_cap(struct vfio_iommu *iommu, + struct vfio_info_cap *caps) +{ + struct vfio_info_cap_header *header; + struct vfio_iommu_type1_info_cap_nesting *nesting_cap; + size_t size; + + size = sizeof(*nesting_cap) + iommu->nesting_info->size; + + header = vfio_info_cap_add(caps, size, + VFIO_IOMMU_TYPE1_INFO_CAP_NESTING, 1); + if (IS_ERR(header)) + return PTR_ERR(header); + + nesting_cap = container_of(header, + struct vfio_iommu_type1_info_cap_nesting, + header); + + memcpy(&nesting_cap->info, iommu->nesting_info, + iommu->nesting_info->size); + + return 0; +} + static int vfio_iommu_type1_get_info(struct vfio_iommu *iommu, unsigned long arg) { @@ -2586,6 +2661,12 @@ static int vfio_iommu_type1_get_info(struct vfio_iommu *iommu, if (ret) return ret; + if (iommu->nesting_info) { + ret = vfio_iommu_info_add_nesting_cap(iommu, &caps); + if (ret) + return ret; + } + if (caps.size) { info.flags |= VFIO_IOMMU_INFO_CAPS; diff --git a/include/uapi/linux/vfio.h b/include/uapi/linux/vfio.h index 9204705..3e3de9c 100644 --- a/include/uapi/linux/vfio.h +++ b/include/uapi/linux/vfio.h @@ -1039,6 +1039,22 @@ struct vfio_iommu_type1_info_cap_migration { __u64 max_dirty_bitmap_size; /* in bytes */ }; +#define VFIO_IOMMU_TYPE1_INFO_CAP_NESTING 3 + +/* + * Reporting nesting info to user space. + * + * @info: the nesting info provided by IOMMU driver. Today + * it is expected to be a struct iommu_nesting_info + * data. + */ +struct vfio_iommu_type1_info_cap_nesting { + struct vfio_info_cap_header header; + __u32 flags; + __u32 padding; + __u8 info[]; +}; + #define VFIO_IOMMU_GET_INFO _IO(VFIO_TYPE, VFIO_BASE + 12) /** -- 2.7.4