Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp430465ybz; Tue, 21 Apr 2020 11:49:11 -0700 (PDT) X-Google-Smtp-Source: APiQypLMX2/Zd4NnQKUQErKAII7rbusW1epuRIYN5kEdcRHlbGIGJ5v6yKxsBVd4IIy2RtpvTGv4 X-Received: by 2002:aa7:d689:: with SMTP id d9mr17755374edr.22.1587494951132; Tue, 21 Apr 2020 11:49:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587494951; cv=none; d=google.com; s=arc-20160816; b=awmq83awTezXTCyE717N3iUFmOEcALAHSQ+KDItEXxV6+FI5JG9hGFpx8w2hIDgT8a /T+IktQGLXD517XwfYoCM4ms3vRemYiOXiLxsBSAyRH75ZpYrNjvcW3rHjRHdId1REIG wvZAkKGbzHaVO8E6w30ex0OQr5IM+pTuyZmEwUsFfbofzQVCn0XwMpo+AdLoZlw+k+fR W4EMxktAzH1ugCUr1HV+DcMm8dPrm6szKGxJBYsLg5n5jTh/1RqGWG3bVF7Na1fz7onT GPj4p3aLNBM9TOWzHjv0l011rym7yHwaeJQ+DU8Hk1U3W9PuZYg75mnOwhgOhk8RTff+ oLog== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from:ironport-sdr:ironport-sdr; bh=Pom1xFQRnDZqGP0GCj6IL5vii4nlbkbf5Dix9IY1hqY=; b=vgOOgJXuI7jno5dSaPezWRAeFEFoFFMu9hvZm4/FwtvGsofeE//j2AA6jhZ/tKCmwN HedUTzVCHC/UsF4+TOlFebCjtxGjlWhZhwrEcLUdbpitAVzdGaQxKF89bKwl5Xy1Y6Mb Zs34Hvm/f7VXWjubBAUC2jUrIeK8mtN6gSmbb9LgReMzVo2RKGV0IdNJpOb/qTSwjRBs 87G/nJHChKMcGXdLfiU9E8zlw6QugmfQSEEfC2MBXU+xnSxr3YmmCLBkZKGLZo8mGDmF xK4kwZCsyqrgBX2eh+QX1hmiAbbWyJa21IcJe3yRF5phSwG8jspOA01MemVCqg4fe0FA xSEQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id u19si2074255eja.75.2020.04.21.11.48.47; Tue, 21 Apr 2020 11:49:11 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726695AbgDUSrW (ORCPT + 99 others); Tue, 21 Apr 2020 14:47:22 -0400 Received: from mga04.intel.com ([192.55.52.120]:22255 "EHLO mga04.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726467AbgDUSrK (ORCPT ); Tue, 21 Apr 2020 14:47:10 -0400 IronPort-SDR: Gvi6BQtE2Yzl9dj9qO556vWO6bHOIcFf9WJkT0+ZLDUrWxLcdQESQVfd6xr3cWFJeXF7dwcHkt VNeDFKLw2glQ== X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga001.jf.intel.com ([10.7.209.18]) by fmsmga104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 21 Apr 2020 11:46:48 -0700 IronPort-SDR: +1mfGseSW/2ckPRgtTnd3DaqQTmgZ4aN70nKaKVQiD0QKYcqU6Aqmzze8A7ThPOv8oLWsj+F3d Ktaez4n5CqzA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.72,411,1580803200"; d="scan'208";a="334367883" Received: from jacob-builder.jf.intel.com ([10.7.199.155]) by orsmga001.jf.intel.com with ESMTP; 21 Apr 2020 11:46:48 -0700 From: Jacob Pan To: "Lu Baolu" , iommu@lists.linux-foundation.org, LKML , Joerg Roedel , David Woodhouse , Jean-Philippe Brucker , Eric Auger Cc: "Yi Liu" , "Tian, Kevin" , Raj Ashok , Alex Williamson , "Christoph Hellwig" , Jonathan Cameron , Jacob Pan , Liu@vger.kernel.org Subject: [PATCH v12 6/8] iommu/vt-d: Add svm/sva invalidate function Date: Tue, 21 Apr 2020 11:52:43 -0700 Message-Id: <1587495165-80096-7-git-send-email-jacob.jun.pan@linux.intel.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1587495165-80096-1-git-send-email-jacob.jun.pan@linux.intel.com> References: <1587495165-80096-1-git-send-email-jacob.jun.pan@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org When Shared Virtual Address (SVA) is enabled for a guest OS via vIOMMU, we need to provide invalidation support at IOMMU API and driver level. This patch adds Intel VT-d specific function to implement iommu passdown invalidate API for shared virtual address. The use case is for supporting caching structure invalidation of assigned SVM capable devices. Emulated IOMMU exposes queue invalidation capability and passes down all descriptors from the guest to the physical IOMMU. The assumption is that guest to host device ID mapping should be resolved prior to calling IOMMU driver. Based on the device handle, host IOMMU driver can replace certain fields before submit to the invalidation queue. --- v12 - Use ratelimited prints for all user called APIs. - Check for domain nesting attr --- Signed-off-by: Jacob Pan Signed-off-by: Ashok Raj Signed-off-by: Liu, Yi L Signed-off-by: Jacob Pan --- drivers/iommu/intel-iommu.c | 175 ++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 175 insertions(+) diff --git a/drivers/iommu/intel-iommu.c b/drivers/iommu/intel-iommu.c index 8862d6b0ef21..24de233faaf5 100644 --- a/drivers/iommu/intel-iommu.c +++ b/drivers/iommu/intel-iommu.c @@ -5595,6 +5595,180 @@ static void intel_iommu_aux_detach_device(struct iommu_domain *domain, aux_domain_remove_dev(to_dmar_domain(domain), dev); } +/* + * 2D array for converting and sanitizing IOMMU generic TLB granularity to + * VT-d granularity. Invalidation is typically included in the unmap operation + * as a result of DMA or VFIO unmap. However, for assigned devices guest + * owns the first level page tables. Invalidations of translation caches in the + * guest are trapped and passed down to the host. + * + * vIOMMU in the guest will only expose first level page tables, therefore + * we do not support IOTLB granularity for request without PASID (second level). + * + * For example, to find the VT-d granularity encoding for IOTLB + * type and page selective granularity within PASID: + * X: indexed by iommu cache type + * Y: indexed by enum iommu_inv_granularity + * [IOMMU_CACHE_INV_TYPE_IOTLB][IOMMU_INV_GRANU_ADDR] + */ + +const static int +inv_type_granu_table[IOMMU_CACHE_INV_TYPE_NR][IOMMU_INV_GRANU_NR] = { + /* + * PASID based IOTLB invalidation: PASID selective (per PASID), + * page selective (address granularity) + */ + {-EINVAL, QI_GRAN_NONG_PASID, QI_GRAN_PSI_PASID}, + /* PASID based dev TLBs */ + {-EINVAL, -EINVAL, QI_DEV_IOTLB_GRAN_PASID_SEL}, + /* PASID cache */ + {-EINVAL, -EINVAL, -EINVAL} +}; + +static inline int to_vtd_granularity(int type, int granu) +{ + return inv_type_granu_table[type][granu]; +} + +static inline u64 to_vtd_size(u64 granu_size, u64 nr_granules) +{ + u64 nr_pages = (granu_size * nr_granules) >> VTD_PAGE_SHIFT; + + /* VT-d size is encoded as 2^size of 4K pages, 0 for 4k, 9 for 2MB, etc. + * IOMMU cache invalidate API passes granu_size in bytes, and number of + * granu size in contiguous memory. + */ + return order_base_2(nr_pages); +} + +#ifdef CONFIG_INTEL_IOMMU_SVM +static int +intel_iommu_sva_invalidate(struct iommu_domain *domain, struct device *dev, + struct iommu_cache_invalidate_info *inv_info) +{ + struct dmar_domain *dmar_domain = to_dmar_domain(domain); + struct device_domain_info *info; + struct intel_iommu *iommu; + unsigned long flags; + int cache_type; + u8 bus, devfn; + u16 did, sid; + int ret = 0; + u64 size = 0; + + if (!inv_info || !dmar_domain || + inv_info->version != IOMMU_CACHE_INVALIDATE_INFO_VERSION_1) + return -EINVAL; + + if (!dev || !dev_is_pci(dev)) + return -ENODEV; + + iommu = device_to_iommu(dev, &bus, &devfn); + if (!iommu) + return -ENODEV; + + if (!(dmar_domain->flags & DOMAIN_FLAG_NESTING_MODE)) + return -EINVAL; + + spin_lock_irqsave(&device_domain_lock, flags); + spin_lock(&iommu->lock); + info = iommu_support_dev_iotlb(dmar_domain, iommu, bus, devfn); + if (!info) { + ret = -EINVAL; + goto out_unlock; + } + did = dmar_domain->iommu_did[iommu->seq_id]; + sid = PCI_DEVID(bus, devfn); + + /* Size is only valid in non-PASID selective invalidation */ + if (inv_info->granularity != IOMMU_INV_GRANU_PASID) + size = to_vtd_size(inv_info->addr_info.granule_size, + inv_info->addr_info.nb_granules); + + for_each_set_bit(cache_type, + (unsigned long *)&inv_info->cache, + IOMMU_CACHE_INV_TYPE_NR) { + int granu = 0; + u64 pasid = 0; + + granu = to_vtd_granularity(cache_type, inv_info->granularity); + if (granu == -EINVAL) { + pr_err_ratelimited("Invalid cache type and granu combination %d/%d\n", + cache_type, inv_info->granularity); + break; + } + + /* + * PASID is stored in different locations based on the + * granularity. + */ + if (inv_info->granularity == IOMMU_INV_GRANU_PASID && + (inv_info->pasid_info.flags & IOMMU_INV_PASID_FLAGS_PASID)) + pasid = inv_info->pasid_info.pasid; + else if (inv_info->granularity == IOMMU_INV_GRANU_ADDR && + (inv_info->addr_info.flags & IOMMU_INV_ADDR_FLAGS_PASID)) + pasid = inv_info->addr_info.pasid; + + switch (BIT(cache_type)) { + case IOMMU_CACHE_INV_TYPE_IOTLB: + if (inv_info->granularity == IOMMU_INV_GRANU_ADDR && + size && + (inv_info->addr_info.addr & ((BIT(VTD_PAGE_SHIFT + size)) - 1))) { + pr_err_ratelimited("Address out of range, 0x%llx, size order %llu\n", + inv_info->addr_info.addr, size); + ret = -ERANGE; + goto out_unlock; + } + + /* + * If granu is PASID-selective, address is ignored. + * We use npages = -1 to indicate that. + */ + qi_flush_piotlb(iommu, did, pasid, + mm_to_dma_pfn(inv_info->addr_info.addr), + (granu == QI_GRAN_NONG_PASID) ? -1 : 1 << size, + inv_info->addr_info.flags & IOMMU_INV_ADDR_FLAGS_LEAF); + + /* + * Always flush device IOTLB if ATS is enabled. vIOMMU + * in the guest may assume IOTLB flush is inclusive, + * which is more efficient. + */ + if (info->ats_enabled) + qi_flush_dev_iotlb_pasid(iommu, sid, + info->pfsid, pasid, + info->ats_qdep, + inv_info->addr_info.addr, + size, granu); + break; + case IOMMU_CACHE_INV_TYPE_DEV_IOTLB: + if (info->ats_enabled) + qi_flush_dev_iotlb_pasid(iommu, sid, + info->pfsid, pasid, + info->ats_qdep, + inv_info->addr_info.addr, + size, granu); + else + pr_warn_ratelimited("Passdown device IOTLB flush w/o ATS!\n"); + break; + case IOMMU_CACHE_INV_TYPE_PASID: + qi_flush_pasid_cache(iommu, did, granu, + inv_info->pasid_info.pasid); + break; + default: + dev_err_ratelimited(dev, "Unsupported IOMMU invalidation type %d\n", + cache_type); + ret = -EINVAL; + } + } +out_unlock: + spin_unlock(&iommu->lock); + spin_unlock_irqrestore(&device_domain_lock, flags); + + return ret; +} +#endif + static int intel_iommu_map(struct iommu_domain *domain, unsigned long iova, phys_addr_t hpa, size_t size, int iommu_prot, gfp_t gfp) @@ -6180,6 +6354,7 @@ const struct iommu_ops intel_iommu_ops = { .is_attach_deferred = intel_iommu_is_attach_deferred, .pgsize_bitmap = INTEL_IOMMU_PGSIZES, #ifdef CONFIG_INTEL_IOMMU_SVM + .cache_invalidate = intel_iommu_sva_invalidate, .sva_bind_gpasid = intel_svm_bind_gpasid, .sva_unbind_gpasid = intel_svm_unbind_gpasid, #endif -- 2.7.4