Received: by 2002:ac0:950e:0:0:0:0:0 with SMTP id f14csp1321720imc; Sun, 17 Mar 2019 10:26:57 -0700 (PDT) X-Google-Smtp-Source: APXvYqwtojLW5Z2L/+lqBbMZd3fohSqWuisc5VhlG3dmvfN3vlRK4JshBKtKeDcw3zDQKmTpcmrh X-Received: by 2002:a65:62c5:: with SMTP id m5mr13565653pgv.77.1552843616924; Sun, 17 Mar 2019 10:26:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1552843616; cv=none; d=google.com; s=arc-20160816; b=iAetbB8kAmAQiyX2usx6JgO41sJKtYh8Z2EighTbkJAGfXKXp+DREuXFI0mcB+mS75 H774+8Ocx15pFe3Y8zyCMz59hQ1c9Wsr59c062cATZSWcrwp3G7FM4HTQ4oFIF9AJMqn jhb/E7Df0FiRUu6wtxRZdbpr6gN7gverhHoPHPhtk3ECxRtMKWYk/HPsHWQGwzrft5/i H+XkjqOojyJ6nw5gRrVTM6ixum9WnzBPfxOFBlZeQzK6/Pcmj0UVACPUoYvofrPyfcwv GQS1MH6a8YQJMeaR0Nsh8r+GLQtauLAKBiKz80BkTBUqn4DpewDV6P2zyChGrPJ1fi5/ feyw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from; bh=yK7GRKWfSEYGSOjrwhVa/OEnzCTHB9TaLWpso6IIcO8=; b=SqEF3U2oAFdGuXEmaH3BbHi5ZU3X7jpneCGI/Lsgqh82SHMMNshP4N8F7KBcuoa0wC exJmKGScOXKVfREMQxKu/Av0vLCtRpEAxvD/rLswzQ5mCaqLOp9FpxjEY+LK9nI5LnSL 4at8Quthbv6VJhJYA6bfTTLjBBKMnZx267+gE9rpUsnWzWaIy5Sj1QgllpK9bSv2+CXo PIjl3ryNE4QyxLnvqDXRWruMQ2OiSEptANgsOHjDnGtmTss+W3cxQihthmL7UzkVXKNT 94QQKqK+RP4qZZngsSe0CBDzOEbI+rQ/AVq7wMi8Y+myN84XSBQKJ/wishMGMTV2crT+ C8tQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b11si6697211pgt.234.2019.03.17.10.26.41; Sun, 17 Mar 2019 10:26:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727617AbfCQRYP (ORCPT + 99 others); Sun, 17 Mar 2019 13:24:15 -0400 Received: from mx1.redhat.com ([209.132.183.28]:54496 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726735AbfCQRYO (ORCPT ); Sun, 17 Mar 2019 13:24:14 -0400 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 90AF5356DA; Sun, 17 Mar 2019 17:24:13 +0000 (UTC) Received: from laptop.redhat.com (ovpn-116-102.ams2.redhat.com [10.36.116.102]) by smtp.corp.redhat.com (Postfix) with ESMTP id 79BF219C71; Sun, 17 Mar 2019 17:24:07 +0000 (UTC) From: Eric Auger To: eric.auger.pro@gmail.com, eric.auger@redhat.com, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, kvm@vger.kernel.org, kvmarm@lists.cs.columbia.edu, joro@8bytes.org, alex.williamson@redhat.com, jacob.jun.pan@linux.intel.com, yi.l.liu@linux.intel.com, jean-philippe.brucker@arm.com, will.deacon@arm.com, robin.murphy@arm.com Cc: kevin.tian@intel.com, ashok.raj@intel.com, marc.zyngier@arm.com, christoffer.dall@arm.com, peter.maydell@linaro.org, vincent.stehle@arm.com Subject: [PATCH v6 15/22] dma-iommu: Implement NESTED_MSI cookie Date: Sun, 17 Mar 2019 18:22:25 +0100 Message-Id: <20190317172232.1068-16-eric.auger@redhat.com> In-Reply-To: <20190317172232.1068-1-eric.auger@redhat.com> References: <20190317172232.1068-1-eric.auger@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Sun, 17 Mar 2019 17:24:13 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Up to now, when the type was UNMANAGED, we used to allocate IOVA pages within a range provided by the user. This does not work in nested mode. If both the host and the guest are exposed with SMMUs, each would allocate an IOVA. The guest allocates an IOVA (gIOVA) to map onto the guest MSI doorbell (gDB). The Host allocates another IOVA (hIOVA) to map onto the physical doorbell (hDB). So we end up with 2 unrelated mappings, at S1 and S2: S1 S2 gIOVA -> gDB hIOVA -> hDB The PCI device would be programmed with hIOVA. iommu_dma_bind_guest_msi allows to pass gIOVA/gDB to the host so that gIOVA can be used by the host instead of re-allocating a new hIOVA. The device handle also is passed to garantee devices belonging to different stage1 domains record distinguishable stage1 mappings. That way the host can create the following nested mapping: S1 S2 gIOVA -> gDB -> hDB this time, the PCI device will be programmed with the gIOVA MSI doorbell which is correctly mapped through the 2 stages. Signed-off-by: Eric Auger --- v3 -> v4: - change function names; add unregister - protect with msi_lock v2 -> v3: - also store the device handle on S1 mapping registration. This garantees we associate the associated S2 mapping binds to the correct physical MSI controller. v1 -> v2: - unmap stage2 on put() --- drivers/iommu/dma-iommu.c | 145 ++++++++++++++++++++++++++++++++++++-- include/linux/dma-iommu.h | 18 +++++ 2 files changed, 159 insertions(+), 4 deletions(-) diff --git a/drivers/iommu/dma-iommu.c b/drivers/iommu/dma-iommu.c index 77aabe637a60..77ec3d35d41e 100644 --- a/drivers/iommu/dma-iommu.c +++ b/drivers/iommu/dma-iommu.c @@ -35,12 +35,16 @@ struct iommu_dma_msi_page { struct list_head list; dma_addr_t iova; + dma_addr_t gpa; phys_addr_t phys; + size_t s1_granule; + struct device *dev; }; enum iommu_dma_cookie_type { IOMMU_DMA_IOVA_COOKIE, IOMMU_DMA_MSI_COOKIE, + IOMMU_DMA_NESTED_MSI_COOKIE, }; struct iommu_dma_cookie { @@ -110,14 +114,17 @@ EXPORT_SYMBOL(iommu_get_dma_cookie); * * Users who manage their own IOVA allocation and do not want DMA API support, * but would still like to take advantage of automatic MSI remapping, can use - * this to initialise their own domain appropriately. Users should reserve a + * this to initialise their own domain appropriately. Users may reserve a * contiguous IOVA region, starting at @base, large enough to accommodate the * number of PAGE_SIZE mappings necessary to cover every MSI doorbell address - * used by the devices attached to @domain. + * used by the devices attached to @domain. The other way round is to provide + * usable iova pages through the iommu_dma_bind_doorbell API (nested stages + * use case) */ int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base) { struct iommu_dma_cookie *cookie; + int nesting, ret; if (domain->type != IOMMU_DOMAIN_UNMANAGED) return -EINVAL; @@ -125,7 +132,12 @@ int iommu_get_msi_cookie(struct iommu_domain *domain, dma_addr_t base) if (domain->iova_cookie) return -EEXIST; - cookie = cookie_alloc(IOMMU_DMA_MSI_COOKIE); + ret = iommu_domain_get_attr(domain, DOMAIN_ATTR_NESTING, &nesting); + if (!ret && nesting) + cookie = cookie_alloc(IOMMU_DMA_NESTED_MSI_COOKIE); + else + cookie = cookie_alloc(IOMMU_DMA_MSI_COOKIE); + if (!cookie) return -ENOMEM; @@ -146,6 +158,7 @@ void iommu_put_dma_cookie(struct iommu_domain *domain) { struct iommu_dma_cookie *cookie = domain->iova_cookie; struct iommu_dma_msi_page *msi, *tmp; + bool s2_unmap = false; if (!cookie) return; @@ -153,7 +166,15 @@ void iommu_put_dma_cookie(struct iommu_domain *domain) if (cookie->type == IOMMU_DMA_IOVA_COOKIE && cookie->iovad.granule) put_iova_domain(&cookie->iovad); + if (cookie->type == IOMMU_DMA_NESTED_MSI_COOKIE) + s2_unmap = true; + list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list, list) { + if (s2_unmap && msi->phys) { + size_t size = cookie_msi_granule(cookie); + + WARN_ON(iommu_unmap(domain, msi->gpa, size) != size); + } list_del(&msi->list); kfree(msi); } @@ -162,6 +183,85 @@ void iommu_put_dma_cookie(struct iommu_domain *domain) } EXPORT_SYMBOL(iommu_put_dma_cookie); +/** + * iommu_dma_bind_guest_msi - Allows to pass the stage 1 + * binding of a virtual MSI doorbell used by @dev. + * + * @domain: domain handle + * @dev: device handle + * @iova: guest iova + * @gpa: gpa of the virtual doorbell + * @size: size of the granule used for the stage1 mapping + * + * In nested stage use case, the user can provide IOVA/IPA bindings + * corresponding to a guest MSI stage 1 mapping. When the host needs + * to map its own MSI doorbells, it can use @gpa as stage 2 input + * and map it onto the physical MSI doorbell. + */ +int iommu_dma_bind_guest_msi(struct iommu_domain *domain, struct device *dev, + dma_addr_t iova, phys_addr_t gpa, size_t size) +{ + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iommu_dma_msi_page *msi; + int ret = 0; + + if (!cookie) + return -EINVAL; + + if (cookie->type != IOMMU_DMA_NESTED_MSI_COOKIE) + return -EINVAL; + + iova = iova & ~(dma_addr_t)(size - 1); + gpa = gpa & ~(phys_addr_t)(size - 1); + + spin_lock(&cookie->msi_lock); + + list_for_each_entry(msi, &cookie->msi_page_list, list) { + if (msi->iova == iova && msi->dev == dev) + goto unlock; /* this page is already registered */ + } + + msi = kzalloc(sizeof(*msi), GFP_ATOMIC); + if (!msi) { + ret = -ENOMEM; + goto unlock; + } + + msi->iova = iova; + msi->gpa = gpa; + msi->dev = dev; + msi->s1_granule = size; + list_add(&msi->list, &cookie->msi_page_list); +unlock: + spin_unlock(&cookie->msi_lock); + return ret; +} +EXPORT_SYMBOL(iommu_dma_bind_guest_msi); + +void iommu_dma_unbind_guest_msi(struct iommu_domain *domain, struct device *dev, + dma_addr_t giova) +{ + struct iommu_dma_cookie *cookie = domain->iova_cookie; + struct iommu_dma_msi_page *msi, *tmp; + + list_for_each_entry_safe(msi, tmp, &cookie->msi_page_list, list) { + dma_addr_t aligned_giova = + giova & ~(dma_addr_t)(msi->s1_granule - 1); + + if (msi->dev == dev && msi->iova == aligned_giova) { + if (msi->phys) { + /* unmap the stage 2 */ + size_t size = cookie_msi_granule(cookie); + + WARN_ON(iommu_unmap(domain, msi->gpa, size) != size); + } + list_del(&msi->list); + kfree(msi); + } + } +} +EXPORT_SYMBOL(iommu_dma_unbind_guest_msi); + /** * iommu_dma_get_resv_regions - Reserved region driver helper * @dev: Device from iommu_get_resv_regions() @@ -855,6 +955,16 @@ void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle, __iommu_dma_unmap(iommu_get_dma_domain(dev), handle, size); } +static bool msi_page_match(struct iommu_dma_msi_page *msi_page, + struct device *dev, phys_addr_t msi_addr) +{ + bool match = msi_page->phys == msi_addr; + + if (msi_page->dev) + match &= (msi_page->dev == dev); + return match; +} + static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, phys_addr_t msi_addr, struct iommu_domain *domain) { @@ -866,9 +976,36 @@ static struct iommu_dma_msi_page *iommu_dma_get_msi_page(struct device *dev, msi_addr &= ~(phys_addr_t)(size - 1); list_for_each_entry(msi_page, &cookie->msi_page_list, list) - if (msi_page->phys == msi_addr) + if (msi_page_match(msi_page, dev, msi_addr)) return msi_page; + /* + * In nested stage mode, we do not allocate an MSI page in + * a range provided by the user. Instead, IOVA/IPA bindings are + * individually provided. We reuse thise IOVAs to build the + * GIOVA -> GPA -> MSI HPA nested stage mapping. + */ + if (cookie->type == IOMMU_DMA_NESTED_MSI_COOKIE) { + list_for_each_entry(msi_page, &cookie->msi_page_list, list) + if (!msi_page->phys && msi_page->dev == dev) { + int ret; + + /* do the stage 2 mapping */ + ret = iommu_map(domain, + msi_page->gpa, msi_addr, size, + IOMMU_MMIO | IOMMU_WRITE); + if (ret) { + pr_warn("MSI S2 mapping failed (%d)\n", + ret); + return NULL; + } + msi_page->phys = msi_addr; + return msi_page; + } + pr_warn("%s no MSI binding found\n", __func__); + return NULL; + } + msi_page = kzalloc(sizeof(*msi_page), GFP_ATOMIC); if (!msi_page) return NULL; diff --git a/include/linux/dma-iommu.h b/include/linux/dma-iommu.h index e760dc5d1fa8..6fc0f2b4a56a 100644 --- a/include/linux/dma-iommu.h +++ b/include/linux/dma-iommu.h @@ -24,6 +24,7 @@ #include #include #include +#include int iommu_dma_init(void); @@ -73,6 +74,10 @@ void iommu_dma_unmap_resource(struct device *dev, dma_addr_t handle, /* The DMA API isn't _quite_ the whole story, though... */ void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg); void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list); +int iommu_dma_bind_guest_msi(struct iommu_domain *domain, struct device *dev, + dma_addr_t iova, phys_addr_t gpa, size_t size); +void iommu_dma_unbind_guest_msi(struct iommu_domain *domain, + struct device *dev, dma_addr_t giova); #else @@ -103,6 +108,19 @@ static inline void iommu_dma_map_msi_msg(int irq, struct msi_msg *msg) { } +static inline int +iommu_dma_bind_guest_msi(struct iommu_domain *domain, struct device *dev, + dma_addr_t iova, phys_addr_t gpa, size_t size) +{ + return -ENODEV; +} + +static inline void +iommu_dma_unbind_guest_msi(struct iommu_domain *domain, + struct device *dev, dma_addr_t giova); +{ +} + static inline void iommu_dma_get_resv_regions(struct device *dev, struct list_head *list) { } -- 2.20.1