Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752520AbdFZNjp (ORCPT ); Mon, 26 Jun 2017 09:39:45 -0400 Received: from szxga02-in.huawei.com ([45.249.212.188]:8858 "EHLO szxga02-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752187AbdFZNj2 (ORCPT ); Mon, 26 Jun 2017 09:39:28 -0400 From: Zhen Lei To: Will Deacon , Joerg Roedel , linux-arm-kernel , iommu , Robin Murphy , linux-kernel CC: Zefan Li , Xinwei Hu , "Tianhong Ding" , Hanjun Guo , Zhen Lei , John Garry Subject: [PATCH 2/5] iommu: add a new member unmap_tlb_sync into struct iommu_ops Date: Mon, 26 Jun 2017 21:38:47 +0800 Message-ID: <1498484330-10840-3-git-send-email-thunder.leizhen@huawei.com> X-Mailer: git-send-email 1.9.5.msysgit.0 In-Reply-To: <1498484330-10840-1-git-send-email-thunder.leizhen@huawei.com> References: <1498484330-10840-1-git-send-email-thunder.leizhen@huawei.com> MIME-Version: 1.0 Content-Type: text/plain X-Originating-IP: [10.177.23.164] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020204.59510E8E.000F,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2014-11-16 11:51:01, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: ed5302fe4556ae41e7f51ba1eefc20de Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2330 Lines: 66 An iova range may contain many pages/blocks, especially for the case of unmap_sg. Currently, for each page/block unmapping, a tlb invalidation operation will be followed and wait(called tlb_sync) until the operation's over. But actually we only need one tlb_sync in the last stage. Look at the loop in function iommu_unmap: while (unmapped < size) { ... unmapped_page = domain->ops->unmap(domain, iova, pgsize); ... } It's not a good idea to add the tlb_sync in domain->ops->unmap. There are many profits, below actions can be reduced: 1. iommu hardware is a shared resource for cpus, for the tlb_sync operation, lock protection is needed. 2. iommu hardware is not inside CPU, to start tlb_sync and check it finished may take a lot of time. Some people might ask: Is it safe to do so? The answer is yes. The standard processing flow is: alloc iova map process data unmap tlb invalidation and sync free iova What should be guaranteed is: "free iova" action is behind "unmap" and "tlbi operation" action, that is what we are doing right now. This ensures that: all TLBs of an iova-range have been invalidated before the iova reallocated. Signed-off-by: Zhen Lei --- drivers/iommu/iommu.c | 3 +++ include/linux/iommu.h | 1 + 2 files changed, 4 insertions(+) diff --git a/drivers/iommu/iommu.c b/drivers/iommu/iommu.c index cf7ca7e..01e91a8 100644 --- a/drivers/iommu/iommu.c +++ b/drivers/iommu/iommu.c @@ -1610,6 +1610,9 @@ size_t iommu_unmap(struct iommu_domain *domain, unsigned long iova, size_t size) unmapped += unmapped_page; } + if (domain->ops->unmap_tlb_sync) + domain->ops->unmap_tlb_sync(domain); + trace_unmap(orig_iova, size, unmapped); return unmapped; } diff --git a/include/linux/iommu.h b/include/linux/iommu.h index 2cb54ad..5964121 100644 --- a/include/linux/iommu.h +++ b/include/linux/iommu.h @@ -197,6 +197,7 @@ struct iommu_ops { phys_addr_t paddr, size_t size, int prot); size_t (*unmap)(struct iommu_domain *domain, unsigned long iova, size_t size); + void (*unmap_tlb_sync)(struct iommu_domain *domain); size_t (*map_sg)(struct iommu_domain *domain, unsigned long iova, struct scatterlist *sg, unsigned int nents, int prot); phys_addr_t (*iova_to_phys)(struct iommu_domain *domain, dma_addr_t iova); -- 2.5.0