Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752857AbdHQQup (ORCPT ); Thu, 17 Aug 2017 12:50:45 -0400 Received: from mx2.suse.de ([195.135.220.15]:56193 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751575AbdHQQuo (ORCPT ); Thu, 17 Aug 2017 12:50:44 -0400 Date: Thu, 17 Aug 2017 18:50:40 +0200 From: Joerg Roedel To: Will Deacon Cc: Joerg Roedel , iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, Suravee Suthikulpanit , Alex Williamson , Robin Murphy Subject: Re: [PATCH 02/13] iommu: Introduce Interface for IOMMU TLB Flushing Message-ID: <20170817165040.GK2853@suse.de> References: <1502974596-23835-1-git-send-email-joro@8bytes.org> <1502974596-23835-3-git-send-email-joro@8bytes.org> <20170817163234.GB30719@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20170817163234.GB30719@arm.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2875 Lines: 64 Hi Will, On Thu, Aug 17, 2017 at 05:32:35PM +0100, Will Deacon wrote: > I really like the idea of this, but I have a couple of questions and > comments below. Great, this together with the common iova-flush it should make it possible to solve the performance problems of the dma-iommu code. > > +int iommu_map_sync(struct iommu_domain *domain, unsigned long iova, > > + phys_addr_t paddr, size_t size, int prot) > > +{ > > + int ret = iommu_map(domain, iova, paddr, size, prot); > > + > > + iommu_tlb_range_add(domain, iova, size); > > + iommu_tlb_sync(domain); > > Many IOMMUs don't need these callbacks on ->map operations, but they won't > be able to distinguish them easily with this API. Could you add a flags > parameter or something to the iommu_tlb_* functions, please? Yeah, this is only needed for virtualized IOMMUs that have a non-present cache. My idea was to let the iommu-drivers tell the common code whether the iommu needs it and the code above just checks a flag and omits the calls to the flush-functions then. Problem currently is how to get this information from 'struct iommu_device' to 'struct iommu_domain'. As a workaround I consider a per-domain flag in the iommu drivers which checks whether any unmap has happened and just do nothing on the flush-call-back if there were none. > I think we will struggle to implement this efficiently on ARM SMMUv3. The > way invalidation works there is that there is a single in-memory command > queue into which we can put TLB invalidation commands (they are inserted > under a lock). These are then processed asynchronously by the hardware, and > you can complete them by inserting a SYNC command and waiting for that to > be consumed by the SMMU. Sounds like a perfect fit, right? Yes, its basically the same as way as it works on AMD-Vi and Intel VT-d. > The problem is that we want to add those invalidation commands as early > as possible, so that they can be processed by the hardware concurrently > with us unmapping other pages. I think that's a bad idea, because then you re-introduce the performance problems again because everyone will spin on the cmd-queue lock in the unmap path of the dma-api. > That means adding the invalidation commands in the ->unmap callback > and not bothering to implement ->iotlb_range_add callback at all. > Then, we will do the sync in ->iotlb_sync. This falls apart if > somebody decides to use iommu_flush_tlb_all(), where we would prefer > not to insert all of the invalidation commands in unmap and instead > insert a single invalidate-all command, followed up with a SYNC. This problem can be solved with the deferred iova flushing code I posted to the ML. When a queue fills up, iommu_flush_tlb_all() is called and every entry that was unmapped before can be released. This works well on x86, are there reasons it wouldn't on ARM? Regards, Joerg