Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp878283ybl; Fri, 30 Aug 2019 08:25:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqzcwxUi3yxxZepUipFHrVOs0qFl8V+GEEu5pQsMo1XQdRMN+d8MJ/BnRb3CfE/1AVvosx3y X-Received: by 2002:a17:902:bd0b:: with SMTP id p11mr10184158pls.46.1567178738266; Fri, 30 Aug 2019 08:25:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1567178738; cv=none; d=google.com; s=arc-20160816; b=bWGUTJnFWgJnetxzeVs+vaPFAnVx+TTtqopGcIGyGvUkbtiKc2SuQE7G66qDNyuHHP lJjDf1MmD8IE82JA3/ZkJ6zJpFUOoWkigtoZYeiIlLORfAUedGu3Zp4us8Xe0E1AlRsS 2NBbCwp/OdpP2dJWLItrO5/XADizdv5LIMV+c1djk0yh8oxsVDkRzZIHnkakusGScOnu VQjpXWCo8lwzVtT3ga6XN2ggDNCm2Y4FSB+It2ndo+J2/41X1zEsryC5OiPKhpznHi2p eKpiYR3KgNva12Qi+uRvYJNP4SdS8x0Y42hhXL+LOmaHxFicaU4IhWVLjqRmczmF6OtF NYFQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=L3ThvK15+nAAQy2W4VR4U128xwxX78Sp2rRSX2RBiwo=; b=ig9yUKvREjzk5zMXrLc4HnccT6cGvDxxaxmc3QODITwjCCsSr5YvTtgL1Ka41RrBS6 SM6PCpDnN+ntniiwZBDmGCEOXqPC3Ld8Km7IVM7/XAR7LZT8Z1SBljGJpgGcfO5dIEOL u0dlhEIwbnDL9+g9odrm0iCrNohcPhL1MRFGniuYWYoUad1D313Hufx563x6pjXlpRqI EFi5OXJSJNxcot6n4GoL19E9Sb1K4IiSeA+TrnQuCPNS2O+xj5XFaF2UG+56N9xuN+qX EgMzJaMPGe3bX7G3xO3iVXHO+XEQNHv4vTICYuWfrgxq5OpqNeGWTQNV1qAG8p1EThB1 uLnw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id ce18si5271797pjb.70.2019.08.30.08.25.22; Fri, 30 Aug 2019 08:25:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727891AbfH3PXR (ORCPT + 99 others); Fri, 30 Aug 2019 11:23:17 -0400 Received: from foss.arm.com ([217.140.110.172]:33928 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727135AbfH3PXR (ORCPT ); Fri, 30 Aug 2019 11:23:17 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 53209344; Fri, 30 Aug 2019 08:23:16 -0700 (PDT) Received: from [10.1.197.57] (e110467-lin.cambridge.arm.com [10.1.197.57]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 9C7973F703; Fri, 30 Aug 2019 08:23:14 -0700 (PDT) Subject: Re: [PATCH 3/7] iommu/arm-smmu: Add tlb_sync implementation hook To: Krishna Reddy Cc: snikam@nvidia.com, thomasz@nvidia.com, jtukkinen@nvidia.com, mperttunen@nvidia.com, praithatha@nvidia.com, iommu@lists.linux-foundation.org, linux-kernel@vger.kernel.org, talho@nvidia.com, yhsu@nvidia.com, linux-tegra@vger.kernel.org, treding@nvidia.com, avanbrunt@nvidia.com, linux-arm-kernel@lists.infradead.org References: <1567118827-26358-1-git-send-email-vdumpa@nvidia.com> <1567118827-26358-4-git-send-email-vdumpa@nvidia.com> From: Robin Murphy Message-ID: <554f8de1-1638-4eb9-59ae-8e1f0d786c44@arm.com> Date: Fri, 30 Aug 2019 16:23:13 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.6.1 MIME-Version: 1.0 In-Reply-To: <1567118827-26358-4-git-send-email-vdumpa@nvidia.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 29/08/2019 23:47, Krishna Reddy wrote: > tlb_sync hook allows nvidia smmu handle tlb sync > across multiple SMMUs as necessary. > > Signed-off-by: Krishna Reddy > --- > drivers/iommu/arm-smmu-nvidia.c | 32 ++++++++++++++++++++++++++++++++ > drivers/iommu/arm-smmu.c | 8 +++++--- > drivers/iommu/arm-smmu.h | 4 ++++ > 3 files changed, 41 insertions(+), 3 deletions(-) > > diff --git a/drivers/iommu/arm-smmu-nvidia.c b/drivers/iommu/arm-smmu-nvidia.c > index d93ceda..a429b2c 100644 > --- a/drivers/iommu/arm-smmu-nvidia.c > +++ b/drivers/iommu/arm-smmu-nvidia.c > @@ -56,11 +56,43 @@ static void nsmmu_write_reg64(struct arm_smmu_device *smmu, > writeq_relaxed(val, nsmmu_page(smmu, i, page) + offset); > } > > +static void nsmmu_tlb_sync_wait(struct arm_smmu_device *smmu, int page, > + int sync, int status, int inst) > +{ > + u32 reg; > + unsigned int spin_cnt, delay; > + > + for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) { > + for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) { > + reg = readl_relaxed( > + nsmmu_page(smmu, inst, page) + status); > + if (!(reg & sTLBGSTATUS_GSACTIVE)) > + return; > + cpu_relax(); > + } > + udelay(delay); > + } > + dev_err_ratelimited(smmu->dev, > + "TLB sync timed out -- SMMU may be deadlocked\n"); > +} > + > +static void nsmmu_tlb_sync(struct arm_smmu_device *smmu, int page, > + int sync, int status) > +{ > + int i; > + > + arm_smmu_writel(smmu, page, sync, 0); > + > + for (i = 0; i < to_nsmmu(smmu)->num_inst; i++) It might make more sense to make this the innermost loop, i.e.: for (i = 0; i < nsmmu->num_inst; i++) reg &= readl_relaxed(nsmmu_page(smmu, i, page)... since polling the instances in parallel rather than in series seems like it might be a bit more efficient. > + nsmmu_tlb_sync_wait(smmu, page, sync, status, i); > +} > + > static const struct arm_smmu_impl nsmmu_impl = { > .read_reg = nsmmu_read_reg, > .write_reg = nsmmu_write_reg, > .read_reg64 = nsmmu_read_reg64, > .write_reg64 = nsmmu_write_reg64, > + .tlb_sync = nsmmu_tlb_sync, > }; > > struct arm_smmu_device *nvidia_smmu_impl_init(struct arm_smmu_device *smmu) > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 46e1641..f5454e71 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -52,9 +52,6 @@ > */ > #define QCOM_DUMMY_VAL -1 > > -#define TLB_LOOP_TIMEOUT 1000000 /* 1s! */ > -#define TLB_SPIN_COUNT 10 > - > #define MSI_IOVA_BASE 0x8000000 > #define MSI_IOVA_LENGTH 0x100000 > > @@ -244,6 +241,11 @@ static void __arm_smmu_tlb_sync(struct arm_smmu_device *smmu, int page, > unsigned int spin_cnt, delay; > u32 reg; > > + if (smmu->impl->tlb_sync) { > + smmu->impl->tlb_sync(smmu, page, sync, status); What I'd hoped is that rather than needing a hook for this, you could just override smmu_domain->tlb_ops from .init_context to wire up the alternate .sync method directly. That would save this extra level of indirection. Robin. > + return; > + } > + > arm_smmu_writel(smmu, page, sync, QCOM_DUMMY_VAL); > for (delay = 1; delay < TLB_LOOP_TIMEOUT; delay *= 2) { > for (spin_cnt = TLB_SPIN_COUNT; spin_cnt > 0; spin_cnt--) { > diff --git a/drivers/iommu/arm-smmu.h b/drivers/iommu/arm-smmu.h > index 9645bf1..d3217f1 100644 > --- a/drivers/iommu/arm-smmu.h > +++ b/drivers/iommu/arm-smmu.h > @@ -207,6 +207,8 @@ enum arm_smmu_cbar_type { > /* Maximum number of context banks per SMMU */ > #define ARM_SMMU_MAX_CBS 128 > > +#define TLB_LOOP_TIMEOUT 1000000 /* 1s! */ > +#define TLB_SPIN_COUNT 10 > > /* Shared driver definitions */ > enum arm_smmu_arch_version { > @@ -336,6 +338,8 @@ struct arm_smmu_impl { > int (*cfg_probe)(struct arm_smmu_device *smmu); > int (*reset)(struct arm_smmu_device *smmu); > int (*init_context)(struct arm_smmu_domain *smmu_domain); > + void (*tlb_sync)(struct arm_smmu_device *smmu, int page, int sync, > + int status); > }; > > static inline void __iomem *arm_smmu_page(struct arm_smmu_device *smmu, int n) >