Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp6348572imu; Mon, 21 Jan 2019 07:24:20 -0800 (PST) X-Google-Smtp-Source: ALg8bN6s522CCvn8WAZ9oYuFwgkZUTglyZq0UsdRJMA4cdeIBvtYOcuvla2Rp6P1XOUYknqGyjpT X-Received: by 2002:aa7:810c:: with SMTP id b12mr29896201pfi.44.1548084260904; Mon, 21 Jan 2019 07:24:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548084260; cv=none; d=google.com; s=arc-20160816; b=0MQBdZBuLQgbiYGvtKRRjVz3XnMQA8rkwsOtubvk9n1t9JRKtQgJunJDMfhJ8z1Snq uumK7UvrUqg1honQUG0FfBA1SksVH+2eh8v3lP5uRX5lqvlTminm+b1yPdzHOsblVN02 koN+kLwPvhYggLbmm/sR1/WcLTVAtGOfW57VJmN8h5yQS+N+rZLo6FydnlUX9dDOJgDV Jbv8ZdG8eIokR5eji61/A+XVLc6onPDWRgPwW4FGoJLMVYXhpcvq1IcC5Iu+xUSjJ7OK CCf+HpEgLGVJU9sZ3p3zx75WTm7ctPVum4NvfeJLk+/zZ6f/nUS4ECT7reG9fqs35rVN VSMA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:subject:from; bh=qjyEySp9kdwf1QuvhV7H+oLJqpBObiMyMhiVYVgG02g=; b=jbVyzEUuJgaezyJy8enVYtc4OA8qkLddmxRMuuo1E+GKVifakIMBjY7vtYw8YFsItL F52IF+Pb3W234qqKQ3rsgtfaLfadXmptlwlt4shJecqlVVEwyPiZ5ZQh7hjyuF8WjaRu yIOoVfekJzqKUoa8oIqK/zS+nbb3qw0sXMEOZU8r/k4VkxZR9klWBuRG4GHr3JTxbp+t G2TGlWANun731mxgmudeOB1q9XYpiVObFn0+hzoNVoooCzcz/QUopaH0e5Uo61LP55p9 VjCqzVhufMHz95JG6+O/LYat9oMPFIhcU/1Upp1bRTdbHtLwIHrjSyvB1/LjI0QbeEJP d9FQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 1si12745099plp.114.2019.01.21.07.24.05; Mon, 21 Jan 2019 07:24:20 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728665AbfAUNMN (ORCPT + 99 others); Mon, 21 Jan 2019 08:12:13 -0500 Received: from usa-sjc-mx-foss1.foss.arm.com ([217.140.101.70]:33784 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726268AbfAUNMN (ORCPT ); Mon, 21 Jan 2019 08:12:13 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 8DE2580D; Mon, 21 Jan 2019 05:12:12 -0800 (PST) Received: from [10.1.196.75] (e110467-lin.cambridge.arm.com [10.1.196.75]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 3FE913F614; Mon, 21 Jan 2019 05:12:11 -0800 (PST) From: Robin Murphy Subject: Re: [PATCH 1/2] iommu/io-pgtable-arm: Add support for non-coherent page tables To: Vivek Gautam , will.deacon@arm.com, joro@8bytes.org, iommu@lists.linux-foundation.org Cc: linux-arm-msm@vger.kernel.org, linux-kernel@vger.kernel.org, linux-arm-kernel@lists.infradead.org References: <20190117092718.1396-1-vivek.gautam@codeaurora.org> <20190117092718.1396-2-vivek.gautam@codeaurora.org> Message-ID: Date: Mon, 21 Jan 2019 13:12:09 +0000 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.2.1 MIME-Version: 1.0 In-Reply-To: <20190117092718.1396-2-vivek.gautam@codeaurora.org> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 17/01/2019 09:27, Vivek Gautam wrote: > From Robin's comment [1] about touching TCR configurations - > > "TBH if we're going to touch the TCR attributes at all then we should > probably correct that sloppiness first - there's an occasional argument > for using non-cacheable pagetables even on a coherent SMMU if reducing > snoop traffic/latency on walks outweighs the cost of cache maintenance > on PTE updates, but anyone thinking they can get that by overriding > dma-coherent silently gets the worst of both worlds thanks to this > current TCR value." > > We have IO_PGTABLE_QUIRK_NO_DMA quirk present, but we don't force > anybody _not_ using dma-coherent smmu to have non-cacheable page table > mappings. > Having another quirk flag can help in having non-cacheable memory for > page tables once and for all. > > [1] https://lore.kernel.org/patchwork/patch/1020906/ > > Signed-off-by: Vivek Gautam > --- > drivers/iommu/io-pgtable-arm.c | 17 ++++++++++++----- > drivers/iommu/io-pgtable.h | 6 ++++++ > 2 files changed, 18 insertions(+), 5 deletions(-) > > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c > index 237cacd4a62b..c76919c30f1a 100644 > --- a/drivers/iommu/io-pgtable-arm.c > +++ b/drivers/iommu/io-pgtable-arm.c > @@ -780,7 +780,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) > struct arm_lpae_io_pgtable *data; > > if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA | > - IO_PGTABLE_QUIRK_NON_STRICT)) > + IO_PGTABLE_QUIRK_NON_STRICT | > + IO_PGTABLE_QUIRK_NON_COHERENT)) > return NULL; > > data = arm_lpae_alloc_pgtable(cfg); > @@ -788,9 +789,14 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) > return NULL; > > /* TCR */ > - reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); > + reg = ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT; > + > + if (cfg->quirks & IO_PGTABLE_QUIRK_NON_COHERENT) > + reg |= ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT | > + ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT; > + else > + reg |= ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT | > + ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT; > > switch (ARM_LPAE_GRANULE(data)) { > case SZ_4K: > @@ -873,7 +879,8 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie) > > /* The NS quirk doesn't apply at stage 2 */ > if (cfg->quirks & ~(IO_PGTABLE_QUIRK_NO_DMA | > - IO_PGTABLE_QUIRK_NON_STRICT)) > + IO_PGTABLE_QUIRK_NON_STRICT | > + IO_PGTABLE_QUIRK_NON_COHERENT)) > return NULL; > > data = arm_lpae_alloc_pgtable(cfg); > diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h > index 47d5ae559329..46604cf7b017 100644 > --- a/drivers/iommu/io-pgtable.h > +++ b/drivers/iommu/io-pgtable.h > @@ -75,6 +75,11 @@ struct io_pgtable_cfg { > * IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs > * on unmap, for DMA domains using the flush queue mechanism for > * delayed invalidation. > + * > + * IO_PGTABLE_QUIRK_NON_COHERENT: Enforce non-cacheable mappings for > + * pagetables even on a coherent SMMU for cases where reducing > + * snoop traffic/latency on walks outweighs the cost of cache > + * maintenance on PTE updates. Hmm, we can't actually "enforce" anything with this as-is - all we're doing is setting the attributes that the IOMMU will use for pagetable walks, and that has no impact on how the CPU actually writes PTEs to memory. In particular, in the case of a hardware-coherent IOMMU which is described as such, even if we make the dma_map/sync calls they still won't do anything since they 'know' that the IOMMU is coherent. Thus if we then set up a non-cacheable TCR we would have no proper means of making pagetables correctly visible to the walker. Aw crap, this is turning out to be a microcosm of the PCIe no-snoop mess... :( To start with, at least, what we want is to set a non-cacheable TCR if the IOMMU is *not* coherent (as far as Linux is concerned - that includes the firmware-lying-about-the-hardware situation I was alluding to before), but even that isn't necessarily as straightforward as it seems. AFAICS, if QUIRK_NO_DMA is set then we definitely have to use a cacheable TCR; we can't strictly rely on the inverse being true, but in practice we *might* get away with it since we already disallow most cases in which the DMA API calls would actually do anything for a known-coherent IOMMU device. Robin. > */ > #define IO_PGTABLE_QUIRK_ARM_NS BIT(0) > #define IO_PGTABLE_QUIRK_NO_PERMS BIT(1) > @@ -82,6 +87,7 @@ struct io_pgtable_cfg { > #define IO_PGTABLE_QUIRK_ARM_MTK_4GB BIT(3) > #define IO_PGTABLE_QUIRK_NO_DMA BIT(4) > #define IO_PGTABLE_QUIRK_NON_STRICT BIT(5) > + #define IO_PGTABLE_QUIRK_NON_COHERENT BIT(6) > unsigned long quirks; > unsigned long pgsize_bitmap; > unsigned int ias; >