Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3403784imu; Mon, 28 Jan 2019 04:21:35 -0800 (PST) X-Google-Smtp-Source: ALg8bN5Ix1tsg4W4AMrkRSj6MT1WC7vLPONfmefHkOTGdsUiVI4nE+n/qaMu9MkDnGHG/RIjzySr X-Received: by 2002:a63:8ac4:: with SMTP id y187mr19779274pgd.446.1548678095114; Mon, 28 Jan 2019 04:21:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548678095; cv=none; d=google.com; s=arc-20160816; b=rOQdpYwF+YAHLzNhTbjpeMjLiJqGwIvndyDo3CJ1OCAfWE7R8MYKBRmNM6dL4vEd1A +zXSmRDoGUSQ8EmRvNdPqqdaO/beojC/oEu4F5uc2++2Jic/jtIgH9XMF69MlzPp3PfW HHHOLKo3YYycQIchWKgR1loYzoQsYQlCNFf3LCkXPpsikqzd8x/N1FSmAvX+xGEWk8EW 10OQomkFJ6UgZwC/a/TQ8LRMXbIogxohw6mlz/sMw8DsmAoVUZB6dzItnFS0XNAQMjUw YoGt9qFiRooASsNx6R8cZkWmTTy3ZvYfi1/+SBJ0jqlOKmU7Le38U84ITgUH4oSg3NsE dvbw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dmarc-filter:dkim-signature :dkim-signature; bh=WuL790mVCxFnLZSnyR5MLVL0C7/JuhqHWlyFG6GGwPg=; b=ROTG9wkHracEW9hLQrF9PlSW58l0S9XfcBBerk69M4+JUpfhMelog7x6gQzyznuME3 2mS0oKozCBcal1YK67/1hASInXxJRf98UzA4+P6jjd4cPvd5sB0yuGwDuBnpM01Ci3D1 Z5UizUtfKzTLSDBx44Z7xdsF7e2AoKYkinrMWU57VHMF5QNz3UkiR5vq06PtjX8Sh42d 8/jc/nv9dWqVPr5KrL5M051LfjNt2lfrYIYgqnsmqCjqpUJ4+X6ZGU2Dhqh0XfDT4Ddo JXTsMUqJG4EY3t7/glgsAfa4UbttKxXYUkzWwcyf8kaZCBwHNKzKQQ4DcS2MawOm0Jji CseQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=QMc8y1JX; dkim=pass header.i=@codeaurora.org header.s=default header.b=QMc8y1JX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 123si33222926pfx.109.2019.01.28.04.21.18; Mon, 28 Jan 2019 04:21:35 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=QMc8y1JX; dkim=pass header.i=@codeaurora.org header.s=default header.b=QMc8y1JX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726880AbfA1MU6 (ORCPT + 99 others); Mon, 28 Jan 2019 07:20:58 -0500 Received: from smtp.codeaurora.org ([198.145.29.96]:40382 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726611AbfA1MU6 (ORCPT ); Mon, 28 Jan 2019 07:20:58 -0500 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 940BF60851; Mon, 28 Jan 2019 12:20:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1548678056; bh=gZw4POZaQMJaz3UpcrYMiOr/WjBkTgI46KYRaGDnDzM=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=QMc8y1JXomVeVH+WBXL9GM8tBMobDLIsB59LfQz3so5KiF6VIg9naWTfBrU5NPT4g QtJ242rPtvUSJmmL7rQVBJGSXjW7fRkLr6QXl20iTlBPYLt+WAC46W+PnImE7Nk9UD K3OGqo/CqdcOMr79+5h4gecROxqSHuJ/73vjhz+I= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.7 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_INVALID,DKIM_SIGNED autolearn=no autolearn_force=no version=3.4.0 Received: from mail-ed1-f45.google.com (mail-ed1-f45.google.com [209.85.208.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: vivek.gautam@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id B48F960712; Mon, 28 Jan 2019 12:20:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1548678056; bh=gZw4POZaQMJaz3UpcrYMiOr/WjBkTgI46KYRaGDnDzM=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=QMc8y1JXomVeVH+WBXL9GM8tBMobDLIsB59LfQz3so5KiF6VIg9naWTfBrU5NPT4g QtJ242rPtvUSJmmL7rQVBJGSXjW7fRkLr6QXl20iTlBPYLt+WAC46W+PnImE7Nk9UD K3OGqo/CqdcOMr79+5h4gecROxqSHuJ/73vjhz+I= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org B48F960712 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=vivek.gautam@codeaurora.org Received: by mail-ed1-f45.google.com with SMTP id o10so12772494edt.13; Mon, 28 Jan 2019 04:20:55 -0800 (PST) X-Gm-Message-State: AJcUukcTPflmTZgh91v1tyEjlhUYHxCU6TniTK6FfBUGEF2nlhwU4rC7 Xh4/eclSf4TfUwMMP4y7MwSHzY1PsR9oDWZaTeg= X-Received: by 2002:a17:906:118c:: with SMTP id n12mr14502995eja.228.1548678054360; Mon, 28 Jan 2019 04:20:54 -0800 (PST) MIME-Version: 1.0 References: <20190117092718.1396-1-vivek.gautam@codeaurora.org> <20190117092718.1396-2-vivek.gautam@codeaurora.org> In-Reply-To: From: Vivek Gautam Date: Mon, 28 Jan 2019 17:50:42 +0530 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH 1/2] iommu/io-pgtable-arm: Add support for non-coherent page tables To: Robin Murphy Cc: Will Deacon , "list@263.net:IOMMU DRIVERS , Joerg Roedel ," , "list@263.net:IOMMU DRIVERS , Joerg Roedel ," , linux-arm-msm , open list , Linux ARM Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 21, 2019 at 6:43 PM Robin Murphy wrote: > > On 17/01/2019 09:27, Vivek Gautam wrote: > > From Robin's comment [1] about touching TCR configurations - > > > > "TBH if we're going to touch the TCR attributes at all then we should > > probably correct that sloppiness first - there's an occasional argument > > for using non-cacheable pagetables even on a coherent SMMU if reducing > > snoop traffic/latency on walks outweighs the cost of cache maintenance > > on PTE updates, but anyone thinking they can get that by overriding > > dma-coherent silently gets the worst of both worlds thanks to this > > current TCR value." > > > > We have IO_PGTABLE_QUIRK_NO_DMA quirk present, but we don't force > > anybody _not_ using dma-coherent smmu to have non-cacheable page table > > mappings. > > Having another quirk flag can help in having non-cacheable memory for > > page tables once and for all. > > > > [1] https://lore.kernel.org/patchwork/patch/1020906/ > > > > Signed-off-by: Vivek Gautam > > --- > > drivers/iommu/io-pgtable-arm.c | 17 ++++++++++++----- > > drivers/iommu/io-pgtable.h | 6 ++++++ > > 2 files changed, 18 insertions(+), 5 deletions(-) > > > > diff --git a/drivers/iommu/io-pgtable-arm.c b/drivers/iommu/io-pgtable-arm.c > > index 237cacd4a62b..c76919c30f1a 100644 > > --- a/drivers/iommu/io-pgtable-arm.c > > +++ b/drivers/iommu/io-pgtable-arm.c > > @@ -780,7 +780,8 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) > > struct arm_lpae_io_pgtable *data; > > > > if (cfg->quirks & ~(IO_PGTABLE_QUIRK_ARM_NS | IO_PGTABLE_QUIRK_NO_DMA | > > - IO_PGTABLE_QUIRK_NON_STRICT)) > > + IO_PGTABLE_QUIRK_NON_STRICT | > > + IO_PGTABLE_QUIRK_NON_COHERENT)) > > return NULL; > > > > data = arm_lpae_alloc_pgtable(cfg); > > @@ -788,9 +789,14 @@ arm_64_lpae_alloc_pgtable_s1(struct io_pgtable_cfg *cfg, void *cookie) > > return NULL; > > > > /* TCR */ > > - reg = (ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT) | > > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT) | > > - (ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT); > > + reg = ARM_LPAE_TCR_SH_IS << ARM_LPAE_TCR_SH0_SHIFT; > > + > > + if (cfg->quirks & IO_PGTABLE_QUIRK_NON_COHERENT) > > + reg |= ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_IRGN0_SHIFT | > > + ARM_LPAE_TCR_RGN_NC << ARM_LPAE_TCR_ORGN0_SHIFT; > > + else > > + reg |= ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_IRGN0_SHIFT | > > + ARM_LPAE_TCR_RGN_WBWA << ARM_LPAE_TCR_ORGN0_SHIFT; > > > > switch (ARM_LPAE_GRANULE(data)) { > > case SZ_4K: > > @@ -873,7 +879,8 @@ arm_64_lpae_alloc_pgtable_s2(struct io_pgtable_cfg *cfg, void *cookie) > > > > /* The NS quirk doesn't apply at stage 2 */ > > if (cfg->quirks & ~(IO_PGTABLE_QUIRK_NO_DMA | > > - IO_PGTABLE_QUIRK_NON_STRICT)) > > + IO_PGTABLE_QUIRK_NON_STRICT | > > + IO_PGTABLE_QUIRK_NON_COHERENT)) > > return NULL; > > > > data = arm_lpae_alloc_pgtable(cfg); > > diff --git a/drivers/iommu/io-pgtable.h b/drivers/iommu/io-pgtable.h > > index 47d5ae559329..46604cf7b017 100644 > > --- a/drivers/iommu/io-pgtable.h > > +++ b/drivers/iommu/io-pgtable.h > > @@ -75,6 +75,11 @@ struct io_pgtable_cfg { > > * IO_PGTABLE_QUIRK_NON_STRICT: Skip issuing synchronous leaf TLBIs > > * on unmap, for DMA domains using the flush queue mechanism for > > * delayed invalidation. > > + * > > + * IO_PGTABLE_QUIRK_NON_COHERENT: Enforce non-cacheable mappings for > > + * pagetables even on a coherent SMMU for cases where reducing > > + * snoop traffic/latency on walks outweighs the cost of cache > > + * maintenance on PTE updates. > > Hmm, we can't actually "enforce" anything with this as-is - all we're > doing is setting the attributes that the IOMMU will use for pagetable > walks, and that has no impact on how the CPU actually writes PTEs to > memory. In particular, in the case of a hardware-coherent IOMMU which is > described as such, even if we make the dma_map/sync calls they still > won't do anything since they 'know' that the IOMMU is coherent. Thus if > we then set up a non-cacheable TCR we would have no proper means of > making pagetables correctly visible to the walker. Right, I get this point. With non-cacheable TCR, the PTW will always go to the main memory rather then snooping in CPU caches for the latest page tables. > > Aw crap, this is turning out to be a microcosm of the PCIe no-snoop > mess... :( > > To start with, at least, what we want is to set a non-cacheable TCR if > the IOMMU is *not* coherent (as far as Linux is concerned - that > includes the firmware-lying-about-the-hardware situation I was alluding > to before), but even that isn't necessarily as straightforward as it > seems. AFAICS, if QUIRK_NO_DMA is set then we definitely have to use a > cacheable TCR; Okay, so for QUIRK_NO_DMA we set IRGN0 and ORGN0 to WBWA in TCR, But, for SMMUs that omit 'dma-coherent' and thus QUIRK_NO_DMA is not set do we allow them to have a Non-Cacheable set to IRGN0 and ORGN0, as the PTW will anyways have to read from memory after the CPU flushes the PTEs to the memory (which we are already doing). Regards Vivek > we can't strictly rely on the inverse being true, but in > practice we *might* get away with it since we already disallow most > cases in which the DMA API calls would actually do anything for a > known-coherent IOMMU device. > > Robin. > > > */ > > #define IO_PGTABLE_QUIRK_ARM_NS BIT(0) > > #define IO_PGTABLE_QUIRK_NO_PERMS BIT(1) > > @@ -82,6 +87,7 @@ struct io_pgtable_cfg { > > #define IO_PGTABLE_QUIRK_ARM_MTK_4GB BIT(3) > > #define IO_PGTABLE_QUIRK_NO_DMA BIT(4) > > #define IO_PGTABLE_QUIRK_NON_STRICT BIT(5) > > + #define IO_PGTABLE_QUIRK_NON_COHERENT BIT(6) > > unsigned long quirks; > > unsigned long pgsize_bitmap; > > unsigned int ias; > > -- QUALCOMM INDIA, on behalf of Qualcomm Innovation Center, Inc. is a member of Code Aurora Forum, hosted by The Linux Foundation