Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp6288478imu; Mon, 21 Jan 2019 06:26:26 -0800 (PST) X-Google-Smtp-Source: ALg8bN4xl5cFLYLc6YKZERra5IpRMHU4+Qio1QYkpUd0apRhl4slUWaKj3C98YTQfeSO5+aQI+7c X-Received: by 2002:a62:60c5:: with SMTP id u188mr29991836pfb.4.1548080786822; Mon, 21 Jan 2019 06:26:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548080786; cv=none; d=google.com; s=arc-20160816; b=NGBvi9/ACJUHSFHHfysVP9Mx/Ety8dIMv03RJwFLtw8+hOUrQOsxVVYgLO/r2Ej/Je gkCC6X19Otrbnytbhr98M7sF5ISfi+I0TVXTK+35gBXbY6dhTE5d9N910rIOCYa8Xtgu CcDZ3TWRvLb2O1FTAPqbNLmgb2cL0PyNsQAEo6k1ohX/L69eoJRDXRkjrWsWs7LyGgGc xanIKoyR576HSHhfC9ywJPBPDR89Ppv5Jq/3cY0ECd4wnAX9kOArLtZtsqjUtnPpiMlk /IV4LzyS2VWwsYE8aovQOGcABMr2jHh6dfZrXLPyl55ZAt1rGx0bn4RvxiBM+WQLWWFQ cnyA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=/u94eUF5mgF2KHAxjyAftYpA3ypiVepGKgWhzr9h2W4=; b=tCEvG8a+Nzj/55TK6Bl1e7zU91SKspaqdQyq9sY2ijJ4woCGvZLzHuHLxs/Q5IrMkn rnmrCJOiEsduC7ocZY96X3yHfmvYoDid0ziVpWP4YWTKJP07X4SK2NeY6aUsp6V6Id/I RAQsgZkeGWK0TSsPhuo+Q4yAq8G14TqDQHIqloBVyAFkhuu5zQ3yDnpHg65cO0Nb6YBI V6A2uxDt/Yj9HbcIwUYFvttpmmkCbswxo/rNmQmBgL3yN8pXSUpn/rp3MG9IXL1zJoFa ehcLXUC4ha2SdDA7OtjqDrlwQz5yh91erVHvDZCHsCwRkFfl6RTE2cMcCPTEIZS++tpc YE0g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=e7N7w54M; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i4si12255058pfg.218.2019.01.21.06.26.11; Mon, 21 Jan 2019 06:26:26 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=e7N7w54M; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729315AbfAUOZI (ORCPT + 99 others); Mon, 21 Jan 2019 09:25:08 -0500 Received: from mail-it1-f196.google.com ([209.85.166.196]:54070 "EHLO mail-it1-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729073AbfAUOZI (ORCPT ); Mon, 21 Jan 2019 09:25:08 -0500 Received: by mail-it1-f196.google.com with SMTP id g85so16590693ita.3 for ; Mon, 21 Jan 2019 06:25:07 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=/u94eUF5mgF2KHAxjyAftYpA3ypiVepGKgWhzr9h2W4=; b=e7N7w54MBAKV9DZq0MZ7LOLDV2AarARteFuPyCSQMy4juxp6Sp/YwZKQ9jMfam3QJ6 /Yi8cFJWl3yBc65GwOy6HzxZwxnU2gVk+rs6hLIF1164xDzBwkk33wCUL8JYqSo17T7+ iphonCyY82qrtMHsSEVHJgoLREb/Le/J+mjUA= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=/u94eUF5mgF2KHAxjyAftYpA3ypiVepGKgWhzr9h2W4=; b=ORKCAfdbvfRs2RuWDSLG0zrBebmp9NNpqnEfdtED2gdQbKp9mVQGAOK3ujtDFLu1MC Wz8VRe8MwGub6syh0QXu6J8ejlYxhzTdXX0CdwrSfk3cRkd8YPd419FnbP4A9hq02F6V +8v45TYw6pfjN8RjSDw1a2dUGXbmWz4UGsv09HcFCnCT9agL3aFf/MgE3dCe2lDX/K8M OABHdNYSdbxxyHAzW2jKR/gN/QAXOq15W5LI8Rspj4i/cB5u9O5Old0R6LZ+ugjabHEN hQyAyJB01RUvyALEiR/RbOQjLzeG4XNu6Y80zp4zwU8Ui/YWcpeUD8PQHdxoLfNfj2Yx Pq+A== X-Gm-Message-State: AJcUukc/4z3ufZfJ4kEbwHo394PJRFxraQ0x/iKmfRjUaIqii96La7QV /9RJrPBzQ0ga7+zWDZ+/L6gFKqf8XW2I2X7mEHmQdA== X-Received: by 2002:a05:660c:4b:: with SMTP id p11mr17498155itk.71.1548080706851; Mon, 21 Jan 2019 06:25:06 -0800 (PST) MIME-Version: 1.0 References: <20190121055335.15430-1-vivek.gautam@codeaurora.org> <964779d6-c676-3379-bf1e-cde0dd82d63d@arm.com> In-Reply-To: <964779d6-c676-3379-bf1e-cde0dd82d63d@arm.com> From: Ard Biesheuvel Date: Mon, 21 Jan 2019 15:24:55 +0100 Message-ID: Subject: Re: [PATCH 0/3] iommu/arm-smmu: Add support to use Last level cache To: Robin Murphy Cc: Vivek Gautam , Will Deacon , Joerg Roedel , "list@263.net:IOMMU DRIVERS" , pdaly@codeaurora.org, linux-arm-msm , Linux Kernel Mailing List , Tomasz Figa , Jordan Crouse , pratikp@codeaurora.org, linux-arm-kernel Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 21 Jan 2019 at 14:56, Robin Murphy wrote: > > On 21/01/2019 13:36, Ard Biesheuvel wrote: > > On Mon, 21 Jan 2019 at 14:25, Robin Murphy wrote: > >> > >> On 21/01/2019 10:50, Ard Biesheuvel wrote: > >>> On Mon, 21 Jan 2019 at 11:17, Vivek Gautam wrote: > >>>> > >>>> Hi, > >>>> > >>>> > >>>> On Mon, Jan 21, 2019 at 12:56 PM Ard Biesheuvel > >>>> wrote: > >>>>> > >>>>> On Mon, 21 Jan 2019 at 06:54, Vivek Gautam wrote: > >>>>>> > >>>>>> Qualcomm SoCs have an additional level of cache called as > >>>>>> System cache, aka. Last level cache (LLC). This cache sits right > >>>>>> before the DDR, and is tightly coupled with the memory controller. > >>>>>> The clients using this cache request their slices from this > >>>>>> system cache, make it active, and can then start using it. > >>>>>> For these clients with smmu, to start using the system cache for > >>>>>> buffers and, related page tables [1], memory attributes need to be > >>>>>> set accordingly. This series add the required support. > >>>>>> > >>>>> > >>>>> Does this actually improve performance on reads from a device? The > >>>>> non-cache coherent DMA routines perform an unconditional D-cache > >>>>> invalidate by VA to the PoC before reading from the buffers filled by > >>>>> the device, and I would expect the PoC to be defined as lying beyond > >>>>> the LLC to still guarantee the architected behavior. > >>>> > >>>> We have seen performance improvements when running Manhattan > >>>> GFXBench benchmarks. > >>>> > >>> > >>> Ah ok, that makes sense, since in that case, the data flow is mostly > >>> to the device, not from the device. > >>> > >>>> As for the PoC, from my knowledge on sdm845 the system cache, aka > >>>> Last level cache (LLC) lies beyond the point of coherency. > >>>> Non-cache coherent buffers will not be cached to system cache also, and > >>>> no additional software cache maintenance ops are required for system cache. > >>>> Pratik can add more if I am missing something. > >>>> > >>>> To take care of the memory attributes from DMA APIs side, we can add a > >>>> DMA_ATTR definition to take care of any dma non-coherent APIs calls. > >>>> > >>> > >>> So does the device use the correct inner non-cacheable, outer > >>> writeback cacheable attributes if the SMMU is in pass-through? > >>> > >>> We have been looking into another use case where the fact that the > >>> SMMU overrides memory attributes is causing issues (WC mappings used > >>> by the radeon and amdgpu driver). So if the SMMU would honour the > >>> existing attributes, would you still need the SMMU changes? > >> > >> Even if we could force a stage 2 mapping with the weakest pagetable > >> attributes (such that combining would work), there would still need to > >> be a way to set the TCR attributes appropriately if this behaviour is > >> wanted for the SMMU's own table walks as well. > >> > > > > Isn't that just a matter of implementing support for SMMUs that lack > > the 'dma-coherent' attribute? > > Not quite - in general they need INC-ONC attributes in case there > actually is something in the architectural outer-cacheable domain. But is it a problem to use INC-ONC attributes for the SMMU PTW on this chip? AIUI, the reason for the SMMU changes is to avoid the performance hit of snooping, which is more expensive than cache maintenance of SMMU page tables. So are you saying the by-VA cache maintenance is not relayed to this system cache, resulting in page table updates to be invisible to masters using INC-ONC attributes? > The > case of the outer cacheablility being not that but a hint to control > non-CPU traffic through some not-quite-transparent cache behind the PoC > definitely stays wrapped up in qcom-specific magic ;) > I'm not surprised ...