Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3930020imu; Mon, 14 Jan 2019 11:37:14 -0800 (PST) X-Google-Smtp-Source: ALg8bN5xebPh8vQ92uVusIU9vPGfrjgBSu1/iMqnyktcmi+8osifPFrZvmxrlTyCzoTY6MmOA5NG X-Received: by 2002:a63:ff16:: with SMTP id k22mr99037pgi.244.1547494634898; Mon, 14 Jan 2019 11:37:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1547494634; cv=none; d=google.com; s=arc-20160816; b=TYV9hIGlGGdppmnBgytZBT6aLSgnMd7rTM518Vkz3Y1BLdB9kLNO9tJROQQYbkLoL6 szmwL3lKQzb9RGnFZQ3aTYv7Px3rfqb+IHNQReBxuVncq6Quo2Rf9XhJJn7yC3necuS3 i+8WhBgsoG0lbbkQR3u1Im3zYNLXFAZKiet462JyxnV4HtFmCEAMVl3ynsTovD8V8Yhe K1Ycnk9NvGwYVa/k3loMeZkaX4n9RHT5oEEcmqwNuMzcPOLOdRDvooONFZ4Lrtt1+IAK fA3QUXMn/ARTgi14HQV99mde9pOjOd8VOB8flj+xQ4pbyVWkbDb5cAkBYqUSVhkWvDS7 ONFA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=05d1805lFJQCxFI4EcA49JPTlOC4QLf3i7W3Lh8i3fI=; b=EWYYZXm+YKxQ1s1Zs3fK1T2uml2OXcjRUcCu0Kuy7rKHP+b5AHlfbBeOjxKgjNU+K0 LJwKz6qNk88tun1mZWiOUyAFn3FPXmJcCymUzp2ogwick989tuBAQgnhtEmRIKYyboO/ HR4z6fbDsLKRJtymNmM372rVQ2yv/c/kndlCfzSxbtecnr/6hY2XgBQNVtZCvi2Jr55C 42taJqy/ol1g2frCSBYibDI2yglKrfqwIVpI1ezlp90v/16erUJJolg8pG0bJ6sW4kPh apIaDxPDNr9RzHlYG/V0JE4DEaKCnylSEAn+0Un5QU+Ux9K4wHxcFWsXlgrFduMNkqUN rOZw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f10si1045079pgh.195.2019.01.14.11.36.58; Mon, 14 Jan 2019 11:37:14 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726788AbfANTfy (ORCPT + 99 others); Mon, 14 Jan 2019 14:35:54 -0500 Received: from foss.arm.com ([217.140.101.70]:39544 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726753AbfANTfx (ORCPT ); Mon, 14 Jan 2019 14:35:53 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 89390A78; Mon, 14 Jan 2019 11:35:53 -0800 (PST) Received: from fuggles.cambridge.arm.com (usa-sjc-imap-foss1.foss.arm.com [10.72.51.249]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 517133F5BD; Mon, 14 Jan 2019 11:35:51 -0800 (PST) Date: Mon, 14 Jan 2019 19:35:48 +0000 From: Will Deacon To: "Koenig, Christian" Cc: Ard Biesheuvel , Michel =?iso-8859-1?Q?D=E4nzer?= , Linux Kernel Mailing List , Carsten Haitzler , David Airlie , dri-devel , "Huang, Ray" , "Zhang, Jerry" , linux-arm-kernel , Bernhard =?iso-8859-1?Q?Rosenkr=E4nzer?= , mpe@ellerman.id.au, benh@kernel.crashing.org Subject: Re: [RFC PATCH] drm/ttm: force cached mappings for system RAM on ARM Message-ID: <20190114193548.GB29600@fuggles.cambridge.arm.com> References: <20190110072841.3283-1-ard.biesheuvel@linaro.org> <5d8135de-80fe-9c0e-2206-ecb809f64cdb@daenzer.net> <55facfb9-92af-86b8-40e9-d63b887b5592@amd.com> <9f956898-7973-98ee-6bf1-e1d445e9d365@amd.com> <20190114191350.GA29600@fuggles.cambridge.arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.11.1+86 (6f28e57d73f2) () Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org [+ BenH and MPE] On Mon, Jan 14, 2019 at 07:21:08PM +0000, Koenig, Christian wrote: > Am 14.01.19 um 20:13 schrieb Will Deacon: > > On Mon, Jan 14, 2019 at 07:07:54PM +0000, Koenig, Christian wrote: > >> Am 14.01.19 um 18:32 schrieb Ard Biesheuvel: > >> - The reason remapping the CPU side as cacheable does work (which I > >> did test) is because the GPU's uncacheable accesses (which I assume > >> are made using the NoSnoop PCIe transaction attribute) are actually > >> emitted as cacheable in some cases. > >> . On my AMD Seattle, with or without SMMU (which is stage 2 only), I > >> must use cacheable accesses from the CPU side or things are broken. > >> This might be a h/w flaw, though. > >> . On systems with stage 1+2 SMMUs, the driver uses stage 1 > >> translations which always override the memory attributes to cacheable > >> for DMA coherent devices. This is what is affecting the Cavium > >> ThunderX2 (although it appears the attributes emitted by the RC may be > >> incorrect as well.) > >> > >> The latter issue is a shortcoming in the SMMU driver that we have to > >> fix, i.e., it should take care not to modify the incoming attributes > >> of DMA coherent PCIe devices for NoSnoop to be able to work. > >> > >> So in summary, the mismatch appears to be between the CPU accessing > >> the vmap region with non-cacheable attributes and the GPU accessing > >> the same memory with cacheable attributes, resulting in a loss of > >> coherency and lots of visible corruption. > >> > >> Actually it is the other way around. The CPU thinks some data is in the > >> cache and the GPU only updates the system memory version because the > >> snoop flag is not set. > >> > >> > >> That doesn't seem to be what is happening. As far as we can tell from > >> our experiments, all inbound transactions are always cacheable, and so > >> the only way to make things work is to ensure that the CPU uses the > >> same attributes. > >> > >> > >> Ok that doesn't make any sense. If inbound transactions are cacheable or not is > >> irrelevant when the CPU always uses uncached accesses. > >> > >> See on the PCIe side you have the snoop bit in the read/write transactions > >> which tells the root hub if the device wants to snoop caches or not. > >> > >> When the CPU accesses some memory as cached then devices need to snoop the > >> cache for coherent accesses. > >> > >> When the CPU accesses some memory as uncached then devices can disable snooping > >> to improve performance, but when they don't do this it is mandated by the spec > >> that this still works. > > Which spec? > > The PCIe spec. The snoop bit (or rather the NoSnoop) in the transaction > is perfectly optional IIRC. Thanks for the clarification. I suspect the devil is in the details, so I'll try to dig up the spec. > > The Arm architecture (and others including Power afaiu) doesn't > > guarantee coherency when memory is accessed using mismatched cacheability > > attributes. > > Well what exactly goes wrong on ARM? Coherency (and any ordering guarantees) can be lost, so the device may see a stale copy of the memory it is accessing. The architecture requires cache maintenance to restore coherency between the mismatched aliases. > As far as I know Power doesn't really supports un-cached memory at all, > except for a very very old and odd configuration with AGP. Hopefully Michael/Ben can elaborate here, but I was under the (possibly mistaken) impression that mismatched attributes could cause a machine-check on Power. > I mean in theory I agree that devices should use matching cacheability > attributes, but in practice I know of quite a bunch of devices/engines > which fails to do this correctly. Given that the experiences of Ard and I so far has been that the system ends up making everything cacheable after the RC, perhaps that's an attempt by system designers to correct for these devices. Unfortunately, it doesn't help if the CPU carefully goes ahead and establishes a non-cacheable mapping for itself! Will