Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp753324imm; Wed, 8 Aug 2018 05:17:52 -0700 (PDT) X-Google-Smtp-Source: AA+uWPzRFSL5K6VkULcYBpHkWhRmh5v/1e+qdY7tWxciloLOnS9NDSguDPdy8Z6ZjrNmqKwgnVXZ X-Received: by 2002:a17:902:6684:: with SMTP id e4-v6mr2384264plk.35.1533730672389; Wed, 08 Aug 2018 05:17:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533730672; cv=none; d=google.com; s=arc-20160816; b=DSNI0OAk8dxhUpkcx6x5WskarPEugvBYK/CHnPy9KiiZOlmdAUrA3s7OkYDsf9KH0p yAOu4jZ3qjK8QP0o9aBfjb8op4pr+hfxvoULR3UvBUOtc3Wi/KKS5xpPOBpjr7SEhuza FSpLsi3r1sSC2JiJAeapfima2Q91G983YDgGa48z7NHSHQs/6wL/BZSKpMHNT+8JquJd 4u5FExx1Nzuoyn7I69ybfHxzQ0ZfhsXbwJXrlVRUYHOQbuDBtORenS6QaQ4wxfAgY1n8 H/YOzvBOGw5rVCU/cczHvhM+CDZr5Mmpn+OSkivG02jpqtG8uucosagWCk1bpUwn4XGh B5eQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=jQb4kjLqKVhKf0Udsf0bAmf/akEFOrtCmo86MZoVagU=; b=RFBVZVhKDiUkz0T05QRT9NuZgS1hnuZJxlLHuMTUm7Ey6RziquWV+lANEhTFDNL8Kw WvCvWdUZNfAHNbAoFurogT+6w6H6LpBKi0H6Sryj8cO98ZyTv8UyY+kVfbEZ80iMRbFK sXwCuNByva3fBSOfujrARtO3q62X1j0L/7LjJfYC+KeDcbsnodgmwhk4Jl7cw4Sh0tUf qHwyfC4WL5jPpoU3BslFmgPETgTINuF/dAuhUefp6ofzAggy9bx8hTA2hAQf0ilO38Wq AnhJRBq72wtBeclAxhI8IeBYFq5JEsUoFmxGaheQhF6ZYJdUkL4rWDii0XPJCqNPEaVv PROg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y92-v6si3253330plb.378.2018.08.08.05.17.37; Wed, 08 Aug 2018 05:17:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726974AbeHHOgN (ORCPT + 99 others); Wed, 8 Aug 2018 10:36:13 -0400 Received: from foss.arm.com ([217.140.101.70]:37918 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726745AbeHHOgN (ORCPT ); Wed, 8 Aug 2018 10:36:13 -0400 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id AE795ED1; Wed, 8 Aug 2018 05:16:46 -0700 (PDT) Received: from iMac.local (usa-sjc-mx-foss1.foss.arm.com [217.140.101.70]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id BC0E93F5D0; Wed, 8 Aug 2018 05:16:44 -0700 (PDT) Date: Wed, 8 Aug 2018 13:16:42 +0100 From: Catalin Marinas To: Matt Sealey Cc: Mikulas Patocka , Thomas Petazzoni , Joao Pinto , Ard Biesheuvel , linux-pci , Jingoo Han , Will Deacon , Russell King , Linux Kernel Mailing List , linux-arm-kernel Subject: Re: framebuffer corruption due to overlapping stp instructions on arm64 Message-ID: <20180808121641.GB24736@iMac.local> References: <20180803094129.GB17798@arm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.0 (2018-05-17) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Matt, On Fri, Aug 03, 2018 at 03:44:44PM -0500, Matt Sealey wrote: > On 3 August 2018 at 13:25, Mikulas Patocka wrote: > > On Fri, 3 Aug 2018, Ard Biesheuvel wrote: > >> Are we still talking about overlapping unaligned accesses here? Or do > >> you see other failures as well? > > > > Yes - it is caused by overlapping unaligned accesses inside memcpy. > > When I put "dmb sy" between the overlapping accesses in > > glibc/sysdeps/aarch64/memcpy.S, this program doesn't detect any > > memory corruption. > > It is a symptom of generating reorderable accesses inside memcpy. It's nothing > to do with alignment, per se (see below). A dmb sy just hides the symptoms. > > What we're talking about here - yes, Ard, within certain amounts of > reason - is that you cannot use PCI BAR memory as 'Normal' - certainly > never cacheable memory, but Normal NC isn't good either. That is that > your CPU cannot post writes or reads towards PCI memory spaces unless > it is dealing with it as Device memory or very strictly controlled use > of Normal Non-Cacheable. I disagree that it's not possible to use Normal NC on prefetchable BARs. This particular case looks more like a hardware issue to me as other platforms don't exhibit the same behaviour. Note that allowing Normal NC mapping of prefetchable BARs together with unaliagned accesses is also a requirement for SBSA-compliant platforms ([1]; though I don't find the text in D.2 very clear). > >> > I tried to run it on system RAM mapped with the NC attribute and I didn't > >> > get any corruption - that suggests the the bug may be in the PCIE > >> > subsystem. > > Pure fluke. Do you mean you don't expect Mikulas' test to run fine on system RAM with Normal NC mapping? We would have bigger issues if this was the case. > I'll give a simple explanation. The Arm Architecture defines > single-copy and multi-copy atomic transactions. You can treat > 'single-copy' to mean that that transaction cannot be made partial, or > reordered within itself, i.e. it must modify memory (if it is a store) > in a single swift effort and any future reads from that memory must > return the FULL result of that write. > > Multi-copy means it can be resized and reordered a bit. Will Deacon is > going to crucify me for simplifying it, but.. let's proceed with a > poor example: Not sure about Will but I think you got them wrong ;). The single/multi copy atomicity is considered in respect to (multiple) observers, a.k.a. masters, and nothing to do with reordering a bit (see B2.2 in the ARMv8 ARM). > STR X0,[X1] on a 32-bit bus cannot ever be single-copy atomic, because > you cannot write 64-bits of data on a 32-bit bus in a single, > unbreakable transaction. This is because from one bus cycle to the > next, one half of the transaction will be in a different place. Your > interconnect will have latched and buffered 32-bits and the CPU is > holding the other. It depends on the implementation, interconnect, buses. Since single-copy atomicity refers to master accesses, the above transaction could be a burst of two 32-bit writes and treated atomically by the interconnect (i.e. not interruptible). > STP X0, X1, [X2] on a 64-bit bus can be single-copy atomic with > respect to the element size. But it is on the whole multi-copy atomic > - that is to say that it can provide a single transaction with > multiple elements which are transmitted, and those elements could be > messed with on the way down the pipe. This has nothing to do with multi-copy atomicity which actually refers to multiple observers seeing the same write. The ARM architecture is not exactly multi-copy atomic anyway (rather "other-multi-copy atomic"). Architecturally, STP is treated as two single-copy accesses (as you mentioned already). Anyway, the single/multiple copy atomicity is irrelevant for the C test from Mikulas where you have the same observer (the CPU) writing and reading the memory. I wonder whether writing a byte and reading a long back would show similar corruption. > And the granularity of the hazarding in your system, from the CPU > store buffer to the bus interface to the interconnect buffering to the > PCIe bridge to the PCIe EP is.. what? Not the same all the way down, > I'll bet you. I think hazarding is what goes wrong here, especially since with overlapping unaligned addresses. However, I disagree that it is impossible to implement this properly on a platform with PCIe so that Normal NC mappings can be used. Thanks. [1] https://developer.arm.com/docs/den0029/latest/server-base-system-architecture -- Catalin