Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp54852imm; Thu, 2 Aug 2018 13:51:13 -0700 (PDT) X-Google-Smtp-Source: AAOMgpd/Qz2vCasWLQ4QNBP/Kx6j6CoWlM4n3Dx99Qqx8Cql5ln2Q1ThjIIC4O1BQSV20dZ8h7S0 X-Received: by 2002:a63:4f63:: with SMTP id p35-v6mr926511pgl.167.1533243073213; Thu, 02 Aug 2018 13:51:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1533243073; cv=none; d=google.com; s=arc-20160816; b=Hdm12KHY2gkmIjyrOvqTWOtnHvKDRu6pWMgB0GllMujK8UWbd2WARyMqpWWhVflbq8 QwXLgp1ACbET7g7qhbzDkq/6VYieUyjT281Mz+bOfg3TchOs7VtbxQ5p2P4hXByD+h3w r5NrVVrnUYDDunEGXc6jCMKM/LYtq2DKVtaGCzlfwgFnSo763lS28/zDznOMcm+zsmdn lPrS1QGOhcagFwsM9WREH5313nUwFCM/hVl13flrhNPzSkmdFMoaPRSTlw3iiFHey4or jmkRsQCiDk4SoUIi/aKPDymlhXMAc4siVN5Cnsl87ivfeGVUC0fch9ubctoDpoEcbbhj bTow== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature :arc-authentication-results; bh=88wuAxzc2PUvM+3rhy3ebdfR336OGglKUQzREKUZyMI=; b=cQVGuB53ZxF/gmgUkcNtzIMzb1uxSGHlbCLtw5xfDqPZFxt3cxxSdnO1d5A9sY1qR/ G2iC7fbxOu2kyL8yciJb+WyBdSMnvn/BgNQxVZP6lMZcF9ezQ+kyk3C+PuDItAdpKPGg cYq+HHdP7CRkf2Zz0rx0Lmy9dNH+yUl/Aaaf0pKzSVgSnLs+r14Xl68WJ3G1hBCgwZe+ 9qY9vbwyJZ+vQtUgP32v10VEQ+ShGEz0S/dMROraEdeCKJkt7298GnqVtDUMY6aIhpIs S5Q37qpZViLaIqviwj3h3pwjqFXh/Y7GPWBRH5a5hYYY0b1E/MTM9CZOrQsyRGOqllVX LLJw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bakuhatsu-net.20150623.gappssmtp.com header.s=20150623 header.b=Vjt+npnc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e17-v6si2768027pgm.671.2018.08.02.13.50.57; Thu, 02 Aug 2018 13:51:13 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@bakuhatsu-net.20150623.gappssmtp.com header.s=20150623 header.b=Vjt+npnc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731986AbeHBWmS (ORCPT + 99 others); Thu, 2 Aug 2018 18:42:18 -0400 Received: from mail-vk0-f67.google.com ([209.85.213.67]:38040 "EHLO mail-vk0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729763AbeHBWmR (ORCPT ); Thu, 2 Aug 2018 18:42:17 -0400 Received: by mail-vk0-f67.google.com with SMTP id k82-v6so1720587vkd.5 for ; Thu, 02 Aug 2018 13:49:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bakuhatsu-net.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=88wuAxzc2PUvM+3rhy3ebdfR336OGglKUQzREKUZyMI=; b=Vjt+npncPDKXFAS6RxFiVsPRq0rPs7czZG+nzGzPaA/65zZ51Vm1b0z5+ATGSzFMLw ehzk6zoGMpPXBQashjF+my1forUUp8z1Bp2c87kYnJIsm0CEIj854OVbtt0qmyTjHPI1 tiBwIEFeu/0eaLIXLF3nzewmpuyScIjavlE4oyVHd876VClsiZh4WBsb2rVv6D82uCk5 soeZ2tNE8K2/xI6WZO6Apzv1RXWQHPw6bwxHqUjf4+LFQPSuVOxPS2RZTAATzNd/0/Xv e5okbyERTRslz9KJhnR7I3UAooq4V2kYuxv2GLeuH3XI9GfrYcHCyAJ03I2GWj1Ek3VZ 1fkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=88wuAxzc2PUvM+3rhy3ebdfR336OGglKUQzREKUZyMI=; b=MLcuIYLXEJ9YL59EZLhmxQGqzz2A6fmlxQotsLUdlD8MK4GjyOO4pl4ovhnm/5X37z 0CCs7BVkHJ/l7pSSOj4tiIAAfVi7Zr289oYIGt+tk4Cm2yRSyFPxcfQIkWZyei7FJrF6 SnIzIl55DUbLOdsaUdbvZKagS20ZeddPk0nJbsD24LmQscqJElRWZ7wLCJssEdl235XL iCSP3lkQ+AaBrQ4fJprA44YW/f6fixPhtJrbCxR2Q+EIWO1K8JrfwMW+ViKG6+hqhLXn P79xwMSiZMvLKbaXD/93DHifKpvRjkw9DA24CvOWE3s89wpKesPLWQMES8O7PFa9Icqn qs/A== X-Gm-Message-State: AOUpUlGPFEodxB9QdAePFI+IjtPZmFcraeDEntvyPub2tXNEKELklIjk S5itkh5QF0Owo2v24D2UB7nceq1NGQGnQQkM68iZ X-Received: by 2002:a1f:9f87:: with SMTP id i129-v6mr703680vke.64.1533242968169; Thu, 02 Aug 2018 13:49:28 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Matt Sealey Date: Thu, 2 Aug 2018 15:49:17 -0500 Message-ID: Subject: Re: framebuffer corruption due to overlapping stp instructions on arm64 To: Mikulas Patocka Cc: Catalin Marinas , Russell King , Thomas Petazzoni , Will Deacon , libc-alpha@sourceware.org, linux-arm-kernel@lists.infradead.org, linux-kernel@vger.kernel.org Content-Type: multipart/alternative; boundary="00000000000078f650057279f2f8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --00000000000078f650057279f2f8 Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable The easiest explanation for this would be that the memory isn=E2=80=99t map= ped correctly. You can=E2=80=99t use PCIe memory spaces with anything other tha= n Device-nGnRE or stricter mappings. That=E2=80=99s just differences between = the AMBA and PCIe (posted/unposted) memory models. Normal memory (cacheable or uncacheable, which Linux tends to call =E2=80= =9Cmemory=E2=80=9D and =E2=80=9Cwritecombine=E2=80=9D respectively) is not a good idea. There are two options; make sure Links maps it=E2=80=99s framebuffer as Dev= ice memory, or the driver, or both - and make sure that only aligned accesses happen (otherwise you=E2=80=99ll just get a synchronous exception) and ther= e isn=E2=80=99t a Normal memory alias. Alternatively, tell the PCIe driver that the framebuffer is in system memory - you can map it however you like but there=E2=80=99ll be a performa= nce hit if you start to use GPU acceleration, but a significant performance boost from the PoV of the CPU. Only memory accessed from the PCIe master interface (i.e. reads and writes generated by the card itself - telling the GPU to pull from system memory or other DMA) can be in Normal memory and this allows PCIe to be cache coherent with the right interconnect. The slave port on a PCIe root complex (i.e. CPU writes) can=E2=80=99t be used w= ith Normal, or reorderable, and therefore your 2GB of graphics memory is going to be slow from the point of view of the CPU. To find the correct mapping you=E2=80=99ll need to know just how cache cohe= rent the PCIe RC is... Ta, Matt On Thu, Aug 2, 2018 at 14:31 Mikulas Patocka wrote: > Hi > > I tried to use a PCIe graphics card on the MacchiatoBIN board and I hit a > strange problem. > > When I use the links browser in graphics mode on the framebuffer, I get > occasional pixel corruption. Links does memcpy, memset and 4-byte writes > on the framebuffer - nothing else. > > I found out that the pixel corruption is caused by overlapping unaligned > stp instructions inside memcpy. In order to avoid branching, the arm64 > memcpy implementation may write the same destination twice with different > alignment. If I put "dmb sy" between the overlapping stp instructions, th= e > pixel corruption goes away. > > This seems like a hardware bug. Is it a known errata? Do you have any > workarounds for it? > > I tried AMD card (HD 6350) and NVidia (NVS 285) and both exhibit the same > corruption. OpenGL doesn't work (it results in artifacts on the AMD card > and lock-up on the NVidia card), but it's quite expected if even simple > writing to the framebuffer doesn't work. > > Mikulas > --00000000000078f650057279f2f8 Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
The easiest explanation for this would be that the m= emory isn=E2=80=99t mapped correctly. You can=E2=80=99t use PCIe memory spa= ces with anything other than Device-nGnRE or stricter mappings. That=E2=80= =99s just differences between the AMBA and PCIe (posted/unposted) memory mo= dels.

Normal memor= y (cacheable or uncacheable, which Linux tends to call =E2=80=9Cmemory=E2= =80=9D and =E2=80=9Cwritecombine=E2=80=9D respectively) is not a good idea.=

There are two options; = make sure Links maps it=E2=80=99s framebuffer as Device memory, or the driv= er, or both - and make sure that only aligned accesses happen (otherwise yo= u=E2=80=99ll just get a synchronous exception) and there isn=E2=80=99t a No= rmal memory alias.

Alter= natively, tell the PCIe driver that the framebuffer is in system memory - y= ou can map it however you like but there=E2=80=99ll be a performance hit if= you start to use GPU acceleration, but a significant performance boost fro= m the PoV of the CPU. Only memory accessed from the PCIe master interface (= i.e. reads and writes generated by the card itself - telling the GPU to pul= l from system memory or other DMA) can be in Normal memory and this allows = PCIe to be cache coherent with the right interconnect. The slave port on a = PCIe root complex (i.e. CPU writes) can=E2=80=99t be used with Normal, or r= eorderable, and therefore your 2GB of graphics memory is going to be slow f= rom the point of view of the CPU.

To find the correct mapping you=E2=80=99ll need to know just how = cache coherent the PCIe RC is...

Ta,
Matt

=
On Thu, Aug 2, 2018 at 14:= 31 Mikulas Patocka <mpatocka@redh= at.com> wrote:
Hi

I tried to use a PCIe graphics card on the MacchiatoBIN board and I hit a <= br> strange problem.

When I use the links browser in graphics mode on the framebuffer, I get occasional pixel corruption. Links does memcpy, memset and 4-byte writes on the framebuffer - nothing else.

I found out that the pixel corruption is caused by overlapping unaligned stp instructions inside memcpy. In order to avoid branching, the arm64
memcpy implementation may write the same destination twice with different <= br> alignment. If I put "dmb sy" between the overlapping stp instruct= ions, the
pixel corruption goes away.

This seems like a hardware bug. Is it a known errata? Do you have any
workarounds for it?

I tried AMD card (HD 6350) and NVidia (NVS 285) and both exhibit the same <= br> corruption. OpenGL doesn't work (it results in artifacts on the AMD car= d
and lock-up on the NVidia card), but it's quite expected if even simple=
writing to the framebuffer doesn't work.

Mikulas
--00000000000078f650057279f2f8--