Received: by 10.192.165.148 with SMTP id m20csp1797809imm; Thu, 26 Apr 2018 02:22:13 -0700 (PDT) X-Google-Smtp-Source: AIpwx4/yVUqwaTME/Ya4bTEG6Z6eYrZhe70Bq8CnTsQwihROqVPA+Q8kX0sueWj8+sXLtp9x2q6I X-Received: by 10.98.93.20 with SMTP id r20mr31592511pfb.53.1524734533013; Thu, 26 Apr 2018 02:22:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524734532; cv=none; d=google.com; s=arc-20160816; b=Y9G2hBYIFZXQtZmThWyoaBJ8JamZCNI+0YCIcAxjeC7Lp8MVkuRcW4T4oscmBVjyRt rUDmHmlXRJLW+lF55sZSefKCqlYpSZHYLH4vDE0CtVGqdHOHFt1SvPLCk+otZ4+wntAx Sg41MW4eH1AIbPPs1PkzDa9B/SsbxN+/3qttpGoTirIXA6cXO0Gy+QwK9yNmMOp2f6rq BL0IKeyhe45+utYQJptzG460pZ7Q+Ldq8FIRI+e2CgSRttfQOlx55RVz2Xgs6UA5syp5 IEdWwPcdSFgotPfymnYvOmW9MuVsGASCeG1XibKUITgM1nrHkmzjbTjH6Z9wR6IJdBmL KM6A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=EMXsOUyq5pExlLKhDTVOb+Ed2Hb7fEvUtYQBEAzOhF8=; b=PQLbBFLp7uUsYMYEJnTygxnYzaT/NTyovMvT54ztGnUdLjRhObug6l7QZdh8UUwXZK LgiMGrd1Gi66AyfDdh2ydAu4+Xr7ZJ0vtPmwmNuy7+0mB4FeoERG/ugdvcsR8JtDu7C5 e8bP97TvT6Qrig7abFhgPlcEnrsRdSBYqTdV+U7xnmXWYCV5nAaN5qZ0m77SB2fXhBgU LlOGGB1mSwhhChyDamM4vWc5Y3QkB/gklMv5p8Kq7F3cgARXlQcHN6tsVdVla1a0HN4u SYj0Sxa5/C4RJSBNAXiSyV9bFOmidjT0whsETJl7UyMPsMoX0J7TTqDqpQS6+3cHhEhf B52g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=RafQq3tB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f96-v6si18691113plb.418.2018.04.26.02.21.58; Thu, 26 Apr 2018 02:22:12 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=RafQq3tB; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754625AbeDZJUt (ORCPT + 99 others); Thu, 26 Apr 2018 05:20:49 -0400 Received: from mail-io0-f173.google.com ([209.85.223.173]:35896 "EHLO mail-io0-f173.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753487AbeDZJUp (ORCPT ); Thu, 26 Apr 2018 05:20:45 -0400 Received: by mail-io0-f173.google.com with SMTP id d73-v6so4872133iog.3 for ; Thu, 26 Apr 2018 02:20:45 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=EMXsOUyq5pExlLKhDTVOb+Ed2Hb7fEvUtYQBEAzOhF8=; b=RafQq3tB3Pf6U9ggtscAgTvVrI3x5nNmtPns5k0StrMj5vI6mC8RCMfmc1NDFvtJlG Ad8dkU9svHJwPwvN/zp8CO2PSeXWrJvJCbAcNLzjQP9fONCszrpvchLZc96CWclry7GH WIagXHCqlljt5vXSkd+W+ZAqoxNC99eyrSgf0= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=EMXsOUyq5pExlLKhDTVOb+Ed2Hb7fEvUtYQBEAzOhF8=; b=CttpgnPszgb3EQ4vn+T/4oZzO8N2Gxl95rmWL1DlVZ28OHIger74jRe+7XuXEjMpj4 kng/1UZzuS8dO1tViwSl+gNL+A3Mgf16cmfheTLWxFJga8fNcsCxtz9OE/DNv8b1oA+C SJArkvQBKWLgOZrssike+YAvkQeNkQH5PgN0X4bTUcRuzRC7V0octFa9D5dqV6kCFAJD rNCMza87dg8bewHJ1K/zAyxouwxIvC+hrSbJAx/eGXgOIH4/w7O6mc1g93wj/zabZQWu ovvwdz+JDjhdBq/J8pDmkllSEiYg/vJ+QR+VoCyPtdtzM6t3jp6dXuMmffP5yIIUyt6z QVsw== X-Gm-Message-State: ALQs6tBXq89lmf9b0X6ses2+5oOggR4fi6K3aP5O+rIoqru418Us9o1u K9t42xL1U3x2OtqJIUdYRkMTBx9moIklTyha3k2GAg== X-Received: by 2002:a6b:a4cb:: with SMTP id d72-v6mr12796566ioj.34.1524734444917; Thu, 26 Apr 2018 02:20:44 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:a4f:f0d3:0:0:0:0:0 with HTTP; Thu, 26 Apr 2018 02:20:44 -0700 (PDT) X-Originating-IP: [2a02:168:5635:0:39d2:f87e:2033:9f6] In-Reply-To: <20180425225443.GQ16141@n2100.armlinux.org.uk> References: <20180424184847.GA3247@infradead.org> <20180425054855.GA17038@infradead.org> <20180425064335.GB28100@infradead.org> <20180425074151.GA2271@ulmo> <20180425085439.GA29996@infradead.org> <20180425100429.GR25142@phenom.ffwll.local> <20180425153312.GD27076@infradead.org> <20180425225443.GQ16141@n2100.armlinux.org.uk> From: Daniel Vetter Date: Thu, 26 Apr 2018 11:20:44 +0200 Message-ID: Subject: Re: [Linaro-mm-sig] noveau vs arm dma ops To: Russell King - ARM Linux Cc: Christoph Hellwig , Linux Kernel Mailing List , amd-gfx list , "moderated list:DMA BUFFER SHARING FRAMEWORK" , Jerome Glisse , iommu@lists.linux-foundation.org, dri-devel , Dan Williams , Thierry Reding , Logan Gunthorpe , =?UTF-8?Q?Christian_K=C3=B6nig?= , Linux ARM , "open list:DMA BUFFER SHARING FRAMEWORK" Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Apr 26, 2018 at 12:54 AM, Russell King - ARM Linux wrote: > On Wed, Apr 25, 2018 at 08:33:12AM -0700, Christoph Hellwig wrote: >> On Wed, Apr 25, 2018 at 12:04:29PM +0200, Daniel Vetter wrote: >> > - dma api hides the cache flushing requirements from us. GPUs love >> > non-snooped access, and worse give userspace control over that. We want >> > a strict separation between mapping stuff and flushing stuff. With the >> > IOMMU api we mostly have the former, but for the later arch maintainers >> > regularly tells they won't allow that. So we have drm_clflush.c. >> >> The problem is that a cache flushing API entirely separate is hard. That >> being said if you look at my generic dma-noncoherent API series it tries >> to move that way. So far it is in early stages and apparently rather >> buggy unfortunately. > > And if folk want a cacheable mapping with explicit cache flushing, the > cache flushing must not be defined in terms of "this is what CPU seems > to need" but from the point of view of a CPU with infinite prefetching, > infinite caching and infinite capacity to perform writebacks of dirty > cache lines at unexpected moments when the memory is mapped in a > cacheable mapping. > > (The reason for that is you're operating in a non-CPU specific space, > so you can't make any guarantees as to how much caching or prefetching > will occur by the CPU - different CPUs will do different amounts.) > > So, for example, the sequence: > > GPU writes to memory > CPU reads from cacheable memory > > if the memory was previously dirty (iow, CPU has written), you need to > flush the dirty cache lines _before_ the GPU writes happen, but you > don't know whether the CPU has speculatively prefetched, so you need > to flush any prefetched cache lines before reading from the cacheable > memory _after_ the GPU has finished writing. > > Also note that "flush" there can be "clean the cache", "clean and > invalidate the cache" or "invalidate the cache" as appropriate - some > CPUs are able to perform those three operations, and the appropriate > one depends on not only where in the above sequence it's being used, > but also on what the operations are. > > So, the above sequence could be: > > CPU invalidates cache for memory > (due to possible dirty cache lines) > GPU writes to memory > CPU invalidates cache for memory > (to get rid of any speculatively prefetched > lines) > CPU reads from cacheable memory > > Yes, in the above case, _two_ cache operations are required to ensure > correct behaviour. However, if you know for certain that the memory was > previously clean, then the first cache operation can be skipped. > > What I'm pointing out is there's much more than just "I want to flush > the cache" here, which is currently what DRM seems to assume at the > moment with the code in drm_cache.c. > > If we can agree a set of interfaces that allows _proper_ use of these > facilities, one which can be used appropriately, then there shouldn't > be a problem. The DMA API does that via it's ideas about who owns a > particular buffer (because of the above problem) and that's something > which would need to be carried over to such a cache flushing API (it > should be pretty obvious that having a GPU read or write memory while > the cache for that memory is being cleaned will lead to unexpected > results.) > > Also note that things get even more interesting in a SMP environment > if cache operations aren't broadcasted across the SMP cluster (which > means cache operations have to be IPI'd to other CPUs.) > > The next issue, which I've brought up before, is that exposing cache > flushing to userspace on architectures where it isn't already exposed > comes. As has been shown by Google Project Zero, this risks exposing > those architectures to Spectre and Meltdown exploits where they weren't > at such a risk before. (I've pretty much shown here that you _do_ > need to control which cache lines get flushed to make these exploits > work, and flushing the cache by reading lots of data in liu of having > the ability to explicitly flush bits of cache makes it very difficult > to impossible for them to work.) The above is already what we're implementing in i915, at least conceptually (it all boils down to clflush instructions because those both invalidate and flush). One architectural guarantee we're exploiting is that prefetched (and hence non-dirty) cachelines will never get written back, but dropped instead. But we kinda need that, otherwise the cpu could randomly corrupt the data the gpu is writing and non-coherent would just not work on those platforms. But aside from that, yes we do an invalidate before reading, and flushing after every writing (or anything else that could leave dirty cachelines behind). Plus a bit of tracking in the driver (kernel/userspace both do this, together, with some hilariously bad evolved semantics at least for i915, but oh well can't fix uapi mistakes) to avoid redundant cacheline flushes/invalidates. So ack. -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch