Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp7133889imu; Tue, 22 Jan 2019 00:39:46 -0800 (PST) X-Google-Smtp-Source: ALg8bN7dUoEWIIa81eAWHM8t4kTr7Q0LbkFTupdV5+YwDko+NWNod5T7tKk68jm2fsEg4G3qlT9g X-Received: by 2002:a63:1d59:: with SMTP id d25mr31618294pgm.180.1548146386863; Tue, 22 Jan 2019 00:39:46 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548146386; cv=none; d=google.com; s=arc-20160816; b=u0GAaESES54zi94cn4FQvzMGoBM1MpVtixWmxgb8QPvPRjoyIUVawK8ansuYqzh4tF wAzZEJikmJIUe1Cct2HfO7J3Y7ilWI3DQa6gIO7ciYvgdWP6i+VQL8vDcPSJ9vo8bgYG qpIIJZPsK3B9HDBshJmWWimiF7ugu5unC+gyH2S3HFa231T0L3KWqFkJ7h7560OsLoog fg4OK0a23NhbQ/3Es+sqN2a1LHaqtn/MIFqyFecddhfDU6YMU8PFWhQYpyE5e01L3WOv v8Wc7HhueroUILDj122eIXl0ZS5Ga6zQ2XtpfUjTkhxK32xXL4bVr71UVKVm/8NDXNdq UN3g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=qUQTqUpTVURbew+RiiCt5ESE99xTjivED68SiV0jtLM=; b=uJOKClAaSI5Psw0fEJ7jytbX0IPydTXbgpSMUQiyOMsoRQssUeLentlNnWc3el0/0B /u+TqDKxk22+ZZ9OX1s3iGxzydwdIz2mTJwaWRQwLZd6JMJ7Q/vxmIsM+I7ZYygHvH8p u+vy857xA2Zrso6Ig6+apjF82IiXhEyN98ZtwgfuV6oMibuuu9TxudOQvmproo0umxUl pGjFDEa3FmHIOvlckKLPZc/4f3EozscHgS9aotn1Pi4TnbrdPQy7wqAZYz2tjsSGu/aU yjLuPA6979s/yjYZz+k4jEuslgLiIKhN33XwE3LXKUMBIkbRpPTsiTzHVh8GN1Y5K3Gk E2rQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=HKqDE8LF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id e69si14373759pfg.137.2019.01.22.00.39.31; Tue, 22 Jan 2019 00:39:46 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@linaro.org header.s=google header.b=HKqDE8LF; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727382AbfAVIiX (ORCPT + 99 others); Tue, 22 Jan 2019 03:38:23 -0500 Received: from mail-io1-f67.google.com ([209.85.166.67]:41184 "EHLO mail-io1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727093AbfAVIiW (ORCPT ); Tue, 22 Jan 2019 03:38:22 -0500 Received: by mail-io1-f67.google.com with SMTP id s22so18461610ioc.8 for ; Tue, 22 Jan 2019 00:38:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=qUQTqUpTVURbew+RiiCt5ESE99xTjivED68SiV0jtLM=; b=HKqDE8LFlR4bcntUCzw5MakCFvbeNDgvCsz4iSgo1AZv4MlXU3MWDlW2eoMxt5jPgl B1WLHHhoFo0U/ljemI8Z4J4693JULpk2nLHxNNhEdhr5Wm/Y8xKiI2BaLpwYMhTOBWQe g/1w+snMb6corjU4aB7Pr3WUQDzxnbACZ13FM= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=qUQTqUpTVURbew+RiiCt5ESE99xTjivED68SiV0jtLM=; b=YMBIKWM8VUv1d1sPq7Tha6ZzwbwN3qyPLcpBn2JJJpZxhjljOyW7bjd1eeXNumCPFq K+ezm+D1WJJNKhQv4hcYt+XbMrJRGFSprJpGQRjDaw4RkPSwXzE1lBO+p9FYG5/S7YV4 PVkliYeihN+rpfTyOJq99XxxP/XBrLlC0XxYZROgK0RzlhGvitHojkVjYk/1ng41MGIj 3WSBCRdpSKRykWO5z19lVwmLE+c5+UN4/CerA1aBTZ9Dq26C77ywXVv5ituyJPUudtDq TRp01PeIbZDi3BUwmYJU1opSff+3GLu9MBFCd/k/67BPdirpfXb034upXQ2mpIwG71Pp Z1Vw== X-Gm-Message-State: AJcUukfZc6Z7jY4RzpI3jq7ZRoSps4HCrBKQz5hg3QXEMR2hvjqE5eiU X1ajgDrIJVMKH0GV9ZH3s//r1ynuN76YEucUHXSioA== X-Received: by 2002:a5e:c206:: with SMTP id v6mr19395883iop.60.1548146301581; Tue, 22 Jan 2019 00:38:21 -0800 (PST) MIME-Version: 1.0 References: <20190121100617.2311-1-ard.biesheuvel@linaro.org> <20190121150734.GA30582@infradead.org> <20190121155908.GA8084@infradead.org> <20190121162238.GA17651@infradead.org> <59ccf85d-b99d-b5c8-ea87-66c2a892e197@daenzer.net> <850b6aee-0040-c333-b125-45211c18ada5@daenzer.net> <047667fd-17be-1c37-5d2a-26768cfd6ab8@daenzer.net> In-Reply-To: <047667fd-17be-1c37-5d2a-26768cfd6ab8@daenzer.net> From: Ard Biesheuvel Date: Tue, 22 Jan 2019 09:38:10 +0100 Message-ID: Subject: Re: [RFC PATCH] drm: disable WC optimization for cache coherent devices on non-x86 To: =?UTF-8?Q?Michel_D=C3=A4nzer?= Cc: Christoph Hellwig , Will Deacon , David Zhou , Maxime Ripard , Benjamin Herrenschmidt , David Airlie , Maarten Lankhorst , Linux Kernel Mailing List , amd-gfx@lists.freedesktop.org, Junwei Zhang , Huang Rui , dri-devel , Daniel Vetter , Michael Ellerman , Alex Deucher , Sean Paul , Christian Koenig , linux-arm-kernel Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 21 Jan 2019 at 20:04, Michel D=C3=A4nzer wrote= : > > On 2019-01-21 7:28 p.m., Ard Biesheuvel wrote: > > On Mon, 21 Jan 2019 at 19:24, Michel D=C3=A4nzer w= rote: > >> On 2019-01-21 7:20 p.m., Ard Biesheuvel wrote: > >>> On Mon, 21 Jan 2019 at 19:04, Michel D=C3=A4nzer = wrote: > >>>> On 2019-01-21 6:59 p.m., Ard Biesheuvel wrote: > >>>>> On Mon, 21 Jan 2019 at 18:55, Michel D=C3=A4nzer wrote: > >>>>>> On 2019-01-21 5:30 p.m., Ard Biesheuvel wrote: > >>>>>>> On Mon, 21 Jan 2019 at 17:22, Christoph Hellwig wrote: > >>>>>>> > >>>>>>>> Until that happens we should just change the driver ifdefs to de= fault > >>>>>>>> the hacks to off and only enable them on setups where we 100% > >>>>>>>> positively know that they actually work. And document that fact > >>>>>>>> in big fat comments. > >>>>>>> > >>>>>>> Well, as I mentioned in my commit log as well, if we default to o= ff > >>>>>>> unless CONFIG_X86, we may break working setups on MIPS and Power = where > >>>>>>> the device is in fact non-cache coherent, and relies on this > >>>>>>> 'optimization' to get things working. > >>>>>> > >>>>>> FWIW, the amdgpu driver doesn't rely on non-snooped transfers for > >>>>>> correct basic operation (the scenario Christian brought up is a ve= ry > >>>>>> specialized use-case), so that shouldn't be an issue. > >>>>> > >>>>> The point is that this is only true for x86. > >>>>> > >>>>> On other architectures, the use of non-cached mappings on the CPU s= ide > >>>>> means that you /do/ rely on non-snooped transfers, since if those > >>>>> transfers turn out not to snoop inadvertently, the accesses are > >>>>> incoherent with the CPU's view of memory. > >>>> > >>>> The driver generally only uses non-cached mappings if > >>>> drm_arch/device_can_wc_memory returns true. > >>> > >>> Indeed. And so we should take care to only return 'true' from that > >>> function if it is guaranteed that non-cached CPU mappings are coheren= t > >>> with the mappings used by the GPU, either because that is always the > >>> case (like on x86), or because we know that the platform in question > >>> implements NoSnoop correctly throughout the interconnect. > >>> > >>> What seems to be complicating matters is that in some cases, the > >>> device is non-cache coherent to begin with, so regardless of whether > >>> the NoSnoop attribute is used or not, those accesses will not snoop i= n > >>> the caches and be coherent with the non-cached mappings used by the > >>> CPU. So if we restrict this optimization [on non-X86] to platforms > >>> that are known to implement NoSnoop correctly, we may break platforms > >>> that are implicitly NoSnoop all the time. > >> > >> Since the driver generally doesn't rely on non-snooped accesses for > >> correctness, that couldn't "break" anything that hasn't always been br= oken. > > > > Again, that is only true on x86. > > > > On other architectures, DMA writes from the device may allocate in the > > caches, and be invisible to the CPU when it uses non-cached mappings. > > Let me try one last time: > > If drm_arch_can_wc_memory returns false, the driver falls back to the > normal mode of operation, using a cacheable CPU mapping and snooped GPU > transfers, even if userspace asks (as a performance optimization) for a > write-combined CPU mapping and non-snooped GPU transfers via > AMDGPU_GEM_CREATE_CPU_GTT_USWC. Another question: when userspace requests for such a mapping to be created, does this involve pages that are mapped cacheable into the userland process?