Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp2306716ybl; Thu, 15 Aug 2019 09:41:56 -0700 (PDT) X-Google-Smtp-Source: APXvYqy2F9Bs3e4O2/Lpz/zSOSGjQJ2sdRyt0qxGZRdHEDQVmidCEfeK5QX4hkUPK+FcKAcA+3K+ X-Received: by 2002:aa7:8106:: with SMTP id b6mr6336251pfi.5.1565887316833; Thu, 15 Aug 2019 09:41:56 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565887316; cv=none; d=google.com; s=arc-20160816; b=LsGpJl8CE0CEBfO96CRz5T54haCplEAdJR6Yj1QGO33Lmi1eq/gL51pNeseqNMVF8e 3+8FB4i0gkDEvAcPvVS/yKRZG7wUXqVz0FAnXtGp4fGdroiYVJv3vicof4hk/h+Ba1Nf 2WY/JJNT3b4uYREu2zT33lLL0cGGPS6KSIy/TTYb5n85+sWswJ4rEeLCIe6lF8gSz7o6 ISpHmKaYyCIJIgpD8RZ2VPx4PbSFRJCE7u++89Ow9R9WUT9zMHVpUWm27lS57tXYHGnv 4FhdyKxnHRrwKubX5cHvy+dNtbsI7npm3Gn6InDK8SY0acJS5YQxlcD4u5VzLpWnXRLL jemQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=wAN07Aej6p7Hd2hVE0xjo/DUiYqDdqV1ia2cD28NFxQ=; b=CSIQ1XOJ02guLV7s6b2q8BT7SPkpAHQyWx8Rvn56dDJqtFumb5XR6MvtmbfyUCMgP6 JYNemks09239fjjMxhnqslG0SAeKUpR17DfonwmK6OoLQGJ6TkQMpdDCMP1R6bCPTPAf m8AN6h9mVf3rFF9QWF4moxGXZZ2mG078DJGqyauqxMNP76plmUj77xeL5YGhBzMzPtrX 47EbcVPk3uaWgIgOZ8QmS3uUEB4F+r2oKWFKB8SYFPmX/cU5ZQYBCONyGb7YyMsCAaRg I4G2sggdGJECUqRZBMvnpMFR1tdEI1AXs0PXZisRRFrst8HlbAD79BmtnJS30tzbH5U+ B8Mg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=DOotlbxS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id f26si2332989pfd.193.2019.08.15.09.41.41; Thu, 15 Aug 2019 09:41:56 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ffwll.ch header.s=google header.b=DOotlbxS; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731680AbfHOQZk (ORCPT + 99 others); Thu, 15 Aug 2019 12:25:40 -0400 Received: from mail-oi1-f193.google.com ([209.85.167.193]:36336 "EHLO mail-oi1-f193.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731565AbfHOQZ2 (ORCPT ); Thu, 15 Aug 2019 12:25:28 -0400 Received: by mail-oi1-f193.google.com with SMTP id c15so2661722oic.3 for ; Thu, 15 Aug 2019 09:25:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ffwll.ch; s=google; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=wAN07Aej6p7Hd2hVE0xjo/DUiYqDdqV1ia2cD28NFxQ=; b=DOotlbxSe+oEc7h+VfeydF2jzAD8OEvRpE+Mg/T7l9NuSqDLgxqy80Rm7yaUhdlUsq XgZgOuBZPynYHO4pUyg0M+oxFE8PtShe6V/b9aN3W9gEmBvrEDd3jLri92IEvOnBd8yU d6tqu6UAtAcEGMC2AAWjeIO4ADYzS2PbO0fdY= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=wAN07Aej6p7Hd2hVE0xjo/DUiYqDdqV1ia2cD28NFxQ=; b=FduP1MwkOK55cQBwKU9/H6zCKcDdpDGvjyZVeLLluzHxLrDAQ7Qut8C9Iw33CgQrFh GyslbxsyaDqprC4LtRuFn1QuM9s9Wlx06qsqqHerlweWKkMfwttPYr7PG4nIeO3TyMT5 7Cg8j6BHoPJRkbEI+pUuMsGa2AJIk30BVjGn1r9WTUzrxnYRuSmWcOGirvtSLjWsglGt YTDaCJdwI4BdXOwBuymcGJFF77walM8UXSGIGt8zonBMkCWMJ6hf79eJV5ChzXs+tcZN S/3i/Ye9OKEP1PAKcobAoOg5ug8nQ2Jml878DENpoN88kNYoNYk4XUfuWv6PuLgHqGRf 51HA== X-Gm-Message-State: APjAAAUmu6fra06p9lLJgXuXBI23p436XRDUMvBVKgWMngkCbZcC3bXo HB0sHEaeP4eUnJIHVLRTQMRUFvaGlUUdcnA36c98pw== X-Received: by 2002:a54:4f89:: with SMTP id g9mr2267959oiy.110.1565886327298; Thu, 15 Aug 2019 09:25:27 -0700 (PDT) MIME-Version: 1.0 References: <20190814202027.18735-1-daniel.vetter@ffwll.ch> <20190814202027.18735-3-daniel.vetter@ffwll.ch> <20190814134558.fe659b1a9a169c0150c3e57c@linux-foundation.org> <20190815084429.GE9477@dhcp22.suse.cz> <20190815130415.GD21596@ziepe.ca> <20190815143759.GG21596@ziepe.ca> <20190815151028.GJ21596@ziepe.ca> In-Reply-To: <20190815151028.GJ21596@ziepe.ca> From: Daniel Vetter Date: Thu, 15 Aug 2019 18:25:16 +0200 Message-ID: Subject: Re: [PATCH 2/5] kernel.h: Add non_block_start/end() To: Jason Gunthorpe Cc: Michal Hocko , Andrew Morton , LKML , Linux MM , DRI Development , Intel Graphics Development , Peter Zijlstra , Ingo Molnar , David Rientjes , =?UTF-8?Q?Christian_K=C3=B6nig?= , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Masahiro Yamada , Wei Wang , Andy Shevchenko , Thomas Gleixner , Jann Horn , Feng Tang , Kees Cook , Randy Dunlap , Daniel Vetter Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 15, 2019 at 5:10 PM Jason Gunthorpe wrote: > > On Thu, Aug 15, 2019 at 04:43:38PM +0200, Daniel Vetter wrote: > > > You have to wait for the gpu to finnish current processing in > > invalidate_range_start. Otherwise there's no point to any of this > > really. So the wait_event/dma_fence_wait are unavoidable really. > > I don't envy your task :| > > But, what you describe sure sounds like a 'registration cache' model, > not the 'shadow pte' model of coherency. > > The key difference is that a regirstationcache is allowed to become > incoherent with the VMA's because it holds page pins. It is a > programming bug in userspace to change VA mappings via mmap/munmap/etc > while the device is working on that VA, but it does not harm system > integrity because of the page pin. > > The cache ensures that each initiated operation sees a DMA setup that > matches the current VA map when the operation is initiated and allows > expensive device DMA setups to be re-used. > > A 'shadow pte' model (ie hmm) *really* needs device support to > directly block DMA access - ie trigger 'device page fault'. ie the > invalidate_start should inform the device to enter a fault mode and > that is it. If the device can't do that, then the driver probably > shouldn't persue this level of coherency. The driver would quickly get > into the messy locking problems like dma_fence_wait from a notifier. > > It is important to identify what model you are going for as defining a > 'registration cache' coherence expectation allows the driver to skip > blocking in invalidate_range_start. All it does is invalidate the > cache so that future operations pick up the new VA mapping. > > Intel's HFI RDMA driver uses this model extensively, and I think it is > well proven, within some limitations of course. > > At least, 'registration cache' is the only use model I know of where > it is acceptable to skip invalidate_range_end. I'm not really well versed in the details of our userptr, but both amdgpu and i915 wait for the gpu to complete from invalidate_range_start. Jerome has at least looked a lot at the amdgpu one, so maybe he can explain what exactly it is we're doing ... -Daniel -- Daniel Vetter Software Engineer, Intel Corporation +41 (0) 79 365 57 48 - http://blog.ffwll.ch