Received: by 2002:a25:8b12:0:0:0:0:0 with SMTP id i18csp2307889ybl; Thu, 15 Aug 2019 09:43:08 -0700 (PDT) X-Google-Smtp-Source: APXvYqymMwpW9Lzm+jbtNHEWDlhrxKys+kdKuUeDu+XsZsbyMraAv2rlj2uvpZf2OpGKCrZQe/tw X-Received: by 2002:aa7:96bd:: with SMTP id g29mr6574438pfk.10.1565887388820; Thu, 15 Aug 2019 09:43:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1565887388; cv=none; d=google.com; s=arc-20160816; b=bsiBRKXC5gElsshoQjmRUGlLbWWASTmmd7pzvrVA1gVfrfie0dFTzBKCJH2rfC1+Sm OqYQ06kcZAL3XfSmPBosj87uQKQgOMhFcBtjJVKQGLIZLJJ7hBIcnTv+/2sVo441d/P1 THgjyXjclq1QyOzOzCUwN7uK2JykO9yXJMiUHBFhE7MopgHP8Ubzz7dqSNQ+2oKJDAiJ OKcN8J3BzvX5063WvpSoHGBwoF9iq8TvdwRVyPJAWuIfycxaCeFXfesl/j6RKH8ScF4O AUaM5fgFlbNewj3//MF9eLtIOnPwhI3hjcR6qabb8MmMYPbbzyXnfe8V7nv3aGrWQ5oe ja2g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=DW6oLPWZiU/rqyPtvv9okhg6kxVdO4dNukRCLYrtzGQ=; b=pXMSWolXvkmPSLHDjZ5L2VUVvomlY2A3c/opv9hk/TygoLiDd1ubJFLqIDcViZ0kuc TrC2KkGyCdKJ7ataQyMh1HzmHL1wanKiXQf+L22O4YfPOY32IP5IueuXJdke6ynXhHd4 TSLUTU5+euoZRb4i58/E9RoNWgupvjCurm5GJXlUmI88Tl/SMPY3xvdNLhSNh+RR65kn IGL+Za/WNBRU9c5YD1jzJQkMH0GLzUh6yemKRRM1jqeP/QgBoISNKsbH8Ycsbfv4+s/d ldOp/Vow6I12xO/nsGzyu1PURwJ4cUWWJK2JiI6CXlUybUIKm5lS/n0e2Z1uKnXSnJoD WG2g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x9si2218966plo.98.2019.08.15.09.42.53; Thu, 15 Aug 2019 09:43:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731941AbfHOQco (ORCPT + 99 others); Thu, 15 Aug 2019 12:32:44 -0400 Received: from mx1.redhat.com ([209.132.183.28]:49182 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730705AbfHOQcn (ORCPT ); Thu, 15 Aug 2019 12:32:43 -0400 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id C67853E2D3; Thu, 15 Aug 2019 16:32:42 +0000 (UTC) Received: from redhat.com (unknown [10.20.6.178]) by smtp.corp.redhat.com (Postfix) with ESMTPS id A8BCB17AD1; Thu, 15 Aug 2019 16:32:40 +0000 (UTC) Date: Thu, 15 Aug 2019 12:32:38 -0400 From: Jerome Glisse To: Jason Gunthorpe Cc: Daniel Vetter , Michal Hocko , Andrew Morton , LKML , Linux MM , DRI Development , Intel Graphics Development , Peter Zijlstra , Ingo Molnar , David Rientjes , Christian =?iso-8859-1?Q?K=F6nig?= , Masahiro Yamada , Wei Wang , Andy Shevchenko , Thomas Gleixner , Jann Horn , Feng Tang , Kees Cook , Randy Dunlap , Daniel Vetter Subject: Re: [PATCH 2/5] kernel.h: Add non_block_start/end() Message-ID: <20190815163238.GA30781@redhat.com> References: <20190814202027.18735-1-daniel.vetter@ffwll.ch> <20190814202027.18735-3-daniel.vetter@ffwll.ch> <20190814134558.fe659b1a9a169c0150c3e57c@linux-foundation.org> <20190815084429.GE9477@dhcp22.suse.cz> <20190815130415.GD21596@ziepe.ca> <20190815143759.GG21596@ziepe.ca> <20190815151028.GJ21596@ziepe.ca> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20190815151028.GJ21596@ziepe.ca> User-Agent: Mutt/1.11.3 (2019-02-01) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Thu, 15 Aug 2019 16:32:43 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 15, 2019 at 12:10:28PM -0300, Jason Gunthorpe wrote: > On Thu, Aug 15, 2019 at 04:43:38PM +0200, Daniel Vetter wrote: > > > You have to wait for the gpu to finnish current processing in > > invalidate_range_start. Otherwise there's no point to any of this > > really. So the wait_event/dma_fence_wait are unavoidable really. > > I don't envy your task :| > > But, what you describe sure sounds like a 'registration cache' model, > not the 'shadow pte' model of coherency. > > The key difference is that a regirstationcache is allowed to become > incoherent with the VMA's because it holds page pins. It is a > programming bug in userspace to change VA mappings via mmap/munmap/etc > while the device is working on that VA, but it does not harm system > integrity because of the page pin. > > The cache ensures that each initiated operation sees a DMA setup that > matches the current VA map when the operation is initiated and allows > expensive device DMA setups to be re-used. > > A 'shadow pte' model (ie hmm) *really* needs device support to > directly block DMA access - ie trigger 'device page fault'. ie the > invalidate_start should inform the device to enter a fault mode and > that is it. If the device can't do that, then the driver probably > shouldn't persue this level of coherency. The driver would quickly get > into the messy locking problems like dma_fence_wait from a notifier. I think here we do not agree on the hardware requirement. For GPU we will always need to be able to wait for some GPU fence from inside the notifier callback, there is just no way around that for many of the GPUs today (i do not see any indication of that changing). Driver should avoid lock complexity by using wait queue so that the driver notifier callback can wait without having to hold some driver lock. However there will be at least one lock needed to update the internal driver state for the range being invalidated. That lock is just the device driver page table lock for the GPU page table associated with the mm_struct. In all GPU driver so far it is a short lived lock and nothing blocking is done while holding it (it is just about updating page table directory really wether it is filling it or clearing it). > > It is important to identify what model you are going for as defining a > 'registration cache' coherence expectation allows the driver to skip > blocking in invalidate_range_start. All it does is invalidate the > cache so that future operations pick up the new VA mapping. > > Intel's HFI RDMA driver uses this model extensively, and I think it is > well proven, within some limitations of course. > > At least, 'registration cache' is the only use model I know of where > it is acceptable to skip invalidate_range_end. Here GPU are not in the registration cache model, i know it might looks like it because of GUP but GUP was use just because hmm did not exist at the time. Cheers, J?r?me