Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp6046812imu; Wed, 30 Jan 2019 07:56:42 -0800 (PST) X-Google-Smtp-Source: ALg8bN6kxWsnT4hbsjZtRdmy1xldE+/m1iGl/OAStLNcRPkquJsGZaS9YOX6nPCqsJ7mxy35aMEJ X-Received: by 2002:a17:902:b592:: with SMTP id a18mr30950742pls.293.1548863802291; Wed, 30 Jan 2019 07:56:42 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548863802; cv=none; d=google.com; s=arc-20160816; b=NdKGNR9C6QS8RpB6tlV8KKOATEiuO7QCldSushpOI829POv7BbTB5xzuapfp52BOVP yww5B4pFbkiA+ZnR69+StMOh9cxiizSIR/TENzs9z0p6H3F3wpmZULXck99h1uADw82s cH0bJCMoA/drev4bRg4np1WuF8R90FXlcn597BwT3hQUYLoeXShLoK+8Dq1q/6W7E76u Hx04kWDYdiG4k3aGZLORwfxiz1oPkmBIRIcAmadNt+vmKCLeywEWSb5DVLPgqEJeHHY2 IZT71tX442+457nJEfJC4GaTnixCmwCdWp5Mv6d6m6ClMG1ouU0211fU2MIBe5FyD26G /NkA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=gRiQz7k6O+nvE7vjqvomxUiXlyS0rKvbbEhLd5cyc60=; b=Lt1cAqx8ccvCiX5Y8/ldTN5PoNySeagk5ev3TQ59OrMdiOicz1d3gSOAjFphJNMuRu uZ8lCnwu4EaPRrwtPzwvBsxcbGwlCZr1AefoDyX2cZ8gxhGjxpTXs6+gSgsCWdpmtTE4 jwf45WewEIaV8zPlHcSTtorwQzOq5KLIV6R253KrtohaRG6GtxhID0iGESiF2hAac3aJ bYqwV6NKPk+HyNg6YnNCjzLkkUn1bn0z5skYj2OJzsCdCs5T9Fzo44FFqCRQHZPOv6rn TlROBtKW7Yi+soctznP6wfPPu0VQTRMs0CpNIJy3Xzyt9YA1xDLZi0uB5LeGx5Cs7uog 66CQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id az11si1720890plb.386.2019.01.30.07.56.26; Wed, 30 Jan 2019 07:56:42 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730884AbfA3Pzs (ORCPT + 99 others); Wed, 30 Jan 2019 10:55:48 -0500 Received: from mx1.redhat.com ([209.132.183.28]:41692 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728720AbfA3Pzr (ORCPT ); Wed, 30 Jan 2019 10:55:47 -0500 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 08E098762E; Wed, 30 Jan 2019 15:55:47 +0000 (UTC) Received: from redhat.com (ovpn-126-0.rdu2.redhat.com [10.10.126.0]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 41DB25C21E; Wed, 30 Jan 2019 15:55:45 +0000 (UTC) Date: Wed, 30 Jan 2019 10:55:43 -0500 From: Jerome Glisse To: "Koenig, Christian" Cc: Christoph Hellwig , Jason Gunthorpe , Logan Gunthorpe , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Greg Kroah-Hartman , "Rafael J . Wysocki" , Bjorn Helgaas , "Kuehling, Felix" , "linux-pci@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , Marek Szyprowski , Robin Murphy , Joerg Roedel , "iommu@lists.linux-foundation.org" Subject: Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma Message-ID: <20190130155543.GC3177@redhat.com> References: <20190129174728.6430-1-jglisse@redhat.com> <20190129174728.6430-4-jglisse@redhat.com> <20190129191120.GE3176@redhat.com> <20190129193250.GK10108@mellanox.com> <99c228c6-ef96-7594-cb43-78931966c75d@deltatee.com> <20190129205827.GM10108@mellanox.com> <20190130080208.GC29665@lst.de> <4e0637ba-0d7c-66a5-d3de-bc1e7dc7c0ef@amd.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <4e0637ba-0d7c-66a5-d3de-bc1e7dc7c0ef@amd.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.26]); Wed, 30 Jan 2019 15:55:47 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 30, 2019 at 10:33:39AM +0000, Koenig, Christian wrote: > Am 30.01.19 um 09:02 schrieb Christoph Hellwig: > > On Tue, Jan 29, 2019 at 08:58:35PM +0000, Jason Gunthorpe wrote: > >> On Tue, Jan 29, 2019 at 01:39:49PM -0700, Logan Gunthorpe wrote: > >> > >>> implement the mapping. And I don't think we should have 'special' vma's > >>> for this (though we may need something to ensure we don't get mapping > >>> requests mixed with different types of pages...). > >> I think Jerome explained the point here is to have a 'special vma' > >> rather than a 'special struct page' as, really, we don't need a > >> struct page at all to make this work. > >> > >> If I recall your earlier attempts at adding struct page for BAR > >> memory, it ran aground on issues related to O_DIRECT/sgls, etc, etc. > > Struct page is what makes O_DIRECT work, using sgls or biovecs, etc on > > it work. Without struct page none of the above can work at all. That > > is why we use struct page for backing BARs in the existing P2P code. > > Not that I'm a particular fan of creating struct page for this device > > memory, but without major invasive surgery to large parts of the kernel > > it is the only way to make it work. > > The problem seems to be that struct page does two things: > > 1. Memory management for system memory. > 2. The object to work with in the I/O layer. > > This was done because a good part of that stuff overlaps, like reference > counting how often a page is used.? The problem now is that this doesn't > work very well for device memory in some cases. > > For example on GPUs you usually have a large amount of memory which is > not even accessible by the CPU. In other words you can't easily create a > struct page for it because you can't reference it with a physical CPU > address. > > Maybe struct page should be split up into smaller structures? I mean > it's really overloaded with data. I think the simpler answer is that we do not want to allow GUP or any- thing similar to pin BAR or device memory. Doing so can only hurt us long term by fragmenting the GPU memory and forbidding us to move thing around. For transparent use of device memory within a process this is definitly forbidden to pin. I do not see any good reasons we would like to pin device memory for the existing GPU GEM objects. Userspace always had a very low expectation on what it can do with mmap of those object and i believe it is better to keep expectation low here and says nothing will work with those pointer. I just do not see a valid and compelling use case to change that :) Even outside GPU driver, device driver like RDMA just want to share their doorbell to other device and they do not want to see those doorbell page use in direct I/O or anything similar AFAICT. Cheers, J?r?me