Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp6440316imu; Wed, 30 Jan 2019 14:59:03 -0800 (PST) X-Google-Smtp-Source: ALg8bN7KxU4OuuKQDpVAr/Y/DzRe/pCtRvZjHU630dYJD4GcUQBmf4fok6GLk0L9wC6l2kJe6Frn X-Received: by 2002:a17:902:a50a:: with SMTP id s10mr30764233plq.278.1548889143280; Wed, 30 Jan 2019 14:59:03 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548889143; cv=none; d=google.com; s=arc-20160816; b=rN9/cBk3H6ZatXbWqJMCKrdtQA1oIj+JtFuaEo9oWSShi5S/tYzZdTPtSJ6NZjKYFD 1nO6Gv4+oQHf6OUFdlCBjckPLHlB5sxT1bD5sOW65fDxyCyCh4PssMqcdfP7mzz6UJOd wdRRN7Gy6+8kKcVTq04Xpw1VIhxL1kr54dUTnNBOlnPDsS9Zj3eo7M6Q/cYORW2MvJZq tvsDmr66Gs6O0Izm+K2tnHiO5k2r+RAFgwe6n2V8P9JvC2aa7I6FVJDj2+1uaAoUVlbZ loWFJvuW8mc/H5uQ+rUpJTnEaxbpl8MLru3jPLZyTLAh9W527IKD0YmnCjUIyQvnL/n6 zweQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-transfer-encoding:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=/KQuiK7VXznvDCndBNBdfJA6yVh7Zl2LT8h4OPx3En0=; b=bM7gDQQyehco59BDfexX5FfgTgDh42xzzxS9gKnrGZezveF57gCXNdk3U/691mw8yZ HP8ih6Txgpu2ON0Q2otes/7sTyoudX1V6y+gmaObbFMKAq8lOk8TkQqucwxBf4nOSB5D 7ZNHXxds8vuGJav2Z0GeeDH7CazX6boccTIKrAUKoWPH4hGrZuIlKAcNvDxKfu/W0/VG mGNmjOBN1yvP8nHMAIO/5KdTrKGrNWTdHkgkqyL3hslmFsuvvDFAeRUrnJiZdFcifVLU BUX7rJhn+rPTUUQoqs8Suix/Tbx1wh+dudBJwrJXiz65Q1imXUfd4T/lZ/Niex66Eq5b P9IA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v32si2711932plb.369.2019.01.30.14.58.48; Wed, 30 Jan 2019 14:59:03 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387765AbfA3Unh (ORCPT + 99 others); Wed, 30 Jan 2019 15:43:37 -0500 Received: from mx1.redhat.com ([209.132.183.28]:56738 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726531AbfA3Ung (ORCPT ); Wed, 30 Jan 2019 15:43:36 -0500 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 3AD2A356D3; Wed, 30 Jan 2019 20:43:36 +0000 (UTC) Received: from redhat.com (ovpn-126-0.rdu2.redhat.com [10.10.126.0]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 33B68103BAB8; Wed, 30 Jan 2019 20:43:34 +0000 (UTC) Date: Wed, 30 Jan 2019 15:43:32 -0500 From: Jerome Glisse To: Jason Gunthorpe Cc: Logan Gunthorpe , "linux-mm@kvack.org" , "linux-kernel@vger.kernel.org" , Greg Kroah-Hartman , "Rafael J . Wysocki" , Bjorn Helgaas , Christian Koenig , Felix Kuehling , "linux-pci@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , Christoph Hellwig , Marek Szyprowski , Robin Murphy , Joerg Roedel , "iommu@lists.linux-foundation.org" Subject: Re: [RFC PATCH 3/5] mm/vma: add support for peer to peer to device vma Message-ID: <20190130204332.GF5061@redhat.com> References: <20190129234752.GR3176@redhat.com> <655a335c-ab91-d1fc-1ed3-b5f0d37c6226@deltatee.com> <20190130041841.GB30598@mellanox.com> <20190130185652.GB17080@mellanox.com> <20190130192234.GD5061@redhat.com> <20190130193759.GE17080@mellanox.com> <20190130201114.GB17915@mellanox.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1 Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: <20190130201114.GB17915@mellanox.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.30]); Wed, 30 Jan 2019 20:43:36 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Jan 30, 2019 at 08:11:19PM +0000, Jason Gunthorpe wrote: > On Wed, Jan 30, 2019 at 01:00:02PM -0700, Logan Gunthorpe wrote: > > > We never changed SGLs. We still use them to pass p2pdma pages, only we > > need to be a bit careful where we send the entire SGL. I see no reason > > why we can't continue to be careful once their in userspace if there's > > something in GUP to deny them. > > > > It would be nice to have heterogeneous SGLs and it is something we > > should work toward but in practice they aren't really necessary at the > > moment. > > RDMA generally cannot cope well with an API that requires homogeneous > SGLs.. User space can construct complex MRs (particularly with the > proposed SGL MR flow) and we must marshal that into a single SGL or > the drivers fall apart. > > Jerome explained that GPU is worse, a single VMA may have a random mix > of CPU or device pages.. > > This is a pretty big blocker that would have to somehow be fixed. Note that HMM takes care of that RDMA ODP with my ODP to HMM patch, so what you get for an ODP umem is just a list of dma address you can program your device to. The aim is to avoid the driver to care about that. The access policy when the UMEM object is created by userspace through verbs API should however ascertain that for mmap of device file it is only creating a UMEM that is fully covered by one and only one vma. GPU device driver will have one vma per logical GPU object. I expect other kind of device do that same so that they can match a vma to a unique object in their driver. > > > That doesn't even necessarily need to be the case. For HMM, I > > understand, struct pages may not point to any accessible memory and the > > memory that backs it (or not) may change over the life time of it. So > > they don't have to be strictly tied to BARs addresses. p2pdma pages are > > strictly tied to BAR addresses though. > > No idea, but at least for this case I don't think we need magic HMM > pages to make simple VMA ops p2p_map/umap work.. Yes, you do not need page for simple driver, if we start creating struct page for all PCIE BAR we are gonna waste a lot of memory and resources for no good reason. I doubt all of the PCIE BAR of a device enabling p2p will ever be map as p2p. So simple driver do not need struct page, GPU driver that do not use HMM (all GPU that are more than 2 years old) do not need struct page. Struct page is a burden here more than anything else. I have not seen one good thing the struct page gives you. Cheers, J?r?me