Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1166181AbdDXHiG convert rfc822-to-8bit (ORCPT ); Mon, 24 Apr 2017 03:38:06 -0400 Received: from userp1040.oracle.com ([156.151.31.81]:40686 "EHLO userp1040.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1166035AbdDXHh4 (ORCPT ); Mon, 24 Apr 2017 03:37:56 -0400 Message-ID: <1493019397.3171.118.camel@oracle.com> Subject: Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory From: Knut Omang To: Benjamin Herrenschmidt , Logan Gunthorpe , Dan Williams Cc: Bjorn Helgaas , Jason Gunthorpe , Christoph Hellwig , Sagi Grimberg , "James E.J. Bottomley" , "Martin K. Petersen" , Jens Axboe , Steve Wise , Stephen Bates , Max Gurtovoy , Keith Busch , linux-pci@vger.kernel.org, linux-scsi , linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org, linux-nvdimm , "linux-kernel@vger.kernel.org" , Jerome Glisse Date: Mon, 24 Apr 2017 09:36:37 +0200 In-Reply-To: <1492381907.25766.49.camel@kernel.crashing.org> References: <1490911959-5146-1-git-send-email-logang@deltatee.com> <1491974532.7236.43.camel@kernel.crashing.org> <5ac22496-56ec-025d-f153-140001d2a7f9@deltatee.com> <1492034124.7236.77.camel@kernel.crashing.org> <81888a1e-eb0d-cbbc-dc66-0a09c32e4ea2@deltatee.com> <20170413232631.GB24910@bhelgaas-glaptop.roam.corp.google.com> <20170414041656.GA30694@obsidianresearch.com> <1492169849.25766.3.camel@kernel.crashing.org> <630c1c63-ff17-1116-e069-2b8f93e50fa2@deltatee.com> <20170414190452.GA15679@bhelgaas-glaptop.roam.corp.google.com> <1492207643.25766.18.camel@kernel.crashing.org> <1492311719.25766.37.camel@kernel.crashing.org> <5e43818e-8c6b-8be8-23ff-b798633d2a73@deltatee.com> <1492381907.25766.49.camel@kernel.crashing.org> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.20.5 (3.20.5-1.fc24) Mime-Version: 1.0 Content-Transfer-Encoding: 8BIT X-Source-IP: userv0021.oracle.com [156.151.31.71] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2451 Lines: 55 On Mon, 2017-04-17 at 08:31 +1000, Benjamin Herrenschmidt wrote: > On Sun, 2017-04-16 at 10:34 -0600, Logan Gunthorpe wrote: > >  > > On 16/04/17 09:53 AM, Dan Williams wrote: > > > ZONE_DEVICE allows you to redirect via get_dev_pagemap() to retrieve > > > context about the physical address in question. I'm thinking you can > > > hang bus address translation data off of that structure. This seems > > > vaguely similar to what HMM is doing. > >  > > Thanks! I didn't realize you had the infrastructure to look up a device > > from a pfn/page. That would really come in handy for us. > > It does indeed. I won't be able to play with that much for a few weeks > (see my other email) so if you're going to tackle this while I'm away, > can you work with Jerome to make sure you don't conflict with HMM ? > > I really want a way for HMM to be able to layout struct pages over the > GPU BARs rather than in "allocated free space" for the case where the > BAR is big enough to cover all of the GPU memory. > > In general, I'd like a simple & generic way for any driver to ask the > core to layout DMA'ble struct pages over BAR space. I an not convinced > this requires a "p2mem device" to be created on top of this though but > that's a different discussion. > > Of course the actual ability to perform the DMA mapping will be subject > to various restrictions that will have to be implemented in the actual > "dma_ops override" backend. We can have generic code to handle the case > where devices reside on the same domain, which can deal with switch > configuration etc... we will need to have iommu specific code to handle > the case going through the fabric.  > > Virtualization is a separate can of worms due to how qemu completely > fakes the MMIO space, we can look into that later. My first reflex when reading this thread was to think that this whole domain lends it self excellently to testing via Qemu. Could it be that doing this in  the opposite direction might be a safer approach in the long run even though  (significant) more work up-front? Eg. start by fixing/providing/documenting suitable model(s)  for testing this in Qemu, then implement the patch set based  on those models? Thanks, Knut > > Cheers, > Ben. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-rdma" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at  http://vger.kernel.org/majordomo-info.html