Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755898AbdDRVcH (ORCPT ); Tue, 18 Apr 2017 17:32:07 -0400 Received: from ale.deltatee.com ([207.54.116.67]:54776 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754844AbdDRVcD (ORCPT ); Tue, 18 Apr 2017 17:32:03 -0400 To: Jason Gunthorpe , Dan Williams References: <1492381396.25766.43.camel@kernel.crashing.org> <20170418164557.GA7181@obsidianresearch.com> <20170418190138.GH7181@obsidianresearch.com> <20170418210339.GA24257@obsidianresearch.com> Cc: Benjamin Herrenschmidt , Bjorn Helgaas , Christoph Hellwig , Sagi Grimberg , "James E.J. Bottomley" , "Martin K. Petersen" , Jens Axboe , Steve Wise , Stephen Bates , Max Gurtovoy , Keith Busch , linux-pci@vger.kernel.org, linux-scsi , linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org, linux-nvdimm , "linux-kernel@vger.kernel.org" , Jerome Glisse From: Logan Gunthorpe Message-ID: <9fc9352f-86fe-3a9e-e372-24b3346b518c@deltatee.com> Date: Tue, 18 Apr 2017 15:31:58 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.6.0 MIME-Version: 1.0 In-Reply-To: <20170418210339.GA24257@obsidianresearch.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-SA-Exim-Connect-IP: 172.16.1.111 X-SA-Exim-Rcpt-To: jglisse@redhat.com, linux-kernel@vger.kernel.org, linux-nvdimm@ml01.01.org, linux-rdma@vger.kernel.org, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, linux-pci@vger.kernel.org, keith.busch@intel.com, maxg@mellanox.com, sbates@raithlin.com, swise@opengridcomputing.com, axboe@kernel.dk, martin.petersen@oracle.com, jejb@linux.vnet.ibm.com, sagi@grimberg.me, hch@lst.de, helgaas@kernel.org, benh@kernel.crashing.org, dan.j.williams@intel.com, jgunthorpe@obsidianresearch.com X-SA-Exim-Mail-From: logang@deltatee.com Subject: Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory X-SA-Exim-Version: 4.2.1 (built Mon, 26 Dec 2011 16:24:06 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2877 Lines: 73 On 18/04/17 03:03 PM, Jason Gunthorpe wrote: > What about something more incremental like this instead: > - dma_ops will set map_sg_p2p == map_sg when they are updated to > support p2p, otherwise DMA on P2P pages will fail for those ops. > - When all ops support p2p we remove the if and ops->map_sg then > just call map_sg_p2p > - For now the scatterlist maintains a bit when pages are added indicating if > p2p memory might be present in the list. > - Unmap for p2p and non-p2p is the same, the underlying ops driver has > to make it work. > > diff --git a/include/linux/dma-mapping.h b/include/linux/dma-mapping.h > index 0977317c6835c2..505ed7d502053d 100644 > --- a/include/linux/dma-mapping.h > +++ b/include/linux/dma-mapping.h > @@ -103,6 +103,9 @@ struct dma_map_ops { > int (*map_sg)(struct device *dev, struct scatterlist *sg, > int nents, enum dma_data_direction dir, > unsigned long attrs); > + int (*map_sg_p2p)(struct device *dev, struct scatterlist *sg, > + int nents, enum dma_data_direction dir, > + unsigned long attrs); > void (*unmap_sg)(struct device *dev, > struct scatterlist *sg, int nents, > enum dma_data_direction dir, > @@ -244,7 +247,15 @@ static inline int dma_map_sg_attrs(struct device *dev, struct scatterlist *sg, > for_each_sg(sg, s, nents, i) > kmemcheck_mark_initialized(sg_virt(s), s->length); > BUG_ON(!valid_dma_direction(dir)); > - ents = ops->map_sg(dev, sg, nents, dir, attrs); > + > + if (sg_has_p2p(sg)) { > + if (ops->map_sg_p2p) > + ents = ops->map_sg_p2p(dev, sg, nents, dir, attrs); > + else > + return 0; > + } else > + ents = ops->map_sg(dev, sg, nents, dir, attrs); > + > BUG_ON(ents < 0); > debug_dma_map_sg(dev, sg, nents, ents, dir); I could get behind this. Though a couple of points: 1) It means that sg_has_p2p has to walk the entire sg and check every page. Then map_sg_p2p/map_sg has to walk it again and repeat the check then do some operation per page. If anyone is concerned about the dma_map performance this could be an issue. 2) Without knowing exactly what the arch specific code may need to do it's hard to say that this is exactly the right approach. If every dma_ops provider has to do exactly this on every page it may lead to a lot of duplicate code: foreach_sg page: if (pci_page_is_p2p(page)) { dma_addr = pci_p2p_map_page(page) if (!dma_addr) return 0; continue } ... The only thing I'm presently aware of is the segment check and applying the offset to the physical address -- neither of which has much to do with the specific dma_ops providers. It _may_ be that this needs to be bus specific and not arch specific which I think is what Dan may be getting at. So it may make sense to just have a pci_map_sg_p2p() which takes a dma_ops struct it would use for any page that isn't a p2p page. Logan