Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758830AbdDSBVe (ORCPT ); Tue, 18 Apr 2017 21:21:34 -0400 Received: from gate.crashing.org ([63.228.1.57]:59734 "EHLO gate.crashing.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758127AbdDSBVT (ORCPT ); Tue, 18 Apr 2017 21:21:19 -0400 Message-ID: <1492564806.25766.124.camel@kernel.crashing.org> Subject: Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory From: Benjamin Herrenschmidt To: Jason Gunthorpe , Dan Williams Cc: Logan Gunthorpe , Bjorn Helgaas , Christoph Hellwig , Sagi Grimberg , "James E.J. Bottomley" , "Martin K. Petersen" , Jens Axboe , Steve Wise , Stephen Bates , Max Gurtovoy , Keith Busch , linux-pci@vger.kernel.org, linux-scsi , linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org, linux-nvdimm , "linux-kernel@vger.kernel.org" , Jerome Glisse Date: Wed, 19 Apr 2017 11:20:06 +1000 In-Reply-To: <20170418210339.GA24257@obsidianresearch.com> References: <1492381396.25766.43.camel@kernel.crashing.org> <20170418164557.GA7181@obsidianresearch.com> <20170418190138.GH7181@obsidianresearch.com> <20170418210339.GA24257@obsidianresearch.com> Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.22.6 (3.22.6-1.fc25) Mime-Version: 1.0 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1203 Lines: 32 On Tue, 2017-04-18 at 15:03 -0600, Jason Gunthorpe wrote: > I don't follow, when does get_dma_ops() return a p2p aware provider? > It has no way to know if the DMA is going to involve p2p, get_dma_ops > is called with the device initiating the DMA. > > So you'd always return the P2P shim on a system that has registered > P2P memory? > > Even so, how does this shim work? dma_ops are not really intended to > be stacked. How would we make unmap work, for instance? What happens > when the underlying iommu dma ops actually natively understands p2p > and doesn't want the shim? Good point. We only know on a per-page basis ... ugh. So we really need to change the arch main dma_ops. I'm not opposed to that. What we then need to do is have that main arch dma_map_sg, when it encounters a "device" page, call into a helper attached to the devmap to handle *that page*, providing sufficient context. That helper wouldn't perform the actual iommu mapping. It would simply return something along the lines of: - "use that alternate bus address and don't map in the iommu" - "use that alternate bus address and do map in the iommu" - "proceed as normal" - "fail" What do you think ? Cheers, Ben.