Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S968579AbdDSRlx (ORCPT ); Wed, 19 Apr 2017 13:41:53 -0400 Received: from mail-yb0-f179.google.com ([209.85.213.179]:34594 "EHLO mail-yb0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S968560AbdDSRlt (ORCPT ); Wed, 19 Apr 2017 13:41:49 -0400 MIME-Version: 1.0 In-Reply-To: <20170419173225.GA11255@redhat.com> References: <20170418164557.GA7181@obsidianresearch.com> <20170418190138.GH7181@obsidianresearch.com> <20170418210339.GA24257@obsidianresearch.com> <1492564806.25766.124.camel@kernel.crashing.org> <20170419155557.GA8497@obsidianresearch.com> <4899b011-bdfb-18d8-ef00-33a1516216a6@deltatee.com> <20170419173225.GA11255@redhat.com> From: Dan Williams Date: Wed, 19 Apr 2017 10:41:42 -0700 Message-ID: Subject: Re: [RFC 0/8] Copy Offload with Peer-to-Peer PCI Memory To: Jerome Glisse Cc: Logan Gunthorpe , Jason Gunthorpe , Benjamin Herrenschmidt , Bjorn Helgaas , Christoph Hellwig , Sagi Grimberg , "James E.J. Bottomley" , "Martin K. Petersen" , Jens Axboe , Steve Wise , Stephen Bates , Max Gurtovoy , Keith Busch , linux-pci@vger.kernel.org, linux-scsi , linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org, linux-nvdimm , "linux-kernel@vger.kernel.org" Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3289 Lines: 70 On Wed, Apr 19, 2017 at 10:32 AM, Jerome Glisse wrote: > On Wed, Apr 19, 2017 at 10:01:23AM -0700, Dan Williams wrote: >> On Wed, Apr 19, 2017 at 9:48 AM, Logan Gunthorpe wrote: >> > >> > >> > On 19/04/17 09:55 AM, Jason Gunthorpe wrote: >> >> I was thinking only this one would be supported with a core code >> >> helper.. >> > >> > Pivoting slightly: I was looking at how HMM uses ZONE_DEVICE. They add a >> > type flag to the dev_pagemap structure which would be very useful to us. >> > We could add another MEMORY_DEVICE_P2P type to distinguish p2p pages. >> > Then, potentially, we could add a dma_map callback to the structure >> > (possibly unioned with an hmm field). The dev_ops providers would then >> > just need to do something like this (enclosed in a helper): >> > >> > if (is_zone_device_page(page)) { >> > pgmap = get_dev_pagemap(page_to_pfn(page)); >> > if (!pgmap || pgmap->type != MEMORY_DEVICE_P2P || >> > !pgmap->dma_map) >> > return 0; >> > >> > dma_addr = pgmap->dma_map(dev, pgmap->dev, page); >> > put_dev_pagemap(pgmap); >> > if (!dma_addr) >> > return 0; >> > ... >> > } >> > >> > The pci_enable_p2p_bar function would then just need to call >> > devm_memremap_pages with the dma_map callback set to a function that >> > does the segment check and the offset calculation. >> > >> > Thoughts? >> > >> > @Jerome: my feedback to you would be that your patch assumes all users >> > of devm_memremap_pages are MEMORY_DEVICE_PERSISTENT. It would be more >> > useful if it was generic. My suggestion would be to have the caller >> > allocate the dev_pagemap structure, populate it and pass it into >> > devm_memremap_pages. Given that pretty much everything in that structure >> > are already arguments to that function, I feel like this makes sense. >> > This should also help to unify hmm_devmem_pages_create and >> > devm_memremap_pages which look very similar to each other. >> >> I like that change. Also the types should describe the memory relative >> to its relationship to struct page, not whether it is persistent or >> not. I would consider volatile and persistent memory that is attached >> to the cpu memory controller and i/o coherent as the same type of >> memory. DMA incoherent ranges like P2P and HMM should get their own >> types. > > Dan you asked me to not use devm_memremap_pages() because you didn't > want to have HMM memory in the pgmap_radix, did you change opinion > on that ? :) No, not quite ;-). I still don't think we should require the non-HMM to pass NULL for all the HMM arguments. What I like about Logan's proposal is to have a separate create and register steps dev_pagemap. That way call paths that don't care about HMM specifics can just turn around and register the vanilla dev_pagemap. > Note i won't make any change now on that front but if it make sense > i am happy to do it as separate patchset on top of HMM. > > Also i don't want p2pmem to be an exclusive or with HMM, we will want > GPU to do peer to peer DMA and thus HMM ZONE_DEVICE pages to support > this too. Yes, this makes sense I think we really just want to distinguish host memory or not in terms of the dev_pagemap type.