Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751675AbcLETPq (ORCPT ); Mon, 5 Dec 2016 14:15:46 -0500 Received: from quartz.orcorp.ca ([184.70.90.242]:57153 "EHLO quartz.orcorp.ca" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751455AbcLETPo (ORCPT ); Mon, 5 Dec 2016 14:15:44 -0500 Date: Mon, 5 Dec 2016 12:14:38 -0700 From: Jason Gunthorpe To: Dan Williams Cc: Logan Gunthorpe , Stephen Bates , Haggai Eran , "linux-kernel@vger.kernel.org" , "linux-rdma@vger.kernel.org" , "linux-nvdimm@ml01.01.org" , "christian.koenig@amd.com" , "Suravee.Suthikulpanit@amd.com" , "John.Bridgman@amd.com" , "Alexander.Deucher@amd.com" , "Linux-media@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , Max Gurtovoy , "linux-pci@vger.kernel.org" , "serguei.sagalovitch@amd.com" , "Paul.Blinzer@amd.com" , "Felix.Kuehling@amd.com" , "ben.sander@amd.com" Subject: Re: Enabling peer to peer device transactions for PCIe devices Message-ID: <20161205191438.GA20464@obsidianresearch.com> References: <20161130162353.GA24639@obsidianresearch.com> <5f5b7989-84f5-737e-47c8-831f752d6280@deltatee.com> <61a2fb07344aacd81111449d222de66e.squirrel@webmail.raithlin.com> <20161205171830.GB27784@obsidianresearch.com> <20161205180231.GA28133@obsidianresearch.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.24 (2015-08-30) X-Broken-Reverse-DNS: no host name found for IP address 10.0.0.156 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1717 Lines: 34 On Mon, Dec 05, 2016 at 10:48:58AM -0800, Dan Williams wrote: > On Mon, Dec 5, 2016 at 10:39 AM, Logan Gunthorpe wrote: > > On 05/12/16 11:08 AM, Dan Williams wrote: > >> > >> I've already recommended that iopmem not be a block device and instead > >> be a device-dax instance. I also don't think it should claim the PCI > >> ID, rather the driver that wants to map one of its bars this way can > >> register the memory region with the device-dax core. > >> > >> I'm not sure there are enough device drivers that want to do this to > >> have it be a generic /sys/.../resource_dmableX capability. It still > >> seems to be an exotic one-off type of configuration. > > > > > > Yes, this is essentially my thinking. Except I think the userspace interface > > should really depend on the device itself. Device dax is a good choice for > > many and I agree the block device approach wouldn't be ideal. > > > > Specifically for NVME CMB: I think it would make a lot of sense to just hand > > out these mappings with an mmap call on /dev/nvmeX. I expect CMB buffers > > would be volatile and thus you wouldn't need to keep track of where in the > > BAR the region came from. Thus, the mmap call would just be an allocator > > from BAR memory. If device-dax were used, userspace would need to lookup > > which device-dax instance corresponds to which nvme drive. > > I'm not opposed to mapping /dev/nvmeX. However, the lookup is trivial > to accomplish in sysfs through /sys/dev/char to find the sysfs path > of But CMB sounds much more like the GPU case where there is a specialized allocator handing out the BAR to consumers, so I'm not sure a general purpose chardev makes a lot of sense? Jason