Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752297AbcLFVrM (ORCPT ); Tue, 6 Dec 2016 16:47:12 -0500 Received: from ale.deltatee.com ([207.54.116.67]:50068 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750785AbcLFVrK (ORCPT ); Tue, 6 Dec 2016 16:47:10 -0500 To: Jason Gunthorpe References: <61a2fb07344aacd81111449d222de66e.squirrel@webmail.raithlin.com> <20161205171830.GB27784@obsidianresearch.com> <20161205180231.GA28133@obsidianresearch.com> <20161206163850.GC28066@obsidianresearch.com> <20161206172838.GB19318@obsidianresearch.com> Cc: Stephen Bates , Dan Williams , Haggai Eran , "linux-kernel@vger.kernel.org" , "linux-rdma@vger.kernel.org" , "linux-nvdimm@ml01.01.org" , "christian.koenig@amd.com" , "Suravee.Suthikulpanit@amd.com" , "John.Bridgman@amd.com" , "Alexander.Deucher@amd.com" , "Linux-media@vger.kernel.org" , "dri-devel@lists.freedesktop.org" , Max Gurtovoy , "linux-pci@vger.kernel.org" , "serguei.sagalovitch@amd.com" , "Paul.Blinzer@amd.com" , "Felix.Kuehling@amd.com" , "ben.sander@amd.com" From: Logan Gunthorpe Message-ID: Date: Tue, 6 Dec 2016 14:47:04 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161206172838.GB19318@obsidianresearch.com> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-SA-Exim-Connect-IP: 172.16.1.111 X-SA-Exim-Rcpt-To: ben.sander@amd.com, felix.kuehling@amd.com, paul.blinzer@amd.com, serguei.sagalovitch@amd.com, linux-pci@vger.kernel.org, maxg@mellanox.com, dri-devel@lists.freedesktop.org, linux-media@vger.kernel.org, alexander.deucher@amd.com, john.bridgman@amd.com, suravee.suthikulpanit@amd.com, christian.koenig@amd.com, linux-nvdimm@ml01.01.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, haggaie@mellanox.com, dan.j.williams@intel.com, sbates@raithlin.com, jgunthorpe@obsidianresearch.com X-SA-Exim-Mail-From: logang@deltatee.com Subject: Re: Enabling peer to peer device transactions for PCIe devices X-SA-Exim-Version: 4.2.1 (built Mon, 26 Dec 2011 16:24:06 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1436 Lines: 38 Hey, > Okay, so clearly this needs a kernel side NVMe specific allocator > and locking so users don't step on each other.. Yup, ideally. That's why device dax isn't ideal for this application: it doesn't provide any way to prevent users from stepping on each other. > Or as Christoph says some kind of general mechanism to get these > bounce buffers.. Yeah, I imagine a general allocate from BAR/region system would be very useful. > Ah, I see. > > As a first draft I'd stick with some kind of API built into the > /dev/nvmeX that backs the filesystem. The user app would fstat the > target file, open /dev/block/MAJOR(st_dev):MINOR(st_dev), do some > ioctl to get a CMB mmap, and then proceed from there.. > > When that is all working kernel-side, it would make sense to look at a > more general mechanism that could be used unprivileged?? That makes a lot of sense to me. I suggested mmapping the char device because it's really easy, but I can see that an ioctl on the block device does seem more general and device agnostic. > This is similar to the GPU issues too.. On NVMe you don't need to pin > the pages, you just need to lock that VMA so it doesn't get freed from > the NVMe CMB allocator while the IO is running... > Probably in the long run the get_user_pages is going to have to be > pushed down into drivers.. Future MMU coherent IO hardware also does > not need the pinning or other overheads. Yup. Yup. Logan