Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932946AbdDFPw7 (ORCPT ); Thu, 6 Apr 2017 11:52:59 -0400 Received: from ale.deltatee.com ([207.54.116.67]:39699 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751605AbdDFPwt (ORCPT ); Thu, 6 Apr 2017 11:52:49 -0400 To: Sagi Grimberg , Christoph Hellwig , "James E.J. Bottomley" , "Martin K. Petersen" , Jens Axboe , Steve Wise , Stephen Bates , Max Gurtovoy , Dan Williams , Keith Busch , Jason Gunthorpe References: <1490911959-5146-1-git-send-email-logang@deltatee.com> <1490911959-5146-4-git-send-email-logang@deltatee.com> <0689e764-bf04-6da2-3b7d-2cbf0b6b94a0@grimberg.me> Cc: linux-pci@vger.kernel.org, linux-scsi@vger.kernel.org, linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org, linux-nvdimm@ml01.01.org, linux-kernel@vger.kernel.org, Sinan Kaya From: Logan Gunthorpe Message-ID: <5f76a824-3918-b706-86bd-0ba455895185@deltatee.com> Date: Thu, 6 Apr 2017 09:52:36 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Icedove/45.6.0 MIME-Version: 1.0 In-Reply-To: <0689e764-bf04-6da2-3b7d-2cbf0b6b94a0@grimberg.me> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit X-SA-Exim-Connect-IP: 172.16.1.111 X-SA-Exim-Rcpt-To: okaya@codeaurora.org, linux-kernel@vger.kernel.org, linux-nvdimm@lists.01.org, linux-rdma@vger.kernel.org, linux-nvme@lists.infradead.org, linux-scsi@vger.kernel.org, linux-pci@vger.kernel.org, jgunthorpe@obsidianresearch.com, keith.busch@intel.com, dan.j.williams@intel.com, maxg@mellanox.com, sbates@raithlin.com, swise@opengridcomputing.com, axboe@kernel.dk, martin.petersen@oracle.com, jejb@linux.vnet.ibm.com, hch@lst.de, sagi@grimberg.me X-SA-Exim-Mail-From: logang@deltatee.com Subject: Re: [RFC 3/8] nvmet: Use p2pmem in nvme target X-SA-Exim-Version: 4.2.1 (built Mon, 26 Dec 2011 16:24:06 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2483 Lines: 55 Hey Sagi, On 05/04/17 11:47 PM, Sagi Grimberg wrote: > Because the user can get it wrong, and its our job to do what we can in > order to prevent the user from screwing itself. Well, "screwing" themselves seems a bit strong. It wouldn't be much different from a lot of other tunables in the system. For example, it would be similar to the user choosing the wrong io scheduler for their disk or workload. If you change this setting without measuring performance you probably don't care too much about the result anyway. > I wasn't against it that much, I'm all for making things "just work" > with minimal configuration steps, but I'm not sure we can get it > right without it. Ok, well in that case I may reconsider this in the next series. >>>> Ideally, we'd want to use an NVME CMB buffer as p2p memory. This would >>>> save an extra PCI transfer as the NVME card could just take the data >>>> out of it's own memory. However, at this time, cards with CMB buffers >>>> don't seem to be available. >>> >>> Even if it was available, it would be hard to make real use of this >>> given that we wouldn't know how to pre-post recv buffers (for in-capsule >>> data). But let's leave this out of the scope entirely... >> >> I don't understand what you're referring to. We'd simply use the CMB >> buffer as a p2pmem device, why does that change anything? > > I'm referring to the in-capsule data buffers pre-posts that we do. > Because we prepare a buffer that would contain in-capsule data, we have > no knowledge to which device the incoming I/O is directed to, which > means we can (and will) have I/O where the data lies in CMB of device > A but it's really targeted to device B - which sorta defeats the purpose > of what we're trying to optimize here... Well, the way I've had it is that each port gets one p2pmem device. So you'd only want to put NVMe devices that will work with that p2pmem device behind that port. Though, I can see that being a difficult restriction seeing it probably means you'll need to have one port per nvme device if you want to use the CMB buffer of each device. I'll have to think about that some. Also, it's worth noting that we aren't even optimizing in-capsule data at this time. > Still the user can get it wrong. Not sure we can get a way without > keeping track of this as new devices join the subsystem. Yeah, I understand. I'll have to think some more about all of this. I'm starting to see some ways to improve thing.s Thanks, Logan