Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1750898AbdHaE6K (ORCPT ); Thu, 31 Aug 2017 00:58:10 -0400 Received: from mail-wr0-f179.google.com ([209.85.128.179]:35271 "EHLO mail-wr0-f179.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750714AbdHaE6I (ORCPT ); Thu, 31 Aug 2017 00:58:08 -0400 MIME-Version: 1.0 In-Reply-To: References: <1503322344-5900-1-git-send-email-suganath-prabu.subramani@broadcom.com> From: Suganath Prabu Subramani Date: Thu, 31 Aug 2017 10:28:06 +0530 Message-ID: Subject: Re: [PATCH v4 00/14] mpt3sas driver NVMe support: To: "Martin K. Petersen" Cc: linux-scsi@vger.kernel.org, Sathya Prakash , Kashyap Desai , linux-kernel@vger.kernel.org, Chaitra Basappa , Sreekanth Reddy , linux-nvme@lists.infradead.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2598 Lines: 76 Hi Martin, Replied inline. Thanks, Suganath Prabu S On Thu, Aug 31, 2017 at 8:35 AM, Martin K. Petersen wrote: > > Hi Suganath, > >> Theoretically we want to use h/w capability (to translate IEEE to PRP) >> for smaller IO size to leverage h/w capability. > > Nobody says we have to use the capability just because the hardware has > it. > > Unlike some other operating systems, Linux will only submit I/Os to the > driver that conform to the reported underlying constraints of the > hardware. In general, h/w constraints are handled. What we missed is Fast Path h/w which is not exposed to OS. > I fail to understand how letting the HBA firmware translate an > SGL to a PRP for a subset of I/Os could do anything but add latency. Let me explain - NVME device fast path is possible in two ways. IEEE SGL and PRP SGL. Due to h/w constraint we choose IEEE SGL only for smaller IO size. Both above is true h/w Fast Path and no firmware involvement. > Plus complexity in the hot path of the driver. Agree with you. We are planning to see if we can keep only simple Fast Path using only PRP. It will take some time to finalize as we have to engage h/w and f/w team. BTW - This area is h/w dependent and we do not see further changes in this area. > >> - If the unmap translation in firmware is slow, why don't you translate >> WRITE SAME/w UNMAP set to DSM DEALLOCATE without requiring >> applications to do encapsulated passthrough? > >> => As of now, current FW supports UNMAP command but not WRITE_SAME for >> NVME drive. We did some experiment to convert UMAP command in driver, >> but that is not really giving any performance improvement. > > It is imperative that the common use case, Linux' discard > infrastructure, is working correctly and is as performant as any > application-driven passthrough workaround. > > Unlike SCSI-to-SATA translation you have the benefit of a 1:1 mapping > between UNMAP and DEALLOCATE. I'm not even sure why there would be a > significant performance penalty in the firmware? I agree. Currently there is no performance issue for UNMAP translation in FW. >> We would like to continue with UNMAP (and all other non-read/write >> commands) to be handled in FW. > > And yet patch 4 circumvents that statement by adding support for > encapsulated commands to bypass the FW translation... This path is not due to performance reason. User wants to interact with NVME drive in native NVME command for management. > > -- > Martin K. Petersen Oracle Linux Engineering