Received: by 10.223.185.116 with SMTP id b49csp2823637wrg; Mon, 5 Mar 2018 09:13:20 -0800 (PST) X-Google-Smtp-Source: AG47ELtc5WcxbojZ0+m6RRB7VORXDJ4u1reQ9T40/hPGfmNsfeAxB0yfvblG8k3w4bEbDtYzhBAH X-Received: by 10.98.224.208 with SMTP id d77mr16144097pfm.194.1520270000003; Mon, 05 Mar 2018 09:13:20 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520269999; cv=none; d=google.com; s=arc-20160816; b=IqMJGSANiDF01PdJvyEl3ihGbNLj16WD5UkiwGm+MtKHCQd8uFIth4Q4RVmpp5ZnFr k1nthflK9iuBuU7BOQLw9VEnpiN8dKgsP/2kVEn0j+Eocg+is+lNqmr5bo6NXJqRHfRZ AbD++2KRTNhjJzPyM9L8lUkYEU5soviubcq4JW+SQkBH+Go/45IPpFV/mezz8PjPT0Rv 4+J28P7h2WwxSLh3clpi02Wq6jr5t5tgIsA0sJQ7jL3qe9FRJ6ASFkXCJAwzgoznAYeb zXk/Iq75Sv/bjyCQ1JUjurO0xkoG7tYfKps7jkv6s0bKzZBcZhqNLR6HxSCddwjimiHD 359Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:subject:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:arc-authentication-results; bh=GH3YPKb1JkhpmC0YrJG1uqQl6xTM8o77xyy2Pa2XPlg=; b=KVNEE7UJJ55xQ2MSgOS//LVd4RXq0A6EXdhF5wqkIm8q4ZE9AfDr/Um1C2tMSOlXGZ 3yJedCnw+Tu+qPxoiVBlSOBrzKFNWq+pH8wtchWEqnZ4zvG7WTLUsAS8yKY9srs6Zq3M LssaDLCKu9lJacL9g3ZaUkcQyMU8+18XT3aaW+4eE64yq+BRSqwpsLLUmTaZp8ms+9r2 AztbJZ553KdC/qF3AuirMYPhXaE/itKAOKlXzx4e1J8RHUioqwm6loO+OYyw/Lp7vtrw ispKTjydidiZs+swNmgebtecuzF8IRqYxUSMmXteHf1bNwdujWmlG6eWY/jwsKRkbX+e rKpA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u23-v6si9845141plk.116.2018.03.05.09.13.06; Mon, 05 Mar 2018 09:13:19 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752578AbeCERLG (ORCPT + 99 others); Mon, 5 Mar 2018 12:11:06 -0500 Received: from ale.deltatee.com ([207.54.116.67]:33980 "EHLO ale.deltatee.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751490AbeCERLD (ORCPT ); Mon, 5 Mar 2018 12:11:03 -0500 Received: from guinness.priv.deltatee.com ([172.16.1.162]) by ale.deltatee.com with esmtp (Exim 4.89) (envelope-from ) id 1estdQ-0005Mh-RN; Mon, 05 Mar 2018 10:10:50 -0700 To: Keith Busch , Oliver Cc: linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org, "linux-nvdimm@lists.01.org" , linux-block@vger.kernel.org, Jens Axboe , Benjamin Herrenschmidt , Alex Williamson , =?UTF-8?B?SsOpcsO0bWUgR2xpc3Nl?= , Jason Gunthorpe , Bjorn Helgaas , Max Gurtovoy , Christoph Hellwig References: <20180228234006.21093-1-logang@deltatee.com> <20180228234006.21093-8-logang@deltatee.com> <20180305160004.GA30975@localhost.localdomain> From: Logan Gunthorpe Message-ID: <3f56c76d-6a5c-7c2f-5442-c9209749b598@deltatee.com> Date: Mon, 5 Mar 2018 10:10:41 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180305160004.GA30975@localhost.localdomain> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-SA-Exim-Connect-IP: 172.16.1.162 X-SA-Exim-Rcpt-To: hch@lst.de, maxg@mellanox.com, bhelgaas@google.com, jgg@mellanox.com, jglisse@redhat.com, alex.williamson@redhat.com, benh@kernel.crashing.org, axboe@kernel.dk, linux-block@vger.kernel.org, linux-nvdimm@lists.01.org, linux-rdma@vger.kernel.org, linux-nvme@lists.infradead.org, linux-pci@vger.kernel.org, linux-kernel@vger.kernel.org, oohall@gmail.com, keith.busch@intel.com X-SA-Exim-Mail-From: logang@deltatee.com X-Spam-Checker-Version: SpamAssassin 3.4.1 (2015-04-28) on ale.deltatee.com X-Spam-Level: X-Spam-Status: No, score=-8.9 required=5.0 tests=ALL_TRUSTED,BAYES_00, GREYLIST_ISWHITE,T_RP_MATCHES_RCVD autolearn=ham autolearn_force=no version=3.4.1 Subject: Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB X-SA-Exim-Version: 4.2.1 (built Tue, 02 Aug 2016 21:08:31 +0000) X-SA-Exim-Scanned: Yes (on ale.deltatee.com) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 05/03/18 09:00 AM, Keith Busch wrote: > On Mon, Mar 05, 2018 at 12:33:29PM +1100, Oliver wrote: >> On Thu, Mar 1, 2018 at 10:40 AM, Logan Gunthorpe wrote: >>> @@ -429,10 +429,7 @@ static void __nvme_submit_cmd(struct nvme_queue *nvmeq, >>> { >>> u16 tail = nvmeq->sq_tail; >> >>> - if (nvmeq->sq_cmds_io) >>> - memcpy_toio(&nvmeq->sq_cmds_io[tail], cmd, sizeof(*cmd)); >>> - else >>> - memcpy(&nvmeq->sq_cmds[tail], cmd, sizeof(*cmd)); >>> + memcpy(&nvmeq->sq_cmds[tail], cmd, sizeof(*cmd)); >> >> Hmm, how safe is replacing memcpy_toio() with regular memcpy()? On PPC >> the _toio() variant enforces alignment, does the copy with 4 byte >> stores, and has a full barrier after the copy. In comparison our >> regular memcpy() does none of those things and may use unaligned and >> vector load/stores. For normal (cacheable) memory that is perfectly >> fine, but they can cause alignment faults when targeted at MMIO >> (cache-inhibited) memory. >> >> I think in this particular case it might be ok since we know SEQs are >> aligned to 64 byte boundaries and the copy is too small to use our >> vectorised memcpy(). I'll assume we don't need explicit ordering >> between writes of SEQs since the existing code doesn't seem to care >> unless the doorbell is being rung, so you're probably fine there too. >> That said, I still think this is a little bit sketchy and at the very >> least you should add a comment explaining what's going on when the CMB >> is being used. If someone more familiar with the NVMe driver could >> chime in I would appreciate it. > > I may not be understanding the concern, but I'll give it a shot. > > You're right, the start of any SQE is always 64-byte aligned, so that > should satisfy alignment requirements. > > The order when writing multiple/successive SQEs in a submission queue > does matter, and this is currently serialized through the q_lock. > > The order in which the bytes of a single SQE is written doesn't really > matter as long as the entire SQE is written into the CMB prior to writing > that SQ's doorbell register. > > The doorbell register is written immediately after copying a command > entry into the submission queue (ignore "shadow buffer" features), > so the doorbells written to commands submitted is 1:1. > > If a CMB SQE and DB order is not enforced with the memcpy, then we do > need a barrier after the SQE's memcpy and before the doorbell's writel. Thanks for the information Keith. Adding to this: regular memcpy generally also enforces alignment as unaligned access to regular memory is typically bad in some way on most arches. The generic memcpy_toio also does not have any barrier as it is just a call to memcpy. Arm64 also does not appear to have a barrier in its implementation and in the short survey I did I could not find any implementation with a barrier. I also did not find a ppc implementation in the tree but it would be weird for it to add a barrier when other arches do not appear to need it. We've been operating on the assumption that memory mapped by devm_memremap_pages() can be treated as regular memory. This is emphasized by the fact that it does not return an __iomem pointer. If this assumption does not hold for an arch then we cannot support P2P DMA without an overhaul of many kernel interfaces or creating other backend interfaces into the drivers which take different data types (ie. we'd have to bypass the entire block layer when trying to write data in p2pmem to an nvme device. This is very undesirable. Logan