Received: by 10.223.185.116 with SMTP id b49csp2741937wrg; Mon, 5 Mar 2018 08:02:08 -0800 (PST) X-Google-Smtp-Source: AG47ELvtbXihtDoq2sFRDS7tZKxGoRC0M+9O8c3aQH7/GFjYAOgXT5i4n7IttONAVqlHl69P9djj X-Received: by 2002:a17:902:bc3:: with SMTP id 61-v6mr13830555plr.398.1520265728343; Mon, 05 Mar 2018 08:02:08 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1520265728; cv=none; d=google.com; s=arc-20160816; b=lbdfTGpnwr2YHRcXxsvHqZ8m3fMfqYRq/Igpuz3SAozTza2jV5SdQ1JMe6ReI15m1Q kO99bRGVZbK7l0ZivWiPPyM/Yw8dbvxKzu0mll9Cy3xvEvblSoIvDvmBexRGS6gvPIuU uwqMrzT2pklZrRAHq5k4OJxsMRTbsleaIqLgmaxSeBH8KAHF8/ib4dclh7cmYRTXe0sE 9tU2Hb9b2E52hNNy1hOe20BBmj0WjsLJJXVcp9BZJJhy5sMjzAdWXfk4nDKFDCv0hChB 3RRjz8lFBy6FXn3wn5tE2EuKNR9zkoe1lRGg52Alb7fgGb+/E2Xtu0BAEKxfszDcTpC7 GDJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=4/UWRjmcQVT9+KdKY7up2vdSUDFbyHKdKMqiuMgVMeE=; b=ktK3GFSIJscga8Bph2r1/zzf2r9W/E4Tx6blFzN0s5wIlbljG+btanXtbyGFFSgmKZ barJ4ZJLQAr4PW37RSMjj20Okg2oPgiG50zFcRE3RwA7HaAzmCOXdeLhnZQY514M0jxv VQppdlK0gsWS7tEbwziUJtsDd6GIEU6pMb1/Rpr6dlvkNrMqizLaqi6pVCnnRqOT2ovj tXk3wEKAY9aFiwajqVvU2+uiK2wFmXQ/VBhd90x87Dso5NKxxt708tzjqAM8ykrZOlfI +9E4gxtUnTu3JART+Y5gbJohvPMcK/EyBm+5aTvE0GUuJyVCrTcpnJCmnL2ChuAupqBQ RUUg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y64si7478758pgy.363.2018.03.05.08.01.52; Mon, 05 Mar 2018 08:02:08 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752401AbeCEQAG (ORCPT + 99 others); Mon, 5 Mar 2018 11:00:06 -0500 Received: from mga12.intel.com ([192.55.52.136]:7355 "EHLO mga12.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751144AbeCEQAE (ORCPT ); Mon, 5 Mar 2018 11:00:04 -0500 X-Amp-Result: UNKNOWN X-Amp-Original-Verdict: FILE UNKNOWN X-Amp-File-Uploaded: False Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga106.fm.intel.com with ESMTP/TLS/DHE-RSA-AES256-GCM-SHA384; 05 Mar 2018 08:00:03 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.47,427,1515484800"; d="scan'208";a="208947363" Received: from unknown (HELO localhost.localdomain) ([10.232.112.44]) by fmsmga005.fm.intel.com with ESMTP; 05 Mar 2018 08:00:03 -0800 Date: Mon, 5 Mar 2018 09:00:05 -0700 From: Keith Busch To: Oliver Cc: Logan Gunthorpe , linux-kernel@vger.kernel.org, linux-pci@vger.kernel.org, linux-nvme@lists.infradead.org, linux-rdma@vger.kernel.org, "linux-nvdimm@lists.01.org" , linux-block@vger.kernel.org, Jens Axboe , Benjamin Herrenschmidt , Alex Williamson , =?iso-8859-1?B?Suly9G1l?= Glisse , Jason Gunthorpe , Bjorn Helgaas , Max Gurtovoy , Christoph Hellwig Subject: Re: [PATCH v2 07/10] nvme-pci: Use PCI p2pmem subsystem to manage the CMB Message-ID: <20180305160004.GA30975@localhost.localdomain> References: <20180228234006.21093-1-logang@deltatee.com> <20180228234006.21093-8-logang@deltatee.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.9.1 (2017-09-22) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Mar 05, 2018 at 12:33:29PM +1100, Oliver wrote: > On Thu, Mar 1, 2018 at 10:40 AM, Logan Gunthorpe wrote: > > @@ -429,10 +429,7 @@ static void __nvme_submit_cmd(struct nvme_queue *nvmeq, > > { > > u16 tail = nvmeq->sq_tail; > > > - if (nvmeq->sq_cmds_io) > > - memcpy_toio(&nvmeq->sq_cmds_io[tail], cmd, sizeof(*cmd)); > > - else > > - memcpy(&nvmeq->sq_cmds[tail], cmd, sizeof(*cmd)); > > + memcpy(&nvmeq->sq_cmds[tail], cmd, sizeof(*cmd)); > > Hmm, how safe is replacing memcpy_toio() with regular memcpy()? On PPC > the _toio() variant enforces alignment, does the copy with 4 byte > stores, and has a full barrier after the copy. In comparison our > regular memcpy() does none of those things and may use unaligned and > vector load/stores. For normal (cacheable) memory that is perfectly > fine, but they can cause alignment faults when targeted at MMIO > (cache-inhibited) memory. > > I think in this particular case it might be ok since we know SEQs are > aligned to 64 byte boundaries and the copy is too small to use our > vectorised memcpy(). I'll assume we don't need explicit ordering > between writes of SEQs since the existing code doesn't seem to care > unless the doorbell is being rung, so you're probably fine there too. > That said, I still think this is a little bit sketchy and at the very > least you should add a comment explaining what's going on when the CMB > is being used. If someone more familiar with the NVMe driver could > chime in I would appreciate it. I may not be understanding the concern, but I'll give it a shot. You're right, the start of any SQE is always 64-byte aligned, so that should satisfy alignment requirements. The order when writing multiple/successive SQEs in a submission queue does matter, and this is currently serialized through the q_lock. The order in which the bytes of a single SQE is written doesn't really matter as long as the entire SQE is written into the CMB prior to writing that SQ's doorbell register. The doorbell register is written immediately after copying a command entry into the submission queue (ignore "shadow buffer" features), so the doorbells written to commands submitted is 1:1. If a CMB SQE and DB order is not enforced with the memcpy, then we do need a barrier after the SQE's memcpy and before the doorbell's writel.