Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp6028584rdb; Thu, 14 Dec 2023 06:37:31 -0800 (PST) X-Google-Smtp-Source: AGHT+IEW3JaiwY3TwaThSsblzM5vk/sRR+fkopATsTFL+z1mzvZAgPHrf6SdMgwDVHbJvJf+upwl X-Received: by 2002:a17:903:1c3:b0:1d3:4ec0:92a5 with SMTP id e3-20020a17090301c300b001d34ec092a5mr1809545plh.83.1702564651345; Thu, 14 Dec 2023 06:37:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702564651; cv=none; d=google.com; s=arc-20160816; b=DU6vaLtCjUdxR6QkzgixDMCz9e1mhABl1JTmj2gBjvrBd094DA5Uyu7oMPGUB1+Sw+ MYFORw3+6nGVaqBho3+jE/z8WcOI64wNoOfbRZNPx8lOQOHn/Yu+0aQ7HP6x+Dim+bQ9 m5XZ/meqa1RxzQvQa2rAKwMATT62DeGRpSP9e9yOj34AaaDoriRelcMILiCIqffoIsSa vTLdFBZYKS3YuqTAOQJHOv3CEHLyXoZ/1QfQCyj7X0eqG2LJKg1C9yS1KJ5bgwfsbOAR ocJhxQkeHdGfO/X/EfOZQ/Br1HYElAXOq8PVp/u7lDws+DLAj15ITEMNuKlk9vC1WoxB MIzQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=DmrzEFB8s8PeKQeuKNNWPl0mUDFCnEXWxD6UvA8A1bM=; fh=VyqK/KqPE5W8g7EpM+UfYWj6W9IWnwpYoBcnd3c12Q4=; b=Abb4ZeapPe6hA5gx7Nwsx6TbJIF7kho116Bkj/2HRCZxrZgorxcus1V+hG9MU1L8hS 9+L6km9xipyn4WTSoconCweW/wsO9+xwJrtNaQFS6XacviDVRK+VsreeKtbQjgGjirFA jJYt6Ryx4T1LeEMQNY0oKJUtfQl55KlZAPP9gK2fCH/2T3daVsq+i6Z9+SevEoy5fk8p An0jIeshOzJ3Me8I5kXowYeaManWNURMI9+7pViCKEVlH86NpR7x2o64g+/72N8SF3I4 T9ADKdEklhk1gcpDYbNmjqH9SIcQA7xtdOOsZ5fvCOwTX9vvXJJrFSHaBu0QGGgL5wOw ILMg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id j5-20020a170902da8500b001cfb55da13bsi11492805plx.332.2023.12.14.06.37.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 14 Dec 2023 06:37:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id B9B4E801B82E; Thu, 14 Dec 2023 06:37:28 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1573505AbjLNOhI (ORCPT + 99 others); Thu, 14 Dec 2023 09:37:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35180 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573497AbjLNOhH (ORCPT ); Thu, 14 Dec 2023 09:37:07 -0500 Received: from verein.lst.de (verein.lst.de [213.95.11.211]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ECC0DE8; Thu, 14 Dec 2023 06:37:13 -0800 (PST) Received: by verein.lst.de (Postfix, from userid 2407) id 4CA3968AFE; Thu, 14 Dec 2023 15:37:09 +0100 (CET) Date: Thu, 14 Dec 2023 15:37:09 +0100 From: Christoph Hellwig To: John Garry Cc: Christoph Hellwig , axboe@kernel.dk, kbusch@kernel.org, sagi@grimberg.me, jejb@linux.ibm.com, martin.petersen@oracle.com, djwong@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, dchinner@redhat.com, jack@suse.cz, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, tytso@mit.edu, jbongio@google.com, linux-scsi@vger.kernel.org, ming.lei@redhat.com, jaswin@linux.ibm.com, bvanassche@acm.org Subject: Re: [PATCH v2 00/16] block atomic writes Message-ID: <20231214143708.GA5331@lst.de> References: <20231212110844.19698-1-john.g.garry@oracle.com> <20231212163246.GA24594@lst.de> <20231213154409.GA7724@lst.de> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.17 (2007-11-01) X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Thu, 14 Dec 2023 06:37:28 -0800 (PST) On Wed, Dec 13, 2023 at 04:27:35PM +0000, John Garry wrote: >>> Are there any patches yet for the change to always use SGLs for transfers >>> larger than a single PRP? >> No. Here is the WIP version. With that you'd need to make atomic writes conditional on !ctrl->need_virt_boundary. diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 8ebdfd623e0f78..e04faffd6551fe 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -1889,7 +1889,8 @@ static void nvme_set_queue_limits(struct nvme_ctrl *ctrl, blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors); blk_queue_max_segments(q, min_t(u32, max_segments, USHRT_MAX)); } - blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1); + if (q == ctrl->admin_q || ctrl->need_virt_boundary) + blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1); blk_queue_dma_alignment(q, 3); blk_queue_write_cache(q, vwc, vwc); } diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index e7411dac00f725..aa98794a3ec53d 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -262,6 +262,7 @@ enum nvme_ctrl_flags { struct nvme_ctrl { bool comp_seen; bool identified; + bool need_virt_boundary; enum nvme_ctrl_state state; spinlock_t lock; struct mutex scan_lock; diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 61af7ff1a9d6ba..a8d273b475cb40 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -60,8 +60,7 @@ MODULE_PARM_DESC(max_host_mem_size_mb, static unsigned int sgl_threshold = SZ_32K; module_param(sgl_threshold, uint, 0644); MODULE_PARM_DESC(sgl_threshold, - "Use SGLs when average request segment size is larger or equal to " - "this size. Use 0 to disable SGLs."); + "Use SGLs when > 0. Use 0 to disable SGLs."); #define NVME_PCI_MIN_QUEUE_SIZE 2 #define NVME_PCI_MAX_QUEUE_SIZE 4095 @@ -504,23 +503,6 @@ static void nvme_commit_rqs(struct blk_mq_hw_ctx *hctx) spin_unlock(&nvmeq->sq_lock); } -static inline bool nvme_pci_use_sgls(struct nvme_dev *dev, struct request *req, - int nseg) -{ - struct nvme_queue *nvmeq = req->mq_hctx->driver_data; - unsigned int avg_seg_size; - - avg_seg_size = DIV_ROUND_UP(blk_rq_payload_bytes(req), nseg); - - if (!nvme_ctrl_sgl_supported(&dev->ctrl)) - return false; - if (!nvmeq->qid) - return false; - if (!sgl_threshold || avg_seg_size < sgl_threshold) - return false; - return true; -} - static void nvme_free_prps(struct nvme_dev *dev, struct request *req) { const int last_prp = NVME_CTRL_PAGE_SIZE / sizeof(__le64) - 1; @@ -769,12 +751,14 @@ static blk_status_t nvme_setup_sgl_simple(struct nvme_dev *dev, static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, struct nvme_command *cmnd) { + struct nvme_queue *nvmeq = req->mq_hctx->driver_data; struct nvme_iod *iod = blk_mq_rq_to_pdu(req); + bool sgl_supported = nvme_ctrl_sgl_supported(&dev->ctrl) && + nvmeq->qid && sgl_threshold; blk_status_t ret = BLK_STS_RESOURCE; int rc; if (blk_rq_nr_phys_segments(req) == 1) { - struct nvme_queue *nvmeq = req->mq_hctx->driver_data; struct bio_vec bv = req_bvec(req); if (!is_pci_p2pdma_page(bv.bv_page)) { @@ -782,8 +766,7 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, return nvme_setup_prp_simple(dev, req, &cmnd->rw, &bv); - if (nvmeq->qid && sgl_threshold && - nvme_ctrl_sgl_supported(&dev->ctrl)) + if (sgl_supported) return nvme_setup_sgl_simple(dev, req, &cmnd->rw, &bv); } @@ -806,7 +789,7 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, goto out_free_sg; } - if (nvme_pci_use_sgls(dev, req, iod->sgt.nents)) + if (sgl_supported) ret = nvme_pci_setup_sgls(dev, req, &cmnd->rw); else ret = nvme_pci_setup_prps(dev, req, &cmnd->rw); @@ -3036,6 +3019,8 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id) result = nvme_init_ctrl_finish(&dev->ctrl, false); if (result) goto out_disable; + if (!nvme_ctrl_sgl_supported(&dev->ctrl)) + dev->ctrl.need_virt_boundary = true; nvme_dbbuf_dma_alloc(dev); diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 81e2621169e5d3..416a9fbcccfc74 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -838,6 +838,7 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl, error = nvme_init_ctrl_finish(&ctrl->ctrl, false); if (error) goto out_quiesce_queue; + ctrl->ctrl.need_virt_boundary = true; return 0;