Received: by 2002:a05:7412:8598:b0:f9:33c2:5753 with SMTP id n24csp151635rdh; Mon, 18 Dec 2023 14:50:38 -0800 (PST) X-Google-Smtp-Source: AGHT+IFFz1ngjmGziOBAchlD+eIQ5iDXnbu8DKMcnfbM1GrfmWR8Q9X4QUbCrkZfbHDj7Ypo1gsr X-Received: by 2002:a05:620a:19a6:b0:77d:98c3:1eda with SMTP id bm38-20020a05620a19a600b0077d98c31edamr22785543qkb.43.1702939838774; Mon, 18 Dec 2023 14:50:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702939838; cv=none; d=google.com; s=arc-20160816; b=NnC63V+hBVDN33nnfqSWck5BZ4FaDA8ovwYpobdOSDMZ7HXN/6je9gHhywEq4SSbGa 6TOlnI/lLGvE9IXp/TQ9YFISPZid/x/udK0DSfT6y4AIGkPghCpg5yIzjeBHT+Xp2bbi MbRCr2N5Z7fbbVwekX40Ikd43EZSE1sNykjk+tF84ytTVHCl2VDcXng5VRq+YUJmHsWJ q6buCt9NFomJESkaTPtvdg2zajv7IGw9tHuMVQ+3gT5GSHS/YnLqq45Vzj7cFMknd/yn tfX/vb7MyMipqDJBWcs3z7pvmFSgsJ+HFnYTBjcWA7uBDKnU98XdNy7lK03mLeS4ULfn MKtg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=in-reply-to:content-disposition:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:references:message-id:subject:cc :to:from:date:dkim-signature; bh=wJaxHKiEDWdFU9vk0GHcuynbIe5J6XaL3nWABmQRXMs=; fh=NoPIj2YClyqb81Jx5/PHOcU79UfriuWTGZklsCCfFeU=; b=B285NzkI89Co5c2DaM1jQcAxYP9E3XJP27CMZF4hb8Ycf4GOftJ8ZdEPjBblGNcKqu dX+AXRyo2kpP9tW5RGOeu0Hm8hz3b7zmBhuWZKeMVtKjoTFLLlglqGXv/5K1iOXKFPIs aKSmXuMJjZnSu53Xgub5pY2lqgjMZYnJ4VJn+Y/MFmSllf1sqhdTCgZeY4mWtQ60VaeO HrPA8k2tgANwYMPthDp7UkeMzaWYhhZ0o73UTs7HkF6QDxmMxuay/vAVKGj+Jzh76Jw7 x4KVSYKTri6L+8tpbp0rdXES+S56UFf2x1hKRz74TCWuzi03X9g8abxXaWYAjB77Tat0 tPkA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=JMFtsEZL; spf=pass (google.com: domain of linux-kernel+bounces-4454-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-4454-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [147.75.199.223]) by mx.google.com with ESMTPS id f2-20020a05620a280200b0077f6cd3d5f8si16425173qkp.114.2023.12.18.14.50.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Dec 2023 14:50:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-4454-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) client-ip=147.75.199.223; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=k20201202 header.b=JMFtsEZL; spf=pass (google.com: domain of linux-kernel+bounces-4454-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.199.223 as permitted sender) smtp.mailfrom="linux-kernel+bounces-4454-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 807701C2221D for ; Mon, 18 Dec 2023 22:50:38 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 7E5CA760B0; Mon, 18 Dec 2023 22:50:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="JMFtsEZL" X-Original-To: linux-kernel@vger.kernel.org Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 994B476080; Mon, 18 Dec 2023 22:50:20 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id F368FC433C7; Mon, 18 Dec 2023 22:50:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1702939820; bh=XQ3Y2xlODgItuHUfk9LQWvZkE68Or3E774k7hEqz9uo=; h=Date:From:To:Cc:Subject:References:In-Reply-To:From; b=JMFtsEZLP9TInD4z1KBYZJn+HhHyQPZeKZ/uxCjFDfRQ4R41d0jWZCypmnLeNuUjl rpQWOCq+O7G6xURivgtJYluZ77NwNHUwxaqR7HEzI+vR2LeMn2fRcU0UViecTGXyfb 90MQZqkN8OwSOOqlHvNg8anjBpO9H3nRknTsFTJe3xywUlve3pijecR5LQLKNdHQn4 iWg0atuu68HLn+lXqA6M3kkpkGdT9YvDwGKExYdJqaUqoBS2WD0ajIzMcRih+uLXOB pnsYDAzMfJFStLjI5N0rkI2O9fd+rt1no7PzyMUtiO5tnG8C9cwoy9RJqin+9rAjvv nYGbbLi7tZJ3w== Date: Mon, 18 Dec 2023 15:50:16 -0700 From: Keith Busch To: Christoph Hellwig Cc: John Garry , axboe@kernel.dk, sagi@grimberg.me, jejb@linux.ibm.com, martin.petersen@oracle.com, djwong@kernel.org, viro@zeniv.linux.org.uk, brauner@kernel.org, dchinner@redhat.com, jack@suse.cz, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-xfs@vger.kernel.org, linux-fsdevel@vger.kernel.org, tytso@mit.edu, jbongio@google.com, linux-scsi@vger.kernel.org, ming.lei@redhat.com, jaswin@linux.ibm.com, bvanassche@acm.org Subject: Re: [PATCH v2 00/16] block atomic writes Message-ID: References: <20231212110844.19698-1-john.g.garry@oracle.com> <20231212163246.GA24594@lst.de> <20231213154409.GA7724@lst.de> <20231214143708.GA5331@lst.de> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231214143708.GA5331@lst.de> On Thu, Dec 14, 2023 at 03:37:09PM +0100, Christoph Hellwig wrote: > On Wed, Dec 13, 2023 at 04:27:35PM +0000, John Garry wrote: > >>> Are there any patches yet for the change to always use SGLs for transfers > >>> larger than a single PRP? > >> No. > > Here is the WIP version. With that you'd need to make atomic writes > conditional on !ctrl->need_virt_boundary. This looks pretty good as-is! > diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c > index 8ebdfd623e0f78..e04faffd6551fe 100644 > --- a/drivers/nvme/host/core.c > +++ b/drivers/nvme/host/core.c > @@ -1889,7 +1889,8 @@ static void nvme_set_queue_limits(struct nvme_ctrl *ctrl, > blk_queue_max_hw_sectors(q, ctrl->max_hw_sectors); > blk_queue_max_segments(q, min_t(u32, max_segments, USHRT_MAX)); > } > - blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1); > + if (q == ctrl->admin_q || ctrl->need_virt_boundary) > + blk_queue_virt_boundary(q, NVME_CTRL_PAGE_SIZE - 1); > blk_queue_dma_alignment(q, 3); > blk_queue_write_cache(q, vwc, vwc); > } > diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h > index e7411dac00f725..aa98794a3ec53d 100644 > --- a/drivers/nvme/host/nvme.h > +++ b/drivers/nvme/host/nvme.h > @@ -262,6 +262,7 @@ enum nvme_ctrl_flags { > struct nvme_ctrl { > bool comp_seen; > bool identified; > + bool need_virt_boundary; > enum nvme_ctrl_state state; > spinlock_t lock; > struct mutex scan_lock; > diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c > index 61af7ff1a9d6ba..a8d273b475cb40 100644 > --- a/drivers/nvme/host/pci.c > +++ b/drivers/nvme/host/pci.c > @@ -60,8 +60,7 @@ MODULE_PARM_DESC(max_host_mem_size_mb, > static unsigned int sgl_threshold = SZ_32K; > module_param(sgl_threshold, uint, 0644); > MODULE_PARM_DESC(sgl_threshold, > - "Use SGLs when average request segment size is larger or equal to " > - "this size. Use 0 to disable SGLs."); > + "Use SGLs when > 0. Use 0 to disable SGLs."); > > #define NVME_PCI_MIN_QUEUE_SIZE 2 > #define NVME_PCI_MAX_QUEUE_SIZE 4095 > @@ -504,23 +503,6 @@ static void nvme_commit_rqs(struct blk_mq_hw_ctx *hctx) > spin_unlock(&nvmeq->sq_lock); > } > > -static inline bool nvme_pci_use_sgls(struct nvme_dev *dev, struct request *req, > - int nseg) > -{ > - struct nvme_queue *nvmeq = req->mq_hctx->driver_data; > - unsigned int avg_seg_size; > - > - avg_seg_size = DIV_ROUND_UP(blk_rq_payload_bytes(req), nseg); > - > - if (!nvme_ctrl_sgl_supported(&dev->ctrl)) > - return false; > - if (!nvmeq->qid) > - return false; > - if (!sgl_threshold || avg_seg_size < sgl_threshold) > - return false; > - return true; > -} > - > static void nvme_free_prps(struct nvme_dev *dev, struct request *req) > { > const int last_prp = NVME_CTRL_PAGE_SIZE / sizeof(__le64) - 1; > @@ -769,12 +751,14 @@ static blk_status_t nvme_setup_sgl_simple(struct nvme_dev *dev, > static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, > struct nvme_command *cmnd) > { > + struct nvme_queue *nvmeq = req->mq_hctx->driver_data; > struct nvme_iod *iod = blk_mq_rq_to_pdu(req); > + bool sgl_supported = nvme_ctrl_sgl_supported(&dev->ctrl) && > + nvmeq->qid && sgl_threshold; > blk_status_t ret = BLK_STS_RESOURCE; > int rc; > > if (blk_rq_nr_phys_segments(req) == 1) { > - struct nvme_queue *nvmeq = req->mq_hctx->driver_data; > struct bio_vec bv = req_bvec(req); > > if (!is_pci_p2pdma_page(bv.bv_page)) { > @@ -782,8 +766,7 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, > return nvme_setup_prp_simple(dev, req, > &cmnd->rw, &bv); > > - if (nvmeq->qid && sgl_threshold && > - nvme_ctrl_sgl_supported(&dev->ctrl)) > + if (sgl_supported) > return nvme_setup_sgl_simple(dev, req, > &cmnd->rw, &bv); > } > @@ -806,7 +789,7 @@ static blk_status_t nvme_map_data(struct nvme_dev *dev, struct request *req, > goto out_free_sg; > } > > - if (nvme_pci_use_sgls(dev, req, iod->sgt.nents)) > + if (sgl_supported) > ret = nvme_pci_setup_sgls(dev, req, &cmnd->rw); > else > ret = nvme_pci_setup_prps(dev, req, &cmnd->rw); > @@ -3036,6 +3019,8 @@ static int nvme_probe(struct pci_dev *pdev, const struct pci_device_id *id) > result = nvme_init_ctrl_finish(&dev->ctrl, false); > if (result) > goto out_disable; > + if (!nvme_ctrl_sgl_supported(&dev->ctrl)) > + dev->ctrl.need_virt_boundary = true; > > nvme_dbbuf_dma_alloc(dev); > > diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c > index 81e2621169e5d3..416a9fbcccfc74 100644 > --- a/drivers/nvme/host/rdma.c > +++ b/drivers/nvme/host/rdma.c > @@ -838,6 +838,7 @@ static int nvme_rdma_configure_admin_queue(struct nvme_rdma_ctrl *ctrl, > error = nvme_init_ctrl_finish(&ctrl->ctrl, false); > if (error) > goto out_quiesce_queue; > + ctrl->ctrl.need_virt_boundary = true; > > return 0; >