Received: by 2002:a05:6358:11c7:b0:104:8066:f915 with SMTP id i7csp1994880rwl; Thu, 30 Mar 2023 04:41:57 -0700 (PDT) X-Google-Smtp-Source: AKy350a0A50G2a8/Yb7Rzt920fWtJid6/TXbazlincMxrOxFFIrvrh5NQUBioErEYsZUItE7r2tc X-Received: by 2002:a17:90a:eb12:b0:23b:4bce:97de with SMTP id j18-20020a17090aeb1200b0023b4bce97demr1990225pjz.4.1680176517370; Thu, 30 Mar 2023 04:41:57 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1680176517; cv=none; d=google.com; s=arc-20160816; b=sMLP1hq3E4WOhxoSn/wumgVNeKa6bo9Ej4YAa+1tKc7eZqabF/xXvrZbX1G2ujt0F4 dqcqTY3S5C8yQAM85xaCLDAgM0vC3fJAzS0Q3qL1V3vOl/ORgTdCQCjQrj4OvZJ7jAvc 55UZe/s6ijK2VBn179bP1o3kPehOoBWstX5wVKVd4ryXmFKdLUpvE+EYK2bioSA7orux 61epcQscmcVMrYcUwxr6QlQxFOOMybN2nmDCR2Tn7YHGAyeBhpzBCo39lTEYpZfNIK2q nxWnb3JkJizmhDxDD46D36zAxncAJ+cgDmTKWvIF0d4UJcOZqWK2NIV0AJcpht+MKRHj crnQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=F7F6vbrS0gh77HT5XSUWKXsVAOVZJQ/PeMG00WEkJhQ=; b=UbHnglsvFKOhKTgr+FUuGI8D4D/OzQ2NJeRgn1OpmutBFI8dCLQuPouQbgYy8IKLXu IaBJonxdY2nSUuDw4vb9v5PFOtV/qeQEtPCk+/y8edLIaaGK+hKfwE+8a43+t70DjWKu t3a439KHxZaLdE++KpL810MjDGaDXN789FMh4U9nq1dIIf6msEKrd8q1Mj1Xka6hd1al 3kF5qGixUsoO5MqEs1btf4WydphQ11aa6dw/bDJXr4luzUlH79i9cbt3wWOas/o77k60 XgykxogUFtT8LFX1GpHHSoa6OUxnSDT6I9KZ1SdKTODW5klEWP96dfUSf9kI/1CI4pOf bsCA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JjcmF5Zo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id nl2-20020a17090b384200b0023f1eccd286si4064472pjb.107.2023.03.30.04.41.45; Thu, 30 Mar 2023 04:41:57 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=JjcmF5Zo; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231795AbjC3Ljp (ORCPT + 99 others); Thu, 30 Mar 2023 07:39:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46982 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231810AbjC3LjQ (ORCPT ); Thu, 30 Mar 2023 07:39:16 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C97C7A267 for ; Thu, 30 Mar 2023 04:37:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1680176261; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=F7F6vbrS0gh77HT5XSUWKXsVAOVZJQ/PeMG00WEkJhQ=; b=JjcmF5ZobPPWQ5TAX5nvJrmsJY/6hmsZ1a3RS51ud03QnJzsADcP8cbhpGY+ZfCzjNy32T RiHOvRAJ7K5/azGzKQGk5GGB9k3li0O/wVhJvYchigqklkXS03Y1rCbDrrFc8FdQver2J8 FUhxFn2R3dgpOy8aU8xeWQ4m+UIm+/c= Received: from mimecast-mx02.redhat.com (mx3-rdu2.redhat.com [66.187.233.73]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-353-qJ6wA3vFNyqWH4EHtEt7Yw-1; Thu, 30 Mar 2023 07:37:38 -0400 X-MC-Unique: qJ6wA3vFNyqWH4EHtEt7Yw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 677B83815EF1; Thu, 30 Mar 2023 11:37:37 +0000 (UTC) Received: from localhost (ovpn-8-19.pek2.redhat.com [10.72.8.19]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5F54B202701E; Thu, 30 Mar 2023 11:37:35 +0000 (UTC) From: Ming Lei To: Jens Axboe , io-uring@vger.kernel.org, linux-block@vger.kernel.org Cc: linux-kernel@vger.kernel.org, Miklos Szeredi , ZiyangZhang , Xiaoguang Wang , Bernd Schubert , Pavel Begunkov , Stefan Hajnoczi , Dan Williams , Ming Lei Subject: [PATCH V6 13/17] block: ublk_drv: grab request reference when the request is handled by userspace Date: Thu, 30 Mar 2023 19:36:26 +0800 Message-Id: <20230330113630.1388860-14-ming.lei@redhat.com> In-Reply-To: <20230330113630.1388860-1-ming.lei@redhat.com> References: <20230330113630.1388860-1-ming.lei@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Scanned-By: MIMEDefang 3.1 on 10.11.54.4 X-Spam-Status: No, score=-0.2 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_NONE, RCVD_IN_MSPIKE_H2,SPF_HELO_NONE,SPF_NONE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Add one reference counter into request pdu data, and hold this reference in the request's lifetime. This way is always safe. In theory, the ublk request won't be completed until fused commands are done. However, it is userspace, and application can submit fused command at will. Prepare for supporting zero copy, which needs to retrieve request buffer by fused command, so we have to guarantee: - the fused command can't succeed unless the request isn't queued - when any fused command is successful, this request can't be freed until all fused commands on this request are done. Signed-off-by: Ming Lei --- drivers/block/ublk_drv.c | 67 ++++++++++++++++++++++++++++++++++++++-- 1 file changed, 64 insertions(+), 3 deletions(-) diff --git a/drivers/block/ublk_drv.c b/drivers/block/ublk_drv.c index cca0e95a89d8..0dc8eb04b9a5 100644 --- a/drivers/block/ublk_drv.c +++ b/drivers/block/ublk_drv.c @@ -43,6 +43,7 @@ #include #include #include +#include #include #define UBLK_MINORS (1U << MINORBITS) @@ -62,6 +63,17 @@ struct ublk_rq_data { struct llist_node node; struct callback_head work; + + /* + * Only for applying fused command to support zero copy: + * + * - if there is any fused command aiming at this request, not complete + * request until all fused commands are done + * + * - fused command has to fail unless this reference is grabbed + * successfully + */ + struct kref ref; }; struct ublk_uring_cmd_pdu { @@ -180,6 +192,9 @@ struct ublk_params_header { __u32 types; }; +static inline void __ublk_complete_rq(struct request *req); +static void ublk_complete_rq(struct kref *ref); + static dev_t ublk_chr_devt; static struct class *ublk_chr_class; @@ -288,6 +303,35 @@ static int ublk_apply_params(struct ublk_device *ub) return 0; } +static inline bool ublk_support_zc(const struct ublk_queue *ubq) +{ + return ubq->flags & UBLK_F_SUPPORT_ZERO_COPY; +} + +static inline bool ublk_get_req_ref(const struct ublk_queue *ubq, + struct request *req) +{ + if (ublk_support_zc(ubq)) { + struct ublk_rq_data *data = blk_mq_rq_to_pdu(req); + + return kref_get_unless_zero(&data->ref); + } + + return true; +} + +static inline void ublk_put_req_ref(const struct ublk_queue *ubq, + struct request *req) +{ + if (ublk_support_zc(ubq)) { + struct ublk_rq_data *data = blk_mq_rq_to_pdu(req); + + kref_put(&data->ref, ublk_complete_rq); + } else { + __ublk_complete_rq(req); + } +} + static inline bool ublk_can_use_task_work(const struct ublk_queue *ubq) { if (IS_BUILTIN(CONFIG_BLK_DEV_UBLK) && @@ -632,13 +676,19 @@ static inline bool ubq_daemon_is_dying(struct ublk_queue *ubq) } /* todo: handle partial completion */ -static void ublk_complete_rq(struct request *req) +static inline void __ublk_complete_rq(struct request *req) { struct ublk_queue *ubq = req->mq_hctx->driver_data; struct ublk_io *io = &ubq->ios[req->tag]; unsigned int unmapped_bytes; blk_status_t res = BLK_STS_OK; + /* called from ublk_abort_queue() code path */ + if (io->flags & UBLK_IO_FLAG_ABORTED) { + res = BLK_STS_IOERR; + goto exit; + } + /* failed read IO if nothing is read */ if (!io->res && req_op(req) == REQ_OP_READ) io->res = -EIO; @@ -678,6 +728,15 @@ static void ublk_complete_rq(struct request *req) blk_mq_end_request(req, res); } +static void ublk_complete_rq(struct kref *ref) +{ + struct ublk_rq_data *data = container_of(ref, struct ublk_rq_data, + ref); + struct request *req = blk_mq_rq_from_pdu(data); + + __ublk_complete_rq(req); +} + /* * Since __ublk_rq_task_work always fails requests immediately during * exiting, __ublk_fail_req() is only called from abort context during @@ -696,7 +755,7 @@ static void __ublk_fail_req(struct ublk_queue *ubq, struct ublk_io *io, if (ublk_queue_can_use_recovery_reissue(ubq)) blk_mq_requeue_request(req, false); else - blk_mq_end_request(req, BLK_STS_IOERR); + ublk_put_req_ref(ubq, req); } } @@ -734,6 +793,7 @@ static inline void __ublk_rq_task_work(struct request *req, unsigned issue_flags) { struct ublk_queue *ubq = req->mq_hctx->driver_data; + struct ublk_rq_data *data = blk_mq_rq_to_pdu(req); int tag = req->tag; struct ublk_io *io = &ubq->ios[tag]; unsigned int mapped_bytes; @@ -805,6 +865,7 @@ static inline void __ublk_rq_task_work(struct request *req, mapped_bytes >> 9; } + kref_init(&data->ref); ubq_complete_io_cmd(io, UBLK_IO_RES_OK, issue_flags); } @@ -1017,7 +1078,7 @@ static void ublk_commit_completion(struct ublk_device *ub, req = blk_mq_tag_to_rq(ub->tag_set.tags[qid], tag); if (req && likely(!blk_should_fake_timeout(req->q))) - ublk_complete_rq(req); + ublk_put_req_ref(ubq, req); } /* -- 2.39.2