Received: by 2002:ad5:4acb:0:0:0:0:0 with SMTP id n11csp118714imw; Mon, 4 Jul 2022 06:18:33 -0700 (PDT) X-Google-Smtp-Source: AGRyM1sro0+z5lrq/uh5Cz6wow9pUMaScx1kmTDLP0n4TWfaUc+70TaLDJVf7wQCDifAA42PA9J9 X-Received: by 2002:a17:907:217b:b0:722:fc5e:326e with SMTP id rl27-20020a170907217b00b00722fc5e326emr29661273ejb.478.1656940713552; Mon, 04 Jul 2022 06:18:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656940713; cv=none; d=google.com; s=arc-20160816; b=09U7MhiIAEmDC9BCRLgsflUGwgDO+mU8nmZmsAECXqfIy8PzYTWKOGLewvLb2MtVcQ RMqXzcUYLufQWFMgY6FwT12eSEksTmWICJFJcLf/LdgW5a+r8W1wnd6WO0ngRHc7Gq7p 2HfNiXtkqUbIqVUw1gHmfVOjqeChCtiD7BWBMGTq7Ja/4EnCeqKm11XVFVYJuVKQrzMc s1Ay7zoX4gFB/6VqKsLYNSV3A5p3wxmNjNjtUJnjZ1NDp2LQJgXQg/9tH9ATOSQ1KAGy VWIS9gAkWECYiiqfb9hZwjglF+8pdVO3IPX3fKFWAOeMnBY6J1TcCXXmtDIWGeYrcSxH sjeg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date:dkim-signature; bh=9UB4cMRFdeVOiRIbGYBVc5TxD3EzBsUt7mV4+/Fo8vI=; b=z8f8hAGlX1x//7KLo1rkMTzpZc29QjioShiQuFmzYHpfXPDeu/GwtPEIFYqNLdyyOa 7zC9vAtv1/kILBvPgEeWBlCKeihdNtmvFOu0wuYKaCjwu133MuwvNRGL4LMiYG+ueRLL eQOekGYIOUoFUHeagPl4/iS5jNLW9VXQYse8VXmt0UrCriL3Q1iIEO0uiIpHdZ9Kpm2q aQrmKgnTf2J2xteDOJrHJCXiobFIpQpm2B4bIo7l35Z2I4uHjLcvD+07sH3lI/C3NZW4 e5eoqcCtGz3pXuweHxG5ocmw2DsR6z6XIC4ZASHDTg/rDaNC/sobIg1I0EHbC7njlM/v Hs8A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Rp+7pzb7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id qw5-20020a1709066a0500b00722e8821ff6si22846522ejc.514.2022.07.04.06.18.01; Mon, 04 Jul 2022 06:18:33 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=Rp+7pzb7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234299AbiGDMek (ORCPT + 99 others); Mon, 4 Jul 2022 08:34:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:44442 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234273AbiGDMei (ORCPT ); Mon, 4 Jul 2022 08:34:38 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 405C411C1B for ; Mon, 4 Jul 2022 05:34:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1656938070; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=9UB4cMRFdeVOiRIbGYBVc5TxD3EzBsUt7mV4+/Fo8vI=; b=Rp+7pzb7f5d+RBwpf0eNsM5uKCiegtFqV7CE1v+gDU19D+kqzxtsGZZ+4hx4jhVY4tvJ2q tlLXGZqzYrQlJq8w9HcA1Fkp8oUCRjlwsxw04687D41bEzRujm1S4kcqeifajtUg18dNjD gN+7cyOir4cREeXIAXqKPh9QX7sQHeI= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-45-EXIVYpsfOsGpiTZVNGBghg-1; Mon, 04 Jul 2022 08:34:27 -0400 X-MC-Unique: EXIVYpsfOsGpiTZVNGBghg-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id C29ED85A581; Mon, 4 Jul 2022 12:34:26 +0000 (UTC) Received: from T590 (ovpn-8-27.pek2.redhat.com [10.72.8.27]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 9759D2026D64; Mon, 4 Jul 2022 12:34:20 +0000 (UTC) Date: Mon, 4 Jul 2022 20:34:15 +0800 From: Ming Lei To: Sagi Grimberg Cc: Jens Axboe , linux-block@vger.kernel.org, Harris James R , linux-kernel@vger.kernel.org, io-uring@vger.kernel.org, Gabriel Krisman Bertazi , ZiyangZhang , Xiaoguang Wang , Stefan Hajnoczi , ming.lei@redhat.com Subject: Re: [PATCH V3 1/1] ublk: add io_uring based userspace block driver Message-ID: References: <20220628160807.148853-1-ming.lei@redhat.com> <20220628160807.148853-2-ming.lei@redhat.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-Spam-Status: No, score=-3.5 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_LOW, SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 04, 2022 at 02:17:44PM +0300, Sagi Grimberg wrote: > > > This is the driver part of userspace block driver(ublk driver), the other > > part is userspace daemon part(ublksrv)[1]. > > > > The two parts communicate by io_uring's IORING_OP_URING_CMD with one > > shared cmd buffer for storing io command, and the buffer is read only for > > ublksrv, each io command is indexed by io request tag directly, and > > is written by ublk driver. > > > > For example, when one READ io request is submitted to ublk block driver, ublk > > driver stores the io command into cmd buffer first, then completes one > > IORING_OP_URING_CMD for notifying ublksrv, and the URING_CMD is issued to > > ublk driver beforehand by ublksrv for getting notification of any new io request, > > and each URING_CMD is associated with one io request by tag. > > > > After ublksrv gets the io command, it translates and handles the ublk io > > request, such as, for the ublk-loop target, ublksrv translates the request > > into same request on another file or disk, like the kernel loop block > > driver. In ublksrv's implementation, the io is still handled by io_uring, > > and share same ring with IORING_OP_URING_CMD command. When the target io > > request is done, the same IORING_OP_URING_CMD is issued to ublk driver for > > both committing io request result and getting future notification of new > > io request. > > > > Another thing done by ublk driver is to copy data between kernel io > > request and ublksrv's io buffer: > > > > 1) before ubsrv handles WRITE request, copy the request's data into > > ublksrv's userspace io buffer, so that ublksrv can handle the write > > request > > > > 2) after ubsrv handles READ request, copy ublksrv's userspace io buffer > > into this READ request, then ublk driver can complete the READ request > > > > Zero copy may be switched if mm is ready to support it. > > > > ublk driver doesn't handle any logic of the specific user space driver, > > so it should be small/simple enough. > > > > [1] ublksrv > > > > https://github.com/ming1/ubdsrv > > > > Signed-off-by: Ming Lei > > --- > > drivers/block/Kconfig | 6 + > > drivers/block/Makefile | 2 + > > drivers/block/ublk_drv.c | 1603 +++++++++++++++++++++++++++++++++ > > include/uapi/linux/ublk_cmd.h | 158 ++++ > > 4 files changed, 1769 insertions(+) > > create mode 100644 drivers/block/ublk_drv.c > > create mode 100644 include/uapi/linux/ublk_cmd.h > > > > diff --git a/drivers/block/Kconfig b/drivers/block/Kconfig > > index fdb81f2794cd..d218089cdbec 100644 > > --- a/drivers/block/Kconfig > > +++ b/drivers/block/Kconfig > > @@ -408,6 +408,12 @@ config BLK_DEV_RBD > > If unsure, say N. > > +config BLK_DEV_UBLK > > + bool "Userspace block driver" > > Really? why compile this to the kernel and not tristate as loadable > module? So far, this is only one reason: task_work_add() is required, which isn't exported for modules. Thanks, Ming