Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 69913C4332F for ; Mon, 10 Jan 2022 15:24:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236137AbiAJPYy (ORCPT ); Mon, 10 Jan 2022 10:24:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46900 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236118AbiAJPYx (ORCPT ); Mon, 10 Jan 2022 10:24:53 -0500 Received: from mail-ed1-x530.google.com (mail-ed1-x530.google.com [IPv6:2a00:1450:4864:20::530]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19E4BC061748 for ; Mon, 10 Jan 2022 07:24:53 -0800 (PST) Received: by mail-ed1-x530.google.com with SMTP id 30so53360125edv.3 for ; Mon, 10 Jan 2022 07:24:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=Uz7H82wmAArl7fSroH8Zwvc58fiFt6Y8AKx2abF1okU=; b=yW+xlYWsp6YuKbRVJ3m+8ZOfhN62/bCBbtGssiGCJB8vlf1fAv0dbmlYPvL6q4n+Ea nPb7GrWV6e03Us51Tiw2EhBfVxotFJwxe8ZtQa08HUUv60tnVTPhWWvaEQFW4DZ5dHm+ SJoOAJbZZ3n/1zAoQBhXAQucdvwdl0IxcroWq70ztUJZvuOS/8uwUqrVsodsfoF0E1hv KkT+aOwpAdqwmZ65hsJKPH71UZXw90Y0x1r1Kn9zomGyc+azmAot4Nm6wCYXRa5RNsfK fIPgCW+NLG+XuYKch8vx+zqTnfLDdhR93vIscfYKJv8Z7zNOWWFcU/OoamlU+GS7G/9j sjMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=Uz7H82wmAArl7fSroH8Zwvc58fiFt6Y8AKx2abF1okU=; b=vL+EXZbuTkRK7o3fai2z0pfJkWREiAjeKpj45HJJemSAavzs3VDhYW0DiUtaU/B9II NdOFppmipxSchsllTaLsutB8YAhtO2DeeVNMfaA5drp2/odvi8866athRJNxjPEt2NTa d2wnDWhSxCoEPtvHCBdAY20J8PFj86yYD8tQ07rQC8von4DlOwc6vr+T5pJ1B3ZPCb8I bZfstuec75JUZ0BhFvGiEGH9LrAp77MZ2adhoP9h/+msZTMMb29xJMaORHtqhEAXI5Zu ImxlGQfcRX88iryRGENhMuS6GWByRpDlaXSK+ndTArmcN0/q8D7zSu5vsekqehdiA3c/ 8FxQ== X-Gm-Message-State: AOAM532qG/HhtaGhCj6VP8G2E5lrlrOdx6miFBWxAfFjEKEmw8AJ0FQT ACJTT7c2v8qkVVvv1YovkBbawOEK6CzAi5LSA7lV X-Google-Smtp-Source: ABdhPJxWc7Bx8VLhIHQS3wJbAlcpFOcvxY2PVIQaF1OaZfuWPMRHUePDwW0PsHzzOXUU3GVAxIYcBiOATO+er4Tf/LI= X-Received: by 2002:a17:907:1b11:: with SMTP id mp17mr215607ejc.374.1641828291620; Mon, 10 Jan 2022 07:24:51 -0800 (PST) MIME-Version: 1.0 References: <20210830141737.181-1-xieyongji@bytedance.com> <20220110075546-mutt-send-email-mst@kernel.org> <20220110100911-mutt-send-email-mst@kernel.org> In-Reply-To: <20220110100911-mutt-send-email-mst@kernel.org> From: Yongji Xie Date: Mon, 10 Jan 2022 23:24:40 +0800 Message-ID: Subject: Re: [PATCH v12 00/13] Introduce VDUSE - vDPA Device in Userspace To: "Michael S. Tsirkin" Cc: Jason Wang , Stefan Hajnoczi , Stefano Garzarella , Parav Pandit , Christoph Hellwig , Christian Brauner , Randy Dunlap , Matthew Wilcox , Al Viro , Jens Axboe , bcrl@kvack.org, Jonathan Corbet , =?UTF-8?Q?Mika_Penttil=C3=A4?= , Dan Carpenter , joro@8bytes.org, Greg KH , He Zhe , Liu Xiaodong , Joe Perches , Robin Murphy , Will Deacon , John Garry , songmuchun@bytedance.com, virtualization , Netdev , kvm , linux-fsdevel@vger.kernel.org, iommu@lists.linux-foundation.org, linux-kernel Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 10, 2022 at 11:10 PM Michael S. Tsirkin wrote: > > On Mon, Jan 10, 2022 at 09:54:08PM +0800, Yongji Xie wrote: > > On Mon, Jan 10, 2022 at 8:57 PM Michael S. Tsirkin wro= te: > > > > > > On Mon, Aug 30, 2021 at 10:17:24PM +0800, Xie Yongji wrote: > > > > This series introduces a framework that makes it possible to implem= ent > > > > software-emulated vDPA devices in userspace. And to make the device > > > > emulation more secure, the emulated vDPA device's control path is h= andled > > > > in the kernel and only the data path is implemented in the userspac= e. > > > > > > > > Since the emuldated vDPA device's control path is handled in the ke= rnel, > > > > a message mechnism is introduced to make userspace be aware of the = data > > > > path related changes. Userspace can use read()/write() to receive/r= eply > > > > the control messages. > > > > > > > > In the data path, the core is mapping dma buffer into VDUSE daemon'= s > > > > address space, which can be implemented in different ways depending= on > > > > the vdpa bus to which the vDPA device is attached. > > > > > > > > In virtio-vdpa case, we implements a MMU-based software IOTLB with > > > > bounce-buffering mechanism to achieve that. And in vhost-vdpa case,= the dma > > > > buffer is reside in a userspace memory region which can be shared t= o the > > > > VDUSE userspace processs via transferring the shmfd. > > > > > > > > The details and our user case is shown below: > > > > > > > > ------------------------ ------------------------- -----------= ----------------------------------- > > > > | Container | | QEMU(VM) | | = VDUSE daemon | > > > > | --------- | | ------------------- | | ---------= ---------------- ---------------- | > > > > | |dev/vdx| | | |/dev/vhost-vdpa-x| | | | vDPA de= vice emulation | | block driver | | > > > > ------------+----------- -----------+------------ -----------= --+----------------------+--------- > > > > | | = | | > > > > | | = | | > > > > ------------+---------------------------+--------------------------= --+----------------------+--------- > > > > | | block device | | vhost device | | vdus= e driver | | TCP/IP | | > > > > | -------+-------- --------+-------- ------= -+-------- -----+---- | > > > > | | | = | | | > > > > | ----------+---------- ----------+----------- ------= -+------- | | > > > > | | virtio-blk driver | | vhost-vdpa driver | | vdpa= device | | | > > > > | ----------+---------- ----------+----------- ------= -+------- | | > > > > | | virtio bus | = | | | > > > > | --------+----+----------- | = | | | > > > > | | | = | | | > > > > | ----------+---------- | = | | | > > > > | | virtio-blk device | | = | | | > > > > | ----------+---------- | = | | | > > > > | | | = | | | > > > > | -----------+----------- | = | | | > > > > | | virtio-vdpa driver | | = | | | > > > > | -----------+----------- | = | | | > > > > | | | = | vdpa bus | | > > > > | -----------+----------------------+--------------------------= -+------------ | | > > > > | = ---+--- | > > > > -------------------------------------------------------------------= ----------------------| NIC |------ > > > > = ---+--- > > > > = | > > > > = ---------+--------- > > > > = | Remote Storages | > > > > = ------------------- > > > > > > > > We make use of it to implement a block device connecting to > > > > our distributed storage, which can be used both in containers and > > > > VMs. Thus, we can have an unified technology stack in this two case= s. > > > > > > > > To test it with null-blk: > > > > > > > > $ qemu-storage-daemon \ > > > > --chardev socket,id=3Dcharmonitor,path=3D/tmp/qmp.sock,server= ,nowait \ > > > > --monitor chardev=3Dcharmonitor \ > > > > --blockdev driver=3Dhost_device,cache.direct=3Don,aio=3Dnativ= e,filename=3D/dev/nullb0,node-name=3Ddisk0 \ > > > > --export type=3Dvduse-blk,id=3Dtest,node-name=3Ddisk0,writabl= e=3Don,name=3Dvduse-null,num-queues=3D16,queue-size=3D128 > > > > > > > > The qemu-storage-daemon can be found at https://github.com/bytedanc= e/qemu/tree/vduse > > > > > > It's been half a year - any plans to upstream this? > > > > Yeah, this is on my to-do list this month. > > > > Sorry for taking so long... I've been working on another project > > enabling userspace RDMA with VDUSE for the past few months. So I > > didn't have much time for this. Anyway, I will submit the first > > version as soon as possible. > > > > Thanks, > > Yongji > > Oh fun. You mean like virtio-rdma? Or RDMA as a backend for regular > virtio? > Yes, like virtio-rdma. Then we can develop something like userspace rxe=E3=80=81siw or custom protocol with VDUSE. Thanks, Yongji