Received: by 2002:ac0:de83:0:0:0:0:0 with SMTP id b3csp1454168imk; Mon, 4 Jul 2022 03:28:15 -0700 (PDT) X-Google-Smtp-Source: AGRyM1vN3PyY6kgssvdwxBwqj51bsxwLlEh8leSPi+VBP05/DwSM/zlJYedGQ06XNQct8z42XhQq X-Received: by 2002:a63:ec03:0:b0:40d:e79d:e22e with SMTP id j3-20020a63ec03000000b0040de79de22emr25630164pgh.53.1656930495669; Mon, 04 Jul 2022 03:28:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1656930495; cv=none; d=google.com; s=arc-20160816; b=XZaa3VnzhJqLIAwI4rzsuWqWwbUW+/buc4a04SOly59dFCPZkSpL2s3gC4vZGZeHOk 136HJfE+EpV4pXMibLLA6P/Jbomb7sEdKHhsbR0aW9Fw2zUSZdXn3eV9Nhi4ZoMRUu5t y9OeSOYOSo+20tCYVhus8F3js1OlHhWSbsBoO4uHZ2y7bFjggDJ6OmMvivyghwtlopUs e3bRgYFcvHO9Uv1j/PALjtGEt5Y5GOxw5S8PQhkIlctc2J2SotixHsPKRncXbxY9X+4/ gpTYw7SP9c9Q+2A2eVlCKcK+/NTdH3jPJJOh/Am8cDZyDFHPf+QzFD2EQLwXi2rMGbAS 1W9g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=GdxpXqd5Zb/vXY/O4KXszU7VLBhJIHNKCWL85v0aSCA=; b=w/TMhAtOT9s6PtHJw6698jyFxrVPYBUBTiqrXIMUfIZLFi96n6inMCyaog6MsMAhMi gZo1ykXCOK8n+uljl15WJo8yFOLW9W2//aNs+GUojKlSHg3TJvXK6AbR0ILHYIfn+0cP ChHFu5Trs5ZRvibW4a0WcPfaqQuCT1Gc5y5m62ib3jQQIOFGT+YKakSpEXMl58GVJVX9 OslT5w+3vuZys47C7FqWuZ9KjOO9qxD3lh3DDzzIMyR5Y+PhNaPMfbYncFu5x9sma8ny aapTZcmwovBa4jVl8pTeU7Ilpf8g+sB1Yq0WBkSbavZ7iQd8u44OS/NbXe57n0CW12I1 LKUw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=LItEpDeO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id lt18-20020a17090b355200b001ef85ad4e16si3773235pjb.69.2022.07.04.03.28.03; Mon, 04 Jul 2022 03:28:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@bytedance-com.20210112.gappssmtp.com header.s=20210112 header.b=LItEpDeO; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=bytedance.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233986AbiGDKDy (ORCPT + 99 others); Mon, 4 Jul 2022 06:03:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52732 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234127AbiGDKDj (ORCPT ); Mon, 4 Jul 2022 06:03:39 -0400 Received: from mail-ej1-x62c.google.com (mail-ej1-x62c.google.com [IPv6:2a00:1450:4864:20::62c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9AFBDE02B for ; Mon, 4 Jul 2022 03:02:38 -0700 (PDT) Received: by mail-ej1-x62c.google.com with SMTP id sb34so15757462ejc.11 for ; Mon, 04 Jul 2022 03:02:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=bytedance-com.20210112.gappssmtp.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=GdxpXqd5Zb/vXY/O4KXszU7VLBhJIHNKCWL85v0aSCA=; b=LItEpDeOzqvBLZS8mV8vwfNdRojgaEHUBqdn0g7r8G+nBY/Jy7ONq0oG5YsVdOn/L5 MUvHyRYKPi58DokyxX6rPiRyLdXKMvkTNVY+jSLMCfGfUGCl4Mh8UhxeNMh0k233j5YH btFs9J3VS3F01g6trSP6Jb27yZMTJ503p4J0P0BwQEn28pd/zL18Jpj1nctZZFe+rJOJ 6NbglyeNjU6mEjmoNdBn6atLFmJryQwVaj4rZjYRnh6aiCTgLrxgts55oLRiS5Gshl8/ 50SeVoQm6EXUdyjLDs+dwcMEHW0MAmB8CnqS5ozhgdthvDfITw6Hl7Weq+UOJz8CgjIg unvA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=GdxpXqd5Zb/vXY/O4KXszU7VLBhJIHNKCWL85v0aSCA=; b=mLDNy5dweLuBqv/gLhRVfSJ8LS925kUZwHH/N0bK5ttWl5PCeECT4fRg/nZXDkd2tC WvNoD5dIMhy6zjwp2Ogl+I6TF6qbNfjVM41dPgIOExK5Yfo+pQrK/crs+Q6Myx6iyuy4 ynYGwtgvD4BKdIgAkRF4Pv1vfl9nW0AAXYo3in+ASteYUjDOF9QeIYttOzYr4vyFD3zu wFl5Vh22RQlCPSagau0NheRUEnRobOV9ZubqNDpo6b+oSmD0ev1M3BILbp73X9DDZ0fZ IBc9WxaSWBaVIzlCGDpjo/3w33UycPWxB5Dl6YHxBjPWXHCcXtSPY0HkuouxVJn2IkiE mAvw== X-Gm-Message-State: AJIora9PSxK/3Vc6V10jvND36M5LTcAQNqE4SzeHb8MAJ0FvA9tQD1qQ pypqZ/in1tnUWfNy6JeC+45uz8tFadn5j3EPtZ2I X-Received: by 2002:a17:907:7ba1:b0:726:4522:d368 with SMTP id ne33-20020a1709077ba100b007264522d368mr28002033ejc.662.1656928957018; Mon, 04 Jul 2022 03:02:37 -0700 (PDT) MIME-Version: 1.0 References: <20220629082541.118-1-xieyongji@bytedance.com> <20220704092652.GB105370@storage2.sh.intel.com> In-Reply-To: <20220704092652.GB105370@storage2.sh.intel.com> From: Yongji Xie Date: Mon, 4 Jul 2022 18:02:26 +0800 Message-ID: Subject: Re: [PATCH 0/6] VDUSE: Support registering userspace memory as bounce buffer To: Liu Xiaodong Cc: "Michael S. Tsirkin" , Jason Wang , Maxime Coquelin , Stefan Hajnoczi , virtualization , linux-kernel Content-Type: text/plain; charset="UTF-8" X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_NONE,SPF_HELO_NONE,SPF_PASS, T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Xiaodong, On Mon, Jul 4, 2022 at 5:27 PM Liu Xiaodong wrote: > > On Wed, Jun 29, 2022 at 04:25:35PM +0800, Xie Yongji wrote: > > Hi all, > > > > This series introduces some new ioctls: VDUSE_IOTLB_GET_INFO, > > VDUSE_IOTLB_REG_UMEM and VDUSE_IOTLB_DEREG_UMEM to support > > registering and de-registering userspace memory for IOTLB > > as bounce buffer in virtio-vdpa case. > > > > The VDUSE_IOTLB_GET_INFO ioctl can help user to query IOLTB > > information such as bounce buffer size. Then user can use > > those information on VDUSE_IOTLB_REG_UMEM and > > VDUSE_IOTLB_DEREG_UMEM ioctls to register and de-register > > userspace memory for IOTLB. > > > > During registering and de-registering, the DMA data in use > > would be copied from kernel bounce pages to userspace bounce > > pages and back. > > > > With this feature, some existing application such as SPDK > > and DPDK can leverage the datapath of VDUSE directly and > > efficiently as discussed before [1]. They can register some > > preallocated hugepages to VDUSE to avoid an extra memcpy > > from bounce-buffer to hugepages. > > Hi, Yongji > > Very glad to see this enhancement in VDUSE. Thank you. > It is really helpful and essential to SPDK. > With this new feature, we can get VDUSE transferred data > accessed directly by userspace physical backends, like RDMA > and PCIe devices. > > In SPDK roadmap, it's one important work to export block > services to local host, especially for container scenario. > This patch could help SPDK do that with its userspace > backend stacks while keeping high efficiency and performance. > So the whole SPDK ecosystem can get benefited. > > Based on this enhancement, as discussed, I drafted a VDUSE > prototype module in SPDK for initial evaluation: > [TEST]vduse: prototype for initial draft > https://review.spdk.io/gerrit/c/spdk/spdk/+/13534 > Thanks for this nice work! > Running SPDK on single CPU core, configured with 2 P3700 NVMe, > and exported block devices to local host kernel via different > protocols. The randwrite IOPS through each protocol are: > NBD 121K > NVMf-tcp loopback 274K > VDUSE 463K > > SPDK with RDMA backends should have a similar ratio. > VDUSE has a great performance advantage for SPDK. > We have kept investigating on this usage for years. > Originally, some SPDK users used NBD. Then NVMf-tcp loopback > is SPDK community accommended way. In future, VDUSE could be > the preferred way. > Glad to see SPDK can benefit from this feature. I will continue to improve this feature to make it available ASAP. Thanks, Yongji > > The kernel and userspace codes could be found in github: > > > > https://github.com/bytedance/linux/tree/vduse-umem > > https://github.com/bytedance/qemu/tree/vduse-umem > > > > To test it with qemu-storage-daemon: > > > > $ qemu-storage-daemon \ > > --chardev socket,id=charmonitor,path=/tmp/qmp.sock,server=on,wait=off \ > > --monitor chardev=charmonitor \ > > --blockdev driver=host_device,cache.direct=on,aio=native,filename=/dev/nullb0,node-name=disk0 > > \ > > --export type=vduse-blk,id=vduse-test,name=vduse-test,node-name=disk0,writable=on > > > > [1] https://lkml.org/lkml/2021/6/27/318 > > > > Please review, thanks! > > Waiting for its review process. > > Thanks > Xiaodong