Received: by 2002:a05:6358:4e97:b0:b3:742d:4702 with SMTP id ce23csp23621rwb; Wed, 17 Aug 2022 21:52:18 -0700 (PDT) X-Google-Smtp-Source: AA6agR5Kbg7G7FNkv06bh1qk473LPdqBt7FESVx7GdUUvCSgJaTOMv4jaXz2saOGzvuP1NkRIEyG X-Received: by 2002:a17:906:98c9:b0:730:a23e:2785 with SMTP id zd9-20020a17090698c900b00730a23e2785mr752743ejb.622.1660798338685; Wed, 17 Aug 2022 21:52:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1660798338; cv=none; d=google.com; s=arc-20160816; b=c1Nztndxr2v33TLOCmYA/UraaoQjBo8JkjyYfE5TBI15CG9lTbdOSGiN70DlbroSDd QK8LyNEmFU8WqelJem45GIyfnnuyewjq/elbFfgSYLkXnzM0aM5rnFaGFb9IwnpQdRiK /Xv4kJTf1wQ7Y28m5w5ce/2+2lZ8b+xlBpv7xaBDZY7NdLwk4V39cT1Zv99kF/ZTUgMn Gi+Ur5ZsvhBQjXqEGl4t0XAAMwOx1Q8acCywMR4hBbJiiGGi3V0rdfp5tFmXWohNfGHR 7Xmp1oLQujYp2QwDAGxOJEyN4/MZI0nNHy9hubqq9hD8OodyXTJJyE38PyQmBiTdLNGQ pK8g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id:dkim-signature; bh=KCPrqwMbhashZs9ajDG3ZLy61/uPNuf+3BvVHV/Odhk=; b=QEPpDnASzNCGXRnspgHi2/U2iNr4wZyole6W/rPGEW9SUftyPvdpyWpoNsJ3kNgXbN 18uCCPnGcMTsnjxi/lhjAv5iW82FjmWHb4fq+jVq79VN9OZ89qcBzQoLXfi1rwx/L33R YdWPPn+q9NPVI3s8/WkP9fBKubjjIQZELo/AJqog7g5/JexPeJbAwWF+vnmWRqyYDxYG SLYyKE7KtCojImEe1+PepWi5HSP4pZZUjgfJoywfohbvuh4r/biR27yheXI3LgPG6tby Fs0x89GwNurl+dWvICrXyrI78+SALpeSUyEQ6KshVBM+4fDXih6iBoh9deru5vjqtRwS bmcw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=foaSqwIt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id dd16-20020a1709069b9000b0073832d0ce04si344032ejc.745.2022.08.17.21.51.46; Wed, 17 Aug 2022 21:52:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=foaSqwIt; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237518AbiHRE3I (ORCPT + 99 others); Thu, 18 Aug 2022 00:29:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40406 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S242748AbiHRE3H (ORCPT ); Thu, 18 Aug 2022 00:29:07 -0400 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 486F744579 for ; Wed, 17 Aug 2022 21:29:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1660796944; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=KCPrqwMbhashZs9ajDG3ZLy61/uPNuf+3BvVHV/Odhk=; b=foaSqwItVnI8ICNq1K7ko3JxdBUXZvRJJaOg/Fl3pjIX99+9XB5WO8h5kgYyza3pChWPfs JIz/gepQ8FUQoTM49Vf5FO0S5UNqUi9pB6lecqEVoEP41QIYfTFK6K/JeCi0FpcO0616CH jA7agAuOlveUci3RUa07Leqt2OBXzrE= Received: from mail-pl1-f199.google.com (mail-pl1-f199.google.com [209.85.214.199]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_128_GCM_SHA256) id us-mta-253-mEfAZoXmOPO1ZsC-WGfEPg-1; Thu, 18 Aug 2022 00:29:02 -0400 X-MC-Unique: mEfAZoXmOPO1ZsC-WGfEPg-1 Received: by mail-pl1-f199.google.com with SMTP id e15-20020a17090301cf00b0016dc94ddcc5so435312plh.3 for ; Wed, 17 Aug 2022 21:29:02 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:in-reply-to:from:references:cc:to :content-language:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc; bh=KCPrqwMbhashZs9ajDG3ZLy61/uPNuf+3BvVHV/Odhk=; b=c4uor5yCFahmdVTGtqjPZYlG/iWSAG57ys4k5KMV9CRo+x8ay+bYnK5UlmUVJg1qgp f/wvkX+OlJOhZFzJ+kSgoj3e2OWMTHB3VmnLwOg04MzGsdDQoE1/q5vJ6xGKzPsKmB3N tHiAVA4HOlDLGEKHRgTZDvQmdkMeTFbdd2MHcyl6SbLT9FnQMWBLHJE75hpnw08OHinx qaE3bx2xUvnPCfvRLZj7HhEO9Pp9dfrS1DAUc/yww7/EarDiskSGSHSfEJpWBJ41x44c PNgFz1MajTz+BwHU8DQIcrxhJyICOPdN4tGic/xfi2qvHtuUccQSiO8QVq/GKfKUvvDq hjmA== X-Gm-Message-State: ACgBeo3ESqHDtCT+jVhSCqKkGOpa//jXf1caC79+cHkKOxkq+3yHDIlb 0UNMw/NzYXpVHoKPU+ooSqo/13jtYkAEXs+gMMTi6eJgfnsm4OR8rtFoGM7dO2pGRiPowjt0h9A 6c/EbPiQ2SgYfNQm1CwfAOnox X-Received: by 2002:a05:6a00:2446:b0:528:5da9:cc7 with SMTP id d6-20020a056a00244600b005285da90cc7mr1293852pfj.51.1660796941420; Wed, 17 Aug 2022 21:29:01 -0700 (PDT) X-Received: by 2002:a05:6a00:2446:b0:528:5da9:cc7 with SMTP id d6-20020a056a00244600b005285da90cc7mr1293831pfj.51.1660796941080; Wed, 17 Aug 2022 21:29:01 -0700 (PDT) Received: from [10.72.13.223] ([209.132.188.80]) by smtp.gmail.com with ESMTPSA id o21-20020a170903211500b00170a757a191sm296595ple.9.2022.08.17.21.28.53 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Wed, 17 Aug 2022 21:29:00 -0700 (PDT) Message-ID: <3abb1be9-b12c-e658-0391-8461e28f1b33@redhat.com> Date: Thu, 18 Aug 2022 12:28:48 +0800 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.15; rv:91.0) Gecko/20100101 Thunderbird/91.12.0 Subject: Re: [PATCH 0/6] virtio/vsock: introduce dgrams, sk_buff, and qdisc Content-Language: en-US To: "Michael S. Tsirkin" , Bobby Eshleman Cc: Bobby Eshleman , Bobby Eshleman , Cong Wang , Jiang Wang , Stefan Hajnoczi , Stefano Garzarella , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , "K. Y. Srinivasan" , Haiyang Zhang , Stephen Hemminger , Wei Liu , Dexuan Cui , kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, linux-hyperv@vger.kernel.org References: <20220817025250-mutt-send-email-mst@kernel.org> From: Jason Wang In-Reply-To: <20220817025250-mutt-send-email-mst@kernel.org> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-2.8 required=5.0 tests=BAYES_00,DKIMWL_WL_HIGH, DKIM_SIGNED,DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A, RCVD_IN_DNSWL_LOW,SPF_HELO_NONE,SPF_NONE,T_SCC_BODY_TEXT_LINE, URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 在 2022/8/17 14:54, Michael S. Tsirkin 写道: > On Mon, Aug 15, 2022 at 10:56:03AM -0700, Bobby Eshleman wrote: >> Hey everybody, >> >> This series introduces datagrams, packet scheduling, and sk_buff usage >> to virtio vsock. >> >> The usage of struct sk_buff benefits users by a) preparing vsock to use >> other related systems that require sk_buff, such as sockmap and qdisc, >> b) supporting basic congestion control via sock_alloc_send_skb, and c) >> reducing copying when delivering packets to TAP. >> >> The socket layer no longer forces errors to be -ENOMEM, as typically >> userspace expects -EAGAIN when the sk_sndbuf threshold is reached and >> messages are being sent with option MSG_DONTWAIT. >> >> The datagram work is based off previous patches by Jiang Wang[1]. >> >> The introduction of datagrams creates a transport layer fairness issue >> where datagrams may freely starve streams of queue access. This happens >> because, unlike streams, datagrams lack the transactions necessary for >> calculating credits and throttling. >> >> Previous proposals introduce changes to the spec to add an additional >> virtqueue pair for datagrams[1]. Although this solution works, using >> Linux's qdisc for packet scheduling leverages already existing systems, >> avoids the need to change the virtio specification, and gives additional >> capabilities. The usage of SFQ or fq_codel, for example, may solve the >> transport layer starvation problem. It is easy to imagine other use >> cases as well. For example, services of varying importance may be >> assigned different priorities, and qdisc will apply appropriate >> priority-based scheduling. By default, the system default pfifo qdisc is >> used. The qdisc may be bypassed and legacy queuing is resumed by simply >> setting the virtio-vsock%d network device to state DOWN. This technique >> still allows vsock to work with zero-configuration. > The basic question to answer then is this: with a net device qdisc > etc in the picture, how is this different from virtio net then? > Why do you still want to use vsock? Or maybe it's time to revisit an old idea[1] to unify at least the driver part (e.g using virtio-net driver for vsock then we can all features that vsock is lacking now)? Thanks [1] https://lists.linuxfoundation.org/pipermail/virtualization/2018-November/039783.html > >> In summary, this series introduces these major changes to vsock: >> >> - virtio vsock supports datagrams >> - virtio vsock uses struct sk_buff instead of virtio_vsock_pkt >> - Because virtio vsock uses sk_buff, it also uses sock_alloc_send_skb, >> which applies the throttling threshold sk_sndbuf. >> - The vsock socket layer supports returning errors other than -ENOMEM. >> - This is used to return -EAGAIN when the sk_sndbuf threshold is >> reached. >> - virtio vsock uses a net_device, through which qdisc may be used. >> - qdisc allows scheduling policies to be applied to vsock flows. >> - Some qdiscs, like SFQ, may allow vsock to avoid transport layer congestion. That is, >> it may avoid datagrams from flooding out stream flows. The benefit >> to this is that additional virtqueues are not needed for datagrams. >> - The net_device and qdisc is bypassed by simply setting the >> net_device state to DOWN. >> >> [1]: https://lore.kernel.org/all/20210914055440.3121004-1-jiang.wang@bytedance.com/ >> >> Bobby Eshleman (5): >> vsock: replace virtio_vsock_pkt with sk_buff >> vsock: return errors other than -ENOMEM to socket >> vsock: add netdev to vhost/virtio vsock >> virtio/vsock: add VIRTIO_VSOCK_F_DGRAM feature bit >> virtio/vsock: add support for dgram >> >> Jiang Wang (1): >> vsock_test: add tests for vsock dgram >> >> drivers/vhost/vsock.c | 238 ++++---- >> include/linux/virtio_vsock.h | 73 ++- >> include/net/af_vsock.h | 2 + >> include/uapi/linux/virtio_vsock.h | 2 + >> net/vmw_vsock/af_vsock.c | 30 +- >> net/vmw_vsock/hyperv_transport.c | 2 +- >> net/vmw_vsock/virtio_transport.c | 237 +++++--- >> net/vmw_vsock/virtio_transport_common.c | 771 ++++++++++++++++-------- >> net/vmw_vsock/vmci_transport.c | 9 +- >> net/vmw_vsock/vsock_loopback.c | 51 +- >> tools/testing/vsock/util.c | 105 ++++ >> tools/testing/vsock/util.h | 4 + >> tools/testing/vsock/vsock_test.c | 195 ++++++ >> 13 files changed, 1176 insertions(+), 543 deletions(-) >> >> -- >> 2.35.1