Received: by 2002:a05:7412:b10a:b0:f3:1519:9f41 with SMTP id az10csp913293rdb; Fri, 1 Dec 2023 01:49:14 -0800 (PST) X-Google-Smtp-Source: AGHT+IEholJpkaeRioIJjdmrWBYPNG/sbLmE1dZ3UEBuglBhRV5B6hJXGnCCNOdDYkYXQiYSoXFr X-Received: by 2002:a05:6358:6f18:b0:16e:34f4:185b with SMTP id r24-20020a0563586f1800b0016e34f4185bmr18749968rwn.20.1701424154299; Fri, 01 Dec 2023 01:49:14 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1701424154; cv=none; d=google.com; s=arc-20160816; b=lVk/YbU1UDQ/hCus244DobnUdrA+ECUsOAbA9UaM+RiRQfAA8a4mzmTG7tR5SxS9i/ bXmI3UISIfgqRUj+FD07tdhGHOFAxOw/BYFnnhRnzU7jUFq6g+JexQhIT/5UqOx4isjv xlt6cvyeYPNunpYOLCiG68JPUNCT71OPAQsF/n+g0reuWoFMcKNEewDQ9y2p4gh9zfCt sbf/n8HrC8rt5IonHVARx1/Uyv5ckI29gxpVVQ77VikMF5526kBWUww3+745myw97IXr r+lpkikdEBJ8m9MdD/RxFO1Ggc5EkphbX5KwaIkCTbXl2rxIraGI8TPIWlsMeDzx0UiI E9DQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=i8iLA1+cnpCTtrNyEc8s6LHU24cbF6mW5NQ3ddyP/QI=; fh=0mJ/jGmZ0v6b8ZW85DH4CXGjTGrXcvNnJ+jC1oCQQkQ=; b=bcMNOoFLE8Be8t//XqsRwFrpLbFIlJNNs+iskSs/gTocJ8+mEcZFAa0lZ371xgMSdZ tpN66jkR1JESszBQ/8JFz25s+iJzMlavJABCsxACeFSjZ0Vb/M6U4cdwjezuGF5eRAaH YSkF9l6P9ZHb7x0dO303dSEAYehq8+OZSzu9nsZ8U4eXJ/wnK/w4yxx5HBIvzigAIxOZ f0XH7lqO89rp/y+O5noXsh8b34rpNIBXsF0Gw1+RYl8zU6+0jdDI8y0SZyiW30s2KkCc VyloSTycO6kSkpG5PwopDE1pYoyP3r9CYuEmrc/feRa+J7mb1bwywr+kqpSdVuWhElUb 1uRg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=EPTfLTHJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from agentk.vger.email (agentk.vger.email. [23.128.96.32]) by mx.google.com with ESMTPS id x16-20020a056a00189000b006bf0f06c31dsi3042549pfh.166.2023.12.01.01.49.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Dec 2023 01:49:14 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) client-ip=23.128.96.32; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=EPTfLTHJ; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.32 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by agentk.vger.email (Postfix) with ESMTP id 451ED80EAD6D; Fri, 1 Dec 2023 01:49:11 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at agentk.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1378036AbjLAJsy (ORCPT + 99 others); Fri, 1 Dec 2023 04:48:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1377899AbjLAJsx (ORCPT ); Fri, 1 Dec 2023 04:48:53 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CCCC010F9 for ; Fri, 1 Dec 2023 01:48:58 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1701424138; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=i8iLA1+cnpCTtrNyEc8s6LHU24cbF6mW5NQ3ddyP/QI=; b=EPTfLTHJsIOqHG9Ld4K0VWCFNqxYErSPhcHsKS07pmR6SrV323EtqhETVwuX1QEmUCUZT0 zckJ/EbVB05HhWeD1bNBxSHuK9PInbUOBJSa85fUzjUzGyWGoayym/CQjI9xnWqHx3h5Rg hNrxPBkbI1Ac1zHXp67DGMrJf6X9VgA= Received: from mail-ed1-f72.google.com (mail-ed1-f72.google.com [209.85.208.72]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.3, cipher=TLS_AES_256_GCM_SHA384) id us-mta-549-q6NMOSsLNH2KRJ3yanZcDg-1; Fri, 01 Dec 2023 04:48:56 -0500 X-MC-Unique: q6NMOSsLNH2KRJ3yanZcDg-1 Received: by mail-ed1-f72.google.com with SMTP id 4fb4d7f45d1cf-54b10fc92a2so1407908a12.1 for ; Fri, 01 Dec 2023 01:48:56 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1701424135; x=1702028935; h=in-reply-to:content-transfer-encoding:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=i8iLA1+cnpCTtrNyEc8s6LHU24cbF6mW5NQ3ddyP/QI=; b=i157EuX4dhqxyfB5PHdi+6VDru97G7hQJoOGLOWY08dNELlf3mT69nyqogYTpV7C1H y5cbwssGRJNjoo+aEgUbqAToeq1OUzMmsU6Zq6N69LmYSWJoOq65S4ZSCWSJIQ95kf82 PK7M8aCUNyRSW7cFjSMh1iw2PlUVYb7CrRAgB4wlQAkcsHy4c+/JNU+kXf0ZuaZuWvyd xYyDBADoiJQhVCxk8ru57DKcxfh08QBimsyGeP307Zp0P/0uyKL9gPO23+uJvy5hdgvh 3APjbEoMQwGq8qofG6QgERMeLke/9qujJ8W/9gE9G80LXvaCrhbORAApXpXKnUdO8EvF ik7Q== X-Gm-Message-State: AOJu0YyQ+SpD8b6XrJ/NQdiRoZV3mO1OKNp3aoYJTIKZytGQ8/+3CuGL MEijCV5dNB7jSVC5zY6ZpR30Yk9QcnWU3Hmou87LNK8MUZoq2DVfBSUJPFcPN7Wq35v1XQWN/jD L0KEfUvZZG5QHzeFEGL3rLj2z X-Received: by 2002:a50:8a81:0:b0:54c:4837:9047 with SMTP id j1-20020a508a81000000b0054c48379047mr672684edj.63.1701424135347; Fri, 01 Dec 2023 01:48:55 -0800 (PST) X-Received: by 2002:a50:8a81:0:b0:54c:4837:9047 with SMTP id j1-20020a508a81000000b0054c48379047mr672661edj.63.1701424134882; Fri, 01 Dec 2023 01:48:54 -0800 (PST) Received: from sgarzare-redhat (host-79-46-200-199.retail.telecomitalia.it. [79.46.200.199]) by smtp.gmail.com with ESMTPSA id p9-20020a056402044900b0054bcb2b77b3sm1451841edw.70.2023.12.01.01.48.53 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 01 Dec 2023 01:48:54 -0800 (PST) Date: Fri, 1 Dec 2023 10:48:50 +0100 From: Stefano Garzarella To: Arseniy Krasnov Cc: "Michael S. Tsirkin" , Stefan Hajnoczi , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jason Wang , Bobby Eshleman , kvm@vger.kernel.org, virtualization@lists.linux-foundation.org, netdev@vger.kernel.org, linux-kernel@vger.kernel.org, kernel@sberdevices.ru, oxffffaa@gmail.com Subject: Re: [PATCH net-next v5 2/3] virtio/vsock: send credit update during setting SO_RCVLOWAT Message-ID: References: <20231130130840.253733-1-avkrasnov@salutedevices.com> <20231130130840.253733-3-avkrasnov@salutedevices.com> <20231130084044-mutt-send-email-mst@kernel.org> <02de8982-ec4a-b3b2-e8e5-1bca28cfc01b@salutedevices.com> <20231130085445-mutt-send-email-mst@kernel.org> <20231130123815-mutt-send-email-mst@kernel.org> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: X-Spam-Status: No, score=-0.9 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on agentk.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (agentk.vger.email [0.0.0.0]); Fri, 01 Dec 2023 01:49:11 -0800 (PST) On Fri, Dec 01, 2023 at 11:35:56AM +0300, Arseniy Krasnov wrote: > > >On 01.12.2023 11:27, Stefano Garzarella wrote: >> On Thu, Nov 30, 2023 at 12:40:43PM -0500, Michael S. Tsirkin wrote: >>> On Thu, Nov 30, 2023 at 03:11:19PM +0100, Stefano Garzarella wrote: >>>> On Thu, Nov 30, 2023 at 08:58:58AM -0500, Michael S. Tsirkin wrote: >>>> > On Thu, Nov 30, 2023 at 04:43:34PM +0300, Arseniy Krasnov wrote: >>>> > > >>>> > > >>>> > > On 30.11.2023 16:42, Michael S. Tsirkin wrote: >>>> > > > On Thu, Nov 30, 2023 at 04:08:39PM +0300, Arseniy Krasnov wrote: >>>> > > >> Send credit update message when SO_RCVLOWAT is updated and it is bigger >>>> > > >> than number of bytes in rx queue. It is needed, because 'poll()' will >>>> > > >> wait until number of bytes in rx queue will be not smaller than >>>> > > >> SO_RCVLOWAT, so kick sender to send more data. Otherwise mutual hungup >>>> > > >> for tx/rx is possible: sender waits for free space and receiver is >>>> > > >> waiting data in 'poll()'. >>>> > > >> >>>> > > >> Signed-off-by: Arseniy Krasnov >>>> > > >> --- >>>> > > >>? Changelog: >>>> > > >>? v1 -> v2: >>>> > > >>?? * Update commit message by removing 'This patch adds XXX' manner. >>>> > > >>?? * Do not initialize 'send_update' variable - set it directly during >>>> > > >>???? first usage. >>>> > > >>? v3 -> v4: >>>> > > >>?? * Fit comment in 'virtio_transport_notify_set_rcvlowat()' to 80 chars. >>>> > > >>? v4 -> v5: >>>> > > >>?? * Do not change callbacks order in transport structures. >>>> > > >> >>>> > > >>? drivers/vhost/vsock.c?????????????????? |? 1 + >>>> > > >>? include/linux/virtio_vsock.h??????????? |? 1 + >>>> > > >>? net/vmw_vsock/virtio_transport.c??????? |? 1 + >>>> > > >>? net/vmw_vsock/virtio_transport_common.c | 27 +++++++++++++++++++++++++ >>>> > > >>? net/vmw_vsock/vsock_loopback.c????????? |? 1 + >>>> > > >>? 5 files changed, 31 insertions(+) >>>> > > >> >>>> > > >> diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c >>>> > > >> index f75731396b7e..4146f80db8ac 100644 >>>> > > >> --- a/drivers/vhost/vsock.c >>>> > > >> +++ b/drivers/vhost/vsock.c >>>> > > >> @@ -451,6 +451,7 @@ static struct virtio_transport vhost_transport = { >>>> > > >>????????? .notify_buffer_size?????? = virtio_transport_notify_buffer_size, >>>> > > >> >>>> > > >>????????? .read_skb = virtio_transport_read_skb, >>>> > > >> +??????? .notify_set_rcvlowat????? = virtio_transport_notify_set_rcvlowat >>>> > > >>????? }, >>>> > > >> >>>> > > >>????? .send_pkt = vhost_transport_send_pkt, >>>> > > >> diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h >>>> > > >> index ebb3ce63d64d..c82089dee0c8 100644 >>>> > > >> --- a/include/linux/virtio_vsock.h >>>> > > >> +++ b/include/linux/virtio_vsock.h >>>> > > >> @@ -256,4 +256,5 @@ void virtio_transport_put_credit(struct virtio_vsock_sock *vvs, u32 credit); >>>> > > >>? void virtio_transport_deliver_tap_pkt(struct sk_buff *skb); >>>> > > >>? int virtio_transport_purge_skbs(void *vsk, struct sk_buff_head *list); >>>> > > >>? int virtio_transport_read_skb(struct vsock_sock *vsk, skb_read_actor_t read_actor); >>>> > > >> +int virtio_transport_notify_set_rcvlowat(struct vsock_sock *vsk, int val); >>>> > > >>? #endif /* _LINUX_VIRTIO_VSOCK_H */ >>>> > > >> diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c >>>> > > >> index af5bab1acee1..8007593a3a93 100644 >>>> > > >> --- a/net/vmw_vsock/virtio_transport.c >>>> > > >> +++ b/net/vmw_vsock/virtio_transport.c >>>> > > >> @@ -539,6 +539,7 @@ static struct virtio_transport virtio_transport = { >>>> > > >>????????? .notify_buffer_size?????? = virtio_transport_notify_buffer_size, >>>> > > >> >>>> > > >>????????? .read_skb = virtio_transport_read_skb, >>>> > > >> +??????? .notify_set_rcvlowat????? = virtio_transport_notify_set_rcvlowat >>>> > > >>????? }, >>>> > > >> >>>> > > >>????? .send_pkt = virtio_transport_send_pkt, >>>> > > >> diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c >>>> > > >> index f6dc896bf44c..1cb556ad4597 100644 >>>> > > >> --- a/net/vmw_vsock/virtio_transport_common.c >>>> > > >> +++ b/net/vmw_vsock/virtio_transport_common.c >>>> > > >> @@ -1684,6 +1684,33 @@ int virtio_transport_read_skb(struct vsock_sock *vsk, skb_read_actor_t recv_acto >>>> > > >>? } >>>> > > >>? EXPORT_SYMBOL_GPL(virtio_transport_read_skb); >>>> > > >> >>>> > > >> +int virtio_transport_notify_set_rcvlowat(struct vsock_sock *vsk, >>>> > > >> int val) >>>> > > >> +{ >>>> > > >> +??? struct virtio_vsock_sock *vvs = vsk->trans; >>>> > > >> +??? bool send_update; >>>> > > >> + >>>> > > >> +??? spin_lock_bh(&vvs->rx_lock); >>>> > > >> + >>>> > > >> +??? /* If number of available bytes is less than new SO_RCVLOWAT value, >>>> > > >> +???? * kick sender to send more data, because sender may sleep in >>>> > > >> its >>>> > > >> +???? * 'send()' syscall waiting for enough space at our side. >>>> > > >> +???? */ >>>> > > >> +??? send_update = vvs->rx_bytes < val; >>>> > > >> + >>>> > > >> +??? spin_unlock_bh(&vvs->rx_lock); >>>> > > >> + >>>> > > >> +??? if (send_update) { >>>> > > >> +??????? int err; >>>> > > >> + >>>> > > >> +??????? err = virtio_transport_send_credit_update(vsk); >>>> > > >> +??????? if (err < 0) >>>> > > >> +??????????? return err; >>>> > > >> +??? } >>>> > > >> + >>>> > > >> +??? return 0; >>>> > > >> +} >>>> > > > >>>> > > > >>>> > > > I find it strange that this will send a credit update >>>> > > > even if nothing changed since this was called previously. >>>> > > > I'm not sure whether this is a problem protocol-wise, >>>> > > > but it certainly was not envisioned when the protocol was >>>> > > > built. WDYT? >>>> > > >>>> > > >From virtio spec I found: >>>> > > >>>> > > It is also valid to send a VIRTIO_VSOCK_OP_CREDIT_UPDATE packet without previously receiving a >>>> > > VIRTIO_VSOCK_OP_CREDIT_REQUEST packet. This allows communicating updates any time a change >>>> > > in buffer space occurs. >>>> > > So I guess there is no limitations to send such type of packet, e.g. it is not >>>> > > required to be a reply for some another packet. Please, correct me if im wrong. >>>> > > >>>> > > Thanks, Arseniy >>>> > >>>> > >>>> > Absolutely. My point was different - with this patch it is possible >>>> > that you are not adding any credits at all since the previous >>>> > VIRTIO_VSOCK_OP_CREDIT_UPDATE. >>>> >>>> I think the problem we're solving here is that since as an optimization we >>>> avoid sending the update for every byte we consume, but we put a threshold, >>>> then we make sure we update the peer. >>>> >>>> A credit update contains a snapshot and sending it the same as the previous >>>> one should not create any problem. >>> >>> Well it consumes a buffer on the other side. >> >> Sure, but we are already speculating by not updating the other side when >> we consume bytes before a certain threshold. This already avoids to >> consume many buffers. >> >> Here we're only sending it once, when the user sets RCVLOWAT, so >> basically I expect it won't affect performance. > >Moreover I think in practice setting RCVLOWAT is rare case, while this patch >fixes real problem I guess > > >> >>> >>>> My doubt now is that we only do this when we set RCVLOWAT , should we also >>>> do something when we consume bytes to avoid the optimization we have? >>>> >>>> Stefano >>> >>> Isn't this why we have credit request? >> >> Yep, but in practice we never use it. It would also consume 2 buffers, >> one at the transmitter and one at the receiver. >> >> However I agree that maybe we should start using it before we decide not >> to send any more data. >> >> To be compatible with older devices, though, I think for now we also >> need to send a credit update when the bytes in the receive queue are >> less than RCVLOWAT, as Arseniy proposed in the other series. > >Looks like (in theory of course), that credit request is considered to be >paired with credit update. While current usage of credit update is something >like ACK packet in TCP, e.g. telling peer that we are ready to receive more >data. I don't honestly know what the original author's choice was, but I think we reduce latency this way. Effectively though, if we never send any credit update when we consume bytes and always leave it up to the transmitter to ask for an update before transmission, we save even more buffer than the optimization we have, but maybe the latency would grow a lot. Stefano