Received: by 2002:a05:6a10:8c0a:0:0:0:0 with SMTP id go10csp3725928pxb; Mon, 1 Feb 2021 03:07:38 -0800 (PST) X-Google-Smtp-Source: ABdhPJwuW7pK9z9oqpPvjlOj4dRywH9kPWsnUGiyWtTdwvrgzeWWcMwJavsmlziXZn9NbA2yrESV X-Received: by 2002:aa7:c7d8:: with SMTP id o24mr2432148eds.121.1612177658236; Mon, 01 Feb 2021 03:07:38 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1612177658; cv=none; d=google.com; s=arc-20160816; b=aoSLhnWZ8YSR22/7yHMovHECpkSvC4ntYqsxNgAhfMoczXHsoQHcG26vcdlmtiKmWq 0aW0S0T+9dDatIg/BQViUi4ssJ+2sufQQtN/2HHCeXG7dS/0TXd5XqeAqk4+YWGpv8zo jtE2G7Z46P2pAGP9YfZjgvSUiDpGVUaPiQlN0BkHI82FksV9VjzuzxS+HL23t+71HgWz G7CiKdcKnDp0P3iDP6V0knVSRCdpGUWAzgfplVWFVIb6cePy3EpT5FPASX27bawX+yrs pIt3pl67WZmm1Da0Rel0fCMkN0d+Nc71kbLnfWb+vl9cWrKyVqd0CAkao46opVHwvFrz xU9w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature; bh=AXwkC2/bPdrbI6fVlDSTD38OtfXWVbFESDZTO3soPEE=; b=B82X9l9wa5RImxqw4hm0taCfnXrXeo/44X19IvM8VgDJ0MkqrN7mFxiH9JcUeYEZ2w AGEZWouH8dLy+u1qK6FYQLcZqk1X8gxDch31cSPfd3lBgeLAsFMY9/zYYoQDiTBrQ5Wi 5SK0HrFZZRML47REJm80Q/Ra2+1w5NMmV85/PO1FERJTOJCVreOHOCMtds7RQyKuIUxB wUCIiO3Nb/wtdJG3AAgUsvHSDsQdkoGTYmsV45XR4XgzUMw3XMzZgs+qe1PM4pBy5gxW tFE3GQuhahbPEEm928PF09GSebM5Vy1pfjpYcRnwS8VJphbPu92wGi1DyWCrEqMeWB70 qFLg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dp9uf7+W; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id zl6si9921029ejb.747.2021.02.01.03.07.12; Mon, 01 Feb 2021 03:07:38 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@redhat.com header.s=mimecast20190719 header.b=dp9uf7+W; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233288AbhBALEe (ORCPT + 99 others); Mon, 1 Feb 2021 06:04:34 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:25332 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233269AbhBALEd (ORCPT ); Mon, 1 Feb 2021 06:04:33 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612177386; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=AXwkC2/bPdrbI6fVlDSTD38OtfXWVbFESDZTO3soPEE=; b=dp9uf7+WCBriARw4lYUvfYeV14wIdcri0CIfdLmszm6PhJLZdumaoO7GslOSV6oGdvTBvL P2u5F9hN+mXZlJ2rh5qzLw1oZ1ival67xsosmB3iA/d1mXwYDUIrXFM9QQWodCO4zwIb4c X0DO+2mzC1GFCr04an23IhDLTZrR+Po= Received: from mail-wr1-f70.google.com (mail-wr1-f70.google.com [209.85.221.70]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-77-IUeKx0BsMNyfRZJ3hmSaqA-1; Mon, 01 Feb 2021 06:03:04 -0500 X-MC-Unique: IUeKx0BsMNyfRZJ3hmSaqA-1 Received: by mail-wr1-f70.google.com with SMTP id n15so10205492wrv.20 for ; Mon, 01 Feb 2021 03:03:04 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:content-transfer-encoding :in-reply-to; bh=AXwkC2/bPdrbI6fVlDSTD38OtfXWVbFESDZTO3soPEE=; b=PnGzYM8okYCJESfAB0xRz1yFH+aShOrOq3drjyEAO0W6k1ifduOBjlwC8NFdvGWqDh lmfaXrc03TiFr9YXSnL9SOnFkwtnWx5Zi/Ke7uCxwJBV3RAp/0JLmhnZDf0LqLCJilxj 3VyPjJG0noNl+daKtnt4pg98dnw/n/xxIZdeloHGocywGF2rL/cWcrT1GIULwp2rzyd5 eMK0qEnOknaIuzCKNhW4hMMRSqIgFUCgfez2wme8VKy39nN5a8XKyHBa1a51s+97ffMB XWm76InNxOxdTqA8MT6Rg0L8fjFBi/GyycmsWrn5PGPR3+x+ioUEAqg7p6pp6y5NwLdU g4yg== X-Gm-Message-State: AOAM530F/Q3QRJJSrxUAtppndwTXjnzy+uloJhhrkhMiJVNJJUFbmlw/ 5ki9XFxYqy/giNAfDFWCG44EO9ycN298sBIkeJC+IISEM1HVuWiAmYuNLxsSktrckdr6W1euqdA a2Ax/yDdJBSZZ/hwEZxmlMvPF X-Received: by 2002:a05:600c:3506:: with SMTP id h6mr8535106wmq.21.1612177383215; Mon, 01 Feb 2021 03:03:03 -0800 (PST) X-Received: by 2002:a05:600c:3506:: with SMTP id h6mr8535044wmq.21.1612177382653; Mon, 01 Feb 2021 03:03:02 -0800 (PST) Received: from steredhat (host-79-34-249-199.business.telecomitalia.it. [79.34.249.199]) by smtp.gmail.com with ESMTPSA id g194sm20204347wme.39.2021.02.01.03.03.00 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 01 Feb 2021 03:03:01 -0800 (PST) Date: Mon, 1 Feb 2021 12:02:58 +0100 From: Stefano Garzarella To: Arseny Krasnov Cc: Stefan Hajnoczi , "Michael S. Tsirkin" , Jason Wang , "David S. Miller" , Jakub Kicinski , Colin Ian King , Andra Paraschiv , Jeff Vander Stoep , "kvm@vger.kernel.org" , "virtualization@lists.linux-foundation.org" , "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , "stsp2@yandex.ru" , "oxffffaa@gmail.com" Subject: Re: [RFC PATCH v3 00/13] virtio/vsock: introduce SOCK_SEQPACKET support Message-ID: <20210201110258.7ze7a7izl7gesv4w@steredhat> References: <20210125110903.597155-1-arseny.krasnov@kaspersky.com> <20210128171923.esyna5ccv5s27jyu@steredhat> <63459bb3-da22-b2a4-71ee-e67660fd2e12@kaspersky.com> <20210129092604.mgaw3ipiyv6xra3b@steredhat> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8bit In-Reply-To: Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 29, 2021 at 06:52:23PM +0300, Arseny Krasnov wrote: > >On 29.01.2021 12:26, Stefano Garzarella wrote: >> On Fri, Jan 29, 2021 at 09:41:50AM +0300, Arseny Krasnov wrote: >>> On 28.01.2021 20:19, Stefano Garzarella wrote: >>>> Hi Arseny, >>>> I reviewed a part, tomorrow I hope to finish the other patches. >>>> >>>> Just a couple of comments in the TODOs below. >>>> >>>> On Mon, Jan 25, 2021 at 02:09:00PM +0300, Arseny Krasnov wrote: >>>>> This patchset impelements support of SOCK_SEQPACKET for virtio >>>>> transport. >>>>> As SOCK_SEQPACKET guarantees to save record boundaries, so to >>>>> do it, new packet operation was added: it marks start of record (with >>>>> record length in header), such packet doesn't carry any data. To send >>>>> record, packet with start marker is sent first, then all data is sent >>>>> as usual 'RW' packets. On receiver's side, length of record is known >>>> >from packet with start record marker. Now as packets of one socket >>>>> are not reordered neither on vsock nor on vhost transport layers, such >>>>> marker allows to restore original record on receiver's side. If user's >>>>> buffer is smaller that record length, when all out of size data is >>>>> dropped. >>>>> Maximum length of datagram is not limited as in stream socket, >>>>> because same credit logic is used. Difference with stream socket is >>>>> that user is not woken up until whole record is received or error >>>>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags. >>>>> Tests also implemented. >>>>> >>>>> Arseny Krasnov (13): >>>>> af_vsock: prepare for SOCK_SEQPACKET support >>>>> af_vsock: prepare 'vsock_connectible_recvmsg()' >>>>> af_vsock: implement SEQPACKET rx loop >>>>> af_vsock: implement send logic for SOCK_SEQPACKET >>>>> af_vsock: rest of SEQPACKET support >>>>> af_vsock: update comments for stream sockets >>>>> virtio/vsock: dequeue callback for SOCK_SEQPACKET >>>>> virtio/vsock: fetch length for SEQPACKET record >>>>> virtio/vsock: add SEQPACKET receive logic >>>>> virtio/vsock: rest of SOCK_SEQPACKET support >>>>> virtio/vsock: setup SEQPACKET ops for transport >>>>> vhost/vsock: setup SEQPACKET ops for transport >>>>> vsock_test: add SOCK_SEQPACKET tests >>>>> >>>>> drivers/vhost/vsock.c | 7 +- >>>>> include/linux/virtio_vsock.h | 12 + >>>>> include/net/af_vsock.h | 6 + >>>>> include/uapi/linux/virtio_vsock.h | 9 + >>>>> net/vmw_vsock/af_vsock.c | 543 ++++++++++++++++------ >>>>> net/vmw_vsock/virtio_transport.c | 4 + >>>>> net/vmw_vsock/virtio_transport_common.c | 295 ++++++++++-- >>>>> tools/testing/vsock/util.c | 32 +- >>>>> tools/testing/vsock/util.h | 3 + >>>>> tools/testing/vsock/vsock_test.c | 126 +++++ >>>>> 10 files changed, 862 insertions(+), 175 deletions(-) >>>>> >>>>> TODO: >>>>> - Support for record integrity control. As transport could drop some >>>>> packets, something like "record-id" and record end marker need to >>>>> be implemented. Idea is that SEQ_BEGIN packet carries both record >>>>> length and record id, end marker(let it be SEQ_END) carries only >>>>> record id. To be sure that no one packet was lost, receiver checks >>>>> length of data between SEQ_BEGIN and SEQ_END(it must be same with >>>>> value in SEQ_BEGIN) and record ids of SEQ_BEGIN and SEQ_END(this >>>>> means that both markers were not dropped. I think that easiest way >>>>> to implement record id for SEQ_BEGIN is to reuse another field of >>>>> packet header(SEQ_BEGIN already uses 'flags' as record length).For >>>>> SEQ_END record id could be stored in 'flags'. >>>> I don't really like the idea of reusing the 'flags' field for this >>>> purpose. >>>> >>>>> Another way to implement it, is to move metadata of both SEQ_END >>>>> and SEQ_BEGIN to payload. But this approach has problem, because >>>>> if we move something to payload, such payload is accounted by >>>>> credit logic, which fragments payload, while payload with record >>>>> length and id couldn't be fragmented. One way to overcome it is to >>>>> ignore credit update for SEQ_BEGIN/SEQ_END packet.Another solution >>>>> is to update 'stream_has_space()' function: current implementation >>>>> return non-zero when at least 1 byte is allowed to use,but updated >>>>> version will have extra argument, which is needed length. For 'RW' >>>>> packet this argument is 1, for SEQ_BEGIN it is sizeof(record len + >>>>> record id) and for SEQ_END it is sizeof(record id). >>>> Is the payload accounted by credit logic also if hdr.op is not >>>> VIRTIO_VSOCK_OP_RW? >>> Yes, on send any packet with payload could be fragmented if >>> >>> there is not enough space at receiver. On receive 'fwd_cnt' and >>> >>> 'buf_alloc' are updated with header of every packet. Of course, >>> >>> to every such case i've described i can add check for 'RW' >>> >>> packet, to exclude payload from credit accounting, but this is >>> >>> bunch of dumb checks. >>> >>>> I think that we can define a specific header to put after the >>>> virtio_vsock_hdr when hdr.op is SEQ_BEGIN or SEQ_END, and in this header >>>> we can store the id and the length of the message. >>> I think it is better than use payload and touch credit logic >>> >> Cool, so let's try this option, hoping there aren't a lot of issues. > >If i understand, current implementation has 'struct virtio_vsock_hdr', > >then i'll add 'struct virtio_vsock_hdr_seq' with message length and id. > >After that, in 'struct virtio_vsock_pkt' which describes packet, field for > >header(which is 'struct virtio_vsock_hdr') must be replaced with new > >structure which? contains both 'struct virtio_vsock_hdr' and 'struct > >virtio_vsock_hdr_seq', because header field of 'struct virtio_vsock_pkt' > >is buffer for virtio layer. After it all accesses to header(for example to > >'buf_alloc' field will go accross new? structure with both headers: > >pkt->hdr.buf_alloc?? ->?? pkt->extended_hdr.classic_hdr.buf_alloc > >May be to avoid this, packet's header could be allocated dynamically > >in the same manner as packet's buffer? Size of allocation is always > >sizeof(classic header) + sizeof(seq header). In 'struct virtio_vsock_pkt' > >such header will be implemented as union of two pointers: class header > >and extended header containing classic and seq header. Which pointer > >to use is depends on packet's op. I think that the 'classic header' can stay as is, and the extended header can be dynamically allocated, as we do for the payload. But we have to be careful what happens if the other peer doesn't support SEQPACKET and if it counts this extra header as a payload for the credit mechanism. I'll try to take a closer look in the next few days. Thanks, Stefano