2021-03-08 09:49:37

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 00/22] virtio/vsock: introduce SOCK_SEQPACKET support

This patchset implements support of SOCK_SEQPACKET for virtio
transport.
As SOCK_SEQPACKET guarantees to save record boundaries, so to
do it, two new packet operations were added: first for start of record
and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
both operations carries metadata - to maintain boundaries and payload
integrity. Metadata is introduced by adding special header with two
fields - message id and message length:

struct virtio_vsock_seq_hdr {
__le32 msg_id;
__le32 msg_len;
} __attribute__((packed));

This header is transmitted as payload of SEQ_BEGIN and SEQ_END
packets(buffer of second virtio descriptor in chain) in the same way as
data transmitted in RW packets. Payload was chosen as buffer for this
header to avoid touching first virtio buffer which carries header of
packet, because someone could check that size of this buffer is equal
to size of packet header. To send record, packet with start marker is
sent first(it's header carries length of record and id),then all data
is sent as usual 'RW' packets and finally SEQ_END is sent(it carries
id of message, which is equal to id of SEQ_BEGIN), also after sending
SEQ_END id is incremented. On receiver's side,size of record is known
from packet with start record marker. To check that no packets were
dropped by transport, 'msg_id's of two sequential SEQ_BEGIN and SEQ_END
are checked to be equal and length of data between two markers is
compared to then length in SEQ_BEGIN header.
Now as packets of one socket are not reordered neither on
vsock nor on vhost transport layers, such markers allows to restore
original record on receiver's side. If user's buffer is smaller that
record length, when all out of size data is dropped.
Maximum length of datagram is not limited as in stream socket,
because same credit logic is used. Difference with stream socket is
that user is not woken up until whole record is received or error
occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
Tests also implemented.

Thanks to [email protected] for encouragements and initial design
recommendations.

Arseny Krasnov (22):
af_vsock: update functions for connectible socket
af_vsock: separate wait data loop
af_vsock: separate receive data loop
af_vsock: implement SEQPACKET receive loop
af_vsock: separate wait space loop
af_vsock: implement send logic for SEQPACKET
af_vsock: rest of SEQPACKET support
af_vsock: update comments for stream sockets
virtio/vsock: set packet's type in virtio_transport_send_pkt_info()
virtio/vsock: simplify credit update function API
virtio/vsock: dequeue callback for SOCK_SEQPACKET
virtio/vsock: fetch length for SEQPACKET record
virtio/vsock: add SEQPACKET receive logic
virtio/vsock: rest of SOCK_SEQPACKET support
virtio/vsock: SEQPACKET feature bit
vhost/vsock: SEQPACKET feature bit support
virtio/vsock: SEQPACKET feature bit support
virtio/vsock: setup SEQPACKET ops for transport
vhost/vsock: setup SEQPACKET ops for transport
vsock/loopback: setup SEQPACKET ops for transport
vsock_test: add SOCK_SEQPACKET tests
virtio/vsock: update trace event for SEQPACKET

drivers/vhost/vsock.c | 22 +-
include/linux/virtio_vsock.h | 22 +
include/net/af_vsock.h | 10 +
.../events/vsock_virtio_transport_common.h | 48 +-
include/uapi/linux/virtio_vsock.h | 19 +
net/vmw_vsock/af_vsock.c | 589 +++++++++++------
net/vmw_vsock/virtio_transport.c | 18 +
net/vmw_vsock/virtio_transport_common.c | 364 ++++++++--
net/vmw_vsock/vsock_loopback.c | 13 +
tools/testing/vsock/util.c | 32 +-
tools/testing/vsock/util.h | 3 +
tools/testing/vsock/vsock_test.c | 126 ++++
12 files changed, 1013 insertions(+), 253 deletions(-)

v5 -> v6:
General changelog:
- virtio transport specific callbacks which send SEQ_BEGIN or
SEQ_END now hidden inside virtio transport. Only enqueue,
dequeue and record length callbacks are provided by transport.

- virtio feature bit for SEQPACKET socket support introduced:
VIRTIO_VSOCK_F_SEQPACKET.

- 'msg_cnt' field in 'struct virtio_vsock_seq_hdr' renamed to
'msg_id' and used as id.

Per patch changelog:
- 'af_vsock: separate wait data loop':
1) Commit message updated.
2) 'prepare_to_wait()' moved inside while loop(thanks to
Jorgen Hansen).
Marked 'Reviewed-by' with 1), but as 2) I removed R-b.

- 'af_vsock: separate receive data loop': commit message
updated.
Marked 'Reviewed-by' with that fix.

- 'af_vsock: implement SEQPACKET receive loop': style fixes.

- 'af_vsock: rest of SEQPACKET support':
1) 'module_put()' added when transport callback check failed.
2) Now only 'seqpacket_allow()' callback called to check
support of SEQPACKET by transport.

- 'af_vsock: update comments for stream sockets': commit message
updated.
Marked 'Reviewed-by' with that fix.

- 'virtio/vsock: set packet's type in send':
1) Commit message updated.
2) Parameter 'type' from 'virtio_transport_send_credit_update()'
also removed in this patch instead of in next.

- 'virtio/vsock: dequeue callback for SOCK_SEQPACKET': SEQPACKET
related state wrapped to special struct.

- 'virtio/vsock: update trace event for SEQPACKET': format strings
now not broken by new lines.

v4 -> v5:
- patches reorganized:
1) Setting of packet's type in 'virtio_transport_send_pkt_info()'
is moved to separate patch.
2) Simplifying of 'virtio_transport_send_credit_update()' is
moved to separate patch and before main virtio/vsock patches.
- style problem fixed
- in 'af_vsock: separate receive data loop' extra 'release_sock()'
removed
- added trace event fields for SEQPACKET
- in 'af_vsock: separate wait data loop':
1) 'vsock_wait_data()' removed 'goto out;'
2) Comment for invalid data amount is changed.
- in 'af_vsock: rest of SEQPACKET support', 'new_transport' pointer
check is moved after 'try_module_get()'
- in 'af_vsock: update comments for stream sockets', 'connect-oriented'
replaced with 'connection-oriented'
- in 'loopback/vsock: setup SEQPACKET ops for transport',
'loopback/vsock' replaced with 'vsock/loopback'

v3 -> v4:
- SEQPACKET specific metadata moved from packet header to payload
and called 'virtio_vsock_seq_hdr'
- record integrity check:
1) SEQ_END operation was added, which marks end of record.
2) Both SEQ_BEGIN and SEQ_END carries counter which is incremented
on every marker send.
- af_vsock.c: socket operations for STREAM and SEQPACKET call same
functions instead of having own "gates" differs only by names:
'vsock_seqpacket/stream_getsockopt()' now replaced with
'vsock_connectible_getsockopt()'.
- af_vsock.c: 'seqpacket_dequeue' callback returns error and flag that
record ready. There is no need to return number of copied bytes,
because case when record received successfully is checked at virtio
transport layer, when SEQ_END is processed. Also user doesn't need
number of copied bytes, because 'recv()' from SEQPACKET could return
error, length of users's buffer or length of whole record(both are
known in af_vsock.c).
- af_vsock.c: both wait loops in af_vsock.c(for data and space) moved
to separate functions because now both called from several places.
- af_vsock.c: 'vsock_assign_transport()' checks that 'new_transport'
pointer is not NULL and returns 'ESOCKTNOSUPPORT' instead of 'ENODEV'
if failed to use transport.
- tools/testing/vsock/vsock_test.c: rename tests

v2 -> v3:
- patches reorganized: split for prepare and implementation patches
- local variables are declared in "Reverse Christmas tree" manner
- virtio_transport_common.c: valid leXX_to_cpu() for vsock header
fields access
- af_vsock.c: 'vsock_connectible_*sockopt()' added as shared code
between stream and seqpacket sockets.
- af_vsock.c: loops in '__vsock_*_recvmsg()' refactored.
- af_vsock.c: 'vsock_wait_data()' refactored.

v1 -> v2:
- patches reordered: af_vsock.c related changes now before virtio vsock
- patches reorganized: more small patches, where +/- are not mixed
- tests for SOCK_SEQPACKET added
- all commit messages updated
- af_vsock.c: 'vsock_pre_recv_check()' inlined to
'vsock_connectible_recvmsg()'
- af_vsock.c: 'vsock_assign_transport()' returns ENODEV if transport
was not found
- virtio_transport_common.c: transport callback for seqpacket dequeue
- virtio_transport_common.c: simplified
'virtio_transport_recv_connected()'
- virtio_transport_common.c: send reset on socket and packet type
mismatch.

Signed-off-by: Arseny Krasnov <[email protected]>

--
2.25.1


2021-03-08 09:50:42

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 01/22] af_vsock: update functions for connectible socket

This prepares af_vsock.c for SEQPACKET support: some functions such
as setsockopt(), getsockopt(), connect(), recvmsg(), sendmsg() are
shared between both types of sockets, so rename them in general
manner.

Signed-off-by: Arseny Krasnov <[email protected]>
---
net/vmw_vsock/af_vsock.c | 64 +++++++++++++++++++++-------------------
1 file changed, 34 insertions(+), 30 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 5546710d8ac1..656370e11707 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -604,8 +604,8 @@ static void vsock_pending_work(struct work_struct *work)

/**** SOCKET OPERATIONS ****/

-static int __vsock_bind_stream(struct vsock_sock *vsk,
- struct sockaddr_vm *addr)
+static int __vsock_bind_connectible(struct vsock_sock *vsk,
+ struct sockaddr_vm *addr)
{
static u32 port;
struct sockaddr_vm new_addr;
@@ -685,7 +685,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr)
switch (sk->sk_socket->type) {
case SOCK_STREAM:
spin_lock_bh(&vsock_table_lock);
- retval = __vsock_bind_stream(vsk, addr);
+ retval = __vsock_bind_connectible(vsk, addr);
spin_unlock_bh(&vsock_table_lock);
break;

@@ -767,6 +767,11 @@ static struct sock *__vsock_create(struct net *net,
return sk;
}

+static bool sock_type_connectible(u16 type)
+{
+ return type == SOCK_STREAM;
+}
+
static void __vsock_release(struct sock *sk, int level)
{
if (sk) {
@@ -785,7 +790,7 @@ static void __vsock_release(struct sock *sk, int level)

if (vsk->transport)
vsk->transport->release(vsk);
- else if (sk->sk_type == SOCK_STREAM)
+ else if (sock_type_connectible(sk->sk_type))
vsock_remove_sock(vsk);

sock_orphan(sk);
@@ -947,7 +952,7 @@ static int vsock_shutdown(struct socket *sock, int mode)
lock_sock(sk);
if (sock->state == SS_UNCONNECTED) {
err = -ENOTCONN;
- if (sk->sk_type == SOCK_STREAM)
+ if (sock_type_connectible(sk->sk_type))
goto out;
} else {
sock->state = SS_DISCONNECTING;
@@ -960,7 +965,7 @@ static int vsock_shutdown(struct socket *sock, int mode)
sk->sk_shutdown |= mode;
sk->sk_state_change(sk);

- if (sk->sk_type == SOCK_STREAM) {
+ if (sock_type_connectible(sk->sk_type)) {
sock_reset_flag(sk, SOCK_DONE);
vsock_send_shutdown(sk, mode);
}
@@ -1015,7 +1020,7 @@ static __poll_t vsock_poll(struct file *file, struct socket *sock,
if (!(sk->sk_shutdown & SEND_SHUTDOWN))
mask |= EPOLLOUT | EPOLLWRNORM | EPOLLWRBAND;

- } else if (sock->type == SOCK_STREAM) {
+ } else if (sock_type_connectible(sk->sk_type)) {
const struct vsock_transport *transport;

lock_sock(sk);
@@ -1262,8 +1267,8 @@ static void vsock_connect_timeout(struct work_struct *work)
sock_put(sk);
}

-static int vsock_stream_connect(struct socket *sock, struct sockaddr *addr,
- int addr_len, int flags)
+static int vsock_connect(struct socket *sock, struct sockaddr *addr,
+ int addr_len, int flags)
{
int err;
struct sock *sk;
@@ -1413,7 +1418,7 @@ static int vsock_accept(struct socket *sock, struct socket *newsock, int flags,

lock_sock(listener);

- if (sock->type != SOCK_STREAM) {
+ if (!sock_type_connectible(sock->type)) {
err = -EOPNOTSUPP;
goto out;
}
@@ -1490,7 +1495,7 @@ static int vsock_listen(struct socket *sock, int backlog)

lock_sock(sk);

- if (sock->type != SOCK_STREAM) {
+ if (!sock_type_connectible(sk->sk_type)) {
err = -EOPNOTSUPP;
goto out;
}
@@ -1534,11 +1539,11 @@ static void vsock_update_buffer_size(struct vsock_sock *vsk,
vsk->buffer_size = val;
}

-static int vsock_stream_setsockopt(struct socket *sock,
- int level,
- int optname,
- sockptr_t optval,
- unsigned int optlen)
+static int vsock_connectible_setsockopt(struct socket *sock,
+ int level,
+ int optname,
+ sockptr_t optval,
+ unsigned int optlen)
{
int err;
struct sock *sk;
@@ -1616,10 +1621,10 @@ static int vsock_stream_setsockopt(struct socket *sock,
return err;
}

-static int vsock_stream_getsockopt(struct socket *sock,
- int level, int optname,
- char __user *optval,
- int __user *optlen)
+static int vsock_connectible_getsockopt(struct socket *sock,
+ int level, int optname,
+ char __user *optval,
+ int __user *optlen)
{
int err;
int len;
@@ -1687,8 +1692,8 @@ static int vsock_stream_getsockopt(struct socket *sock,
return 0;
}

-static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
- size_t len)
+static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
+ size_t len)
{
struct sock *sk;
struct vsock_sock *vsk;
@@ -1827,10 +1832,9 @@ static int vsock_stream_sendmsg(struct socket *sock, struct msghdr *msg,
return err;
}

-
static int
-vsock_stream_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
- int flags)
+vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+ int flags)
{
struct sock *sk;
struct vsock_sock *vsk;
@@ -2006,7 +2010,7 @@ static const struct proto_ops vsock_stream_ops = {
.owner = THIS_MODULE,
.release = vsock_release,
.bind = vsock_bind,
- .connect = vsock_stream_connect,
+ .connect = vsock_connect,
.socketpair = sock_no_socketpair,
.accept = vsock_accept,
.getname = vsock_getname,
@@ -2014,10 +2018,10 @@ static const struct proto_ops vsock_stream_ops = {
.ioctl = sock_no_ioctl,
.listen = vsock_listen,
.shutdown = vsock_shutdown,
- .setsockopt = vsock_stream_setsockopt,
- .getsockopt = vsock_stream_getsockopt,
- .sendmsg = vsock_stream_sendmsg,
- .recvmsg = vsock_stream_recvmsg,
+ .setsockopt = vsock_connectible_setsockopt,
+ .getsockopt = vsock_connectible_getsockopt,
+ .sendmsg = vsock_connectible_sendmsg,
+ .recvmsg = vsock_connectible_recvmsg,
.mmap = sock_no_mmap,
.sendpage = sock_no_sendpage,
};
--
2.25.1

2021-03-08 09:51:49

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 02/22] af_vsock: separate wait data loop

This moves wait loop for data to dedicated function, because later it
will be used by SEQPACKET data receive loop. While moving the code
around, let's update an old comment.

Signed-off-by: Arseny Krasnov <[email protected]>
---
net/vmw_vsock/af_vsock.c | 156 +++++++++++++++++++++------------------
1 file changed, 84 insertions(+), 72 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 656370e11707..421c0303b26f 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1832,6 +1832,69 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
return err;
}

+static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
+ long timeout,
+ struct vsock_transport_recv_notify_data *recv_data,
+ size_t target)
+{
+ const struct vsock_transport *transport;
+ struct vsock_sock *vsk;
+ s64 data;
+ int err;
+
+ vsk = vsock_sk(sk);
+ err = 0;
+ transport = vsk->transport;
+
+ while ((data = vsock_stream_has_data(vsk)) == 0) {
+ prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
+
+ if (sk->sk_err != 0 ||
+ (sk->sk_shutdown & RCV_SHUTDOWN) ||
+ (vsk->peer_shutdown & SEND_SHUTDOWN)) {
+ break;
+ }
+
+ /* Don't wait for non-blocking sockets. */
+ if (timeout == 0) {
+ err = -EAGAIN;
+ break;
+ }
+
+ if (recv_data) {
+ err = transport->notify_recv_pre_block(vsk, target, recv_data);
+ if (err < 0)
+ break;
+ }
+
+ release_sock(sk);
+ timeout = schedule_timeout(timeout);
+ lock_sock(sk);
+
+ if (signal_pending(current)) {
+ err = sock_intr_errno(timeout);
+ break;
+ } else if (timeout == 0) {
+ err = -EAGAIN;
+ break;
+ }
+ }
+
+ finish_wait(sk_sleep(sk), wait);
+
+ if (err)
+ return err;
+
+ /* Internal transport error when checking for available
+ * data. XXX This should be changed to a connection
+ * reset in a later change.
+ */
+ if (data < 0)
+ return -ENOMEM;
+
+ return data;
+}
+
static int
vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
int flags)
@@ -1911,85 +1974,34 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,


while (1) {
- s64 ready;
+ ssize_t read;

- prepare_to_wait(sk_sleep(sk), &wait, TASK_INTERRUPTIBLE);
- ready = vsock_stream_has_data(vsk);
+ err = vsock_wait_data(sk, &wait, timeout, &recv_data, target);
+ if (err <= 0)
+ break;

- if (ready == 0) {
- if (sk->sk_err != 0 ||
- (sk->sk_shutdown & RCV_SHUTDOWN) ||
- (vsk->peer_shutdown & SEND_SHUTDOWN)) {
- finish_wait(sk_sleep(sk), &wait);
- break;
- }
- /* Don't wait for non-blocking sockets. */
- if (timeout == 0) {
- err = -EAGAIN;
- finish_wait(sk_sleep(sk), &wait);
- break;
- }
-
- err = transport->notify_recv_pre_block(
- vsk, target, &recv_data);
- if (err < 0) {
- finish_wait(sk_sleep(sk), &wait);
- break;
- }
- release_sock(sk);
- timeout = schedule_timeout(timeout);
- lock_sock(sk);
-
- if (signal_pending(current)) {
- err = sock_intr_errno(timeout);
- finish_wait(sk_sleep(sk), &wait);
- break;
- } else if (timeout == 0) {
- err = -EAGAIN;
- finish_wait(sk_sleep(sk), &wait);
- break;
- }
- } else {
- ssize_t read;
-
- finish_wait(sk_sleep(sk), &wait);
-
- if (ready < 0) {
- /* Invalid queue pair content. XXX This should
- * be changed to a connection reset in a later
- * change.
- */
-
- err = -ENOMEM;
- goto out;
- }
-
- err = transport->notify_recv_pre_dequeue(
- vsk, target, &recv_data);
- if (err < 0)
- break;
+ err = transport->notify_recv_pre_dequeue(vsk, target,
+ &recv_data);
+ if (err < 0)
+ break;

- read = transport->stream_dequeue(
- vsk, msg,
- len - copied, flags);
- if (read < 0) {
- err = -ENOMEM;
- break;
- }
+ read = transport->stream_dequeue(vsk, msg, len - copied, flags);
+ if (read < 0) {
+ err = -ENOMEM;
+ break;
+ }

- copied += read;
+ copied += read;

- err = transport->notify_recv_post_dequeue(
- vsk, target, read,
- !(flags & MSG_PEEK), &recv_data);
- if (err < 0)
- goto out;
+ err = transport->notify_recv_post_dequeue(vsk, target, read,
+ !(flags & MSG_PEEK), &recv_data);
+ if (err < 0)
+ goto out;

- if (read >= target || flags & MSG_PEEK)
- break;
+ if (read >= target || flags & MSG_PEEK)
+ break;

- target -= read;
- }
+ target -= read;
}

if (sk->sk_err)
--
2.25.1

2021-03-08 09:52:20

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 03/22] af_vsock: separate receive data loop

Move STREAM specific data receive logic to '__vsock_stream_recvmsg()'
dedicated function, while checks, that will be same for both STREAM
and SEQPACKET sockets, stays in 'vsock_connectible_recvmsg()' shared
functions.

Signed-off-by: Arseny Krasnov <[email protected]>
Reviewed-by: Stefano Garzarella <[email protected]>
---
net/vmw_vsock/af_vsock.c | 116 ++++++++++++++++++++++-----------------
1 file changed, 67 insertions(+), 49 deletions(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 421c0303b26f..0bc661e54262 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1895,65 +1895,22 @@ static int vsock_wait_data(struct sock *sk, struct wait_queue_entry *wait,
return data;
}

-static int
-vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
- int flags)
+static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
+ size_t len, int flags)
{
- struct sock *sk;
- struct vsock_sock *vsk;
+ struct vsock_transport_recv_notify_data recv_data;
const struct vsock_transport *transport;
- int err;
- size_t target;
+ struct vsock_sock *vsk;
ssize_t copied;
+ size_t target;
long timeout;
- struct vsock_transport_recv_notify_data recv_data;
+ int err;

DEFINE_WAIT(wait);

- sk = sock->sk;
vsk = vsock_sk(sk);
- err = 0;
-
- lock_sock(sk);
-
transport = vsk->transport;

- if (!transport || sk->sk_state != TCP_ESTABLISHED) {
- /* Recvmsg is supposed to return 0 if a peer performs an
- * orderly shutdown. Differentiate between that case and when a
- * peer has not connected or a local shutdown occured with the
- * SOCK_DONE flag.
- */
- if (sock_flag(sk, SOCK_DONE))
- err = 0;
- else
- err = -ENOTCONN;
-
- goto out;
- }
-
- if (flags & MSG_OOB) {
- err = -EOPNOTSUPP;
- goto out;
- }
-
- /* We don't check peer_shutdown flag here since peer may actually shut
- * down, but there can be data in the queue that a local socket can
- * receive.
- */
- if (sk->sk_shutdown & RCV_SHUTDOWN) {
- err = 0;
- goto out;
- }
-
- /* It is valid on Linux to pass in a zero-length receive buffer. This
- * is not an error. We may as well bail out now.
- */
- if (!len) {
- err = 0;
- goto out;
- }
-
/* We must not copy less than target bytes into the user's buffer
* before returning successfully, so we wait for the consume queue to
* have that much data to consume before dequeueing. Note that this
@@ -2012,6 +1969,67 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
if (copied > 0)
err = copied;

+out:
+ return err;
+}
+
+static int
+vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
+ int flags)
+{
+ struct sock *sk;
+ struct vsock_sock *vsk;
+ const struct vsock_transport *transport;
+ int err;
+
+ DEFINE_WAIT(wait);
+
+ sk = sock->sk;
+ vsk = vsock_sk(sk);
+ err = 0;
+
+ lock_sock(sk);
+
+ transport = vsk->transport;
+
+ if (!transport || sk->sk_state != TCP_ESTABLISHED) {
+ /* Recvmsg is supposed to return 0 if a peer performs an
+ * orderly shutdown. Differentiate between that case and when a
+ * peer has not connected or a local shutdown occurred with the
+ * SOCK_DONE flag.
+ */
+ if (sock_flag(sk, SOCK_DONE))
+ err = 0;
+ else
+ err = -ENOTCONN;
+
+ goto out;
+ }
+
+ if (flags & MSG_OOB) {
+ err = -EOPNOTSUPP;
+ goto out;
+ }
+
+ /* We don't check peer_shutdown flag here since peer may actually shut
+ * down, but there can be data in the queue that a local socket can
+ * receive.
+ */
+ if (sk->sk_shutdown & RCV_SHUTDOWN) {
+ err = 0;
+ goto out;
+ }
+
+ /* It is valid on Linux to pass in a zero-length receive buffer. This
+ * is not an error. We may as well bail out now.
+ */
+ if (!len) {
+ err = 0;
+ goto out;
+ }
+
+ err = __vsock_stream_recvmsg(sk, msg, len, flags);
+
out:
release_sock(sk);
return err;
--
2.25.1

2021-03-08 09:52:56

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 04/22] af_vsock: implement SEQPACKET receive loop

This adds receive loop for SEQPACKET. It looks like receive loop for
STREAM, but there is a little bit difference:
1) It doesn't call notify callbacks.
2) It doesn't care about 'SO_SNDLOWAT' and 'SO_RCVLOWAT' values, because
there is no sense for these values in SEQPACKET case.
3) It waits until whole record is received or error is found during
receiving.
4) It processes and sets 'MSG_TRUNC' flag.

So to avoid extra conditions for two types of socket inside one loop, two
independent functions were created.

Signed-off-by: Arseny Krasnov <[email protected]>
---
include/net/af_vsock.h | 5 +++
net/vmw_vsock/af_vsock.c | 95 +++++++++++++++++++++++++++++++++++++++-
2 files changed, 99 insertions(+), 1 deletion(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index b1c717286993..5ad7ee7f78fd 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -135,6 +135,11 @@ struct vsock_transport {
bool (*stream_is_active)(struct vsock_sock *);
bool (*stream_allow)(u32 cid, u32 port);

+ /* SEQ_PACKET. */
+ size_t (*seqpacket_seq_get_len)(struct vsock_sock *vsk);
+ int (*seqpacket_dequeue)(struct vsock_sock *vsk, struct msghdr *msg,
+ int flags, bool *msg_ready);
+
/* Notification. */
int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
int (*notify_poll_out)(struct vsock_sock *, size_t, bool *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 0bc661e54262..ac2f69362f2e 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1973,6 +1973,96 @@ static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
return err;
}

+static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
+ size_t len, int flags)
+{
+ const struct vsock_transport *transport;
+ const struct iovec *orig_iov;
+ unsigned long orig_nr_segs;
+ bool msg_ready;
+ struct vsock_sock *vsk;
+ size_t record_len;
+ long timeout;
+ int err = 0;
+ DEFINE_WAIT(wait);
+
+ vsk = vsock_sk(sk);
+ transport = vsk->transport;
+
+ timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
+ orig_nr_segs = msg->msg_iter.nr_segs;
+ orig_iov = msg->msg_iter.iov;
+ msg_ready = false;
+ record_len = 0;
+
+ while (1) {
+ err = vsock_wait_data(sk, &wait, timeout, NULL, 0);
+
+ if (err <= 0) {
+ /* In case of any loop break(timeout, signal
+ * interrupt or shutdown), we report user that
+ * nothing was copied.
+ */
+ err = 0;
+ break;
+ }
+
+ if (record_len == 0) {
+ record_len =
+ transport->seqpacket_seq_get_len(vsk);
+
+ if (record_len == 0)
+ continue;
+ }
+
+ err = transport->seqpacket_dequeue(vsk, msg, flags, &msg_ready);
+ if (err < 0) {
+ if (err == -EAGAIN) {
+ iov_iter_init(&msg->msg_iter, READ,
+ orig_iov, orig_nr_segs,
+ len);
+ /* Clear 'MSG_EOR' here, because dequeue
+ * callback above set it again if it was
+ * set by sender. This 'MSG_EOR' is from
+ * dropped record.
+ */
+ msg->msg_flags &= ~MSG_EOR;
+ record_len = 0;
+ continue;
+ }
+
+ err = -ENOMEM;
+ break;
+ }
+
+ if (msg_ready)
+ break;
+ }
+
+ if (sk->sk_err)
+ err = -sk->sk_err;
+ else if (sk->sk_shutdown & RCV_SHUTDOWN)
+ err = 0;
+
+ if (msg_ready) {
+ /* User sets MSG_TRUNC, so return real length of
+ * packet.
+ */
+ if (flags & MSG_TRUNC)
+ err = record_len;
+ else
+ err = len - msg->msg_iter.count;
+
+ /* Always set MSG_TRUNC if real length of packet is
+ * bigger than user's buffer.
+ */
+ if (record_len > len)
+ msg->msg_flags |= MSG_TRUNC;
+ }
+
+ return err;
+}
+
static int
vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
int flags)
@@ -2028,7 +2118,10 @@ vsock_connectible_recvmsg(struct socket *sock, struct msghdr *msg, size_t len,
goto out;
}

- err = __vsock_stream_recvmsg(sk, msg, len, flags);
+ if (sk->sk_type == SOCK_STREAM)
+ err = __vsock_stream_recvmsg(sk, msg, len, flags);
+ else
+ err = __vsock_seqpacket_recvmsg(sk, msg, len, flags);

out:
release_sock(sk);
--
2.25.1

2021-03-08 09:55:41

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 07/22] af_vsock: rest of SEQPACKET support

This does rest of SOCK_SEQPACKET support:
1) Adds socket ops for SEQPACKET type.
2) Allows to create socket with SEQPACKET type.

Signed-off-by: Arseny Krasnov <[email protected]>
---
include/net/af_vsock.h | 1 +
net/vmw_vsock/af_vsock.c | 36 +++++++++++++++++++++++++++++++++++-
2 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index aed306292ab3..ee16744ed4a8 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -141,6 +141,7 @@ struct vsock_transport {
int flags, bool *msg_ready);
int (*seqpacket_enqueue)(struct vsock_sock *vsk, struct msghdr *msg,
int flags, size_t len);
+ bool (*seqpacket_allow)(void);

/* Notification. */
int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index a031f165494d..673eb0de79fe 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -452,6 +452,7 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
new_transport = transport_dgram;
break;
case SOCK_STREAM:
+ case SOCK_SEQPACKET:
if (vsock_use_local_transport(remote_cid))
new_transport = transport_local;
else if (remote_cid <= VMADDR_CID_HOST || !transport_h2g ||
@@ -484,6 +485,14 @@ int vsock_assign_transport(struct vsock_sock *vsk, struct vsock_sock *psk)
if (!new_transport || !try_module_get(new_transport->module))
return -ENODEV;

+ if (sk->sk_type == SOCK_SEQPACKET) {
+ if (!new_transport->seqpacket_allow ||
+ !new_transport->seqpacket_allow()) {
+ module_put(new_transport->module);
+ return -ESOCKTNOSUPPORT;
+ }
+ }
+
ret = new_transport->init(vsk, psk);
if (ret) {
module_put(new_transport->module);
@@ -684,6 +693,7 @@ static int __vsock_bind(struct sock *sk, struct sockaddr_vm *addr)

switch (sk->sk_socket->type) {
case SOCK_STREAM:
+ case SOCK_SEQPACKET:
spin_lock_bh(&vsock_table_lock);
retval = __vsock_bind_connectible(vsk, addr);
spin_unlock_bh(&vsock_table_lock);
@@ -769,7 +779,7 @@ static struct sock *__vsock_create(struct net *net,

static bool sock_type_connectible(u16 type)
{
- return type == SOCK_STREAM;
+ return (type == SOCK_STREAM) || (type == SOCK_SEQPACKET);
}

static void __vsock_release(struct sock *sk, int level)
@@ -2182,6 +2192,27 @@ static const struct proto_ops vsock_stream_ops = {
.sendpage = sock_no_sendpage,
};

+static const struct proto_ops vsock_seqpacket_ops = {
+ .family = PF_VSOCK,
+ .owner = THIS_MODULE,
+ .release = vsock_release,
+ .bind = vsock_bind,
+ .connect = vsock_connect,
+ .socketpair = sock_no_socketpair,
+ .accept = vsock_accept,
+ .getname = vsock_getname,
+ .poll = vsock_poll,
+ .ioctl = sock_no_ioctl,
+ .listen = vsock_listen,
+ .shutdown = vsock_shutdown,
+ .setsockopt = vsock_connectible_setsockopt,
+ .getsockopt = vsock_connectible_getsockopt,
+ .sendmsg = vsock_connectible_sendmsg,
+ .recvmsg = vsock_connectible_recvmsg,
+ .mmap = sock_no_mmap,
+ .sendpage = sock_no_sendpage,
+};
+
static int vsock_create(struct net *net, struct socket *sock,
int protocol, int kern)
{
@@ -2202,6 +2233,9 @@ static int vsock_create(struct net *net, struct socket *sock,
case SOCK_STREAM:
sock->ops = &vsock_stream_ops;
break;
+ case SOCK_SEQPACKET:
+ sock->ops = &vsock_seqpacket_ops;
+ break;
default:
return -ESOCKTNOSUPPORT;
}
--
2.25.1

2021-03-08 09:56:49

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 10/22] virtio/vsock: simplify credit update function API

This function is static and 'hdr' arg was always NULL.

Signed-off-by: Arseny Krasnov <[email protected]>
---
net/vmw_vsock/virtio_transport_common.c | 7 +++----
1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index f69993d67f89..833104b71a1c 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -271,8 +271,7 @@ void virtio_transport_put_credit(struct virtio_vsock_sock *vvs, u32 credit)
}
EXPORT_SYMBOL_GPL(virtio_transport_put_credit);

-static int virtio_transport_send_credit_update(struct vsock_sock *vsk,
- struct virtio_vsock_hdr *hdr)
+static int virtio_transport_send_credit_update(struct vsock_sock *vsk)
{
struct virtio_vsock_pkt_info info = {
.op = VIRTIO_VSOCK_OP_CREDIT_UPDATE,
@@ -384,7 +383,7 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
* with different values.
*/
if (free_space < VIRTIO_VSOCK_MAX_PKT_BUF_SIZE)
- virtio_transport_send_credit_update(vsk, NULL);
+ virtio_transport_send_credit_update(vsk);

return total;

@@ -493,7 +492,7 @@ void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val)

vvs->buf_alloc = *val;

- virtio_transport_send_credit_update(vsk, NULL);
+ virtio_transport_send_credit_update(vsk);
}
EXPORT_SYMBOL_GPL(virtio_transport_notify_buffer_size);

--
2.25.1

2021-03-08 09:56:49

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 09/22] virtio/vsock: set packet's type in virtio_transport_send_pkt_info()

This moves passing type of packet from 'info' structure to 'virtio_
transport_send_pkt_info()' function. There is no need to set type of
packet which differs from type of socket. Since at current time only
stream type is supported, set it directly in 'virtio_transport_send_
pkt_info()', so callers don't need to set it.

Signed-off-by: Arseny Krasnov <[email protected]>
---
net/vmw_vsock/virtio_transport_common.c | 19 +++++--------------
1 file changed, 5 insertions(+), 14 deletions(-)

diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index e4370b1b7494..f69993d67f89 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -179,6 +179,8 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
struct virtio_vsock_pkt *pkt;
u32 pkt_len = info->pkt_len;

+ info->type = VIRTIO_VSOCK_TYPE_STREAM;
+
t_ops = virtio_transport_get_ops(vsk);
if (unlikely(!t_ops))
return -EFAULT;
@@ -270,12 +272,10 @@ void virtio_transport_put_credit(struct virtio_vsock_sock *vvs, u32 credit)
EXPORT_SYMBOL_GPL(virtio_transport_put_credit);

static int virtio_transport_send_credit_update(struct vsock_sock *vsk,
- int type,
struct virtio_vsock_hdr *hdr)
{
struct virtio_vsock_pkt_info info = {
.op = VIRTIO_VSOCK_OP_CREDIT_UPDATE,
- .type = type,
.vsk = vsk,
};

@@ -383,11 +383,8 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
* messages, we set the limit to a high value. TODO: experiment
* with different values.
*/
- if (free_space < VIRTIO_VSOCK_MAX_PKT_BUF_SIZE) {
- virtio_transport_send_credit_update(vsk,
- VIRTIO_VSOCK_TYPE_STREAM,
- NULL);
- }
+ if (free_space < VIRTIO_VSOCK_MAX_PKT_BUF_SIZE)
+ virtio_transport_send_credit_update(vsk, NULL);

return total;

@@ -496,8 +493,7 @@ void virtio_transport_notify_buffer_size(struct vsock_sock *vsk, u64 *val)

vvs->buf_alloc = *val;

- virtio_transport_send_credit_update(vsk, VIRTIO_VSOCK_TYPE_STREAM,
- NULL);
+ virtio_transport_send_credit_update(vsk, NULL);
}
EXPORT_SYMBOL_GPL(virtio_transport_notify_buffer_size);

@@ -624,7 +620,6 @@ int virtio_transport_connect(struct vsock_sock *vsk)
{
struct virtio_vsock_pkt_info info = {
.op = VIRTIO_VSOCK_OP_REQUEST,
- .type = VIRTIO_VSOCK_TYPE_STREAM,
.vsk = vsk,
};

@@ -636,7 +631,6 @@ int virtio_transport_shutdown(struct vsock_sock *vsk, int mode)
{
struct virtio_vsock_pkt_info info = {
.op = VIRTIO_VSOCK_OP_SHUTDOWN,
- .type = VIRTIO_VSOCK_TYPE_STREAM,
.flags = (mode & RCV_SHUTDOWN ?
VIRTIO_VSOCK_SHUTDOWN_RCV : 0) |
(mode & SEND_SHUTDOWN ?
@@ -665,7 +659,6 @@ virtio_transport_stream_enqueue(struct vsock_sock *vsk,
{
struct virtio_vsock_pkt_info info = {
.op = VIRTIO_VSOCK_OP_RW,
- .type = VIRTIO_VSOCK_TYPE_STREAM,
.msg = msg,
.pkt_len = len,
.vsk = vsk,
@@ -688,7 +681,6 @@ static int virtio_transport_reset(struct vsock_sock *vsk,
{
struct virtio_vsock_pkt_info info = {
.op = VIRTIO_VSOCK_OP_RST,
- .type = VIRTIO_VSOCK_TYPE_STREAM,
.reply = !!pkt,
.vsk = vsk,
};
@@ -990,7 +982,6 @@ virtio_transport_send_response(struct vsock_sock *vsk,
{
struct virtio_vsock_pkt_info info = {
.op = VIRTIO_VSOCK_OP_RESPONSE,
- .type = VIRTIO_VSOCK_TYPE_STREAM,
.remote_cid = le64_to_cpu(pkt->hdr.src_cid),
.remote_port = le32_to_cpu(pkt->hdr.src_port),
.reply = true,
--
2.25.1

2021-03-08 09:58:25

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 14/22] virtio/vsock: rest of SOCK_SEQPACKET support

This adds rest of logic for SEQPACKET:
1) SEQPACKET specific functions which send SEQ_BEGIN/SEQ_END.
Note that both functions may sleep to wait enough space for
SEQPACKET header.
2) SEQ_BEGIN/SEQ_END in TAP packet capture.
3) Send SHUTDOWN on socket close for SEQPACKET type.
4) Set SEQPACKET packet type during send.
5) Set MSG_EOR in flags for SEQPACKET during send.
6) 'seqpacket_allow' flag to virtio transport.

Signed-off-by: Arseny Krasnov <[email protected]>
---
include/linux/virtio_vsock.h | 8 +++
net/vmw_vsock/virtio_transport_common.c | 87 ++++++++++++++++++++++++-
2 files changed, 93 insertions(+), 2 deletions(-)

diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index d7edcfeb4cd2..6b45a8b98226 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -22,6 +22,7 @@ struct virtio_vsock_seqpack_state {
u32 user_read_seq_len;
u32 user_read_copied;
u32 curr_rx_msg_id;
+ u32 next_tx_msg_id;
};

/* Per-socket state (accessed via vsk->trans) */
@@ -76,6 +77,8 @@ struct virtio_transport {

/* Takes ownership of the packet */
int (*send_pkt)(struct virtio_vsock_pkt *pkt);
+
+ bool seqpacket_allow;
};

ssize_t
@@ -90,6 +93,11 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,

size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
int
+virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk,
+ struct msghdr *msg,
+ int flags,
+ size_t len);
+int
virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
struct msghdr *msg,
int flags,
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 9d86375935ce..8e9fdd8aba5d 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -139,6 +139,8 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque)
break;
case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
+ case VIRTIO_VSOCK_OP_SEQ_BEGIN:
+ case VIRTIO_VSOCK_OP_SEQ_END:
hdr->op = cpu_to_le16(AF_VSOCK_OP_CONTROL);
break;
default:
@@ -187,7 +189,12 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
struct virtio_vsock_pkt *pkt;
u32 pkt_len = info->pkt_len;

- info->type = VIRTIO_VSOCK_TYPE_STREAM;
+ info->type = virtio_transport_get_type(sk_vsock(vsk));
+
+ if (info->type == VIRTIO_VSOCK_TYPE_SEQPACKET &&
+ info->msg &&
+ info->msg->msg_flags & MSG_EOR)
+ info->flags |= VIRTIO_VSOCK_RW_EOR;

t_ops = virtio_transport_get_ops(vsk);
if (unlikely(!t_ops))
@@ -401,6 +408,43 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
return err;
}

+static int virtio_transport_seqpacket_send_ctrl(struct vsock_sock *vsk,
+ int type,
+ size_t len,
+ int flags)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+ struct virtio_vsock_pkt_info info = {
+ .op = type,
+ .vsk = vsk,
+ .pkt_len = sizeof(struct virtio_vsock_seq_hdr)
+ };
+
+ struct virtio_vsock_seq_hdr seq_hdr = {
+ .msg_id = cpu_to_le32(vvs->seqpacket_state.next_tx_msg_id),
+ .msg_len = cpu_to_le32(len)
+ };
+
+ struct kvec seq_hdr_kiov = {
+ .iov_base = (void *)&seq_hdr,
+ .iov_len = sizeof(struct virtio_vsock_seq_hdr)
+ };
+
+ struct msghdr msg = {0};
+
+ //XXX: do we need 'vsock_transport_send_notify_data' pointer?
+ if (vsock_wait_space(sk_vsock(vsk),
+ sizeof(struct virtio_vsock_seq_hdr),
+ flags, NULL))
+ return -1;
+
+ iov_iter_kvec(&msg.msg_iter, WRITE, &seq_hdr_kiov, 1, sizeof(seq_hdr));
+
+ info.msg = &msg;
+
+ return virtio_transport_send_pkt_info(vsk, &info);
+}
+
static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
{
list_del(&pkt->list);
@@ -582,6 +626,45 @@ virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
}
EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_dequeue);

+int
+virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk,
+ struct msghdr *msg,
+ int flags,
+ size_t len)
+{
+ int written;
+
+ if (msg->msg_iter.iov_offset == 0) {
+ /* Send SEQBEGIN. */
+ if (virtio_transport_seqpacket_send_ctrl(vsk,
+ VIRTIO_VSOCK_OP_SEQ_BEGIN,
+ len,
+ flags) < 0)
+ return -1;
+ }
+
+ written = virtio_transport_stream_enqueue(vsk, msg, len);
+
+ if (written < 0)
+ return -1;
+
+ if (msg->msg_iter.count == 0) {
+ struct virtio_vsock_sock *vvs = vsk->trans;
+
+ /* Send SEQEND. */
+ if (virtio_transport_seqpacket_send_ctrl(vsk,
+ VIRTIO_VSOCK_OP_SEQ_END,
+ 0,
+ flags) < 0)
+ return -1;
+
+ vvs->seqpacket_state.next_tx_msg_id++;
+ }
+
+ return written;
+}
+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_enqueue);
+
int
virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
struct msghdr *msg,
@@ -1001,7 +1084,7 @@ void virtio_transport_release(struct vsock_sock *vsk)
struct sock *sk = &vsk->sk;
bool remove_sock = true;

- if (sk->sk_type == SOCK_STREAM)
+ if (sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET)
remove_sock = virtio_transport_close(vsk);

list_for_each_entry_safe(pkt, tmp, &vvs->rx_queue, list) {
--
2.25.1

2021-03-08 10:08:49

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 17/22] virtio/vsock: SEQPACKET feature bit support

This adds handling of SEQPACKET bit: guest tries to negotiate it
with vhost.

Signed-off-by: Arseny Krasnov <[email protected]>
---
net/vmw_vsock/virtio_transport.c | 5 +++++
1 file changed, 5 insertions(+)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 2700a63ab095..41c5d0a31e08 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -612,6 +612,10 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
rcu_assign_pointer(the_virtio_vsock, vsock);

mutex_unlock(&the_virtio_vsock_mutex);
+
+ if (vdev->features & (1ULL << VIRTIO_VSOCK_F_SEQPACKET))
+ virtio_transport.seqpacket_allow = true;
+
return 0;

out:
@@ -695,6 +699,7 @@ static struct virtio_device_id id_table[] = {
};

static unsigned int features[] = {
+ VIRTIO_VSOCK_F_SEQPACKET
};

static struct virtio_driver virtio_vsock_driver = {
--
2.25.1

2021-03-08 10:09:13

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 18/22] virtio/vsock: setup SEQPACKET ops for transport

This adds SEQPACKET ops for virtio transport and 'seqpacket_allow()'
callback.

Signed-off-by: Arseny Krasnov <[email protected]>
---
net/vmw_vsock/virtio_transport.c | 13 +++++++++++++
1 file changed, 13 insertions(+)

diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 41c5d0a31e08..1b957b5477c1 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -443,6 +443,8 @@ static void virtio_vsock_rx_done(struct virtqueue *vq)
queue_work(virtio_vsock_workqueue, &vsock->rx_work);
}

+static bool virtio_transport_seqpacket_allow(void);
+
static struct virtio_transport virtio_transport = {
.transport = {
.module = THIS_MODULE,
@@ -469,6 +471,11 @@ static struct virtio_transport virtio_transport = {
.stream_is_active = virtio_transport_stream_is_active,
.stream_allow = virtio_transport_stream_allow,

+ .seqpacket_seq_get_len = virtio_transport_seqpacket_seq_get_len,
+ .seqpacket_dequeue = virtio_transport_seqpacket_dequeue,
+ .seqpacket_enqueue = virtio_transport_seqpacket_enqueue,
+ .seqpacket_allow = virtio_transport_seqpacket_allow,
+
.notify_poll_in = virtio_transport_notify_poll_in,
.notify_poll_out = virtio_transport_notify_poll_out,
.notify_recv_init = virtio_transport_notify_recv_init,
@@ -483,8 +490,14 @@ static struct virtio_transport virtio_transport = {
},

.send_pkt = virtio_transport_send_pkt,
+ .seqpacket_allow = false
};

+static bool virtio_transport_seqpacket_allow(void)
+{
+ return virtio_transport.seqpacket_allow;
+}
+
static void virtio_transport_rx_work(struct work_struct *work)
{
struct virtio_vsock *vsock =
--
2.25.1

2021-03-08 10:09:19

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 16/22] vhost/vsock: SEQPACKET feature bit support

This adds handling of SEQPACKET bit: if guest sets features with
this bit cleared, then SOCK_SEQPACKET support will be disabled.

Signed-off-by: Arseny Krasnov <[email protected]>
---
drivers/vhost/vsock.c | 6 +++++-
1 file changed, 5 insertions(+), 1 deletion(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 5e78fb719602..3b0a50e6de12 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -31,7 +31,8 @@

enum {
VHOST_VSOCK_FEATURES = VHOST_FEATURES |
- (1ULL << VIRTIO_F_ACCESS_PLATFORM)
+ (1ULL << VIRTIO_F_ACCESS_PLATFORM) |
+ (1ULL << VIRTIO_VSOCK_F_SEQPACKET)
};

enum {
@@ -785,6 +786,9 @@ static int vhost_vsock_set_features(struct vhost_vsock *vsock, u64 features)
goto err;
}

+ if (features & (1ULL << VIRTIO_VSOCK_F_SEQPACKET))
+ vhost_transport.seqpacket_allow = true;
+
for (i = 0; i < ARRAY_SIZE(vsock->vqs); i++) {
vq = &vsock->vqs[i];
mutex_lock(&vq->mutex);
--
2.25.1

2021-03-08 10:09:45

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 19/22] vhost/vsock: setup SEQPACKET ops for transport

This also removes ignore of non-stream type of packets and adds
'seqpacket_allow()' callback.

Signed-off-by: Arseny Krasnov <[email protected]>
---
drivers/vhost/vsock.c | 16 ++++++++++++++--
1 file changed, 14 insertions(+), 2 deletions(-)

diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index 3b0a50e6de12..9d4bbf9b71e2 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -355,8 +355,7 @@ vhost_vsock_alloc_pkt(struct vhost_virtqueue *vq,
return NULL;
}

- if (le16_to_cpu(pkt->hdr.type) == VIRTIO_VSOCK_TYPE_STREAM)
- pkt->len = le32_to_cpu(pkt->hdr.len);
+ pkt->len = le32_to_cpu(pkt->hdr.len);

/* No payload */
if (!pkt->len)
@@ -399,6 +398,8 @@ static bool vhost_vsock_more_replies(struct vhost_vsock *vsock)
return val < vq->num;
}

+static bool vhost_transport_seqpacket_allow(void);
+
static struct virtio_transport vhost_transport = {
.transport = {
.module = THIS_MODULE,
@@ -425,6 +426,11 @@ static struct virtio_transport vhost_transport = {
.stream_is_active = virtio_transport_stream_is_active,
.stream_allow = virtio_transport_stream_allow,

+ .seqpacket_seq_get_len = virtio_transport_seqpacket_seq_get_len,
+ .seqpacket_dequeue = virtio_transport_seqpacket_dequeue,
+ .seqpacket_enqueue = virtio_transport_seqpacket_enqueue,
+ .seqpacket_allow = vhost_transport_seqpacket_allow,
+
.notify_poll_in = virtio_transport_notify_poll_in,
.notify_poll_out = virtio_transport_notify_poll_out,
.notify_recv_init = virtio_transport_notify_recv_init,
@@ -440,8 +446,14 @@ static struct virtio_transport vhost_transport = {
},

.send_pkt = vhost_transport_send_pkt,
+ .seqpacket_allow = false
};

+static bool vhost_transport_seqpacket_allow(void)
+{
+ return vhost_transport.seqpacket_allow;
+}
+
static void vhost_vsock_handle_tx_kick(struct vhost_work *work)
{
struct vhost_virtqueue *vq = container_of(work, struct vhost_virtqueue,
--
2.25.1

2021-03-08 10:12:00

by Arseny Krasnov

[permalink] [raw]
Subject: [RFC PATCH v6 21/22] vsock_test: add SOCK_SEQPACKET tests

This adds two tests of SOCK_SEQPACKET socket: both transfer data and
then test MSG_EOR and MSG_TRUNC flags. Cases for connect(), bind(),
etc. are not tested, because it is same as for stream socket.

Signed-off-by: Arseny Krasnov <[email protected]>
---
tools/testing/vsock/util.c | 32 ++++++--
tools/testing/vsock/util.h | 3 +
tools/testing/vsock/vsock_test.c | 126 +++++++++++++++++++++++++++++++
3 files changed, 156 insertions(+), 5 deletions(-)

diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
index 93cbd6f603f9..2acbb7703c6a 100644
--- a/tools/testing/vsock/util.c
+++ b/tools/testing/vsock/util.c
@@ -84,7 +84,7 @@ void vsock_wait_remote_close(int fd)
}

/* Connect to <cid, port> and return the file descriptor. */
-int vsock_stream_connect(unsigned int cid, unsigned int port)
+static int vsock_connect(unsigned int cid, unsigned int port, int type)
{
union {
struct sockaddr sa;
@@ -101,7 +101,7 @@ int vsock_stream_connect(unsigned int cid, unsigned int port)

control_expectln("LISTENING");

- fd = socket(AF_VSOCK, SOCK_STREAM, 0);
+ fd = socket(AF_VSOCK, type, 0);

timeout_begin(TIMEOUT);
do {
@@ -120,11 +120,21 @@ int vsock_stream_connect(unsigned int cid, unsigned int port)
return fd;
}

+int vsock_stream_connect(unsigned int cid, unsigned int port)
+{
+ return vsock_connect(cid, port, SOCK_STREAM);
+}
+
+int vsock_seqpacket_connect(unsigned int cid, unsigned int port)
+{
+ return vsock_connect(cid, port, SOCK_SEQPACKET);
+}
+
/* Listen on <cid, port> and return the first incoming connection. The remote
* address is stored to clientaddrp. clientaddrp may be NULL.
*/
-int vsock_stream_accept(unsigned int cid, unsigned int port,
- struct sockaddr_vm *clientaddrp)
+static int vsock_accept(unsigned int cid, unsigned int port,
+ struct sockaddr_vm *clientaddrp, int type)
{
union {
struct sockaddr sa;
@@ -145,7 +155,7 @@ int vsock_stream_accept(unsigned int cid, unsigned int port,
int client_fd;
int old_errno;

- fd = socket(AF_VSOCK, SOCK_STREAM, 0);
+ fd = socket(AF_VSOCK, type, 0);

if (bind(fd, &addr.sa, sizeof(addr.svm)) < 0) {
perror("bind");
@@ -189,6 +199,18 @@ int vsock_stream_accept(unsigned int cid, unsigned int port,
return client_fd;
}

+int vsock_stream_accept(unsigned int cid, unsigned int port,
+ struct sockaddr_vm *clientaddrp)
+{
+ return vsock_accept(cid, port, clientaddrp, SOCK_STREAM);
+}
+
+int vsock_seqpacket_accept(unsigned int cid, unsigned int port,
+ struct sockaddr_vm *clientaddrp)
+{
+ return vsock_accept(cid, port, clientaddrp, SOCK_SEQPACKET);
+}
+
/* Transmit one byte and check the return value.
*
* expected_ret:
diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
index e53dd09d26d9..a3375ad2fb7f 100644
--- a/tools/testing/vsock/util.h
+++ b/tools/testing/vsock/util.h
@@ -36,8 +36,11 @@ struct test_case {
void init_signals(void);
unsigned int parse_cid(const char *str);
int vsock_stream_connect(unsigned int cid, unsigned int port);
+int vsock_seqpacket_connect(unsigned int cid, unsigned int port);
int vsock_stream_accept(unsigned int cid, unsigned int port,
struct sockaddr_vm *clientaddrp);
+int vsock_seqpacket_accept(unsigned int cid, unsigned int port,
+ struct sockaddr_vm *clientaddrp);
void vsock_wait_remote_close(int fd);
void send_byte(int fd, int expected_ret, int flags);
void recv_byte(int fd, int expected_ret, int flags);
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 5a4fb80fa832..5fca9be5b1dd 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -14,6 +14,8 @@
#include <errno.h>
#include <unistd.h>
#include <linux/kernel.h>
+#include <sys/types.h>
+#include <sys/socket.h>

#include "timeout.h"
#include "control.h"
@@ -279,6 +281,120 @@ static void test_stream_msg_peek_server(const struct test_opts *opts)
close(fd);
}

+#define MESSAGES_CNT 7
+#define MESSAGE_EOR_IDX (MESSAGES_CNT / 2)
+static void test_seqpacket_msg_eor_client(const struct test_opts *opts)
+{
+ int fd;
+
+ fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
+ if (fd < 0) {
+ perror("connect");
+ exit(EXIT_FAILURE);
+ }
+
+ /* Send several messages, one with MSG_EOR flag */
+ for (int i = 0; i < MESSAGES_CNT; i++)
+ send_byte(fd, 1, (i != MESSAGE_EOR_IDX) ? 0 : MSG_EOR);
+
+ control_writeln("SENDDONE");
+ close(fd);
+}
+
+static void test_seqpacket_msg_eor_server(const struct test_opts *opts)
+{
+ int fd;
+ char buf[16];
+ struct msghdr msg = {0};
+ struct iovec iov = {0};
+
+ fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL);
+ if (fd < 0) {
+ perror("accept");
+ exit(EXIT_FAILURE);
+ }
+
+ control_expectln("SENDDONE");
+ iov.iov_base = buf;
+ iov.iov_len = sizeof(buf);
+ msg.msg_iov = &iov;
+ msg.msg_iovlen = 1;
+
+ for (int i = 0; i < MESSAGES_CNT; i++) {
+ if (recvmsg(fd, &msg, 0) != 1) {
+ perror("message bound violated");
+ exit(EXIT_FAILURE);
+ }
+
+ if (i == MESSAGE_EOR_IDX) {
+ if (!(msg.msg_flags & MSG_EOR)) {
+ fprintf(stderr, "MSG_EOR flag expected\n");
+ exit(EXIT_FAILURE);
+ }
+ } else {
+ if (msg.msg_flags & MSG_EOR) {
+ fprintf(stderr, "unexpected MSG_EOR flag\n");
+ exit(EXIT_FAILURE);
+ }
+ }
+ }
+
+ close(fd);
+}
+
+#define MESSAGE_TRUNC_SZ 32
+static void test_seqpacket_msg_trunc_client(const struct test_opts *opts)
+{
+ int fd;
+ char buf[MESSAGE_TRUNC_SZ];
+
+ fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
+ if (fd < 0) {
+ perror("connect");
+ exit(EXIT_FAILURE);
+ }
+
+ if (send(fd, buf, sizeof(buf), 0) != sizeof(buf)) {
+ perror("send failed");
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln("SENDDONE");
+ close(fd);
+}
+
+static void test_seqpacket_msg_trunc_server(const struct test_opts *opts)
+{
+ int fd;
+ char buf[MESSAGE_TRUNC_SZ / 2];
+ struct msghdr msg = {0};
+ struct iovec iov = {0};
+
+ fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL);
+ if (fd < 0) {
+ perror("accept");
+ exit(EXIT_FAILURE);
+ }
+
+ control_expectln("SENDDONE");
+ iov.iov_base = buf;
+ iov.iov_len = sizeof(buf);
+ msg.msg_iov = &iov;
+ msg.msg_iovlen = 1;
+
+ if (recvmsg(fd, &msg, MSG_TRUNC) != MESSAGE_TRUNC_SZ) {
+ perror("MSG_TRUNC doesn't work");
+ exit(EXIT_FAILURE);
+ }
+
+ if (!(msg.msg_flags & MSG_TRUNC)) {
+ fprintf(stderr, "MSG_TRUNC expected\n");
+ exit(EXIT_FAILURE);
+ }
+
+ close(fd);
+}
+
static struct test_case test_cases[] = {
{
.name = "SOCK_STREAM connection reset",
@@ -309,6 +425,16 @@ static struct test_case test_cases[] = {
.run_client = test_stream_msg_peek_client,
.run_server = test_stream_msg_peek_server,
},
+ {
+ .name = "SOCK_SEQPACKET send data MSG_EOR",
+ .run_client = test_seqpacket_msg_eor_client,
+ .run_server = test_seqpacket_msg_eor_server,
+ },
+ {
+ .name = "SOCK_SEQPACKET send data MSG_TRUNC",
+ .run_client = test_seqpacket_msg_trunc_client,
+ .run_server = test_seqpacket_msg_trunc_server,
+ },
{},
};

--
2.25.1

2021-03-10 10:08:11

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 00/22] virtio/vsock: introduce SOCK_SEQPACKET support

Hi Arseny,
thanks for this new version.

It's a busy week for me, but I hope to review this series by the end of
this week :-)

Thanks,
Stefano

On Sun, Mar 07, 2021 at 08:57:19PM +0300, Arseny Krasnov wrote:
> This patchset implements support of SOCK_SEQPACKET for virtio
>transport.
> As SOCK_SEQPACKET guarantees to save record boundaries, so to
>do it, two new packet operations were added: first for start of record
> and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>both operations carries metadata - to maintain boundaries and payload
>integrity. Metadata is introduced by adding special header with two
>fields - message id and message length:
>
> struct virtio_vsock_seq_hdr {
> __le32 msg_id;
> __le32 msg_len;
> } __attribute__((packed));
>
> This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>packets(buffer of second virtio descriptor in chain) in the same way as
>data transmitted in RW packets. Payload was chosen as buffer for this
>header to avoid touching first virtio buffer which carries header of
>packet, because someone could check that size of this buffer is equal
>to size of packet header. To send record, packet with start marker is
>sent first(it's header carries length of record and id),then all data
>is sent as usual 'RW' packets and finally SEQ_END is sent(it carries
>id of message, which is equal to id of SEQ_BEGIN), also after sending
>SEQ_END id is incremented. On receiver's side,size of record is known
>from packet with start record marker. To check that no packets were
>dropped by transport, 'msg_id's of two sequential SEQ_BEGIN and SEQ_END
>are checked to be equal and length of data between two markers is
>compared to then length in SEQ_BEGIN header.
> Now as packets of one socket are not reordered neither on
>vsock nor on vhost transport layers, such markers allows to restore
>original record on receiver's side. If user's buffer is smaller that
>record length, when all out of size data is dropped.
> Maximum length of datagram is not limited as in stream socket,
>because same credit logic is used. Difference with stream socket is
>that user is not woken up until whole record is received or error
>occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
> Tests also implemented.
>
> Thanks to [email protected] for encouragements and initial design
>recommendations.
>
> Arseny Krasnov (22):
> af_vsock: update functions for connectible socket
> af_vsock: separate wait data loop
> af_vsock: separate receive data loop
> af_vsock: implement SEQPACKET receive loop
> af_vsock: separate wait space loop
> af_vsock: implement send logic for SEQPACKET
> af_vsock: rest of SEQPACKET support
> af_vsock: update comments for stream sockets
> virtio/vsock: set packet's type in virtio_transport_send_pkt_info()
> virtio/vsock: simplify credit update function API
> virtio/vsock: dequeue callback for SOCK_SEQPACKET
> virtio/vsock: fetch length for SEQPACKET record
> virtio/vsock: add SEQPACKET receive logic
> virtio/vsock: rest of SOCK_SEQPACKET support
> virtio/vsock: SEQPACKET feature bit
> vhost/vsock: SEQPACKET feature bit support
> virtio/vsock: SEQPACKET feature bit support
> virtio/vsock: setup SEQPACKET ops for transport
> vhost/vsock: setup SEQPACKET ops for transport
> vsock/loopback: setup SEQPACKET ops for transport
> vsock_test: add SOCK_SEQPACKET tests
> virtio/vsock: update trace event for SEQPACKET
>
> drivers/vhost/vsock.c | 22 +-
> include/linux/virtio_vsock.h | 22 +
> include/net/af_vsock.h | 10 +
> .../events/vsock_virtio_transport_common.h | 48 +-
> include/uapi/linux/virtio_vsock.h | 19 +
> net/vmw_vsock/af_vsock.c | 589 +++++++++++------
> net/vmw_vsock/virtio_transport.c | 18 +
> net/vmw_vsock/virtio_transport_common.c | 364 ++++++++--
> net/vmw_vsock/vsock_loopback.c | 13 +
> tools/testing/vsock/util.c | 32 +-
> tools/testing/vsock/util.h | 3 +
> tools/testing/vsock/vsock_test.c | 126 ++++
> 12 files changed, 1013 insertions(+), 253 deletions(-)
>
> v5 -> v6:
> General changelog:
> - virtio transport specific callbacks which send SEQ_BEGIN or
> SEQ_END now hidden inside virtio transport. Only enqueue,
> dequeue and record length callbacks are provided by transport.
>
> - virtio feature bit for SEQPACKET socket support introduced:
> VIRTIO_VSOCK_F_SEQPACKET.
>
> - 'msg_cnt' field in 'struct virtio_vsock_seq_hdr' renamed to
> 'msg_id' and used as id.
>
> Per patch changelog:
> - 'af_vsock: separate wait data loop':
> 1) Commit message updated.
> 2) 'prepare_to_wait()' moved inside while loop(thanks to
> Jorgen Hansen).
> Marked 'Reviewed-by' with 1), but as 2) I removed R-b.
>
> - 'af_vsock: separate receive data loop': commit message
> updated.
> Marked 'Reviewed-by' with that fix.
>
> - 'af_vsock: implement SEQPACKET receive loop': style fixes.
>
> - 'af_vsock: rest of SEQPACKET support':
> 1) 'module_put()' added when transport callback check failed.
> 2) Now only 'seqpacket_allow()' callback called to check
> support of SEQPACKET by transport.
>
> - 'af_vsock: update comments for stream sockets': commit message
> updated.
> Marked 'Reviewed-by' with that fix.
>
> - 'virtio/vsock: set packet's type in send':
> 1) Commit message updated.
> 2) Parameter 'type' from 'virtio_transport_send_credit_update()'
> also removed in this patch instead of in next.
>
> - 'virtio/vsock: dequeue callback for SOCK_SEQPACKET': SEQPACKET
> related state wrapped to special struct.
>
> - 'virtio/vsock: update trace event for SEQPACKET': format strings
> now not broken by new lines.
>
> v4 -> v5:
> - patches reorganized:
> 1) Setting of packet's type in 'virtio_transport_send_pkt_info()'
> is moved to separate patch.
> 2) Simplifying of 'virtio_transport_send_credit_update()' is
> moved to separate patch and before main virtio/vsock patches.
> - style problem fixed
> - in 'af_vsock: separate receive data loop' extra 'release_sock()'
> removed
> - added trace event fields for SEQPACKET
> - in 'af_vsock: separate wait data loop':
> 1) 'vsock_wait_data()' removed 'goto out;'
> 2) Comment for invalid data amount is changed.
> - in 'af_vsock: rest of SEQPACKET support', 'new_transport' pointer
> check is moved after 'try_module_get()'
> - in 'af_vsock: update comments for stream sockets', 'connect-oriented'
> replaced with 'connection-oriented'
> - in 'loopback/vsock: setup SEQPACKET ops for transport',
> 'loopback/vsock' replaced with 'vsock/loopback'
>
> v3 -> v4:
> - SEQPACKET specific metadata moved from packet header to payload
> and called 'virtio_vsock_seq_hdr'
> - record integrity check:
> 1) SEQ_END operation was added, which marks end of record.
> 2) Both SEQ_BEGIN and SEQ_END carries counter which is incremented
> on every marker send.
> - af_vsock.c: socket operations for STREAM and SEQPACKET call same
> functions instead of having own "gates" differs only by names:
> 'vsock_seqpacket/stream_getsockopt()' now replaced with
> 'vsock_connectible_getsockopt()'.
> - af_vsock.c: 'seqpacket_dequeue' callback returns error and flag that
> record ready. There is no need to return number of copied bytes,
> because case when record received successfully is checked at virtio
> transport layer, when SEQ_END is processed. Also user doesn't need
> number of copied bytes, because 'recv()' from SEQPACKET could return
> error, length of users's buffer or length of whole record(both are
> known in af_vsock.c).
> - af_vsock.c: both wait loops in af_vsock.c(for data and space) moved
> to separate functions because now both called from several places.
> - af_vsock.c: 'vsock_assign_transport()' checks that 'new_transport'
> pointer is not NULL and returns 'ESOCKTNOSUPPORT' instead of 'ENODEV'
> if failed to use transport.
> - tools/testing/vsock/vsock_test.c: rename tests
>
> v2 -> v3:
> - patches reorganized: split for prepare and implementation patches
> - local variables are declared in "Reverse Christmas tree" manner
> - virtio_transport_common.c: valid leXX_to_cpu() for vsock header
> fields access
> - af_vsock.c: 'vsock_connectible_*sockopt()' added as shared code
> between stream and seqpacket sockets.
> - af_vsock.c: loops in '__vsock_*_recvmsg()' refactored.
> - af_vsock.c: 'vsock_wait_data()' refactored.
>
> v1 -> v2:
> - patches reordered: af_vsock.c related changes now before virtio vsock
> - patches reorganized: more small patches, where +/- are not mixed
> - tests for SOCK_SEQPACKET added
> - all commit messages updated
> - af_vsock.c: 'vsock_pre_recv_check()' inlined to
> 'vsock_connectible_recvmsg()'
> - af_vsock.c: 'vsock_assign_transport()' returns ENODEV if transport
> was not found
> - virtio_transport_common.c: transport callback for seqpacket dequeue
> - virtio_transport_common.c: simplified
> 'virtio_transport_recv_connected()'
> - virtio_transport_common.c: send reset on socket and packet type
> mismatch.
>
>Signed-off-by: Arseny Krasnov <[email protected]>
>
>--
>2.25.1
>

2021-03-10 10:15:15

by Arseny Krasnov

[permalink] [raw]
Subject: Re: [RFC PATCH v6 00/22] virtio/vsock: introduce SOCK_SEQPACKET support

Hello, great, no problem!

Thanks

On 10.03.2021 13:06, Stefano Garzarella wrote:
> Hi Arseny,
> thanks for this new version.
>
> It's a busy week for me, but I hope to review this series by the end of
> this week :-)
>
> Thanks,
> Stefano
>
> On Sun, Mar 07, 2021 at 08:57:19PM +0300, Arseny Krasnov wrote:
>> This patchset implements support of SOCK_SEQPACKET for virtio
>> transport.
>> As SOCK_SEQPACKET guarantees to save record boundaries, so to
>> do it, two new packet operations were added: first for start of record
>> and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>> both operations carries metadata - to maintain boundaries and payload
>> integrity. Metadata is introduced by adding special header with two
>> fields - message id and message length:
>>
>> struct virtio_vsock_seq_hdr {
>> __le32 msg_id;
>> __le32 msg_len;
>> } __attribute__((packed));
>>
>> This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>> packets(buffer of second virtio descriptor in chain) in the same way as
>> data transmitted in RW packets. Payload was chosen as buffer for this
>> header to avoid touching first virtio buffer which carries header of
>> packet, because someone could check that size of this buffer is equal
>> to size of packet header. To send record, packet with start marker is
>> sent first(it's header carries length of record and id),then all data
>> is sent as usual 'RW' packets and finally SEQ_END is sent(it carries
>> id of message, which is equal to id of SEQ_BEGIN), also after sending
>> SEQ_END id is incremented. On receiver's side,size of record is known
> >from packet with start record marker. To check that no packets were
>> dropped by transport, 'msg_id's of two sequential SEQ_BEGIN and SEQ_END
>> are checked to be equal and length of data between two markers is
>> compared to then length in SEQ_BEGIN header.
>> Now as packets of one socket are not reordered neither on
>> vsock nor on vhost transport layers, such markers allows to restore
>> original record on receiver's side. If user's buffer is smaller that
>> record length, when all out of size data is dropped.
>> Maximum length of datagram is not limited as in stream socket,
>> because same credit logic is used. Difference with stream socket is
>> that user is not woken up until whole record is received or error
>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>> Tests also implemented.
>>
>> Thanks to [email protected] for encouragements and initial design
>> recommendations.
>>
>> Arseny Krasnov (22):
>> af_vsock: update functions for connectible socket
>> af_vsock: separate wait data loop
>> af_vsock: separate receive data loop
>> af_vsock: implement SEQPACKET receive loop
>> af_vsock: separate wait space loop
>> af_vsock: implement send logic for SEQPACKET
>> af_vsock: rest of SEQPACKET support
>> af_vsock: update comments for stream sockets
>> virtio/vsock: set packet's type in virtio_transport_send_pkt_info()
>> virtio/vsock: simplify credit update function API
>> virtio/vsock: dequeue callback for SOCK_SEQPACKET
>> virtio/vsock: fetch length for SEQPACKET record
>> virtio/vsock: add SEQPACKET receive logic
>> virtio/vsock: rest of SOCK_SEQPACKET support
>> virtio/vsock: SEQPACKET feature bit
>> vhost/vsock: SEQPACKET feature bit support
>> virtio/vsock: SEQPACKET feature bit support
>> virtio/vsock: setup SEQPACKET ops for transport
>> vhost/vsock: setup SEQPACKET ops for transport
>> vsock/loopback: setup SEQPACKET ops for transport
>> vsock_test: add SOCK_SEQPACKET tests
>> virtio/vsock: update trace event for SEQPACKET
>>
>> drivers/vhost/vsock.c | 22 +-
>> include/linux/virtio_vsock.h | 22 +
>> include/net/af_vsock.h | 10 +
>> .../events/vsock_virtio_transport_common.h | 48 +-
>> include/uapi/linux/virtio_vsock.h | 19 +
>> net/vmw_vsock/af_vsock.c | 589 +++++++++++------
>> net/vmw_vsock/virtio_transport.c | 18 +
>> net/vmw_vsock/virtio_transport_common.c | 364 ++++++++--
>> net/vmw_vsock/vsock_loopback.c | 13 +
>> tools/testing/vsock/util.c | 32 +-
>> tools/testing/vsock/util.h | 3 +
>> tools/testing/vsock/vsock_test.c | 126 ++++
>> 12 files changed, 1013 insertions(+), 253 deletions(-)
>>
>> v5 -> v6:
>> General changelog:
>> - virtio transport specific callbacks which send SEQ_BEGIN or
>> SEQ_END now hidden inside virtio transport. Only enqueue,
>> dequeue and record length callbacks are provided by transport.
>>
>> - virtio feature bit for SEQPACKET socket support introduced:
>> VIRTIO_VSOCK_F_SEQPACKET.
>>
>> - 'msg_cnt' field in 'struct virtio_vsock_seq_hdr' renamed to
>> 'msg_id' and used as id.
>>
>> Per patch changelog:
>> - 'af_vsock: separate wait data loop':
>> 1) Commit message updated.
>> 2) 'prepare_to_wait()' moved inside while loop(thanks to
>> Jorgen Hansen).
>> Marked 'Reviewed-by' with 1), but as 2) I removed R-b.
>>
>> - 'af_vsock: separate receive data loop': commit message
>> updated.
>> Marked 'Reviewed-by' with that fix.
>>
>> - 'af_vsock: implement SEQPACKET receive loop': style fixes.
>>
>> - 'af_vsock: rest of SEQPACKET support':
>> 1) 'module_put()' added when transport callback check failed.
>> 2) Now only 'seqpacket_allow()' callback called to check
>> support of SEQPACKET by transport.
>>
>> - 'af_vsock: update comments for stream sockets': commit message
>> updated.
>> Marked 'Reviewed-by' with that fix.
>>
>> - 'virtio/vsock: set packet's type in send':
>> 1) Commit message updated.
>> 2) Parameter 'type' from 'virtio_transport_send_credit_update()'
>> also removed in this patch instead of in next.
>>
>> - 'virtio/vsock: dequeue callback for SOCK_SEQPACKET': SEQPACKET
>> related state wrapped to special struct.
>>
>> - 'virtio/vsock: update trace event for SEQPACKET': format strings
>> now not broken by new lines.
>>
>> v4 -> v5:
>> - patches reorganized:
>> 1) Setting of packet's type in 'virtio_transport_send_pkt_info()'
>> is moved to separate patch.
>> 2) Simplifying of 'virtio_transport_send_credit_update()' is
>> moved to separate patch and before main virtio/vsock patches.
>> - style problem fixed
>> - in 'af_vsock: separate receive data loop' extra 'release_sock()'
>> removed
>> - added trace event fields for SEQPACKET
>> - in 'af_vsock: separate wait data loop':
>> 1) 'vsock_wait_data()' removed 'goto out;'
>> 2) Comment for invalid data amount is changed.
>> - in 'af_vsock: rest of SEQPACKET support', 'new_transport' pointer
>> check is moved after 'try_module_get()'
>> - in 'af_vsock: update comments for stream sockets', 'connect-oriented'
>> replaced with 'connection-oriented'
>> - in 'loopback/vsock: setup SEQPACKET ops for transport',
>> 'loopback/vsock' replaced with 'vsock/loopback'
>>
>> v3 -> v4:
>> - SEQPACKET specific metadata moved from packet header to payload
>> and called 'virtio_vsock_seq_hdr'
>> - record integrity check:
>> 1) SEQ_END operation was added, which marks end of record.
>> 2) Both SEQ_BEGIN and SEQ_END carries counter which is incremented
>> on every marker send.
>> - af_vsock.c: socket operations for STREAM and SEQPACKET call same
>> functions instead of having own "gates" differs only by names:
>> 'vsock_seqpacket/stream_getsockopt()' now replaced with
>> 'vsock_connectible_getsockopt()'.
>> - af_vsock.c: 'seqpacket_dequeue' callback returns error and flag that
>> record ready. There is no need to return number of copied bytes,
>> because case when record received successfully is checked at virtio
>> transport layer, when SEQ_END is processed. Also user doesn't need
>> number of copied bytes, because 'recv()' from SEQPACKET could return
>> error, length of users's buffer or length of whole record(both are
>> known in af_vsock.c).
>> - af_vsock.c: both wait loops in af_vsock.c(for data and space) moved
>> to separate functions because now both called from several places.
>> - af_vsock.c: 'vsock_assign_transport()' checks that 'new_transport'
>> pointer is not NULL and returns 'ESOCKTNOSUPPORT' instead of 'ENODEV'
>> if failed to use transport.
>> - tools/testing/vsock/vsock_test.c: rename tests
>>
>> v2 -> v3:
>> - patches reorganized: split for prepare and implementation patches
>> - local variables are declared in "Reverse Christmas tree" manner
>> - virtio_transport_common.c: valid leXX_to_cpu() for vsock header
>> fields access
>> - af_vsock.c: 'vsock_connectible_*sockopt()' added as shared code
>> between stream and seqpacket sockets.
>> - af_vsock.c: loops in '__vsock_*_recvmsg()' refactored.
>> - af_vsock.c: 'vsock_wait_data()' refactored.
>>
>> v1 -> v2:
>> - patches reordered: af_vsock.c related changes now before virtio vsock
>> - patches reorganized: more small patches, where +/- are not mixed
>> - tests for SOCK_SEQPACKET added
>> - all commit messages updated
>> - af_vsock.c: 'vsock_pre_recv_check()' inlined to
>> 'vsock_connectible_recvmsg()'
>> - af_vsock.c: 'vsock_assign_transport()' returns ENODEV if transport
>> was not found
>> - virtio_transport_common.c: transport callback for seqpacket dequeue
>> - virtio_transport_common.c: simplified
>> 'virtio_transport_recv_connected()'
>> - virtio_transport_common.c: send reset on socket and packet type
>> mismatch.
>>
>> Signed-off-by: Arseny Krasnov <[email protected]>
>>
>> --
>> 2.25.1
>>
>

2021-03-12 14:39:45

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 01/22] af_vsock: update functions for connectible socket

On Sun, Mar 07, 2021 at 08:58:39PM +0300, Arseny Krasnov wrote:
>This prepares af_vsock.c for SEQPACKET support: some functions such
>as setsockopt(), getsockopt(), connect(), recvmsg(), sendmsg() are
>shared between both types of sockets, so rename them in general
>manner.
>
>Signed-off-by: Arseny Krasnov <[email protected]>
>---
> net/vmw_vsock/af_vsock.c | 64 +++++++++++++++++++++-------------------
> 1 file changed, 34 insertions(+), 30 deletions(-)

Reviewed-by: Stefano Garzarella <[email protected]>

2021-03-12 14:42:33

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 02/22] af_vsock: separate wait data loop

On Sun, Mar 07, 2021 at 08:59:01PM +0300, Arseny Krasnov wrote:
>This moves wait loop for data to dedicated function, because later it
>will be used by SEQPACKET data receive loop. While moving the code
>around, let's update an old comment.
>
>Signed-off-by: Arseny Krasnov <[email protected]>
>---
> net/vmw_vsock/af_vsock.c | 156 +++++++++++++++++++++------------------
> 1 file changed, 84 insertions(+), 72 deletions(-)

Reviewed-by: Stefano Garzarella <[email protected]>

2021-03-12 15:03:54

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 04/22] af_vsock: implement SEQPACKET receive loop

On Sun, Mar 07, 2021 at 08:59:45PM +0300, Arseny Krasnov wrote:
>This adds receive loop for SEQPACKET. It looks like receive loop for
>STREAM, but there is a little bit difference:
>1) It doesn't call notify callbacks.
>2) It doesn't care about 'SO_SNDLOWAT' and 'SO_RCVLOWAT' values, because
> there is no sense for these values in SEQPACKET case.
>3) It waits until whole record is received or error is found during
> receiving.
>4) It processes and sets 'MSG_TRUNC' flag.
>
>So to avoid extra conditions for two types of socket inside one loop, two
>independent functions were created.
>
>Signed-off-by: Arseny Krasnov <[email protected]>
>---
> include/net/af_vsock.h | 5 +++
> net/vmw_vsock/af_vsock.c | 95 +++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 99 insertions(+), 1 deletion(-)

Reviewed-by: Stefano Garzarella <[email protected]>

2021-03-12 15:21:13

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 04/22] af_vsock: implement SEQPACKET receive loop

On Sun, Mar 07, 2021 at 08:59:45PM +0300, Arseny Krasnov wrote:
>This adds receive loop for SEQPACKET. It looks like receive loop for
>STREAM, but there is a little bit difference:
>1) It doesn't call notify callbacks.
>2) It doesn't care about 'SO_SNDLOWAT' and 'SO_RCVLOWAT' values, because
> there is no sense for these values in SEQPACKET case.
>3) It waits until whole record is received or error is found during
> receiving.
>4) It processes and sets 'MSG_TRUNC' flag.
>
>So to avoid extra conditions for two types of socket inside one loop, two
>independent functions were created.
>
>Signed-off-by: Arseny Krasnov <[email protected]>
>---
> include/net/af_vsock.h | 5 +++
> net/vmw_vsock/af_vsock.c | 95 +++++++++++++++++++++++++++++++++++++++-
> 2 files changed, 99 insertions(+), 1 deletion(-)
>
>diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>index b1c717286993..5ad7ee7f78fd 100644
>--- a/include/net/af_vsock.h
>+++ b/include/net/af_vsock.h
>@@ -135,6 +135,11 @@ struct vsock_transport {
> bool (*stream_is_active)(struct vsock_sock *);
> bool (*stream_allow)(u32 cid, u32 port);
>
>+ /* SEQ_PACKET. */
>+ size_t (*seqpacket_seq_get_len)(struct vsock_sock *vsk);
>+ int (*seqpacket_dequeue)(struct vsock_sock *vsk, struct msghdr *msg,
>+ int flags, bool *msg_ready);
>+
> /* Notification. */
> int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
> int (*notify_poll_out)(struct vsock_sock *, size_t, bool *);
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 0bc661e54262..ac2f69362f2e 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1973,6 +1973,96 @@ static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
> return err;
> }
>
>+static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
>+ size_t len, int flags)
>+{
>+ const struct vsock_transport *transport;
>+ const struct iovec *orig_iov;
>+ unsigned long orig_nr_segs;
>+ bool msg_ready;
>+ struct vsock_sock *vsk;
>+ size_t record_len;
>+ long timeout;
>+ int err = 0;
>+ DEFINE_WAIT(wait);
>+
>+ vsk = vsock_sk(sk);
>+ transport = vsk->transport;
>+
>+ timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
>+ orig_nr_segs = msg->msg_iter.nr_segs;
>+ orig_iov = msg->msg_iter.iov;
>+ msg_ready = false;
>+ record_len = 0;
>+
>+ while (1) {
>+ err = vsock_wait_data(sk, &wait, timeout, NULL, 0);
>+
>+ if (err <= 0) {
>+ /* In case of any loop break(timeout, signal
>+ * interrupt or shutdown), we report user that
>+ * nothing was copied.
>+ */
>+ err = 0;
>+ break;
>+ }
>+
>+ if (record_len == 0) {
>+ record_len =
>+ transport->seqpacket_seq_get_len(vsk);
>+
>+ if (record_len == 0)
>+ continue;
>+ }
>+
>+ err = transport->seqpacket_dequeue(vsk, msg, flags, &msg_ready);

In order to simplify the transport interface, can we do the work of
seqpacket_seq_get_len() at the beginning of seqpacket_dequeue()?

So in this way seqpacket_dequeue() can return the 'record_len' or an
error.

2021-03-12 15:30:26

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 07/22] af_vsock: rest of SEQPACKET support

On Sun, Mar 07, 2021 at 09:00:47PM +0300, Arseny Krasnov wrote:
>This does rest of SOCK_SEQPACKET support:
>1) Adds socket ops for SEQPACKET type.
>2) Allows to create socket with SEQPACKET type.
>
>Signed-off-by: Arseny Krasnov <[email protected]>
>---
> include/net/af_vsock.h | 1 +
> net/vmw_vsock/af_vsock.c | 36 +++++++++++++++++++++++++++++++++++-
> 2 files changed, 36 insertions(+), 1 deletion(-)

Reviewed-by: Stefano Garzarella <[email protected]>

2021-03-12 15:33:21

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 09/22] virtio/vsock: set packet's type in virtio_transport_send_pkt_info()

On Sun, Mar 07, 2021 at 09:01:22PM +0300, Arseny Krasnov wrote:
>This moves passing type of packet from 'info' structure to 'virtio_
>transport_send_pkt_info()' function. There is no need to set type of
>packet which differs from type of socket. Since at current time only
>stream type is supported, set it directly in 'virtio_transport_send_
>pkt_info()', so callers don't need to set it.
>
>Signed-off-by: Arseny Krasnov <[email protected]>
>---
> net/vmw_vsock/virtio_transport_common.c | 19 +++++--------------
> 1 file changed, 5 insertions(+), 14 deletions(-)

Reviewed-by: Stefano Garzarella <[email protected]>

2021-03-12 15:37:16

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 10/22] virtio/vsock: simplify credit update function API

On Sun, Mar 07, 2021 at 09:01:44PM +0300, Arseny Krasnov wrote:
>This function is static and 'hdr' arg was always NULL.
>
>Signed-off-by: Arseny Krasnov <[email protected]>
>---
> net/vmw_vsock/virtio_transport_common.c | 7 +++----
> 1 file changed, 3 insertions(+), 4 deletions(-)

Reviewed-by: Stefano Garzarella <[email protected]>

2021-03-15 07:51:22

by Arseny Krasnov

[permalink] [raw]
Subject: Re: [RFC PATCH v6 04/22] af_vsock: implement SEQPACKET receive loop


On 12.03.2021 18:17, Stefano Garzarella wrote:
> On Sun, Mar 07, 2021 at 08:59:45PM +0300, Arseny Krasnov wrote:
>> This adds receive loop for SEQPACKET. It looks like receive loop for
>> STREAM, but there is a little bit difference:
>> 1) It doesn't call notify callbacks.
>> 2) It doesn't care about 'SO_SNDLOWAT' and 'SO_RCVLOWAT' values, because
>> there is no sense for these values in SEQPACKET case.
>> 3) It waits until whole record is received or error is found during
>> receiving.
>> 4) It processes and sets 'MSG_TRUNC' flag.
>>
>> So to avoid extra conditions for two types of socket inside one loop, two
>> independent functions were created.
>>
>> Signed-off-by: Arseny Krasnov <[email protected]>
>> ---
>> include/net/af_vsock.h | 5 +++
>> net/vmw_vsock/af_vsock.c | 95 +++++++++++++++++++++++++++++++++++++++-
>> 2 files changed, 99 insertions(+), 1 deletion(-)
>>
>> diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
>> index b1c717286993..5ad7ee7f78fd 100644
>> --- a/include/net/af_vsock.h
>> +++ b/include/net/af_vsock.h
>> @@ -135,6 +135,11 @@ struct vsock_transport {
>> bool (*stream_is_active)(struct vsock_sock *);
>> bool (*stream_allow)(u32 cid, u32 port);
>>
>> + /* SEQ_PACKET. */
>> + size_t (*seqpacket_seq_get_len)(struct vsock_sock *vsk);
>> + int (*seqpacket_dequeue)(struct vsock_sock *vsk, struct msghdr *msg,
>> + int flags, bool *msg_ready);
>> +
>> /* Notification. */
>> int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
>> int (*notify_poll_out)(struct vsock_sock *, size_t, bool *);
>> diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>> index 0bc661e54262..ac2f69362f2e 100644
>> --- a/net/vmw_vsock/af_vsock.c
>> +++ b/net/vmw_vsock/af_vsock.c
>> @@ -1973,6 +1973,96 @@ static int __vsock_stream_recvmsg(struct sock *sk, struct msghdr *msg,
>> return err;
>> }
>>
>> +static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
>> + size_t len, int flags)
>> +{
>> + const struct vsock_transport *transport;
>> + const struct iovec *orig_iov;
>> + unsigned long orig_nr_segs;
>> + bool msg_ready;
>> + struct vsock_sock *vsk;
>> + size_t record_len;
>> + long timeout;
>> + int err = 0;
>> + DEFINE_WAIT(wait);
>> +
>> + vsk = vsock_sk(sk);
>> + transport = vsk->transport;
>> +
>> + timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
>> + orig_nr_segs = msg->msg_iter.nr_segs;
>> + orig_iov = msg->msg_iter.iov;
>> + msg_ready = false;
>> + record_len = 0;
>> +
>> + while (1) {
>> + err = vsock_wait_data(sk, &wait, timeout, NULL, 0);
>> +
>> + if (err <= 0) {
>> + /* In case of any loop break(timeout, signal
>> + * interrupt or shutdown), we report user that
>> + * nothing was copied.
>> + */
>> + err = 0;
>> + break;
>> + }
>> +
>> + if (record_len == 0) {
>> + record_len =
>> + transport->seqpacket_seq_get_len(vsk);
>> +
>> + if (record_len == 0)
>> + continue;
>> + }
>> +
>> + err = transport->seqpacket_dequeue(vsk, msg, flags, &msg_ready);
> In order to simplify the transport interface, can we do the work of
> seqpacket_seq_get_len() at the beginning of seqpacket_dequeue()?
>
> So in this way seqpacket_dequeue() can return the 'record_len' or an
> error.
Ack
>
>

2021-03-15 11:29:00

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 14/22] virtio/vsock: rest of SOCK_SEQPACKET support

On Sun, Mar 07, 2021 at 09:03:09PM +0300, Arseny Krasnov wrote:
>This adds rest of logic for SEQPACKET:
>1) SEQPACKET specific functions which send SEQ_BEGIN/SEQ_END.
> Note that both functions may sleep to wait enough space for
> SEQPACKET header.
>2) SEQ_BEGIN/SEQ_END in TAP packet capture.
>3) Send SHUTDOWN on socket close for SEQPACKET type.
>4) Set SEQPACKET packet type during send.
>5) Set MSG_EOR in flags for SEQPACKET during send.
>6) 'seqpacket_allow' flag to virtio transport.
>
>Signed-off-by: Arseny Krasnov <[email protected]>
>---
> include/linux/virtio_vsock.h | 8 +++
> net/vmw_vsock/virtio_transport_common.c | 87 ++++++++++++++++++++++++-
> 2 files changed, 93 insertions(+), 2 deletions(-)
>
>diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
>index d7edcfeb4cd2..6b45a8b98226 100644
>--- a/include/linux/virtio_vsock.h
>+++ b/include/linux/virtio_vsock.h
>@@ -22,6 +22,7 @@ struct virtio_vsock_seqpack_state {
> u32 user_read_seq_len;
> u32 user_read_copied;
> u32 curr_rx_msg_id;
>+ u32 next_tx_msg_id;
> };
>
> /* Per-socket state (accessed via vsk->trans) */
>@@ -76,6 +77,8 @@ struct virtio_transport {
>
> /* Takes ownership of the packet */
> int (*send_pkt)(struct virtio_vsock_pkt *pkt);
>+
>+ bool seqpacket_allow;
> };
>
> ssize_t
>@@ -90,6 +93,11 @@ virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
>
> size_t virtio_transport_seqpacket_seq_get_len(struct vsock_sock *vsk);
> int
>+virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk,
>+ struct msghdr *msg,
>+ int flags,
>+ size_t len);
>+int
> virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
> struct msghdr *msg,
> int flags,
>diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
>index 9d86375935ce..8e9fdd8aba5d 100644
>--- a/net/vmw_vsock/virtio_transport_common.c
>+++ b/net/vmw_vsock/virtio_transport_common.c
>@@ -139,6 +139,8 @@ static struct sk_buff *virtio_transport_build_skb(void *opaque)
> break;
> case VIRTIO_VSOCK_OP_CREDIT_UPDATE:
> case VIRTIO_VSOCK_OP_CREDIT_REQUEST:
>+ case VIRTIO_VSOCK_OP_SEQ_BEGIN:
>+ case VIRTIO_VSOCK_OP_SEQ_END:
> hdr->op = cpu_to_le16(AF_VSOCK_OP_CONTROL);
> break;
> default:
>@@ -187,7 +189,12 @@ static int virtio_transport_send_pkt_info(struct vsock_sock *vsk,
> struct virtio_vsock_pkt *pkt;
> u32 pkt_len = info->pkt_len;
>
>- info->type = VIRTIO_VSOCK_TYPE_STREAM;
>+ info->type = virtio_transport_get_type(sk_vsock(vsk));
>+
>+ if (info->type == VIRTIO_VSOCK_TYPE_SEQPACKET &&
>+ info->msg &&
>+ info->msg->msg_flags & MSG_EOR)
>+ info->flags |= VIRTIO_VSOCK_RW_EOR;
>
> t_ops = virtio_transport_get_ops(vsk);
> if (unlikely(!t_ops))
>@@ -401,6 +408,43 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
> return err;
> }
>
>+static int virtio_transport_seqpacket_send_ctrl(struct vsock_sock *vsk,
>+ int type,
>+ size_t len,
>+ int flags)
>+{
>+ struct virtio_vsock_sock *vvs = vsk->trans;
>+ struct virtio_vsock_pkt_info info = {
>+ .op = type,
>+ .vsk = vsk,
>+ .pkt_len = sizeof(struct virtio_vsock_seq_hdr)
>+ };
>+
>+ struct virtio_vsock_seq_hdr seq_hdr = {
>+ .msg_id = cpu_to_le32(vvs->seqpacket_state.next_tx_msg_id),
>+ .msg_len = cpu_to_le32(len)
>+ };
>+
>+ struct kvec seq_hdr_kiov = {
>+ .iov_base = (void *)&seq_hdr,
>+ .iov_len = sizeof(struct virtio_vsock_seq_hdr)
>+ };
>+
>+ struct msghdr msg = {0};
>+
>+ //XXX: do we need 'vsock_transport_send_notify_data' pointer?
>+ if (vsock_wait_space(sk_vsock(vsk),
>+ sizeof(struct virtio_vsock_seq_hdr),
>+ flags, NULL))
>+ return -1;
>+
>+ iov_iter_kvec(&msg.msg_iter, WRITE, &seq_hdr_kiov, 1, sizeof(seq_hdr));
>+
>+ info.msg = &msg;
>+
>+ return virtio_transport_send_pkt_info(vsk, &info);
>+}
>+
> static inline void virtio_transport_remove_pkt(struct virtio_vsock_pkt *pkt)
> {
> list_del(&pkt->list);
>@@ -582,6 +626,45 @@ virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
> }
> EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_dequeue);
>
>+int
>+virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk,
>+ struct msghdr *msg,
>+ int flags,
>+ size_t len)
>+{
>+ int written;
>+
>+ if (msg->msg_iter.iov_offset == 0) {
>+ /* Send SEQBEGIN. */
>+ if (virtio_transport_seqpacket_send_ctrl(vsk,
>+ VIRTIO_VSOCK_OP_SEQ_BEGIN,
>+ len,
>+ flags) < 0)
>+ return -1;
>+ }
>+
>+ written = virtio_transport_stream_enqueue(vsk, msg, len);
>+
>+ if (written < 0)
>+ return -1;
>+
>+ if (msg->msg_iter.count == 0) {
>+ struct virtio_vsock_sock *vvs = vsk->trans;
>+
>+ /* Send SEQEND. */
>+ if (virtio_transport_seqpacket_send_ctrl(vsk,
>+ VIRTIO_VSOCK_OP_SEQ_END,
>+ 0,
>+ flags) < 0)
>+ return -1;
>+
>+ vvs->seqpacket_state.next_tx_msg_id++;
>+ }

I suspect we should increment next_tx_msg_id even in case of an error to
avoid issues with packets with same IDs, so in case of error I would do:

if (/* error */) {
written = -1;
goto out;
}

Then we can add the 'out' label and the id increment:

out:
vvs->seqpacket_state.next_tx_msg_id++;
>+
>+ return written;
>+}
>+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_enqueue);
>+
> int
> virtio_transport_dgram_dequeue(struct vsock_sock *vsk,
> struct msghdr *msg,
>@@ -1001,7 +1084,7 @@ void virtio_transport_release(struct vsock_sock *vsk)
> struct sock *sk = &vsk->sk;
> bool remove_sock = true;
>
>- if (sk->sk_type == SOCK_STREAM)
>+ if (sk->sk_type == SOCK_STREAM || sk->sk_type == SOCK_SEQPACKET)
> remove_sock = virtio_transport_close(vsk);
>
> list_for_each_entry_safe(pkt, tmp, &vvs->rx_queue, list) {
>--
>2.25.1
>

2021-03-15 11:30:27

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 16/22] vhost/vsock: SEQPACKET feature bit support

On Sun, Mar 07, 2021 at 09:03:41PM +0300, Arseny Krasnov wrote:
>This adds handling of SEQPACKET bit: if guest sets features with
>this bit cleared, then SOCK_SEQPACKET support will be disabled.
>
>Signed-off-by: Arseny Krasnov <[email protected]>
>---
> drivers/vhost/vsock.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)

I think is better to move this patch after we set the seqpackets ops,
so we are really able to handle SEQPACKET traffic.

>
>diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
>index 5e78fb719602..3b0a50e6de12 100644
>--- a/drivers/vhost/vsock.c
>+++ b/drivers/vhost/vsock.c
>@@ -31,7 +31,8 @@
>
> enum {
> VHOST_VSOCK_FEATURES = VHOST_FEATURES |
>- (1ULL << VIRTIO_F_ACCESS_PLATFORM)
>+ (1ULL << VIRTIO_F_ACCESS_PLATFORM) |
>+ (1ULL << VIRTIO_VSOCK_F_SEQPACKET)
> };
>
> enum {
>@@ -785,6 +786,9 @@ static int vhost_vsock_set_features(struct vhost_vsock *vsock, u64 features)
> goto err;
> }
>
>+ if (features & (1ULL << VIRTIO_VSOCK_F_SEQPACKET))
>+ vhost_transport.seqpacket_allow = true;
>+
> for (i = 0; i < ARRAY_SIZE(vsock->vqs); i++) {
> vq = &vsock->vqs[i];
> mutex_lock(&vq->mutex);
>--
>2.25.1
>

2021-03-15 11:31:10

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 17/22] virtio/vsock: SEQPACKET feature bit support

On Sun, Mar 07, 2021 at 09:04:01PM +0300, Arseny Krasnov wrote:
>This adds handling of SEQPACKET bit: guest tries to negotiate it
>with vhost.
>
>Signed-off-by: Arseny Krasnov <[email protected]>
>---
> net/vmw_vsock/virtio_transport.c | 5 +++++
> 1 file changed, 5 insertions(+)

Also for this patch I think is better to move after we set the
seqpackets ops.

>
>diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
>index 2700a63ab095..41c5d0a31e08 100644
>--- a/net/vmw_vsock/virtio_transport.c
>+++ b/net/vmw_vsock/virtio_transport.c
>@@ -612,6 +612,10 @@ static int virtio_vsock_probe(struct virtio_device *vdev)
> rcu_assign_pointer(the_virtio_vsock, vsock);
>
> mutex_unlock(&the_virtio_vsock_mutex);
>+
>+ if (vdev->features & (1ULL << VIRTIO_VSOCK_F_SEQPACKET))
>+ virtio_transport.seqpacket_allow = true;
>+
> return 0;
>
> out:
>@@ -695,6 +699,7 @@ static struct virtio_device_id id_table[] = {
> };
>
> static unsigned int features[] = {
>+ VIRTIO_VSOCK_F_SEQPACKET
> };
>
> static struct virtio_driver virtio_vsock_driver = {
>--
>2.25.1
>

2021-03-15 11:44:10

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 00/22] virtio/vsock: introduce SOCK_SEQPACKET support

Hi Arseny,

On Sun, Mar 07, 2021 at 08:57:19PM +0300, Arseny Krasnov wrote:
> This patchset implements support of SOCK_SEQPACKET for virtio
>transport.
> As SOCK_SEQPACKET guarantees to save record boundaries, so to
>do it, two new packet operations were added: first for start of record
> and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>both operations carries metadata - to maintain boundaries and payload
>integrity. Metadata is introduced by adding special header with two
>fields - message id and message length:
>
> struct virtio_vsock_seq_hdr {
> __le32 msg_id;
> __le32 msg_len;
> } __attribute__((packed));
>
> This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>packets(buffer of second virtio descriptor in chain) in the same way as
>data transmitted in RW packets. Payload was chosen as buffer for this
>header to avoid touching first virtio buffer which carries header of
>packet, because someone could check that size of this buffer is equal
>to size of packet header. To send record, packet with start marker is
>sent first(it's header carries length of record and id),then all data
>is sent as usual 'RW' packets and finally SEQ_END is sent(it carries
>id of message, which is equal to id of SEQ_BEGIN), also after sending
>SEQ_END id is incremented. On receiver's side,size of record is known
>from packet with start record marker. To check that no packets were
>dropped by transport, 'msg_id's of two sequential SEQ_BEGIN and SEQ_END
>are checked to be equal and length of data between two markers is
>compared to then length in SEQ_BEGIN header.
> Now as packets of one socket are not reordered neither on
>vsock nor on vhost transport layers, such markers allows to restore
>original record on receiver's side. If user's buffer is smaller that
>record length, when all out of size data is dropped.
> Maximum length of datagram is not limited as in stream socket,
>because same credit logic is used. Difference with stream socket is
>that user is not woken up until whole record is received or error
>occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
> Tests also implemented.
>
> Thanks to [email protected] for encouragements and initial design
>recommendations.
>
> Arseny Krasnov (22):
> af_vsock: update functions for connectible socket
> af_vsock: separate wait data loop
> af_vsock: separate receive data loop
> af_vsock: implement SEQPACKET receive loop
> af_vsock: separate wait space loop
> af_vsock: implement send logic for SEQPACKET
> af_vsock: rest of SEQPACKET support
> af_vsock: update comments for stream sockets
> virtio/vsock: set packet's type in virtio_transport_send_pkt_info()
> virtio/vsock: simplify credit update function API
> virtio/vsock: dequeue callback for SOCK_SEQPACKET
> virtio/vsock: fetch length for SEQPACKET record
> virtio/vsock: add SEQPACKET receive logic
> virtio/vsock: rest of SOCK_SEQPACKET support
> virtio/vsock: SEQPACKET feature bit
> vhost/vsock: SEQPACKET feature bit support
> virtio/vsock: SEQPACKET feature bit support
> virtio/vsock: setup SEQPACKET ops for transport
> vhost/vsock: setup SEQPACKET ops for transport
> vsock/loopback: setup SEQPACKET ops for transport
> vsock_test: add SOCK_SEQPACKET tests
> virtio/vsock: update trace event for SEQPACKET
>
> drivers/vhost/vsock.c | 22 +-
> include/linux/virtio_vsock.h | 22 +
> include/net/af_vsock.h | 10 +
> .../events/vsock_virtio_transport_common.h | 48 +-
> include/uapi/linux/virtio_vsock.h | 19 +
> net/vmw_vsock/af_vsock.c | 589 +++++++++++------
> net/vmw_vsock/virtio_transport.c | 18 +
> net/vmw_vsock/virtio_transport_common.c | 364 ++++++++--
> net/vmw_vsock/vsock_loopback.c | 13 +
> tools/testing/vsock/util.c | 32 +-
> tools/testing/vsock/util.h | 3 +
> tools/testing/vsock/vsock_test.c | 126 ++++
> 12 files changed, 1013 insertions(+), 253 deletions(-)
>
> v5 -> v6:
> General changelog:
> - virtio transport specific callbacks which send SEQ_BEGIN or
> SEQ_END now hidden inside virtio transport. Only enqueue,
> dequeue and record length callbacks are provided by transport.
>
> - virtio feature bit for SEQPACKET socket support introduced:
> VIRTIO_VSOCK_F_SEQPACKET.
>
> - 'msg_cnt' field in 'struct virtio_vsock_seq_hdr' renamed to
> 'msg_id' and used as id.
>
> Per patch changelog:
> - 'af_vsock: separate wait data loop':
> 1) Commit message updated.
> 2) 'prepare_to_wait()' moved inside while loop(thanks to
> Jorgen Hansen).
> Marked 'Reviewed-by' with 1), but as 2) I removed R-b.
>
> - 'af_vsock: separate receive data loop': commit message
> updated.
> Marked 'Reviewed-by' with that fix.
>
> - 'af_vsock: implement SEQPACKET receive loop': style fixes.
>
> - 'af_vsock: rest of SEQPACKET support':
> 1) 'module_put()' added when transport callback check failed.
> 2) Now only 'seqpacket_allow()' callback called to check
> support of SEQPACKET by transport.
>
> - 'af_vsock: update comments for stream sockets': commit message
> updated.
> Marked 'Reviewed-by' with that fix.
>
> - 'virtio/vsock: set packet's type in send':
> 1) Commit message updated.
> 2) Parameter 'type' from 'virtio_transport_send_credit_update()'
> also removed in this patch instead of in next.
>
> - 'virtio/vsock: dequeue callback for SOCK_SEQPACKET': SEQPACKET
> related state wrapped to special struct.
>
> - 'virtio/vsock: update trace event for SEQPACKET': format strings
> now not broken by new lines.

I left a bunch of comments in the patches, I hope they are easy to fix
:-)

Thanks for the changelogs. About 'per patch changelog', it is very
useful!
Just a suggestion, I think is better to include them in each patch after
the '---' to simplify the review.

You can use git-notes(1) or you can simply edit the format-patch and add
the changelog after the 3 dashes, so that they are ignored when the
patch is applied.

Thanks,
Stefano

2021-03-15 15:24:13

by Arseny Krasnov

[permalink] [raw]
Subject: Re: [RFC PATCH v6 00/22] virtio/vsock: introduce SOCK_SEQPACKET support


On 15.03.2021 14:40, Stefano Garzarella wrote:
> Hi Arseny,
>
> On Sun, Mar 07, 2021 at 08:57:19PM +0300, Arseny Krasnov wrote:
>> This patchset implements support of SOCK_SEQPACKET for virtio
>> transport.
>> As SOCK_SEQPACKET guarantees to save record boundaries, so to
>> do it, two new packet operations were added: first for start of record
>> and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>> both operations carries metadata - to maintain boundaries and payload
>> integrity. Metadata is introduced by adding special header with two
>> fields - message id and message length:
>>
>> struct virtio_vsock_seq_hdr {
>> __le32 msg_id;
>> __le32 msg_len;
>> } __attribute__((packed));
>>
>> This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>> packets(buffer of second virtio descriptor in chain) in the same way as
>> data transmitted in RW packets. Payload was chosen as buffer for this
>> header to avoid touching first virtio buffer which carries header of
>> packet, because someone could check that size of this buffer is equal
>> to size of packet header. To send record, packet with start marker is
>> sent first(it's header carries length of record and id),then all data
>> is sent as usual 'RW' packets and finally SEQ_END is sent(it carries
>> id of message, which is equal to id of SEQ_BEGIN), also after sending
>> SEQ_END id is incremented. On receiver's side,size of record is known
> >from packet with start record marker. To check that no packets were
>> dropped by transport, 'msg_id's of two sequential SEQ_BEGIN and SEQ_END
>> are checked to be equal and length of data between two markers is
>> compared to then length in SEQ_BEGIN header.
>> Now as packets of one socket are not reordered neither on
>> vsock nor on vhost transport layers, such markers allows to restore
>> original record on receiver's side. If user's buffer is smaller that
>> record length, when all out of size data is dropped.
>> Maximum length of datagram is not limited as in stream socket,
>> because same credit logic is used. Difference with stream socket is
>> that user is not woken up until whole record is received or error
>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>> Tests also implemented.
>>
>> Thanks to [email protected] for encouragements and initial design
>> recommendations.
>>
>> Arseny Krasnov (22):
>> af_vsock: update functions for connectible socket
>> af_vsock: separate wait data loop
>> af_vsock: separate receive data loop
>> af_vsock: implement SEQPACKET receive loop
>> af_vsock: separate wait space loop
>> af_vsock: implement send logic for SEQPACKET
>> af_vsock: rest of SEQPACKET support
>> af_vsock: update comments for stream sockets
>> virtio/vsock: set packet's type in virtio_transport_send_pkt_info()
>> virtio/vsock: simplify credit update function API
>> virtio/vsock: dequeue callback for SOCK_SEQPACKET
>> virtio/vsock: fetch length for SEQPACKET record
>> virtio/vsock: add SEQPACKET receive logic
>> virtio/vsock: rest of SOCK_SEQPACKET support
>> virtio/vsock: SEQPACKET feature bit
>> vhost/vsock: SEQPACKET feature bit support
>> virtio/vsock: SEQPACKET feature bit support
>> virtio/vsock: setup SEQPACKET ops for transport
>> vhost/vsock: setup SEQPACKET ops for transport
>> vsock/loopback: setup SEQPACKET ops for transport
>> vsock_test: add SOCK_SEQPACKET tests
>> virtio/vsock: update trace event for SEQPACKET
>>
>> drivers/vhost/vsock.c | 22 +-
>> include/linux/virtio_vsock.h | 22 +
>> include/net/af_vsock.h | 10 +
>> .../events/vsock_virtio_transport_common.h | 48 +-
>> include/uapi/linux/virtio_vsock.h | 19 +
>> net/vmw_vsock/af_vsock.c | 589 +++++++++++------
>> net/vmw_vsock/virtio_transport.c | 18 +
>> net/vmw_vsock/virtio_transport_common.c | 364 ++++++++--
>> net/vmw_vsock/vsock_loopback.c | 13 +
>> tools/testing/vsock/util.c | 32 +-
>> tools/testing/vsock/util.h | 3 +
>> tools/testing/vsock/vsock_test.c | 126 ++++
>> 12 files changed, 1013 insertions(+), 253 deletions(-)
>>
>> v5 -> v6:
>> General changelog:
>> - virtio transport specific callbacks which send SEQ_BEGIN or
>> SEQ_END now hidden inside virtio transport. Only enqueue,
>> dequeue and record length callbacks are provided by transport.
>>
>> - virtio feature bit for SEQPACKET socket support introduced:
>> VIRTIO_VSOCK_F_SEQPACKET.
>>
>> - 'msg_cnt' field in 'struct virtio_vsock_seq_hdr' renamed to
>> 'msg_id' and used as id.
>>
>> Per patch changelog:
>> - 'af_vsock: separate wait data loop':
>> 1) Commit message updated.
>> 2) 'prepare_to_wait()' moved inside while loop(thanks to
>> Jorgen Hansen).
>> Marked 'Reviewed-by' with 1), but as 2) I removed R-b.
>>
>> - 'af_vsock: separate receive data loop': commit message
>> updated.
>> Marked 'Reviewed-by' with that fix.
>>
>> - 'af_vsock: implement SEQPACKET receive loop': style fixes.
>>
>> - 'af_vsock: rest of SEQPACKET support':
>> 1) 'module_put()' added when transport callback check failed.
>> 2) Now only 'seqpacket_allow()' callback called to check
>> support of SEQPACKET by transport.
>>
>> - 'af_vsock: update comments for stream sockets': commit message
>> updated.
>> Marked 'Reviewed-by' with that fix.
>>
>> - 'virtio/vsock: set packet's type in send':
>> 1) Commit message updated.
>> 2) Parameter 'type' from 'virtio_transport_send_credit_update()'
>> also removed in this patch instead of in next.
>>
>> - 'virtio/vsock: dequeue callback for SOCK_SEQPACKET': SEQPACKET
>> related state wrapped to special struct.
>>
>> - 'virtio/vsock: update trace event for SEQPACKET': format strings
>> now not broken by new lines.
> I left a bunch of comments in the patches, I hope they are easy to fix
> :-)
Thank you, yes, there are still small fixes.
>
> Thanks for the changelogs. About 'per patch changelog', it is very
> useful!
> Just a suggestion, I think is better to include them in each patch after
> the '---' to simplify the review.
Ack
>
> You can use git-notes(1) or you can simply edit the format-patch and add
> the changelog after the 3 dashes, so that they are ignored when the
> patch is applied.
>
> Thanks,
> Stefano
>
>

2021-03-16 09:33:29

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v6 00/22] virtio/vsock: introduce SOCK_SEQPACKET support

On Tue, Mar 16, 2021 at 06:37:31AM +0300, Arseny Krasnov wrote:
>
>On 15.03.2021 18:22, Arseny Krasnov wrote:
>> On 15.03.2021 14:40, Stefano Garzarella wrote:
>>> Hi Arseny,
>>>
>>> On Sun, Mar 07, 2021 at 08:57:19PM +0300, Arseny Krasnov wrote:
>>>> This patchset implements support of SOCK_SEQPACKET for virtio
>>>> transport.
>>>> As SOCK_SEQPACKET guarantees to save record boundaries, so to
>>>> do it, two new packet operations were added: first for start of record
>>>> and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>>>> both operations carries metadata - to maintain boundaries and payload
>>>> integrity. Metadata is introduced by adding special header with two
>>>> fields - message id and message length:
>>>>
>>>> struct virtio_vsock_seq_hdr {
>>>> __le32 msg_id;
>>>> __le32 msg_len;
>>>> } __attribute__((packed));
>>>>
>>>> This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>>>> packets(buffer of second virtio descriptor in chain) in the same way as
>>>> data transmitted in RW packets. Payload was chosen as buffer for this
>>>> header to avoid touching first virtio buffer which carries header of
>>>> packet, because someone could check that size of this buffer is equal
>>>> to size of packet header. To send record, packet with start marker is
>>>> sent first(it's header carries length of record and id),then all data
>>>> is sent as usual 'RW' packets and finally SEQ_END is sent(it carries
>>>> id of message, which is equal to id of SEQ_BEGIN), also after sending
>>>> SEQ_END id is incremented. On receiver's side,size of record is known
>>> >from packet with start record marker. To check that no packets were
>>>> dropped by transport, 'msg_id's of two sequential SEQ_BEGIN and SEQ_END
>>>> are checked to be equal and length of data between two markers is
>>>> compared to then length in SEQ_BEGIN header.
>>>> Now as packets of one socket are not reordered neither on
>>>> vsock nor on vhost transport layers, such markers allows to restore
>>>> original record on receiver's side. If user's buffer is smaller that
>>>> record length, when all out of size data is dropped.
>>>> Maximum length of datagram is not limited as in stream socket,
>>>> because same credit logic is used. Difference with stream socket is
>>>> that user is not woken up until whole record is received or error
>>>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>>>> Tests also implemented.
>>>>
>>>> Thanks to [email protected] for encouragements and initial design
>>>> recommendations.
>>>>
>>>> Arseny Krasnov (22):
>>>> af_vsock: update functions for connectible socket
>>>> af_vsock: separate wait data loop
>>>> af_vsock: separate receive data loop
>>>> af_vsock: implement SEQPACKET receive loop
>>>> af_vsock: separate wait space loop
>>>> af_vsock: implement send logic for SEQPACKET
>>>> af_vsock: rest of SEQPACKET support
>>>> af_vsock: update comments for stream sockets
>>>> virtio/vsock: set packet's type in virtio_transport_send_pkt_info()
>>>> virtio/vsock: simplify credit update function API
>>>> virtio/vsock: dequeue callback for SOCK_SEQPACKET
>>>> virtio/vsock: fetch length for SEQPACKET record
>>>> virtio/vsock: add SEQPACKET receive logic
>>>> virtio/vsock: rest of SOCK_SEQPACKET support
>>>> virtio/vsock: SEQPACKET feature bit
>>>> vhost/vsock: SEQPACKET feature bit support
>>>> virtio/vsock: SEQPACKET feature bit support
>>>> virtio/vsock: setup SEQPACKET ops for transport
>>>> vhost/vsock: setup SEQPACKET ops for transport
>>>> vsock/loopback: setup SEQPACKET ops for transport
>>>> vsock_test: add SOCK_SEQPACKET tests
>>>> virtio/vsock: update trace event for SEQPACKET
>>>>
>>>> drivers/vhost/vsock.c | 22 +-
>>>> include/linux/virtio_vsock.h | 22 +
>>>> include/net/af_vsock.h | 10 +
>>>> .../events/vsock_virtio_transport_common.h | 48 +-
>>>> include/uapi/linux/virtio_vsock.h | 19 +
>>>> net/vmw_vsock/af_vsock.c | 589 +++++++++++------
>>>> net/vmw_vsock/virtio_transport.c | 18 +
>>>> net/vmw_vsock/virtio_transport_common.c | 364 ++++++++--
>>>> net/vmw_vsock/vsock_loopback.c | 13 +
>>>> tools/testing/vsock/util.c | 32 +-
>>>> tools/testing/vsock/util.h | 3 +
>>>> tools/testing/vsock/vsock_test.c | 126 ++++
>>>> 12 files changed, 1013 insertions(+), 253 deletions(-)
>>>>
>>>> v5 -> v6:
>>>> General changelog:
>>>> - virtio transport specific callbacks which send SEQ_BEGIN or
>>>> SEQ_END now hidden inside virtio transport. Only enqueue,
>>>> dequeue and record length callbacks are provided by transport.
>>>>
>>>> - virtio feature bit for SEQPACKET socket support introduced:
>>>> VIRTIO_VSOCK_F_SEQPACKET.
>>>>
>>>> - 'msg_cnt' field in 'struct virtio_vsock_seq_hdr' renamed to
>>>> 'msg_id' and used as id.
>>>>
>>>> Per patch changelog:
>>>> - 'af_vsock: separate wait data loop':
>>>> 1) Commit message updated.
>>>> 2) 'prepare_to_wait()' moved inside while loop(thanks to
>>>> Jorgen Hansen).
>>>> Marked 'Reviewed-by' with 1), but as 2) I removed R-b.
>>>>
>>>> - 'af_vsock: separate receive data loop': commit message
>>>> updated.
>>>> Marked 'Reviewed-by' with that fix.
>>>>
>>>> - 'af_vsock: implement SEQPACKET receive loop': style fixes.
>>>>
>>>> - 'af_vsock: rest of SEQPACKET support':
>>>> 1) 'module_put()' added when transport callback check failed.
>>>> 2) Now only 'seqpacket_allow()' callback called to check
>>>> support of SEQPACKET by transport.
>>>>
>>>> - 'af_vsock: update comments for stream sockets': commit message
>>>> updated.
>>>> Marked 'Reviewed-by' with that fix.
>>>>
>>>> - 'virtio/vsock: set packet's type in send':
>>>> 1) Commit message updated.
>>>> 2) Parameter 'type' from 'virtio_transport_send_credit_update()'
>>>> also removed in this patch instead of in next.
>>>>
>>>> - 'virtio/vsock: dequeue callback for SOCK_SEQPACKET': SEQPACKET
>>>> related state wrapped to special struct.
>>>>
>>>> - 'virtio/vsock: update trace event for SEQPACKET': format strings
>>>> now not broken by new lines.
>>> I left a bunch of comments in the patches, I hope they are easy to fix
>>> :-)
>> Thank you, yes, there are still small fixes.
>
>So one more question, this is final review for this version of patchset and can
>
>prepare next version with fixes? All other patches will reviewed in next version?
>

For me yes, the other patches seem okay, but I would like to see them
later, with the right order and the last things fixed.

Maybe to merge all this we should wait for the agreement from the specs
patch.

Thanks,
Stefano

2021-03-16 11:28:45

by Arseny Krasnov

[permalink] [raw]
Subject: Re: [RFC PATCH v6 00/22] virtio/vsock: introduce SOCK_SEQPACKET support


On 15.03.2021 18:22, Arseny Krasnov wrote:
> On 15.03.2021 14:40, Stefano Garzarella wrote:
>> Hi Arseny,
>>
>> On Sun, Mar 07, 2021 at 08:57:19PM +0300, Arseny Krasnov wrote:
>>> This patchset implements support of SOCK_SEQPACKET for virtio
>>> transport.
>>> As SOCK_SEQPACKET guarantees to save record boundaries, so to
>>> do it, two new packet operations were added: first for start of record
>>> and second to mark end of record(SEQ_BEGIN and SEQ_END later). Also,
>>> both operations carries metadata - to maintain boundaries and payload
>>> integrity. Metadata is introduced by adding special header with two
>>> fields - message id and message length:
>>>
>>> struct virtio_vsock_seq_hdr {
>>> __le32 msg_id;
>>> __le32 msg_len;
>>> } __attribute__((packed));
>>>
>>> This header is transmitted as payload of SEQ_BEGIN and SEQ_END
>>> packets(buffer of second virtio descriptor in chain) in the same way as
>>> data transmitted in RW packets. Payload was chosen as buffer for this
>>> header to avoid touching first virtio buffer which carries header of
>>> packet, because someone could check that size of this buffer is equal
>>> to size of packet header. To send record, packet with start marker is
>>> sent first(it's header carries length of record and id),then all data
>>> is sent as usual 'RW' packets and finally SEQ_END is sent(it carries
>>> id of message, which is equal to id of SEQ_BEGIN), also after sending
>>> SEQ_END id is incremented. On receiver's side,size of record is known
>> >from packet with start record marker. To check that no packets were
>>> dropped by transport, 'msg_id's of two sequential SEQ_BEGIN and SEQ_END
>>> are checked to be equal and length of data between two markers is
>>> compared to then length in SEQ_BEGIN header.
>>> Now as packets of one socket are not reordered neither on
>>> vsock nor on vhost transport layers, such markers allows to restore
>>> original record on receiver's side. If user's buffer is smaller that
>>> record length, when all out of size data is dropped.
>>> Maximum length of datagram is not limited as in stream socket,
>>> because same credit logic is used. Difference with stream socket is
>>> that user is not woken up until whole record is received or error
>>> occurred. Implementation also supports 'MSG_EOR' and 'MSG_TRUNC' flags.
>>> Tests also implemented.
>>>
>>> Thanks to [email protected] for encouragements and initial design
>>> recommendations.
>>>
>>> Arseny Krasnov (22):
>>> af_vsock: update functions for connectible socket
>>> af_vsock: separate wait data loop
>>> af_vsock: separate receive data loop
>>> af_vsock: implement SEQPACKET receive loop
>>> af_vsock: separate wait space loop
>>> af_vsock: implement send logic for SEQPACKET
>>> af_vsock: rest of SEQPACKET support
>>> af_vsock: update comments for stream sockets
>>> virtio/vsock: set packet's type in virtio_transport_send_pkt_info()
>>> virtio/vsock: simplify credit update function API
>>> virtio/vsock: dequeue callback for SOCK_SEQPACKET
>>> virtio/vsock: fetch length for SEQPACKET record
>>> virtio/vsock: add SEQPACKET receive logic
>>> virtio/vsock: rest of SOCK_SEQPACKET support
>>> virtio/vsock: SEQPACKET feature bit
>>> vhost/vsock: SEQPACKET feature bit support
>>> virtio/vsock: SEQPACKET feature bit support
>>> virtio/vsock: setup SEQPACKET ops for transport
>>> vhost/vsock: setup SEQPACKET ops for transport
>>> vsock/loopback: setup SEQPACKET ops for transport
>>> vsock_test: add SOCK_SEQPACKET tests
>>> virtio/vsock: update trace event for SEQPACKET
>>>
>>> drivers/vhost/vsock.c | 22 +-
>>> include/linux/virtio_vsock.h | 22 +
>>> include/net/af_vsock.h | 10 +
>>> .../events/vsock_virtio_transport_common.h | 48 +-
>>> include/uapi/linux/virtio_vsock.h | 19 +
>>> net/vmw_vsock/af_vsock.c | 589 +++++++++++------
>>> net/vmw_vsock/virtio_transport.c | 18 +
>>> net/vmw_vsock/virtio_transport_common.c | 364 ++++++++--
>>> net/vmw_vsock/vsock_loopback.c | 13 +
>>> tools/testing/vsock/util.c | 32 +-
>>> tools/testing/vsock/util.h | 3 +
>>> tools/testing/vsock/vsock_test.c | 126 ++++
>>> 12 files changed, 1013 insertions(+), 253 deletions(-)
>>>
>>> v5 -> v6:
>>> General changelog:
>>> - virtio transport specific callbacks which send SEQ_BEGIN or
>>> SEQ_END now hidden inside virtio transport. Only enqueue,
>>> dequeue and record length callbacks are provided by transport.
>>>
>>> - virtio feature bit for SEQPACKET socket support introduced:
>>> VIRTIO_VSOCK_F_SEQPACKET.
>>>
>>> - 'msg_cnt' field in 'struct virtio_vsock_seq_hdr' renamed to
>>> 'msg_id' and used as id.
>>>
>>> Per patch changelog:
>>> - 'af_vsock: separate wait data loop':
>>> 1) Commit message updated.
>>> 2) 'prepare_to_wait()' moved inside while loop(thanks to
>>> Jorgen Hansen).
>>> Marked 'Reviewed-by' with 1), but as 2) I removed R-b.
>>>
>>> - 'af_vsock: separate receive data loop': commit message
>>> updated.
>>> Marked 'Reviewed-by' with that fix.
>>>
>>> - 'af_vsock: implement SEQPACKET receive loop': style fixes.
>>>
>>> - 'af_vsock: rest of SEQPACKET support':
>>> 1) 'module_put()' added when transport callback check failed.
>>> 2) Now only 'seqpacket_allow()' callback called to check
>>> support of SEQPACKET by transport.
>>>
>>> - 'af_vsock: update comments for stream sockets': commit message
>>> updated.
>>> Marked 'Reviewed-by' with that fix.
>>>
>>> - 'virtio/vsock: set packet's type in send':
>>> 1) Commit message updated.
>>> 2) Parameter 'type' from 'virtio_transport_send_credit_update()'
>>> also removed in this patch instead of in next.
>>>
>>> - 'virtio/vsock: dequeue callback for SOCK_SEQPACKET': SEQPACKET
>>> related state wrapped to special struct.
>>>
>>> - 'virtio/vsock: update trace event for SEQPACKET': format strings
>>> now not broken by new lines.
>> I left a bunch of comments in the patches, I hope they are easy to fix
>> :-)
> Thank you, yes, there are still small fixes.

So one more question, this is final review for this version of patchset and can

prepare next version with fixes? All other patches will reviewed in next version?

Thank You

>> Thanks for the changelogs. About 'per patch changelog', it is very
>> useful!
>> Just a suggestion, I think is better to include them in each patch after
>> the '---' to simplify the review.
> Ack
>> You can use git-notes(1) or you can simply edit the format-patch and add
>> the changelog after the 3 dashes, so that they are ignored when the
>> patch is applied.
>>
>> Thanks,
>> Stefano
>>
>>