This patchset modifies receive logic for SOCK_SEQPACKET.
Difference between current implementation and this version is that
now reader is woken up when there is at least one RW packet in rx
queue of socket and data is copied to user's buffer, while merged
approach wake up user only when whole message is received and kept
in queue. New implementation has several advantages:
1) There is no limit for message length. Merged approach requires
that length must be smaller than 'peer_buf_alloc', otherwise
transmission will stuck.
2) There is no need to keep whole message in queue, thus no
'kmalloc()' memory will be wasted until EOR is received.
Also new approach has some feature: as fragments of message
are copied until EOR is received, it is possible that part of
message will be already in user's buffer, while rest of message
still not received. And if user will be interrupted by signal or
timeout with part of message in buffer, it will exit receive loop,
leaving rest of message in queue. To solve this problem special
callback was added to transport: it is called when user was forced
to leave exit loop and tells transport to drop any packet until
EOR met. When EOR is found, this mode is disabled and normal packet
processing started. Note, that when 'drop until EOR' mode is on,
incoming packets still inserted in queue, reader will be woken up,
tries to copy data, but nothing will be copied until EOR found.
It was possible to drain such unneeded packets it rx work without
kicking user, but implemented way is simplest. Anyway, i think
such cases are rare.
New test also added - it tries to copy to invalid user's
buffer.
Arseny Krasnov (16):
af_vsock/virtio/vsock: change seqpacket receive logic
af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
virtio/vsock: remove 'msg_count' based logic
af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
virtio/vsock: remove record size limit for SEQPACKET
vsock_test: SEQPACKET read to broken buffer
drivers/vhost/vsock.c | 2 +-
include/linux/virtio_vsock.h | 7 +-
include/net/af_vsock.h | 4 +-
net/vmw_vsock/af_vsock.c | 44 ++++----
net/vmw_vsock/virtio_transport.c | 2 +-
net/vmw_vsock/virtio_transport_common.c | 103 ++++++++-----------
net/vmw_vsock/vsock_loopback.c | 2 +-
tools/testing/vsock/vsock_test.c | 120 ++++++++++++++++++++++
8 files changed, 193 insertions(+), 91 deletions(-)
v1 -> v2:
Patches reordered and reorganized.
Signed-off-by: Arseny Krasnov <[email protected]>
---
cv.txt | 0
1 file changed, 0 insertions(+), 0 deletions(-)
create mode 100644 cv.txt
diff --git a/cv.txt b/cv.txt
new file mode 100644
index 000000000000..e69de29bb2d1
--
2.25.1
1) In af_vsock "loop" now is really loop: it receives
message fragments one by one, until 'msg_ready'
value is returned by transport.
2) In virtio transport, dequeue callback is called
everytime when at least one fragment of message is
received.
Signed-off-by: Arseny Krasnov <[email protected]>
---
include/linux/virtio_vsock.h | 3 +-
include/net/af_vsock.h | 2 +-
net/vmw_vsock/af_vsock.c | 33 +++++++++----
net/vmw_vsock/virtio_transport_common.c | 62 +++++++++++--------------
4 files changed, 52 insertions(+), 48 deletions(-)
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 35d7eedb5e8e..e68b4029f038 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -88,7 +88,8 @@ virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk,
ssize_t
virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
struct msghdr *msg,
- int flags);
+ int flags,
+ bool *msg_ready);
s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
u32 virtio_transport_seqpacket_has_data(struct vsock_sock *vsk);
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index ab207677e0a8..c40d341611b0 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -137,7 +137,7 @@ struct vsock_transport {
/* SEQ_PACKET. */
ssize_t (*seqpacket_dequeue)(struct vsock_sock *vsk, struct msghdr *msg,
- int flags);
+ int flags, bool *msg_ready);
int (*seqpacket_enqueue)(struct vsock_sock *vsk, struct msghdr *msg,
size_t len);
bool (*seqpacket_allow)(u32 remote_cid);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 3e02cc3b24f8..b66884def8e8 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1881,7 +1881,7 @@ static int vsock_connectible_wait_data(struct sock *sk,
err = 0;
transport = vsk->transport;
- while ((data = vsock_connectible_has_data(vsk)) == 0) {
+ while ((data = vsock_stream_has_data(vsk)) == 0) {
prepare_to_wait(sk_sleep(sk), wait, TASK_INTERRUPTIBLE);
if (sk->sk_err != 0 ||
@@ -2013,6 +2013,7 @@ static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
size_t len, int flags)
{
const struct vsock_transport *transport;
+ bool msg_ready;
struct vsock_sock *vsk;
ssize_t record_len;
long timeout;
@@ -2023,23 +2024,36 @@ static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
transport = vsk->transport;
timeout = sock_rcvtimeo(sk, flags & MSG_DONTWAIT);
+ msg_ready = false;
+ record_len = 0;
- err = vsock_connectible_wait_data(sk, &wait, timeout, NULL, 0);
- if (err <= 0)
- goto out;
+ while (!msg_ready) {
+ ssize_t fragment_len;
+ int intr_err;
- record_len = transport->seqpacket_dequeue(vsk, msg, flags);
+ intr_err = vsock_connectible_wait_data(sk, &wait, timeout, NULL, 0);
+ if (intr_err <= 0) {
+ err = intr_err;
+ break;
+ }
- if (record_len < 0) {
- err = -ENOMEM;
- goto out;
+ fragment_len = transport->seqpacket_dequeue(vsk, msg, flags, &msg_ready);
+
+ if (fragment_len < 0) {
+ err = -ENOMEM;
+ break;
+ }
+
+ record_len += fragment_len;
}
if (sk->sk_err) {
err = -sk->sk_err;
} else if (sk->sk_shutdown & RCV_SHUTDOWN) {
err = 0;
- } else {
+ }
+
+ if (msg_ready && !err) {
/* User sets MSG_TRUNC, so return real length of
* packet.
*/
@@ -2055,7 +2069,6 @@ static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
msg->msg_flags |= MSG_TRUNC;
}
-out:
return err;
}
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 169ba8b72a63..053bcea1a03f 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -407,58 +407,48 @@ virtio_transport_stream_do_dequeue(struct vsock_sock *vsk,
static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
struct msghdr *msg,
- int flags)
+ int flags,
+ bool *msg_ready)
{
struct virtio_vsock_sock *vvs = vsk->trans;
struct virtio_vsock_pkt *pkt;
int dequeued_len = 0;
size_t user_buf_len = msg_data_left(msg);
- bool msg_ready = false;
+ *msg_ready = false;
spin_lock_bh(&vvs->rx_lock);
- if (vvs->msg_count == 0) {
- spin_unlock_bh(&vvs->rx_lock);
- return 0;
- }
+ while (!*msg_ready && !list_empty(&vvs->rx_queue) && dequeued_len >= 0) {
+ size_t pkt_len;
+ size_t bytes_to_copy;
- while (!msg_ready) {
pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
+ pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
- if (dequeued_len >= 0) {
- size_t pkt_len;
- size_t bytes_to_copy;
+ bytes_to_copy = min(user_buf_len, pkt_len);
- pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
- bytes_to_copy = min(user_buf_len, pkt_len);
-
- if (bytes_to_copy) {
- int err;
-
- /* sk_lock is held by caller so no one else can dequeue.
- * Unlock rx_lock since memcpy_to_msg() may sleep.
- */
- spin_unlock_bh(&vvs->rx_lock);
+ if (bytes_to_copy) {
+ int err;
+ /* sk_lock is held by caller so no one else can dequeue.
+ * Unlock rx_lock since memcpy_to_msg() may sleep.
+ */
+ spin_unlock_bh(&vvs->rx_lock);
- err = memcpy_to_msg(msg, pkt->buf, bytes_to_copy);
- if (err) {
- /* Copy of message failed. Rest of
- * fragments will be freed without copy.
- */
- dequeued_len = err;
- } else {
- user_buf_len -= bytes_to_copy;
- }
+ err = memcpy_to_msg(msg, pkt->buf, bytes_to_copy);
- spin_lock_bh(&vvs->rx_lock);
- }
+ spin_lock_bh(&vvs->rx_lock);
- if (dequeued_len >= 0)
- dequeued_len += pkt_len;
+ if (err)
+ dequeued_len = err;
+ else
+ user_buf_len -= bytes_to_copy;
}
+ if (dequeued_len >= 0)
+ dequeued_len += pkt_len;
+
if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SEQ_EOR) {
- msg_ready = true;
+ *msg_ready = true;
vvs->msg_count--;
}
@@ -489,12 +479,12 @@ EXPORT_SYMBOL_GPL(virtio_transport_stream_dequeue);
ssize_t
virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
struct msghdr *msg,
- int flags)
+ int flags, bool *msg_ready)
{
if (flags & MSG_PEEK)
return -EOPNOTSUPP;
- return virtio_transport_seqpacket_do_dequeue(vsk, msg, flags);
+ return virtio_transport_seqpacket_do_dequeue(vsk, msg, flags, msg_ready);
}
EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_dequeue);
--
2.25.1
'rx_bytes' is used to check data presence on both SOCK_STREAM
and SOCK_SEQPACKET socket types for virtio/vsock.
Signed-off-by: Arseny Krasnov <[email protected]>
---
drivers/vhost/vsock.c | 1 -
include/linux/virtio_vsock.h | 1 -
include/net/af_vsock.h | 1 -
net/vmw_vsock/af_vsock.c | 10 ----------
net/vmw_vsock/virtio_transport.c | 1 -
net/vmw_vsock/virtio_transport_common.c | 13 -------------
net/vmw_vsock/vsock_loopback.c | 1 -
7 files changed, 28 deletions(-)
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index d38c996b4f46..c9713d8db0f4 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -451,7 +451,6 @@ static struct virtio_transport vhost_transport = {
.seqpacket_dequeue = virtio_transport_seqpacket_dequeue,
.seqpacket_enqueue = virtio_transport_seqpacket_enqueue,
.seqpacket_allow = vhost_transport_seqpacket_allow,
- .seqpacket_has_data = virtio_transport_seqpacket_has_data,
.notify_poll_in = virtio_transport_notify_poll_in,
.notify_poll_out = virtio_transport_notify_poll_out,
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index e68b4029f038..8d34f3d73bbb 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -92,7 +92,6 @@ virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
bool *msg_ready);
s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
-u32 virtio_transport_seqpacket_has_data(struct vsock_sock *vsk);
int virtio_transport_do_socket_init(struct vsock_sock *vsk,
struct vsock_sock *psk);
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index c40d341611b0..1747c0b564ef 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -141,7 +141,6 @@ struct vsock_transport {
int (*seqpacket_enqueue)(struct vsock_sock *vsk, struct msghdr *msg,
size_t len);
bool (*seqpacket_allow)(u32 remote_cid);
- u32 (*seqpacket_has_data)(struct vsock_sock *vsk);
/* Notification. */
int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index b66884def8e8..87955f9ff065 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -860,16 +860,6 @@ s64 vsock_stream_has_data(struct vsock_sock *vsk)
}
EXPORT_SYMBOL_GPL(vsock_stream_has_data);
-static s64 vsock_connectible_has_data(struct vsock_sock *vsk)
-{
- struct sock *sk = sk_vsock(vsk);
-
- if (sk->sk_type == SOCK_SEQPACKET)
- return vsk->transport->seqpacket_has_data(vsk);
- else
- return vsock_stream_has_data(vsk);
-}
-
s64 vsock_stream_has_space(struct vsock_sock *vsk)
{
return vsk->transport->stream_has_space(vsk);
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index e0c2c992ad9c..2a7c56fcb062 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -475,7 +475,6 @@ static struct virtio_transport virtio_transport = {
.seqpacket_dequeue = virtio_transport_seqpacket_dequeue,
.seqpacket_enqueue = virtio_transport_seqpacket_enqueue,
.seqpacket_allow = virtio_transport_seqpacket_allow,
- .seqpacket_has_data = virtio_transport_seqpacket_has_data,
.notify_poll_in = virtio_transport_notify_poll_in,
.notify_poll_out = virtio_transport_notify_poll_out,
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 053bcea1a03f..37d4ed526750 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -530,19 +530,6 @@ s64 virtio_transport_stream_has_data(struct vsock_sock *vsk)
}
EXPORT_SYMBOL_GPL(virtio_transport_stream_has_data);
-u32 virtio_transport_seqpacket_has_data(struct vsock_sock *vsk)
-{
- struct virtio_vsock_sock *vvs = vsk->trans;
- u32 msg_count;
-
- spin_lock_bh(&vvs->rx_lock);
- msg_count = vvs->msg_count;
- spin_unlock_bh(&vvs->rx_lock);
-
- return msg_count;
-}
-EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_has_data);
-
static s64 virtio_transport_has_space(struct vsock_sock *vsk)
{
struct virtio_vsock_sock *vvs = vsk->trans;
diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
index 169a8cf65b39..809f807d0710 100644
--- a/net/vmw_vsock/vsock_loopback.c
+++ b/net/vmw_vsock/vsock_loopback.c
@@ -94,7 +94,6 @@ static struct virtio_transport loopback_transport = {
.seqpacket_dequeue = virtio_transport_seqpacket_dequeue,
.seqpacket_enqueue = virtio_transport_seqpacket_enqueue,
.seqpacket_allow = vsock_loopback_seqpacket_allow,
- .seqpacket_has_data = virtio_transport_seqpacket_has_data,
.notify_poll_in = virtio_transport_notify_poll_in,
.notify_poll_out = virtio_transport_notify_poll_out,
--
2.25.1
Add special callback for SEQPACKET socket which is called when
we need to drop current in-progress record: part of record was
copied successfully, reader wait rest of record, but signal
interrupts it and reader leaves it's loop, leaving packets of
current record still in queue. So to avoid copy of "orphaned"
record, we tell transport to drop every packet until EOR will
be found.
Signed-off-by: Arseny Krasnov <[email protected]>
---
drivers/vhost/vsock.c | 1 +
include/linux/virtio_vsock.h | 2 ++
include/net/af_vsock.h | 1 +
net/vmw_vsock/af_vsock.c | 1 +
net/vmw_vsock/virtio_transport.c | 1 +
net/vmw_vsock/virtio_transport_common.c | 23 +++++++++++++++++++----
net/vmw_vsock/vsock_loopback.c | 1 +
7 files changed, 26 insertions(+), 4 deletions(-)
diff --git a/drivers/vhost/vsock.c b/drivers/vhost/vsock.c
index c9713d8db0f4..731b9fe07cd3 100644
--- a/drivers/vhost/vsock.c
+++ b/drivers/vhost/vsock.c
@@ -447,6 +447,7 @@ static struct virtio_transport vhost_transport = {
.stream_rcvhiwat = virtio_transport_stream_rcvhiwat,
.stream_is_active = virtio_transport_stream_is_active,
.stream_allow = virtio_transport_stream_allow,
+ .seqpacket_drop = virtio_transport_seqpacket_drop,
.seqpacket_dequeue = virtio_transport_seqpacket_dequeue,
.seqpacket_enqueue = virtio_transport_seqpacket_enqueue,
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 7360ab7ea0af..18a50f64bf54 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -36,6 +36,7 @@ struct virtio_vsock_sock {
u32 rx_bytes;
u32 buf_alloc;
struct list_head rx_queue;
+ bool drop_until_eor;
};
struct virtio_vsock_pkt {
@@ -89,6 +90,7 @@ virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
struct msghdr *msg,
int flags,
bool *msg_ready);
+void virtio_transport_seqpacket_drop(struct vsock_sock *vsk);
s64 virtio_transport_stream_has_data(struct vsock_sock *vsk);
s64 virtio_transport_stream_has_space(struct vsock_sock *vsk);
diff --git a/include/net/af_vsock.h b/include/net/af_vsock.h
index 1747c0b564ef..356878aabbd4 100644
--- a/include/net/af_vsock.h
+++ b/include/net/af_vsock.h
@@ -141,6 +141,7 @@ struct vsock_transport {
int (*seqpacket_enqueue)(struct vsock_sock *vsk, struct msghdr *msg,
size_t len);
bool (*seqpacket_allow)(u32 remote_cid);
+ void (*seqpacket_drop)(struct vsock_sock *vsk);
/* Notification. */
int (*notify_poll_in)(struct vsock_sock *, size_t, bool *);
diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 87955f9ff065..380a90c758c4 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -2024,6 +2024,7 @@ static int __vsock_seqpacket_recvmsg(struct sock *sk, struct msghdr *msg,
intr_err = vsock_connectible_wait_data(sk, &wait, timeout, NULL, 0);
if (intr_err <= 0) {
err = intr_err;
+ transport->seqpacket_drop(vsk);
break;
}
diff --git a/net/vmw_vsock/virtio_transport.c b/net/vmw_vsock/virtio_transport.c
index 2a7c56fcb062..2f7d54071ee2 100644
--- a/net/vmw_vsock/virtio_transport.c
+++ b/net/vmw_vsock/virtio_transport.c
@@ -475,6 +475,7 @@ static struct virtio_transport virtio_transport = {
.seqpacket_dequeue = virtio_transport_seqpacket_dequeue,
.seqpacket_enqueue = virtio_transport_seqpacket_enqueue,
.seqpacket_allow = virtio_transport_seqpacket_allow,
+ .seqpacket_drop = virtio_transport_seqpacket_drop,
.notify_poll_in = virtio_transport_notify_poll_in,
.notify_poll_out = virtio_transport_notify_poll_out,
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index ce67cf449ef8..52765754edcd 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -425,7 +425,7 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
pkt = list_first_entry(&vvs->rx_queue, struct virtio_vsock_pkt, list);
pkt_len = (size_t)le32_to_cpu(pkt->hdr.len);
- bytes_to_copy = min(user_buf_len, pkt_len);
+ bytes_to_copy = vvs->drop_until_eor ? 0 : min(user_buf_len, pkt_len);
if (bytes_to_copy) {
int err;
@@ -438,17 +438,22 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
spin_lock_bh(&vvs->rx_lock);
- if (err)
+ if (err) {
dequeued_len = err;
- else
+ vvs->drop_until_eor = true;
+ } else {
user_buf_len -= bytes_to_copy;
+ }
}
if (dequeued_len >= 0)
dequeued_len += pkt_len;
if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SEQ_EOR) {
- *msg_ready = true;
+ if (vvs->drop_until_eor)
+ vvs->drop_until_eor = false;
+ else
+ *msg_ready = true;
}
virtio_transport_dec_rx_pkt(vvs, pkt);
@@ -487,6 +492,16 @@ virtio_transport_seqpacket_dequeue(struct vsock_sock *vsk,
}
EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_dequeue);
+void virtio_transport_seqpacket_drop(struct vsock_sock *vsk)
+{
+ struct virtio_vsock_sock *vvs = vsk->trans;
+
+ spin_lock_bh(&vvs->rx_lock);
+ vvs->drop_until_eor = true;
+ spin_unlock_bh(&vvs->rx_lock);
+}
+EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_drop);
+
int
virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk,
struct msghdr *msg,
diff --git a/net/vmw_vsock/vsock_loopback.c b/net/vmw_vsock/vsock_loopback.c
index 809f807d0710..d9030a46e4b9 100644
--- a/net/vmw_vsock/vsock_loopback.c
+++ b/net/vmw_vsock/vsock_loopback.c
@@ -94,6 +94,7 @@ static struct virtio_transport loopback_transport = {
.seqpacket_dequeue = virtio_transport_seqpacket_dequeue,
.seqpacket_enqueue = virtio_transport_seqpacket_enqueue,
.seqpacket_allow = vsock_loopback_seqpacket_allow,
+ .seqpacket_drop = virtio_transport_seqpacket_drop,
.notify_poll_in = virtio_transport_notify_poll_in,
.notify_poll_out = virtio_transport_notify_poll_out,
--
2.25.1
Remove record size limit which was 'peer_buf_alloc' value.
New approach doesn't need this, because data is copied to
user's buffer in stream manner(we don't wait until whole
record is received).
Signed-off-by: Arseny Krasnov <[email protected]>
---
net/vmw_vsock/virtio_transport_common.c | 11 -----------
1 file changed, 11 deletions(-)
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 52765754edcd..5cfdf701a8af 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -507,17 +507,6 @@ virtio_transport_seqpacket_enqueue(struct vsock_sock *vsk,
struct msghdr *msg,
size_t len)
{
- struct virtio_vsock_sock *vvs = vsk->trans;
-
- spin_lock_bh(&vvs->tx_lock);
-
- if (len > vvs->peer_buf_alloc) {
- spin_unlock_bh(&vvs->tx_lock);
- return -EMSGSIZE;
- }
-
- spin_unlock_bh(&vvs->tx_lock);
-
return virtio_transport_stream_enqueue(vsk, msg, len);
}
EXPORT_SYMBOL_GPL(virtio_transport_seqpacket_enqueue);
--
2.25.1
'msg_count' is obsolete, because 'rx_bytes' used instead.
Signed-off-by: Arseny Krasnov <[email protected]>
---
include/linux/virtio_vsock.h | 1 -
net/vmw_vsock/virtio_transport_common.c | 4 ----
2 files changed, 5 deletions(-)
diff --git a/include/linux/virtio_vsock.h b/include/linux/virtio_vsock.h
index 8d34f3d73bbb..7360ab7ea0af 100644
--- a/include/linux/virtio_vsock.h
+++ b/include/linux/virtio_vsock.h
@@ -36,7 +36,6 @@ struct virtio_vsock_sock {
u32 rx_bytes;
u32 buf_alloc;
struct list_head rx_queue;
- u32 msg_count;
};
struct virtio_vsock_pkt {
diff --git a/net/vmw_vsock/virtio_transport_common.c b/net/vmw_vsock/virtio_transport_common.c
index 37d4ed526750..ce67cf449ef8 100644
--- a/net/vmw_vsock/virtio_transport_common.c
+++ b/net/vmw_vsock/virtio_transport_common.c
@@ -449,7 +449,6 @@ static int virtio_transport_seqpacket_do_dequeue(struct vsock_sock *vsk,
if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SEQ_EOR) {
*msg_ready = true;
- vvs->msg_count--;
}
virtio_transport_dec_rx_pkt(vvs, pkt);
@@ -1006,9 +1005,6 @@ virtio_transport_recv_enqueue(struct vsock_sock *vsk,
goto out;
}
- if (le32_to_cpu(pkt->hdr.flags) & VIRTIO_VSOCK_SEQ_EOR)
- vvs->msg_count++;
-
/* Try to copy small packets into the buffer of last packet queued,
* to avoid wasting memory queueing the entire buffer with a small
* payload.
--
2.25.1
Add test where sender sends two message, each with own
data pattern. Reader tries to read first to broken buffer:
it has three pages size, but middle page is unmapped. Then,
reader tries to read second message to valid buffer. Test
checks, that uncopied part of first message was dropped
and thus not copied as part of second message.
Signed-off-by: Arseny Krasnov <[email protected]>
---
tools/testing/vsock/vsock_test.c | 120 +++++++++++++++++++++++++++++++
1 file changed, 120 insertions(+)
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 67766bfe176f..cdaa154fc3a9 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -16,6 +16,7 @@
#include <linux/kernel.h>
#include <sys/types.h>
#include <sys/socket.h>
+#include <sys/mman.h>
#include "timeout.h"
#include "control.h"
@@ -385,6 +386,120 @@ static void test_seqpacket_msg_trunc_server(const struct test_opts *opts)
close(fd);
}
+#define BUF_PATTERN_1 'a'
+#define BUF_PATTERN_2 'b'
+
+static void test_seqpacket_invalid_rec_buffer_client(const struct test_opts *opts)
+{
+ int fd;
+ unsigned char *buf1;
+ unsigned char *buf2;
+ int buf_size = getpagesize() * 3;
+
+ fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
+ if (fd < 0) {
+ perror("connect");
+ exit(EXIT_FAILURE);
+ }
+
+ buf1 = malloc(buf_size);
+ if (!buf1) {
+ perror("'malloc()' for 'buf1'");
+ exit(EXIT_FAILURE);
+ }
+
+ buf2 = malloc(buf_size);
+ if (!buf2) {
+ perror("'malloc()' for 'buf2'");
+ exit(EXIT_FAILURE);
+ }
+
+ memset(buf1, BUF_PATTERN_1, buf_size);
+ memset(buf2, BUF_PATTERN_2, buf_size);
+
+ if (send(fd, buf1, buf_size, 0) != buf_size) {
+ perror("send failed");
+ exit(EXIT_FAILURE);
+ }
+
+ if (send(fd, buf2, buf_size, 0) != buf_size) {
+ perror("send failed");
+ exit(EXIT_FAILURE);
+ }
+
+ close(fd);
+}
+
+static void test_seqpacket_invalid_rec_buffer_server(const struct test_opts *opts)
+{
+ int fd;
+ unsigned char *broken_buf;
+ unsigned char *valid_buf;
+ int page_size = getpagesize();
+ int buf_size = page_size * 3;
+ ssize_t res;
+ int prot = PROT_READ | PROT_WRITE;
+ int flags = MAP_PRIVATE | MAP_ANONYMOUS;
+ int i;
+
+ fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL);
+ if (fd < 0) {
+ perror("accept");
+ exit(EXIT_FAILURE);
+ }
+
+ /* Setup first buffer. */
+ broken_buf = mmap(NULL, buf_size, prot, flags, -1, 0);
+ if (broken_buf == MAP_FAILED) {
+ perror("mmap for 'broken_buf'");
+ exit(EXIT_FAILURE);
+ }
+
+ /* Unmap "hole" in buffer. */
+ if (munmap(broken_buf + page_size, page_size)) {
+ perror("'broken_buf' setup");
+ exit(EXIT_FAILURE);
+ }
+
+ valid_buf = mmap(NULL, buf_size, prot, flags, -1, 0);
+ if (valid_buf == MAP_FAILED) {
+ perror("mmap for 'valid_buf'");
+ exit(EXIT_FAILURE);
+ }
+
+ /* Try to fill buffer with unmapped middle. */
+ res = read(fd, broken_buf, buf_size);
+ if (res != -1) {
+ perror("invalid read result of 'broken_buf'");
+ exit(EXIT_FAILURE);
+ }
+
+ if (errno != ENOMEM) {
+ perror("invalid errno of 'broken_buf'");
+ exit(EXIT_FAILURE);
+ }
+
+ /* Try to fill valid buffer. */
+ res = read(fd, valid_buf, buf_size);
+ if (res != buf_size) {
+ perror("invalid read result of 'valid_buf'");
+ exit(EXIT_FAILURE);
+ }
+
+ for (i = 0; i < buf_size; i++) {
+ if (valid_buf[i] != BUF_PATTERN_2) {
+ perror("invalid pattern for valid buf");
+ exit(EXIT_FAILURE);
+ }
+ }
+
+ /* Unmap buffers. */
+ munmap(broken_buf, page_size);
+ munmap(broken_buf + page_size * 2, page_size);
+ munmap(valid_buf, buf_size);
+ close(fd);
+}
+
static struct test_case test_cases[] = {
{
.name = "SOCK_STREAM connection reset",
@@ -425,6 +540,11 @@ static struct test_case test_cases[] = {
.run_client = test_seqpacket_msg_trunc_client,
.run_server = test_seqpacket_msg_trunc_server,
},
+ {
+ .name = "SOCK_SEQPACKET invalid receive buffer",
+ .run_client = test_seqpacket_invalid_rec_buffer_client,
+ .run_server = test_seqpacket_invalid_rec_buffer_server,
+ },
{},
};
--
2.25.1
On 04.07.2021 11:08, Arseny Krasnov wrote:
> This patchset modifies receive logic for SOCK_SEQPACKET.
> Difference between current implementation and this version is that
> now reader is woken up when there is at least one RW packet in rx
> queue of socket and data is copied to user's buffer, while merged
> approach wake up user only when whole message is received and kept
> in queue. New implementation has several advantages:
> 1) There is no limit for message length. Merged approach requires
> that length must be smaller than 'peer_buf_alloc', otherwise
> transmission will stuck.
> 2) There is no need to keep whole message in queue, thus no
> 'kmalloc()' memory will be wasted until EOR is received.
>
> Also new approach has some feature: as fragments of message
> are copied until EOR is received, it is possible that part of
> message will be already in user's buffer, while rest of message
> still not received. And if user will be interrupted by signal or
> timeout with part of message in buffer, it will exit receive loop,
> leaving rest of message in queue. To solve this problem special
> callback was added to transport: it is called when user was forced
> to leave exit loop and tells transport to drop any packet until
> EOR met. When EOR is found, this mode is disabled and normal packet
> processing started. Note, that when 'drop until EOR' mode is on,
> incoming packets still inserted in queue, reader will be woken up,
> tries to copy data, but nothing will be copied until EOR found.
> It was possible to drain such unneeded packets it rx work without
> kicking user, but implemented way is simplest. Anyway, i think
> such cases are rare.
>
> New test also added - it tries to copy to invalid user's
> buffer.
>
> Arseny Krasnov (16):
> af_vsock/virtio/vsock: change seqpacket receive logic
> af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
> virtio/vsock: remove 'msg_count' based logic
> af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
> virtio/vsock: remove record size limit for SEQPACKET
> vsock_test: SEQPACKET read to broken buffer
>
> drivers/vhost/vsock.c | 2 +-
> include/linux/virtio_vsock.h | 7 +-
> include/net/af_vsock.h | 4 +-
> net/vmw_vsock/af_vsock.c | 44 ++++----
> net/vmw_vsock/virtio_transport.c | 2 +-
> net/vmw_vsock/virtio_transport_common.c | 103 ++++++++-----------
> net/vmw_vsock/vsock_loopback.c | 2 +-
> tools/testing/vsock/vsock_test.c | 120 ++++++++++++++++++++++
> 8 files changed, 193 insertions(+), 91 deletions(-)
>
> v1 -> v2:
> Patches reordered and reorganized.
>
> Signed-off-by: Arseny Krasnov <[email protected]>
> ---
> cv.txt | 0
> 1 file changed, 0 insertions(+), 0 deletions(-)
> create mode 100644 cv.txt
>
> diff --git a/cv.txt b/cv.txt
> new file mode 100644
> index 000000000000..e69de29bb2d1
Sorry, forget to remove my tmp file
On Sun, Jul 04, 2021 at 11:08:13AM +0300, Arseny Krasnov wrote:
> This patchset modifies receive logic for SOCK_SEQPACKET.
> Difference between current implementation and this version is that
> now reader is woken up when there is at least one RW packet in rx
> queue of socket and data is copied to user's buffer, while merged
> approach wake up user only when whole message is received and kept
> in queue. New implementation has several advantages:
> 1) There is no limit for message length. Merged approach requires
> that length must be smaller than 'peer_buf_alloc', otherwise
> transmission will stuck.
> 2) There is no need to keep whole message in queue, thus no
> 'kmalloc()' memory will be wasted until EOR is received.
>
> Also new approach has some feature: as fragments of message
> are copied until EOR is received, it is possible that part of
> message will be already in user's buffer, while rest of message
> still not received. And if user will be interrupted by signal or
> timeout with part of message in buffer, it will exit receive loop,
> leaving rest of message in queue. To solve this problem special
> callback was added to transport: it is called when user was forced
> to leave exit loop and tells transport to drop any packet until
> EOR met.
Sorry about commenting late in the game. I'm a bit lost
SOCK_SEQPACKET
Provides sequenced, reliable, bidirectional, connection-mode transmission paths for records. A record can be sent using one or more output operations and received using one or more input operations, but a single operation never transfers part of more than one record. Record boundaries are visible to the receiver via the MSG_EOR flag.
it's supposed to be reliable - how is it legal to drop packets?
> When EOR is found, this mode is disabled and normal packet
> processing started. Note, that when 'drop until EOR' mode is on,
> incoming packets still inserted in queue, reader will be woken up,
> tries to copy data, but nothing will be copied until EOR found.
> It was possible to drain such unneeded packets it rx work without
> kicking user, but implemented way is simplest. Anyway, i think
> such cases are rare.
> New test also added - it tries to copy to invalid user's
> buffer.
>
> Arseny Krasnov (16):
> af_vsock/virtio/vsock: change seqpacket receive logic
> af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
> virtio/vsock: remove 'msg_count' based logic
> af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
> virtio/vsock: remove record size limit for SEQPACKET
> vsock_test: SEQPACKET read to broken buffer
>
> drivers/vhost/vsock.c | 2 +-
> include/linux/virtio_vsock.h | 7 +-
> include/net/af_vsock.h | 4 +-
> net/vmw_vsock/af_vsock.c | 44 ++++----
> net/vmw_vsock/virtio_transport.c | 2 +-
> net/vmw_vsock/virtio_transport_common.c | 103 ++++++++-----------
> net/vmw_vsock/vsock_loopback.c | 2 +-
> tools/testing/vsock/vsock_test.c | 120 ++++++++++++++++++++++
> 8 files changed, 193 insertions(+), 91 deletions(-)
>
> v1 -> v2:
> Patches reordered and reorganized.
>
> Signed-off-by: Arseny Krasnov <[email protected]>
> ---
> cv.txt | 0
> 1 file changed, 0 insertions(+), 0 deletions(-)
> create mode 100644 cv.txt
>
> diff --git a/cv.txt b/cv.txt
> new file mode 100644
> index 000000000000..e69de29bb2d1
> --
> 2.25.1
On 04.07.2021 11:30, Michael S. Tsirkin wrote:
> On Sun, Jul 04, 2021 at 11:08:13AM +0300, Arseny Krasnov wrote:
>> This patchset modifies receive logic for SOCK_SEQPACKET.
>> Difference between current implementation and this version is that
>> now reader is woken up when there is at least one RW packet in rx
>> queue of socket and data is copied to user's buffer, while merged
>> approach wake up user only when whole message is received and kept
>> in queue. New implementation has several advantages:
>> 1) There is no limit for message length. Merged approach requires
>> that length must be smaller than 'peer_buf_alloc', otherwise
>> transmission will stuck.
>> 2) There is no need to keep whole message in queue, thus no
>> 'kmalloc()' memory will be wasted until EOR is received.
>>
>> Also new approach has some feature: as fragments of message
>> are copied until EOR is received, it is possible that part of
>> message will be already in user's buffer, while rest of message
>> still not received. And if user will be interrupted by signal or
>> timeout with part of message in buffer, it will exit receive loop,
>> leaving rest of message in queue. To solve this problem special
>> callback was added to transport: it is called when user was forced
>> to leave exit loop and tells transport to drop any packet until
>> EOR met.
> Sorry about commenting late in the game. I'm a bit lost
>
>
> SOCK_SEQPACKET
> Provides sequenced, reliable, bidirectional, connection-mode transmission paths for records. A record can be sent using one or more output operations and received using one or more input operations, but a single operation never transfers part of more than one record. Record boundaries are visible to the receiver via the MSG_EOR flag.
>
> it's supposed to be reliable - how is it legal to drop packets?
Sorry, seems i need to rephrase description. "Packet" here means fragment of record(message) at transport
layer. As this is SEQPACKET mode, receiver could get only whole message or error, so if only several fragments
of message was copied (if signal received for example) we can't return it to user - it breaks SEQPACKET sense. I think,
in this case we can drop rest of record's fragments legally.
Thank You
>
>
>> When EOR is found, this mode is disabled and normal packet
>> processing started. Note, that when 'drop until EOR' mode is on,
>> incoming packets still inserted in queue, reader will be woken up,
>> tries to copy data, but nothing will be copied until EOR found.
>> It was possible to drain such unneeded packets it rx work without
>> kicking user, but implemented way is simplest. Anyway, i think
>> such cases are rare.
>
>> New test also added - it tries to copy to invalid user's
>> buffer.
>>
>> Arseny Krasnov (16):
>> af_vsock/virtio/vsock: change seqpacket receive logic
>> af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
>> virtio/vsock: remove 'msg_count' based logic
>> af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
>> virtio/vsock: remove record size limit for SEQPACKET
>> vsock_test: SEQPACKET read to broken buffer
>>
>> drivers/vhost/vsock.c | 2 +-
>> include/linux/virtio_vsock.h | 7 +-
>> include/net/af_vsock.h | 4 +-
>> net/vmw_vsock/af_vsock.c | 44 ++++----
>> net/vmw_vsock/virtio_transport.c | 2 +-
>> net/vmw_vsock/virtio_transport_common.c | 103 ++++++++-----------
>> net/vmw_vsock/vsock_loopback.c | 2 +-
>> tools/testing/vsock/vsock_test.c | 120 ++++++++++++++++++++++
>> 8 files changed, 193 insertions(+), 91 deletions(-)
>>
>> v1 -> v2:
>> Patches reordered and reorganized.
>>
>> Signed-off-by: Arseny Krasnov <[email protected]>
>> ---
>> cv.txt | 0
>> 1 file changed, 0 insertions(+), 0 deletions(-)
>> create mode 100644 cv.txt
>>
>> diff --git a/cv.txt b/cv.txt
>> new file mode 100644
>> index 000000000000..e69de29bb2d1
>> --
>> 2.25.1
>
On Sun, Jul 04, 2021 at 12:23:03PM +0300, Arseny Krasnov wrote:
>
> On 04.07.2021 11:30, Michael S. Tsirkin wrote:
> > On Sun, Jul 04, 2021 at 11:08:13AM +0300, Arseny Krasnov wrote:
> >> This patchset modifies receive logic for SOCK_SEQPACKET.
> >> Difference between current implementation and this version is that
> >> now reader is woken up when there is at least one RW packet in rx
> >> queue of socket and data is copied to user's buffer, while merged
> >> approach wake up user only when whole message is received and kept
> >> in queue. New implementation has several advantages:
> >> 1) There is no limit for message length. Merged approach requires
> >> that length must be smaller than 'peer_buf_alloc', otherwise
> >> transmission will stuck.
> >> 2) There is no need to keep whole message in queue, thus no
> >> 'kmalloc()' memory will be wasted until EOR is received.
> >>
> >> Also new approach has some feature: as fragments of message
> >> are copied until EOR is received, it is possible that part of
> >> message will be already in user's buffer, while rest of message
> >> still not received. And if user will be interrupted by signal or
> >> timeout with part of message in buffer, it will exit receive loop,
> >> leaving rest of message in queue. To solve this problem special
> >> callback was added to transport: it is called when user was forced
> >> to leave exit loop and tells transport to drop any packet until
> >> EOR met.
> > Sorry about commenting late in the game. I'm a bit lost
> >
> >
> > SOCK_SEQPACKET
> > Provides sequenced, reliable, bidirectional, connection-mode transmission paths for records. A record can be sent using one or more output operations and received using one or more input operations, but a single operation never transfers part of more than one record. Record boundaries are visible to the receiver via the MSG_EOR flag.
> >
> > it's supposed to be reliable - how is it legal to drop packets?
>
> Sorry, seems i need to rephrase description. "Packet" here means fragment of record(message) at transport
>
> layer. As this is SEQPACKET mode, receiver could get only whole message or error, so if only several fragments
>
> of message was copied (if signal received for example) we can't return it to user - it breaks SEQPACKET sense. I think,
>
> in this case we can drop rest of record's fragments legally.
>
>
> Thank You
Would not that violate the reliable property? IIUC it's only ok to
return an error if socket gets closed. Just like e.g. TCP ...
> >
> >
> >> When EOR is found, this mode is disabled and normal packet
> >> processing started. Note, that when 'drop until EOR' mode is on,
> >> incoming packets still inserted in queue, reader will be woken up,
> >> tries to copy data, but nothing will be copied until EOR found.
> >> It was possible to drain such unneeded packets it rx work without
> >> kicking user, but implemented way is simplest. Anyway, i think
> >> such cases are rare.
> >
> >> New test also added - it tries to copy to invalid user's
> >> buffer.
> >>
> >> Arseny Krasnov (16):
> >> af_vsock/virtio/vsock: change seqpacket receive logic
> >> af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
> >> virtio/vsock: remove 'msg_count' based logic
> >> af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
> >> virtio/vsock: remove record size limit for SEQPACKET
> >> vsock_test: SEQPACKET read to broken buffer
> >>
> >> drivers/vhost/vsock.c | 2 +-
> >> include/linux/virtio_vsock.h | 7 +-
> >> include/net/af_vsock.h | 4 +-
> >> net/vmw_vsock/af_vsock.c | 44 ++++----
> >> net/vmw_vsock/virtio_transport.c | 2 +-
> >> net/vmw_vsock/virtio_transport_common.c | 103 ++++++++-----------
> >> net/vmw_vsock/vsock_loopback.c | 2 +-
> >> tools/testing/vsock/vsock_test.c | 120 ++++++++++++++++++++++
> >> 8 files changed, 193 insertions(+), 91 deletions(-)
> >>
> >> v1 -> v2:
> >> Patches reordered and reorganized.
> >>
> >> Signed-off-by: Arseny Krasnov <[email protected]>
> >> ---
> >> cv.txt | 0
> >> 1 file changed, 0 insertions(+), 0 deletions(-)
> >> create mode 100644 cv.txt
> >>
> >> diff --git a/cv.txt b/cv.txt
> >> new file mode 100644
> >> index 000000000000..e69de29bb2d1
> >> --
> >> 2.25.1
> >
On 04.07.2021 12:54, Michael S. Tsirkin wrote:
> On Sun, Jul 04, 2021 at 12:23:03PM +0300, Arseny Krasnov wrote:
>> On 04.07.2021 11:30, Michael S. Tsirkin wrote:
>>> On Sun, Jul 04, 2021 at 11:08:13AM +0300, Arseny Krasnov wrote:
>>>> This patchset modifies receive logic for SOCK_SEQPACKET.
>>>> Difference between current implementation and this version is that
>>>> now reader is woken up when there is at least one RW packet in rx
>>>> queue of socket and data is copied to user's buffer, while merged
>>>> approach wake up user only when whole message is received and kept
>>>> in queue. New implementation has several advantages:
>>>> 1) There is no limit for message length. Merged approach requires
>>>> that length must be smaller than 'peer_buf_alloc', otherwise
>>>> transmission will stuck.
>>>> 2) There is no need to keep whole message in queue, thus no
>>>> 'kmalloc()' memory will be wasted until EOR is received.
>>>>
>>>> Also new approach has some feature: as fragments of message
>>>> are copied until EOR is received, it is possible that part of
>>>> message will be already in user's buffer, while rest of message
>>>> still not received. And if user will be interrupted by signal or
>>>> timeout with part of message in buffer, it will exit receive loop,
>>>> leaving rest of message in queue. To solve this problem special
>>>> callback was added to transport: it is called when user was forced
>>>> to leave exit loop and tells transport to drop any packet until
>>>> EOR met.
>>> Sorry about commenting late in the game. I'm a bit lost
>>>
>>>
>>> SOCK_SEQPACKET
>>> Provides sequenced, reliable, bidirectional, connection-mode transmission paths for records. A record can be sent using one or more output operations and received using one or more input operations, but a single operation never transfers part of more than one record. Record boundaries are visible to the receiver via the MSG_EOR flag.
>>>
>>> it's supposed to be reliable - how is it legal to drop packets?
>> Sorry, seems i need to rephrase description. "Packet" here means fragment of record(message) at transport
>>
>> layer. As this is SEQPACKET mode, receiver could get only whole message or error, so if only several fragments
>>
>> of message was copied (if signal received for example) we can't return it to user - it breaks SEQPACKET sense. I think,
>>
>> in this case we can drop rest of record's fragments legally.
>>
>>
>> Thank You
> Would not that violate the reliable property? IIUC it's only ok to
> return an error if socket gets closed. Just like e.g. TCP ...
>
Sorry for late answer, yes You're right, seems this is unwanted drop...
Lets wait for Stefano Garzarella feedback
Thank You
>
>>>
>>>> When EOR is found, this mode is disabled and normal packet
>>>> processing started. Note, that when 'drop until EOR' mode is on,
>>>> incoming packets still inserted in queue, reader will be woken up,
>>>> tries to copy data, but nothing will be copied until EOR found.
>>>> It was possible to drain such unneeded packets it rx work without
>>>> kicking user, but implemented way is simplest. Anyway, i think
>>>> such cases are rare.
>>>> New test also added - it tries to copy to invalid user's
>>>> buffer.
>>>>
>>>> Arseny Krasnov (16):
>>>> af_vsock/virtio/vsock: change seqpacket receive logic
>>>> af_vsock/virtio/vsock: remove 'seqpacket_has_data' callback
>>>> virtio/vsock: remove 'msg_count' based logic
>>>> af_vsock/virtio/vsock: add 'seqpacket_drop()' callback
>>>> virtio/vsock: remove record size limit for SEQPACKET
>>>> vsock_test: SEQPACKET read to broken buffer
>>>>
>>>> drivers/vhost/vsock.c | 2 +-
>>>> include/linux/virtio_vsock.h | 7 +-
>>>> include/net/af_vsock.h | 4 +-
>>>> net/vmw_vsock/af_vsock.c | 44 ++++----
>>>> net/vmw_vsock/virtio_transport.c | 2 +-
>>>> net/vmw_vsock/virtio_transport_common.c | 103 ++++++++-----------
>>>> net/vmw_vsock/vsock_loopback.c | 2 +-
>>>> tools/testing/vsock/vsock_test.c | 120 ++++++++++++++++++++++
>>>> 8 files changed, 193 insertions(+), 91 deletions(-)
>>>>
>>>> v1 -> v2:
>>>> Patches reordered and reorganized.
>>>>
>>>> Signed-off-by: Arseny Krasnov <[email protected]>
>>>> ---
>>>> cv.txt | 0
>>>> 1 file changed, 0 insertions(+), 0 deletions(-)
>>>> create mode 100644 cv.txt
>>>>
>>>> diff --git a/cv.txt b/cv.txt
>>>> new file mode 100644
>>>> index 000000000000..e69de29bb2d1
>>>> --
>>>> 2.25.1
>
On Mon, Jul 05, 2021 at 01:48:28PM +0300, Arseny Krasnov wrote:
>
>On 04.07.2021 12:54, Michael S. Tsirkin wrote:
>> On Sun, Jul 04, 2021 at 12:23:03PM +0300, Arseny Krasnov wrote:
>>> On 04.07.2021 11:30, Michael S. Tsirkin wrote:
>>>> On Sun, Jul 04, 2021 at 11:08:13AM +0300, Arseny Krasnov wrote:
>>>>> This patchset modifies receive logic for SOCK_SEQPACKET.
>>>>> Difference between current implementation and this version is that
>>>>> now reader is woken up when there is at least one RW packet in rx
>>>>> queue of socket and data is copied to user's buffer, while merged
>>>>> approach wake up user only when whole message is received and kept
>>>>> in queue. New implementation has several advantages:
>>>>> 1) There is no limit for message length. Merged approach requires
>>>>> that length must be smaller than 'peer_buf_alloc', otherwise
>>>>> transmission will stuck.
>>>>> 2) There is no need to keep whole message in queue, thus no
>>>>> 'kmalloc()' memory will be wasted until EOR is received.
>>>>>
>>>>> Also new approach has some feature: as fragments of message
>>>>> are copied until EOR is received, it is possible that part of
>>>>> message will be already in user's buffer, while rest of message
>>>>> still not received. And if user will be interrupted by signal or
>>>>> timeout with part of message in buffer, it will exit receive loop,
>>>>> leaving rest of message in queue. To solve this problem special
>>>>> callback was added to transport: it is called when user was forced
>>>>> to leave exit loop and tells transport to drop any packet until
>>>>> EOR met.
>>>> Sorry about commenting late in the game. I'm a bit lost
>>>>
>>>>
>>>> SOCK_SEQPACKET
>>>> Provides sequenced, reliable, bidirectional, connection-mode transmission paths for records. A record can be sent using one or more output operations and received using one or more input operations, but a single operation never transfers part of more than one record. Record boundaries are visible to the receiver via the MSG_EOR flag.
>>>>
>>>> it's supposed to be reliable - how is it legal to drop packets?
>>> Sorry, seems i need to rephrase description. "Packet" here means fragment of record(message) at transport
>>>
>>> layer. As this is SEQPACKET mode, receiver could get only whole message or error, so if only several fragments
>>>
>>> of message was copied (if signal received for example) we can't return it to user - it breaks SEQPACKET sense. I think,
>>>
>>> in this case we can drop rest of record's fragments legally.
>>>
>>>
>>> Thank You
>> Would not that violate the reliable property? IIUC it's only ok to
>> return an error if socket gets closed. Just like e.g. TCP ...
>>
>Sorry for late answer, yes You're right, seems this is unwanted drop...
>
>Lets wait for Stefano Garzarella feedback
It was the same concern I had with the series that introduced SEQPACKET
for vsock, which is why I suggested to wait until the message is
complete, before copying it to the user's buffer.
IIUC, with the current upstream implementation, we don't have this
problem, right?
I'm not sure how to fix this, other than by keeping all the fragments
queued until we've successfully copied them to user space, which is what
we should do without this series applied IIUC.
Stefano