2022-11-25 17:11:15

by Arseniy Krasnov

[permalink] [raw]
Subject: [RFC PATCH v2 0/6] vsock: update tools and error handling

Patchset consists of two parts:

1) Kernel patches
Three patches from Bobby Eshleman. I took single patch from Bobby:
https://lore.kernel.org/lkml/d81818b868216c774613dd03641fcfe63cc55a45
[email protected]/ and split it to three
patches according different parts of vsock subsystem.

I used it, because for SOCK_SEQPACKET big messages handling was broken -
ENOMEM was returned instead of EMSGSIZE. And anyway, current logic which
always replaces any error code returned by transport to ENOMEM looks
strange for me also(for example in EMSGSIZE case it was changed to
ENOMEM). So, one of three patches updates af_vsock.c, keeping error
code from transport untouched, while another 2 patches save original
behaviour for Hyper-V and VMCI.

Please, Hyper-V and VMCI guys, could You take a look? Is previous
behaviour really needed?

2) Tool patches
Since there is work on several significant updates for vsock(virtio/
vsock especially): skbuff, DGRAM, zerocopy rx/tx, so I think that this
patchset will be useful.

This patchset updates vsock tests and tools a little bit. First of all
it updates test suite: two new tests are added. One test is reworked
message bound test. Now it is more complex. Instead of sending 1 byte
messages with one MSG_EOR bit, it sends messages of random length(one
half of messages are smaller than page size, second half are bigger)
with random number of MSG_EOR bits set. Receiver also don't know total
number of messages. Message bounds control is maintained by hash sum
of messages length calculation. Second test is for SOCK_SEQPACKET - it
tries to send message with length more than allowed. I think both tests
will be useful for DGRAM support also.

Third thing that this patchset adds is small utility to test vsock
performance for both rx and tx. I think this util could be useful as
'iperf', because:
1) It is small comparing to 'iperf()', so it very easy to add new
mode or feature to it(especially vsock specific).
2) It is located in kernel source tree, so it could be updated by the
same patchset which changes related kernel functionality in vsock.

I used this util very often to check performance of my rx zerocopy
support(this tool has rx zerocopy support, but not in this patchset).

Patchset was rebased and tested on skbuff v4 patch from Bobby Eshleman:
https://lore.kernel.org/netdev/[email protected]/

Changelog:
v1 -> v2:
- Three new patches from Bobby Eshleman to kernel part
- Message bounds test: some refactoring and add comment to describe
hashing purpose
- Big message test: check 'errno' for EMSGSIZE and move new test to
the end of tests array
- vsock_perf:
- update README file
- add simple usage example to commit message
- update '-h' (help) output
- use 'stdout' for output instead of 'stderr'
- use 'strtol' instead of 'atoi'

Bobby Eshleman(3):
vsock: return errors other than -ENOMEM to socket
hv_sock: always return ENOMEM in case of error
vsock/vmci: always return ENOMEM in case of error

Arseniy Krasnov(3):
test/vsock: rework message bound test
test/vsock: add big message test
test/vsock: vsock_perf utility

net/vmw_vsock/af_vsock.c | 3 +-
net/vmw_vsock/hyperv_transport.c | 2 +-
net/vmw_vsock/vmci_transport.c | 9 +-
tools/testing/vsock/Makefile | 1 +
tools/testing/vsock/README | 34 ++++
tools/testing/vsock/control.c | 28 +++
tools/testing/vsock/control.h | 2 +
tools/testing/vsock/util.c | 13 ++
tools/testing/vsock/util.h | 1 +
tools/testing/vsock/vsock_perf.c | 400 +++++++++++++++++++++++++++++++++++++++
tools/testing/vsock/vsock_test.c | 193 +++++++++++++++++--
11 files changed, 670 insertions(+), 16 deletions(-)

--
2.25.1


2022-11-25 17:12:30

by Arseniy Krasnov

[permalink] [raw]
Subject: [RFC PATCH v2 2/6] hv_sock: always return ENOMEM in case of error

From: Bobby Eshleman <[email protected]>

This saves original behaviour from af_vsock.c - switch any error
code returned from transport layer to ENOMEM.

Signed-off-by: Bobby Eshleman <[email protected]>
Signed-off-by: Arseniy Krasnov <[email protected]>
---
net/vmw_vsock/hyperv_transport.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
index 59c3e2697069..fbbe55133da2 100644
--- a/net/vmw_vsock/hyperv_transport.c
+++ b/net/vmw_vsock/hyperv_transport.c
@@ -687,7 +687,7 @@ static ssize_t hvs_stream_enqueue(struct vsock_sock *vsk, struct msghdr *msg,
if (bytes_written)
ret = bytes_written;
kfree(send_buf);
- return ret;
+ return ret < 0 ? -ENOMEM : ret;
}

static s64 hvs_stream_has_data(struct vsock_sock *vsk)
--
2.25.1

2022-11-25 17:42:46

by Arseniy Krasnov

[permalink] [raw]
Subject: [RFC PATCH v2 4/6] test/vsock: rework message bounds test

This updates message bound test making it more complex. Instead of
sending 1 bytes messages with one MSG_EOR bit, it sends messages of
random length(one half of messages are smaller than page size, second
half are bigger) with random number of MSG_EOR bits set. Receiver
also don't know total number of messages.

Signed-off-by: Arseniy Krasnov <[email protected]>
---
tools/testing/vsock/control.c | 28 +++++++
tools/testing/vsock/control.h | 2 +
tools/testing/vsock/util.c | 13 ++++
tools/testing/vsock/util.h | 1 +
tools/testing/vsock/vsock_test.c | 124 +++++++++++++++++++++++++++----
5 files changed, 155 insertions(+), 13 deletions(-)

diff --git a/tools/testing/vsock/control.c b/tools/testing/vsock/control.c
index 4874872fc5a3..d2deb4b15b94 100644
--- a/tools/testing/vsock/control.c
+++ b/tools/testing/vsock/control.c
@@ -141,6 +141,34 @@ void control_writeln(const char *str)
timeout_end();
}

+void control_writeulong(unsigned long value)
+{
+ char str[32];
+
+ if (snprintf(str, sizeof(str), "%lu", value) >= sizeof(str)) {
+ perror("snprintf");
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln(str);
+}
+
+unsigned long control_readulong(void)
+{
+ unsigned long value;
+ char *str;
+
+ str = control_readln();
+
+ if (!str)
+ exit(EXIT_FAILURE);
+
+ value = strtoul(str, NULL, 10);
+ free(str);
+
+ return value;
+}
+
/* Return the next line from the control socket (without the trailing newline).
*
* The program terminates if a timeout occurs.
diff --git a/tools/testing/vsock/control.h b/tools/testing/vsock/control.h
index 51814b4f9ac1..c1f77fdb2c7a 100644
--- a/tools/testing/vsock/control.h
+++ b/tools/testing/vsock/control.h
@@ -9,7 +9,9 @@ void control_init(const char *control_host, const char *control_port,
void control_cleanup(void);
void control_writeln(const char *str);
char *control_readln(void);
+unsigned long control_readulong(void);
void control_expectln(const char *str);
bool control_cmpln(char *line, const char *str, bool fail);
+void control_writeulong(unsigned long value);

#endif /* CONTROL_H */
diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
index 2acbb7703c6a..01b636d3039a 100644
--- a/tools/testing/vsock/util.c
+++ b/tools/testing/vsock/util.c
@@ -395,3 +395,16 @@ void skip_test(struct test_case *test_cases, size_t test_cases_len,

test_cases[test_id].skip = true;
}
+
+unsigned long hash_djb2(const void *data, size_t len)
+{
+ unsigned long hash = 5381;
+ int i = 0;
+
+ while (i < len) {
+ hash = ((hash << 5) + hash) + ((unsigned char *)data)[i];
+ i++;
+ }
+
+ return hash;
+}
diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
index a3375ad2fb7f..fb99208a95ea 100644
--- a/tools/testing/vsock/util.h
+++ b/tools/testing/vsock/util.h
@@ -49,4 +49,5 @@ void run_tests(const struct test_case *test_cases,
void list_tests(const struct test_case *test_cases);
void skip_test(struct test_case *test_cases, size_t test_cases_len,
const char *test_id_str);
+unsigned long hash_djb2(const void *data, size_t len);
#endif /* UTIL_H */
diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index bb6d691cb30d..12ef0cca6f93 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -284,10 +284,14 @@ static void test_stream_msg_peek_server(const struct test_opts *opts)
close(fd);
}

-#define MESSAGES_CNT 7
-#define MSG_EOR_IDX (MESSAGES_CNT / 2)
+#define SOCK_BUF_SIZE (2 * 1024 * 1024)
+#define MAX_MSG_SIZE (32 * 1024)
+
static void test_seqpacket_msg_bounds_client(const struct test_opts *opts)
{
+ unsigned long curr_hash;
+ int page_size;
+ int msg_count;
int fd;

fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
@@ -296,18 +300,78 @@ static void test_seqpacket_msg_bounds_client(const struct test_opts *opts)
exit(EXIT_FAILURE);
}

- /* Send several messages, one with MSG_EOR flag */
- for (int i = 0; i < MESSAGES_CNT; i++)
- send_byte(fd, 1, (i == MSG_EOR_IDX) ? MSG_EOR : 0);
+ /* Wait, until receiver sets buffer size. */
+ control_expectln("SRVREADY");
+
+ curr_hash = 0;
+ page_size = getpagesize();
+ msg_count = SOCK_BUF_SIZE / MAX_MSG_SIZE;
+
+ for (int i = 0; i < msg_count; i++) {
+ ssize_t send_size;
+ size_t buf_size;
+ int flags;
+ void *buf;
+
+ /* Use "small" buffers and "big" buffers. */
+ if (i & 1)
+ buf_size = page_size +
+ (rand() % (MAX_MSG_SIZE - page_size));
+ else
+ buf_size = 1 + (rand() % page_size);
+
+ buf = malloc(buf_size);
+
+ if (!buf) {
+ perror("malloc");
+ exit(EXIT_FAILURE);
+ }
+
+ /* Set at least one MSG_EOR + some random. */
+ if (i == (msg_count / 2) || (rand() & 1)) {
+ flags = MSG_EOR;
+ curr_hash++;
+ } else {
+ flags = 0;
+ }
+
+ send_size = send(fd, buf, buf_size, flags);
+
+ if (send_size < 0) {
+ perror("send");
+ exit(EXIT_FAILURE);
+ }
+
+ if (send_size != buf_size) {
+ fprintf(stderr, "Invalid send size\n");
+ exit(EXIT_FAILURE);
+ }
+
+ /*
+ * Hash sum is computed at both client and server in
+ * the same way: H = hash(H + 'message length'). Idea
+ * is simple: such hash "accumulates" length and order
+ * of every sent/received message. After data exchange,
+ * both sums are compared using control socket, and if
+ * message bounds wasn't broken - two values must be
+ * equal.
+ */
+ curr_hash += send_size;
+ curr_hash = hash_djb2(&curr_hash, sizeof(curr_hash));
+ }

control_writeln("SENDDONE");
+ control_writeulong(curr_hash);
close(fd);
}

static void test_seqpacket_msg_bounds_server(const struct test_opts *opts)
{
+ unsigned long sock_buf_size;
+ unsigned long remote_hash;
+ unsigned long curr_hash;
int fd;
- char buf[16];
+ char buf[MAX_MSG_SIZE];
struct msghdr msg = {0};
struct iovec iov = {0};

@@ -317,25 +381,58 @@ static void test_seqpacket_msg_bounds_server(const struct test_opts *opts)
exit(EXIT_FAILURE);
}

+ sock_buf_size = SOCK_BUF_SIZE;
+
+ if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MAX_SIZE,
+ &sock_buf_size, sizeof(sock_buf_size))) {
+ perror("getsockopt");
+ exit(EXIT_FAILURE);
+ }
+
+ if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
+ &sock_buf_size, sizeof(sock_buf_size))) {
+ perror("getsockopt");
+ exit(EXIT_FAILURE);
+ }
+
+ /* Ready to receive data. */
+ control_writeln("SRVREADY");
+ /* Wait, until peer sends whole data. */
control_expectln("SENDDONE");
iov.iov_base = buf;
iov.iov_len = sizeof(buf);
msg.msg_iov = &iov;
msg.msg_iovlen = 1;

- for (int i = 0; i < MESSAGES_CNT; i++) {
- if (recvmsg(fd, &msg, 0) != 1) {
- perror("message bound violated");
- exit(EXIT_FAILURE);
- }
+ curr_hash = 0;

- if ((i == MSG_EOR_IDX) ^ !!(msg.msg_flags & MSG_EOR)) {
- perror("MSG_EOR");
+ while (1) {
+ ssize_t recv_size;
+
+ recv_size = recvmsg(fd, &msg, 0);
+
+ if (!recv_size)
+ break;
+
+ if (recv_size < 0) {
+ perror("recvmsg");
exit(EXIT_FAILURE);
}
+
+ if (msg.msg_flags & MSG_EOR)
+ curr_hash++;
+
+ curr_hash += recv_size;
+ curr_hash = hash_djb2(&curr_hash, sizeof(curr_hash));
}

close(fd);
+ remote_hash = control_readulong();
+
+ if (curr_hash != remote_hash) {
+ fprintf(stderr, "Message bounds broken\n");
+ exit(EXIT_FAILURE);
+ }
}

#define MESSAGE_TRUNC_SZ 32
@@ -837,6 +934,7 @@ int main(int argc, char **argv)
.peer_cid = VMADDR_CID_ANY,
};

+ srand(time(NULL));
init_signals();

for (;;) {
--
2.25.1

2022-11-25 18:17:23

by Arseniy Krasnov

[permalink] [raw]
Subject: [RFC PATCH v2 6/6] test/vsock: vsock_perf utility

This adds utility to check vsock rx/tx performance.

Usage as sender:
./vsock_perf -p <port> -m <bytes to send)
Usage as receiver:
./vsock_perf -c <cid> -p <port> -r <SO_RCVLOWAT>

Signed-off-by: Arseniy Krasnov <[email protected]>
---
tools/testing/vsock/Makefile | 1 +
tools/testing/vsock/README | 34 +++
tools/testing/vsock/vsock_perf.c | 400 +++++++++++++++++++++++++++++++
3 files changed, 435 insertions(+)
create mode 100644 tools/testing/vsock/vsock_perf.c

diff --git a/tools/testing/vsock/Makefile b/tools/testing/vsock/Makefile
index f8293c6910c9..d36fdd59fe2e 100644
--- a/tools/testing/vsock/Makefile
+++ b/tools/testing/vsock/Makefile
@@ -3,6 +3,7 @@ all: test
test: vsock_test vsock_diag_test
vsock_test: vsock_test.o timeout.o control.o util.o
vsock_diag_test: vsock_diag_test.o timeout.o control.o util.o
+vsock_perf: vsock_perf.o

CFLAGS += -g -O2 -Werror -Wall -I. -I../../include -I../../../usr/include -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -D_GNU_SOURCE
.PHONY: all test clean
diff --git a/tools/testing/vsock/README b/tools/testing/vsock/README
index 4d5045e7d2c3..dbeba1d62016 100644
--- a/tools/testing/vsock/README
+++ b/tools/testing/vsock/README
@@ -35,3 +35,37 @@ Invoke test binaries in both directions as follows:
--control-port=$GUEST_IP \
--control-port=1234 \
--peer-cid=3
+
+vsock_perf utility
+-------------------
+'vsock_perf' is simple tool to measure vsock performance. It works in
+sender/receiver modes: sender waits for connection at specified port,
+and after accept, starts data transmission to receiver. After data
+processing is done, it prints several metrics(see below).
+
+Usage:
+# run as sender
+# listen port 1234, tx buffer size is 1MB, send of data 1G
+./vsock_perf -s -p 1234 -b 1MB -m 1G
+
+Output:
+tx performance: A Gb/s
+
+Output explanation:
+A is calculated as "number of bytes to send" / "time in tx loop"
+
+# run as receiver
+# connect to CID 2, port 1234, rx buffer size is 1MB, peer buf is 1G, SO_RCVLOWAT is 65536
+./vsock_perf -c 2 -p 1234 -b 1MB -v 1G -r 65536
+
+Output:
+rx performance: A Gb/s
+total in 'read()': B sec
+POLLIN wakeups: C
+average in 'read()': D ns
+
+Output explanation:
+A is calculated as "number of received bytes" / "time in rx loop".
+B is time, spent in 'read()' system call(excluding 'poll()')
+C is number of 'poll()' wake ups with POLLIN bit set.
+D is B / C, e.g. average amount of time, spent in single 'read()'.
diff --git a/tools/testing/vsock/vsock_perf.c b/tools/testing/vsock/vsock_perf.c
new file mode 100644
index 000000000000..bca2027de52d
--- /dev/null
+++ b/tools/testing/vsock/vsock_perf.c
@@ -0,0 +1,400 @@
+// SPDX-License-Identifier: GPL-2.0-only
+/*
+ * vsock_perf - benchmark utility for vsock.
+ *
+ * Copyright (C) 2022 SberDevices.
+ *
+ * Author: Arseniy Krasnov <[email protected]>
+ */
+#include <getopt.h>
+#include <stdio.h>
+#include <stdlib.h>
+#include <stdbool.h>
+#include <string.h>
+#include <errno.h>
+#include <unistd.h>
+#include <time.h>
+#include <sys/mman.h>
+#include <stdint.h>
+#include <poll.h>
+#include <sys/socket.h>
+#include <linux/vm_sockets.h>
+
+#define DEFAULT_BUF_SIZE_BYTES (128 * 1024)
+#define DEFAULT_TO_SEND_BYTES (64 * 1024)
+#define DEFAULT_VSOCK_BUF_BYTES (256 * 1024)
+#define DEFAULT_RCVLOWAT_BYTES 1
+#define DEFAULT_PORT 1234
+#define DEFAULT_CID 2
+
+#define BYTES_PER_GB (1024 * 1024 * 1024ULL)
+#define NSEC_PER_SEC (1000000000ULL)
+
+static unsigned int port = DEFAULT_PORT;
+static unsigned long buf_size_bytes = DEFAULT_BUF_SIZE_BYTES;
+static unsigned long vsock_buf_bytes = DEFAULT_VSOCK_BUF_BYTES;
+
+static inline time_t current_nsec(void)
+{
+ struct timespec ts;
+
+ if (clock_gettime(CLOCK_REALTIME, &ts)) {
+ perror("clock_gettime");
+ exit(EXIT_FAILURE);
+ }
+
+ return (ts.tv_sec * NSEC_PER_SEC) + ts.tv_nsec;
+}
+
+/* From lib/cmdline.c. */
+static unsigned long memparse(const char *ptr)
+{
+ char *endptr;
+
+ unsigned long long ret = strtoull(ptr, &endptr, 0);
+
+ switch (*endptr) {
+ case 'E':
+ case 'e':
+ ret <<= 10;
+ case 'P':
+ case 'p':
+ ret <<= 10;
+ case 'T':
+ case 't':
+ ret <<= 10;
+ case 'G':
+ case 'g':
+ ret <<= 10;
+ case 'M':
+ case 'm':
+ ret <<= 10;
+ case 'K':
+ case 'k':
+ ret <<= 10;
+ endptr++;
+ default:
+ break;
+ }
+
+ return ret;
+}
+
+static void vsock_increase_buf_size(int fd)
+{
+ if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MAX_SIZE,
+ &vsock_buf_bytes, sizeof(vsock_buf_bytes))) {
+ perror("setsockopt");
+ exit(EXIT_FAILURE);
+ }
+
+ if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
+ &vsock_buf_bytes, sizeof(vsock_buf_bytes))) {
+ perror("setsockopt");
+ exit(EXIT_FAILURE);
+ }
+}
+
+static int vsock_connect(unsigned int cid, unsigned int port)
+{
+ union {
+ struct sockaddr sa;
+ struct sockaddr_vm svm;
+ } addr = {
+ .svm = {
+ .svm_family = AF_VSOCK,
+ .svm_port = port,
+ .svm_cid = cid,
+ },
+ };
+ int fd;
+
+ fd = socket(AF_VSOCK, SOCK_STREAM, 0);
+
+ if (fd < 0)
+ return -1;
+
+ vsock_increase_buf_size(fd);
+
+ if (connect(fd, &addr.sa, sizeof(addr.svm)) < 0) {
+ close(fd);
+ return -1;
+ }
+
+ return fd;
+}
+
+static float get_gbps(unsigned long bytes, time_t ns_delta)
+{
+ return ((float)bytes / BYTES_PER_GB) /
+ ((float)ns_delta / NSEC_PER_SEC);
+}
+
+static void run_sender(unsigned long to_send_bytes)
+{
+ time_t tx_begin_ns;
+ size_t total_send;
+ int client_fd;
+ char *data;
+ int fd;
+ union {
+ struct sockaddr sa;
+ struct sockaddr_vm svm;
+ } addr = {
+ .svm = {
+ .svm_family = AF_VSOCK,
+ .svm_port = port,
+ .svm_cid = VMADDR_CID_ANY,
+ },
+ };
+ union {
+ struct sockaddr sa;
+ struct sockaddr_vm svm;
+ } clientaddr;
+
+ socklen_t clientaddr_len = sizeof(clientaddr.svm);
+
+ printf("Run as sender\n");
+ printf("Listen port %u\n", port);
+ printf("Send %lu bytes\n", to_send_bytes);
+ printf("TX buffer %lu bytes\n", buf_size_bytes);
+ printf("Peer buffer %lu bytes\n", vsock_buf_bytes);
+
+ fd = socket(AF_VSOCK, SOCK_STREAM, 0);
+
+ if (fd < 0) {
+ perror("socket");
+ exit(EXIT_FAILURE);
+ }
+
+ if (bind(fd, &addr.sa, sizeof(addr.svm)) < 0) {
+ perror("bind");
+ exit(EXIT_FAILURE);
+ }
+
+ if (listen(fd, 1) < 0) {
+ perror("listen");
+ exit(EXIT_FAILURE);
+ }
+
+ client_fd = accept(fd, &clientaddr.sa, &clientaddr_len);
+
+ if (client_fd < 0) {
+ perror("accept");
+ exit(EXIT_FAILURE);
+ }
+
+ vsock_increase_buf_size(client_fd);
+
+ data = malloc(buf_size_bytes);
+
+ if (!data) {
+ printf("malloc failed\n");
+ exit(EXIT_FAILURE);
+ }
+
+ memset(data, 0, buf_size_bytes);
+ total_send = 0;
+ tx_begin_ns = current_nsec();
+
+ while (total_send < to_send_bytes) {
+ ssize_t sent;
+
+ sent = write(client_fd, data, buf_size_bytes);
+
+ if (sent <= 0) {
+ perror("write");
+ exit(EXIT_FAILURE);
+ }
+
+ total_send += sent;
+ }
+
+ printf("tx performance: %f Gb/s\n",
+ get_gbps(total_send, current_nsec() - tx_begin_ns));
+
+ close(client_fd);
+ close(fd);
+
+ free(data);
+}
+
+static void run_receiver(int peer_cid, unsigned long rcvlowat_bytes)
+{
+ unsigned int read_cnt;
+ time_t rx_begin_ns;
+ time_t in_read_ns;
+ size_t total_recv;
+ void *data;
+ int fd;
+
+ printf("Run as receiver\n");
+ printf("Connect to %i:%u\n", peer_cid, port);
+ printf("RX buffer %lu bytes\n", buf_size_bytes);
+ printf("Peer buffer %lu bytes\n", vsock_buf_bytes);
+ printf("SO_RCVLOWAT %lu bytes\n", rcvlowat_bytes);
+
+ fd = vsock_connect(peer_cid, port);
+
+ if (fd < 0) {
+ perror("socket");
+ exit(EXIT_FAILURE);
+ }
+
+ if (setsockopt(fd, SOL_SOCKET, SO_RCVLOWAT,
+ &rcvlowat_bytes,
+ sizeof(rcvlowat_bytes))) {
+ perror("setsockopt");
+ exit(EXIT_FAILURE);
+ }
+
+ data = mmap(NULL, buf_size_bytes, PROT_READ | PROT_WRITE,
+ MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);
+
+ if (data == MAP_FAILED) {
+ perror("mmap");
+ exit(EXIT_FAILURE);
+ }
+
+ read_cnt = 0;
+ in_read_ns = 0;
+ total_recv = 0;
+ rx_begin_ns = current_nsec();
+
+ while (1) {
+ struct pollfd fds = { 0 };
+
+ fds.fd = fd;
+ fds.events = POLLIN | POLLERR | POLLHUP |
+ POLLRDHUP | POLLNVAL;
+
+ if (poll(&fds, 1, -1) < 0) {
+ perror("poll");
+ exit(EXIT_FAILURE);
+ }
+
+ if (fds.revents & POLLERR) {
+ printf("'poll()' error\n");
+ exit(EXIT_FAILURE);
+ }
+
+ if (fds.revents & POLLIN) {
+ ssize_t bytes_read;
+ time_t t;
+
+ t = current_nsec();
+ bytes_read = read(fd, data, buf_size_bytes);
+ in_read_ns += (current_nsec() - t);
+ read_cnt++;
+
+ if (!bytes_read)
+ break;
+
+ if (bytes_read < 0) {
+ perror("recv");
+ exit(EXIT_FAILURE);
+ }
+
+ total_recv += bytes_read;
+ }
+
+ if (fds.revents & (POLLHUP | POLLRDHUP))
+ break;
+ }
+
+ printf("rx performance: %f Gb/s\n",
+ get_gbps(total_recv, current_nsec() - rx_begin_ns));
+ printf("total in 'read()': %f sec\n", (float)in_read_ns / NSEC_PER_SEC);
+ printf("POLLIN wakeups: %i\n", read_cnt);
+ printf("average in 'read()': %f ns\n", (float)in_read_ns / read_cnt);
+
+ munmap(data, buf_size_bytes);
+ close(fd);
+}
+
+static void usage(void)
+{
+ printf("Help:\n"
+ "\n"
+ "This is benchmarking utility, to test vsock performance.\n"
+ "It runs in two modes: sender or receiver. In sender mode, it waits\n"
+ "connection from receiver, and when established, sender starts data\n"
+ "transmission.\n"
+ "\n"
+ "Options:\n"
+ " -h This help message\n"
+ " -s Sender mode(receiver default)\n"
+ " -p <port> Port (%d)\n"
+ " -c <cid> CID of the peer (%d)\n"
+ " -m <bytes to send> Bytes to send (%d)\n"
+ " -b <buffer size> Rx/Tx buffer size (%d). In sender mode\n"
+ " it is size of buffer passed to 'write()'.\n"
+ " In receiver mode it is size of buffer passed\n"
+ " to 'read()'.\n"
+ " -v <peer buffer size> Peer buffer size (%d)\n"
+ " -r <SO_RCVLOWAT> SO_RCVLOWAT (%d)\n"
+ "\n", DEFAULT_PORT, DEFAULT_CID, DEFAULT_TO_SEND_BYTES,
+ DEFAULT_BUF_SIZE_BYTES, DEFAULT_VSOCK_BUF_BYTES,
+ DEFAULT_RCVLOWAT_BYTES);
+ exit(EXIT_FAILURE);
+}
+
+static long strtolx(const char *arg)
+{
+ long value;
+ char *end;
+
+ value = strtol(arg, &end, 10);
+
+ if (end != arg + strlen(arg))
+ usage();
+
+ return value;
+}
+
+int main(int argc, char **argv)
+{
+ unsigned long to_send_bytes = DEFAULT_TO_SEND_BYTES;
+ unsigned long rcvlowat_bytes = DEFAULT_RCVLOWAT_BYTES;
+ bool receiver_mode = true;
+ int peer_cid = DEFAULT_CID;
+ int c;
+
+ while ((c = getopt(argc, argv, "v:r:c:p:m:b:sh")) != -1) {
+ switch (c) {
+ case 'v': /* Peer buffer size. */
+ vsock_buf_bytes = memparse(optarg);
+ break;
+ case 'r': /* SO_RCVLOWAT value. */
+ rcvlowat_bytes = memparse(optarg);
+ break;
+ case 'c': /* CID to connect to. */
+ peer_cid = strtolx(optarg);
+ break;
+ case 'p': /* Port to connect to. */
+ port = strtolx(optarg);
+ break;
+ case 'm': /* Bytes to send. */
+ to_send_bytes = memparse(optarg);
+ break;
+ case 'b': /* Size of rx/tx buffer. */
+ buf_size_bytes = memparse(optarg);
+ break;
+ case 's': /* Sender mode. */
+ receiver_mode = false;
+ break;
+ case 'h': /* Help. */
+ usage();
+ break;
+ default:
+ usage();
+ }
+ }
+
+ if (receiver_mode)
+ run_receiver(peer_cid, rcvlowat_bytes);
+ else
+ run_sender(to_send_bytes);
+
+ return 0;
+}
--
2.25.1

2022-11-25 18:18:45

by Arseniy Krasnov

[permalink] [raw]
Subject: [RFC PATCH v2 1/6] vsock: return errors other than -ENOMEM to socket

From: Bobby Eshleman <[email protected]>

This removes behaviour, where error code returned from any
transport was always switched to ENOMEM.

Signed-off-by: Bobby Eshleman <[email protected]>
Signed-off-by: Arseniy Krasnov <[email protected]>
---
net/vmw_vsock/af_vsock.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
index 884eca7f6743..61ddab664c33 100644
--- a/net/vmw_vsock/af_vsock.c
+++ b/net/vmw_vsock/af_vsock.c
@@ -1862,8 +1862,9 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
written = transport->stream_enqueue(vsk,
msg, len - total_written);
}
+
if (written < 0) {
- err = -ENOMEM;
+ err = written;
goto out_err;
}

--
2.25.1

2022-11-25 18:19:22

by Arseniy Krasnov

[permalink] [raw]
Subject: [RFC PATCH v2 3/6] vsock/vmci: always return ENOMEM in case of error

From: Bobby Eshleman <[email protected]>

This saves original behaviour from af_vsock.c - switch any error
code returned from transport layer to ENOMEM.

Signed-off-by: Bobby Eshleman <[email protected]>
Signed-off-by: Arseniy Krasnov <[email protected]>
---
net/vmw_vsock/vmci_transport.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
index 842c94286d31..289a36a203a2 100644
--- a/net/vmw_vsock/vmci_transport.c
+++ b/net/vmw_vsock/vmci_transport.c
@@ -1838,7 +1838,14 @@ static ssize_t vmci_transport_stream_enqueue(
struct msghdr *msg,
size_t len)
{
- return vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg, len, 0);
+ int err;
+
+ err = vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg, len, 0);
+
+ if (err < 0)
+ err = -ENOMEM;
+
+ return err;
}

static s64 vmci_transport_stream_has_data(struct vsock_sock *vsk)
--
2.25.1

2022-11-25 18:19:45

by Arseniy Krasnov

[permalink] [raw]
Subject: [RFC PATCH v2 5/6] test/vsock: add big message test

This adds test for sending message, bigger than peer's buffer size.
For SOCK_SEQPACKET socket it must fail, as this type of socket has
message size limit.

Signed-off-by: Arseniy Krasnov <[email protected]>
---
tools/testing/vsock/vsock_test.c | 69 ++++++++++++++++++++++++++++++++
1 file changed, 69 insertions(+)

diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
index 12ef0cca6f93..a8e43424fb32 100644
--- a/tools/testing/vsock/vsock_test.c
+++ b/tools/testing/vsock/vsock_test.c
@@ -569,6 +569,70 @@ static void test_seqpacket_timeout_server(const struct test_opts *opts)
close(fd);
}

+static void test_seqpacket_bigmsg_client(const struct test_opts *opts)
+{
+ unsigned long sock_buf_size;
+ ssize_t send_size;
+ socklen_t len;
+ void *data;
+ int fd;
+
+ len = sizeof(sock_buf_size);
+
+ fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
+ if (fd < 0) {
+ perror("connect");
+ exit(EXIT_FAILURE);
+ }
+
+ if (getsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
+ &sock_buf_size, &len)) {
+ perror("getsockopt");
+ exit(EXIT_FAILURE);
+ }
+
+ sock_buf_size++;
+
+ data = malloc(sock_buf_size);
+ if (!data) {
+ perror("malloc");
+ exit(EXIT_FAILURE);
+ }
+
+ send_size = send(fd, data, sock_buf_size, 0);
+ if (send_size != -1) {
+ fprintf(stderr, "expected 'send(2)' failure, got %zi\n",
+ send_size);
+ exit(EXIT_FAILURE);
+ }
+
+ if (errno != EMSGSIZE) {
+ fprintf(stderr, "expected EMSGSIZE in 'errno', got %i\n",
+ errno);
+ exit(EXIT_FAILURE);
+ }
+
+ control_writeln("CLISENT");
+
+ free(data);
+ close(fd);
+}
+
+static void test_seqpacket_bigmsg_server(const struct test_opts *opts)
+{
+ int fd;
+
+ fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL);
+ if (fd < 0) {
+ perror("accept");
+ exit(EXIT_FAILURE);
+ }
+
+ control_expectln("CLISENT");
+
+ close(fd);
+}
+
#define BUF_PATTERN_1 'a'
#define BUF_PATTERN_2 'b'

@@ -851,6 +915,11 @@ static struct test_case test_cases[] = {
.run_client = test_stream_poll_rcvlowat_client,
.run_server = test_stream_poll_rcvlowat_server,
},
+ {
+ .name = "SOCK_SEQPACKET big message",
+ .run_client = test_seqpacket_bigmsg_client,
+ .run_server = test_seqpacket_bigmsg_server,
+ },
{},
};

--
2.25.1

2022-12-01 09:44:03

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v2 1/6] vsock: return errors other than -ENOMEM to socket

On Fri, Nov 25, 2022 at 05:03:06PM +0000, Arseniy Krasnov wrote:
>From: Bobby Eshleman <[email protected]>
>
>This removes behaviour, where error code returned from any
>transport was always switched to ENOMEM.
>
>Signed-off-by: Bobby Eshleman <[email protected]>
>Signed-off-by: Arseniy Krasnov <[email protected]>
>---
> net/vmw_vsock/af_vsock.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)

This patch LGTM, but I would move after the 2 patches that change vmci
and hyperv transports.

First we should fix the transports by returning the error we think is
right, and then expose it to the user.

Thanks,
Stefano

>
>diff --git a/net/vmw_vsock/af_vsock.c b/net/vmw_vsock/af_vsock.c
>index 884eca7f6743..61ddab664c33 100644
>--- a/net/vmw_vsock/af_vsock.c
>+++ b/net/vmw_vsock/af_vsock.c
>@@ -1862,8 +1862,9 @@ static int vsock_connectible_sendmsg(struct socket *sock, struct msghdr *msg,
> written = transport->stream_enqueue(vsk,
> msg, len - total_written);
> }
>+
> if (written < 0) {
>- err = -ENOMEM;
>+ err = written;
> goto out_err;
> }
>
>--
>2.25.1

2022-12-01 09:53:50

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v2 4/6] test/vsock: rework message bounds test

On Fri, Nov 25, 2022 at 05:10:35PM +0000, Arseniy Krasnov wrote:
>This updates message bound test making it more complex. Instead of
>sending 1 bytes messages with one MSG_EOR bit, it sends messages of
>random length(one half of messages are smaller than page size, second
>half are bigger) with random number of MSG_EOR bits set. Receiver
>also don't know total number of messages.
>
>Signed-off-by: Arseniy Krasnov <[email protected]>
>---
> tools/testing/vsock/control.c | 28 +++++++
> tools/testing/vsock/control.h | 2 +
> tools/testing/vsock/util.c | 13 ++++
> tools/testing/vsock/util.h | 1 +
> tools/testing/vsock/vsock_test.c | 124 +++++++++++++++++++++++++++----
> 5 files changed, 155 insertions(+), 13 deletions(-)
>
>diff --git a/tools/testing/vsock/control.c b/tools/testing/vsock/control.c
>index 4874872fc5a3..d2deb4b15b94 100644
>--- a/tools/testing/vsock/control.c
>+++ b/tools/testing/vsock/control.c
>@@ -141,6 +141,34 @@ void control_writeln(const char *str)
> timeout_end();
> }
>
>+void control_writeulong(unsigned long value)
>+{
>+ char str[32];
>+
>+ if (snprintf(str, sizeof(str), "%lu", value) >= sizeof(str)) {
>+ perror("snprintf");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ control_writeln(str);
>+}
>+
>+unsigned long control_readulong(void)
>+{
>+ unsigned long value;
>+ char *str;
>+
>+ str = control_readln();
>+
>+ if (!str)
>+ exit(EXIT_FAILURE);
>+
>+ value = strtoul(str, NULL, 10);
>+ free(str);
>+
>+ return value;
>+}
>+
> /* Return the next line from the control socket (without the trailing newline).
> *
> * The program terminates if a timeout occurs.
>diff --git a/tools/testing/vsock/control.h b/tools/testing/vsock/control.h
>index 51814b4f9ac1..c1f77fdb2c7a 100644
>--- a/tools/testing/vsock/control.h
>+++ b/tools/testing/vsock/control.h
>@@ -9,7 +9,9 @@ void control_init(const char *control_host, const char *control_port,
> void control_cleanup(void);
> void control_writeln(const char *str);
> char *control_readln(void);
>+unsigned long control_readulong(void);
> void control_expectln(const char *str);
> bool control_cmpln(char *line, const char *str, bool fail);
>+void control_writeulong(unsigned long value);
>
> #endif /* CONTROL_H */
>diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
>index 2acbb7703c6a..01b636d3039a 100644
>--- a/tools/testing/vsock/util.c
>+++ b/tools/testing/vsock/util.c
>@@ -395,3 +395,16 @@ void skip_test(struct test_case *test_cases, size_t test_cases_len,
>
> test_cases[test_id].skip = true;
> }
>+
>+unsigned long hash_djb2(const void *data, size_t len)
>+{
>+ unsigned long hash = 5381;
>+ int i = 0;
>+
>+ while (i < len) {
>+ hash = ((hash << 5) + hash) + ((unsigned char *)data)[i];
>+ i++;
>+ }
>+
>+ return hash;
>+}
>diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
>index a3375ad2fb7f..fb99208a95ea 100644
>--- a/tools/testing/vsock/util.h
>+++ b/tools/testing/vsock/util.h
>@@ -49,4 +49,5 @@ void run_tests(const struct test_case *test_cases,
> void list_tests(const struct test_case *test_cases);
> void skip_test(struct test_case *test_cases, size_t test_cases_len,
> const char *test_id_str);
>+unsigned long hash_djb2(const void *data, size_t len);
> #endif /* UTIL_H */
>diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>index bb6d691cb30d..12ef0cca6f93 100644
>--- a/tools/testing/vsock/vsock_test.c
>+++ b/tools/testing/vsock/vsock_test.c
>@@ -284,10 +284,14 @@ static void test_stream_msg_peek_server(const struct test_opts *opts)
> close(fd);
> }
>
>-#define MESSAGES_CNT 7
>-#define MSG_EOR_IDX (MESSAGES_CNT / 2)
>+#define SOCK_BUF_SIZE (2 * 1024 * 1024)
>+#define MAX_MSG_SIZE (32 * 1024)
>+
> static void test_seqpacket_msg_bounds_client(const struct test_opts *opts)
> {
>+ unsigned long curr_hash;
>+ int page_size;
>+ int msg_count;
> int fd;
>
> fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
>@@ -296,18 +300,78 @@ static void test_seqpacket_msg_bounds_client(const struct test_opts *opts)
> exit(EXIT_FAILURE);
> }
>
>- /* Send several messages, one with MSG_EOR flag */
>- for (int i = 0; i < MESSAGES_CNT; i++)
>- send_byte(fd, 1, (i == MSG_EOR_IDX) ? MSG_EOR : 0);
>+ /* Wait, until receiver sets buffer size. */
>+ control_expectln("SRVREADY");
>+
>+ curr_hash = 0;
>+ page_size = getpagesize();
>+ msg_count = SOCK_BUF_SIZE / MAX_MSG_SIZE;
>+
>+ for (int i = 0; i < msg_count; i++) {
>+ ssize_t send_size;
>+ size_t buf_size;
>+ int flags;
>+ void *buf;
>+
>+ /* Use "small" buffers and "big" buffers. */
>+ if (i & 1)
>+ buf_size = page_size +
>+ (rand() % (MAX_MSG_SIZE - page_size));
>+ else
>+ buf_size = 1 + (rand() % page_size);
>+
>+ buf = malloc(buf_size);
>+
>+ if (!buf) {
>+ perror("malloc");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ /* Set at least one MSG_EOR + some random. */
>+ if (i == (msg_count / 2) || (rand() & 1)) {
>+ flags = MSG_EOR;
>+ curr_hash++;
>+ } else {
>+ flags = 0;
>+ }
>+
>+ send_size = send(fd, buf, buf_size, flags);
>+
>+ if (send_size < 0) {
>+ perror("send");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ if (send_size != buf_size) {
>+ fprintf(stderr, "Invalid send size\n");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ /*
>+ * Hash sum is computed at both client and server in
>+ * the same way: H = hash(H + 'message length'). Idea
>+ * is simple: such hash "accumulates" length and order
>+ * of every sent/received message. After data exchange,
>+ * both sums are compared using control socket, and if
>+ * message bounds wasn't broken - two values must be
>+ * equal.
>+ */
>+ curr_hash += send_size;

Sorry, I thought about it now, but instead of hashing the size, couldn't
we just fill the buffer with a pattern (maybe random?) and hash the
content?

That way we see both that the bounds were met and that the content was
transferred correctly.

What do you think?

The rest LGTM!

Stefano

>+ curr_hash = hash_djb2(&curr_hash, sizeof(curr_hash));
>+ }
>
> control_writeln("SENDDONE");
>+ control_writeulong(curr_hash);
> close(fd);
> }
>
> static void test_seqpacket_msg_bounds_server(const struct test_opts *opts)
> {
>+ unsigned long sock_buf_size;
>+ unsigned long remote_hash;
>+ unsigned long curr_hash;
> int fd;
>- char buf[16];
>+ char buf[MAX_MSG_SIZE];
> struct msghdr msg = {0};
> struct iovec iov = {0};
>
>@@ -317,25 +381,58 @@ static void test_seqpacket_msg_bounds_server(const struct test_opts *opts)
> exit(EXIT_FAILURE);
> }
>
>+ sock_buf_size = SOCK_BUF_SIZE;
>+
>+ if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MAX_SIZE,
>+ &sock_buf_size, sizeof(sock_buf_size))) {
>+ perror("getsockopt");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
>+ &sock_buf_size, sizeof(sock_buf_size))) {
>+ perror("getsockopt");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ /* Ready to receive data. */
>+ control_writeln("SRVREADY");
>+ /* Wait, until peer sends whole data. */
> control_expectln("SENDDONE");
> iov.iov_base = buf;
> iov.iov_len = sizeof(buf);
> msg.msg_iov = &iov;
> msg.msg_iovlen = 1;
>
>- for (int i = 0; i < MESSAGES_CNT; i++) {
>- if (recvmsg(fd, &msg, 0) != 1) {
>- perror("message bound violated");
>- exit(EXIT_FAILURE);
>- }
>+ curr_hash = 0;
>
>- if ((i == MSG_EOR_IDX) ^ !!(msg.msg_flags & MSG_EOR)) {
>- perror("MSG_EOR");
>+ while (1) {
>+ ssize_t recv_size;
>+
>+ recv_size = recvmsg(fd, &msg, 0);
>+
>+ if (!recv_size)
>+ break;
>+
>+ if (recv_size < 0) {
>+ perror("recvmsg");
> exit(EXIT_FAILURE);
> }
>+
>+ if (msg.msg_flags & MSG_EOR)
>+ curr_hash++;
>+
>+ curr_hash += recv_size;
>+ curr_hash = hash_djb2(&curr_hash, sizeof(curr_hash));
> }
>
> close(fd);
>+ remote_hash = control_readulong();
>+
>+ if (curr_hash != remote_hash) {
>+ fprintf(stderr, "Message bounds broken\n");
>+ exit(EXIT_FAILURE);
>+ }
> }
>
> #define MESSAGE_TRUNC_SZ 32
>@@ -837,6 +934,7 @@ int main(int argc, char **argv)
> .peer_cid = VMADDR_CID_ANY,
> };
>
>+ srand(time(NULL));
> init_signals();
>
> for (;;) {
>--
>2.25.1

2022-12-01 10:09:46

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v2 5/6] test/vsock: add big message test

On Fri, Nov 25, 2022 at 05:13:06PM +0000, Arseniy Krasnov wrote:
>This adds test for sending message, bigger than peer's buffer size.
>For SOCK_SEQPACKET socket it must fail, as this type of socket has
>message size limit.
>
>Signed-off-by: Arseniy Krasnov <[email protected]>
>---
> tools/testing/vsock/vsock_test.c | 69 ++++++++++++++++++++++++++++++++
> 1 file changed, 69 insertions(+)
>
>diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>index 12ef0cca6f93..a8e43424fb32 100644
>--- a/tools/testing/vsock/vsock_test.c
>+++ b/tools/testing/vsock/vsock_test.c
>@@ -569,6 +569,70 @@ static void test_seqpacket_timeout_server(const struct test_opts *opts)
> close(fd);
> }
>
>+static void test_seqpacket_bigmsg_client(const struct test_opts *opts)
>+{
>+ unsigned long sock_buf_size;
>+ ssize_t send_size;
>+ socklen_t len;
>+ void *data;
>+ int fd;
>+
>+ len = sizeof(sock_buf_size);
>+
>+ fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
>+ if (fd < 0) {
>+ perror("connect");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ if (getsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
>+ &sock_buf_size, &len)) {
>+ perror("getsockopt");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ sock_buf_size++;
>+
>+ data = malloc(sock_buf_size);
>+ if (!data) {
>+ perror("malloc");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ send_size = send(fd, data, sock_buf_size, 0);
>+ if (send_size != -1) {
>+ fprintf(stderr, "expected 'send(2)' failure, got %zi\n",
>+ send_size);
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ if (errno != EMSGSIZE) {
>+ fprintf(stderr, "expected EMSGSIZE in 'errno', got %i\n",
>+ errno);
>+ exit(EXIT_FAILURE);
>+ }

We should make sure that this is true for all transports, but since now
only virtio-vsock supports it, we should be okay.

>+
>+ control_writeln("CLISENT");
>+
>+ free(data);
>+ close(fd);
>+}
>+
>+static void test_seqpacket_bigmsg_server(const struct test_opts *opts)
>+{
>+ int fd;
>+
>+ fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL);
>+ if (fd < 0) {
>+ perror("accept");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ control_expectln("CLISENT");
>+
>+ close(fd);
>+}
>+
> #define BUF_PATTERN_1 'a'
> #define BUF_PATTERN_2 'b'
>
>@@ -851,6 +915,11 @@ static struct test_case test_cases[] = {
> .run_client = test_stream_poll_rcvlowat_client,
> .run_server = test_stream_poll_rcvlowat_server,
> },
>+ {
>+ .name = "SOCK_SEQPACKET big message",
>+ .run_client = test_seqpacket_bigmsg_client,
>+ .run_server = test_seqpacket_bigmsg_server,
>+ },
> {},
> };
>
>--
>2.25.1

LGTM!

Reviewed-by: Stefano Garzarella <[email protected]>

2022-12-01 10:31:26

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v2 2/6] hv_sock: always return ENOMEM in case of error

On Fri, Nov 25, 2022 at 05:05:53PM +0000, Arseniy Krasnov wrote:
>From: Bobby Eshleman <[email protected]>
>
>This saves original behaviour from af_vsock.c - switch any error
>code returned from transport layer to ENOMEM.
>
>Signed-off-by: Bobby Eshleman <[email protected]>
>Signed-off-by: Arseniy Krasnov <[email protected]>
>---
> net/vmw_vsock/hyperv_transport.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
>diff --git a/net/vmw_vsock/hyperv_transport.c b/net/vmw_vsock/hyperv_transport.c
>index 59c3e2697069..fbbe55133da2 100644
>--- a/net/vmw_vsock/hyperv_transport.c
>+++ b/net/vmw_vsock/hyperv_transport.c
>@@ -687,7 +687,7 @@ static ssize_t hvs_stream_enqueue(struct vsock_sock *vsk, struct msghdr *msg,
> if (bytes_written)
> ret = bytes_written;
> kfree(send_buf);
>- return ret;
>+ return ret < 0 ? -ENOMEM : ret;

I'm not sure for hyperv we want to preserve -ENOMEM. This transport was
added after virtio-vsock, so I think we can return the error directly.

@Dexuan what do you think?

Thanks,
Stefano

2022-12-01 10:34:30

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v2 6/6] test/vsock: vsock_perf utility

On Fri, Nov 25, 2022 at 05:15:08PM +0000, Arseniy Krasnov wrote:
>This adds utility to check vsock rx/tx performance.
>
>Usage as sender:
>./vsock_perf -p <port> -m <bytes to send)
>Usage as receiver:
>./vsock_perf -c <cid> -p <port> -r <SO_RCVLOWAT>
>
>Signed-off-by: Arseniy Krasnov <[email protected]>
>---
> tools/testing/vsock/Makefile | 1 +
> tools/testing/vsock/README | 34 +++
> tools/testing/vsock/vsock_perf.c | 400 +++++++++++++++++++++++++++++++
> 3 files changed, 435 insertions(+)
> create mode 100644 tools/testing/vsock/vsock_perf.c
>
>diff --git a/tools/testing/vsock/Makefile b/tools/testing/vsock/Makefile
>index f8293c6910c9..d36fdd59fe2e 100644
>--- a/tools/testing/vsock/Makefile
>+++ b/tools/testing/vsock/Makefile
>@@ -3,6 +3,7 @@ all: test
> test: vsock_test vsock_diag_test
> vsock_test: vsock_test.o timeout.o control.o util.o
> vsock_diag_test: vsock_diag_test.o timeout.o control.o util.o
>+vsock_perf: vsock_perf.o
>
> CFLAGS += -g -O2 -Werror -Wall -I. -I../../include -I../../../usr/include -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -D_GNU_SOURCE
> .PHONY: all test clean
>diff --git a/tools/testing/vsock/README b/tools/testing/vsock/README
>index 4d5045e7d2c3..dbeba1d62016 100644
>--- a/tools/testing/vsock/README
>+++ b/tools/testing/vsock/README
>@@ -35,3 +35,37 @@ Invoke test binaries in both directions as follows:
> --control-port=$GUEST_IP \
> --control-port=1234 \
> --peer-cid=3

I'm not a native speaker, so the suggestions below may be wrong.

>+
>+vsock_perf utility
>+-------------------
>+'vsock_perf' is simple tool to measure vsock performance. It works in

is a simple tool

>+sender/receiver modes: sender waits for connection at specified port,
>+and after accept, starts data transmission to receiver. After data
^
and after accepting it
^
to the receiver

>+processing is done, it prints several metrics(see below).
>+
>+Usage:
>+# run as sender
>+# listen port 1234, tx buffer size is 1MB, send of data 1G
>+./vsock_perf -s -p 1234 -b 1MB -m 1G

Like for vsock_test and vsock_diag_test I would use only the long form
for the parameters (e.g. --sender, --port, etc.)

>+
>+Output:
>+tx performance: A Gb/s
>+
>+Output explanation:
>+A is calculated as "number of bytes to send" / "time in tx loop"
>+
>+# run as receiver
>+# connect to CID 2, port 1234, rx buffer size is 1MB, peer buf is 1G, SO_RCVLOWAT is 65536
>+./vsock_perf -c 2 -p 1234 -b 1MB -v 1G -r 65536
>+
>+Output:
>+rx performance: A Gb/s
>+total in 'read()': B sec
>+POLLIN wakeups: C
>+average in 'read()': D ns
>+
>+Output explanation:
>+A is calculated as "number of received bytes" / "time in rx loop".
>+B is time, spent in 'read()' system call(excluding 'poll()')
>+C is number of 'poll()' wake ups with POLLIN bit set.
>+D is B / C, e.g. average amount of time, spent in single 'read()'.
>diff --git a/tools/testing/vsock/vsock_perf.c b/tools/testing/vsock/vsock_perf.c
>new file mode 100644
>index 000000000000..bca2027de52d
>--- /dev/null
>+++ b/tools/testing/vsock/vsock_perf.c
>@@ -0,0 +1,400 @@
>+// SPDX-License-Identifier: GPL-2.0-only
>+/*
>+ * vsock_perf - benchmark utility for vsock.
>+ *
>+ * Copyright (C) 2022 SberDevices.
>+ *
>+ * Author: Arseniy Krasnov <[email protected]>
>+ */
>+#include <getopt.h>
>+#include <stdio.h>
>+#include <stdlib.h>
>+#include <stdbool.h>
>+#include <string.h>
>+#include <errno.h>
>+#include <unistd.h>
>+#include <time.h>
>+#include <sys/mman.h>
>+#include <stdint.h>
>+#include <poll.h>
>+#include <sys/socket.h>
>+#include <linux/vm_sockets.h>
>+
>+#define DEFAULT_BUF_SIZE_BYTES (128 * 1024)
>+#define DEFAULT_TO_SEND_BYTES (64 * 1024)
>+#define DEFAULT_VSOCK_BUF_BYTES (256 * 1024)
>+#define DEFAULT_RCVLOWAT_BYTES 1
>+#define DEFAULT_PORT 1234
>+#define DEFAULT_CID 2
>+
>+#define BYTES_PER_GB (1024 * 1024 * 1024ULL)
>+#define NSEC_PER_SEC (1000000000ULL)
>+
>+static unsigned int port = DEFAULT_PORT;
>+static unsigned long buf_size_bytes = DEFAULT_BUF_SIZE_BYTES;
>+static unsigned long vsock_buf_bytes = DEFAULT_VSOCK_BUF_BYTES;
>+
>+static inline time_t current_nsec(void)
>+{
>+ struct timespec ts;
>+
>+ if (clock_gettime(CLOCK_REALTIME, &ts)) {
>+ perror("clock_gettime");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ return (ts.tv_sec * NSEC_PER_SEC) + ts.tv_nsec;
>+}
>+
>+/* From lib/cmdline.c. */
>+static unsigned long memparse(const char *ptr)
>+{
>+ char *endptr;
>+
>+ unsigned long long ret = strtoull(ptr, &endptr, 0);
>+
>+ switch (*endptr) {
>+ case 'E':
>+ case 'e':
>+ ret <<= 10;
>+ case 'P':
>+ case 'p':
>+ ret <<= 10;
>+ case 'T':
>+ case 't':
>+ ret <<= 10;
>+ case 'G':
>+ case 'g':
>+ ret <<= 10;
>+ case 'M':
>+ case 'm':
>+ ret <<= 10;
>+ case 'K':
>+ case 'k':
>+ ret <<= 10;
>+ endptr++;
>+ default:
>+ break;
>+ }
>+
>+ return ret;
>+}
>+
>+static void vsock_increase_buf_size(int fd)
>+{
>+ if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MAX_SIZE,
>+ &vsock_buf_bytes, sizeof(vsock_buf_bytes))) {
>+ perror("setsockopt");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
>+ &vsock_buf_bytes, sizeof(vsock_buf_bytes))) {
>+ perror("setsockopt");
>+ exit(EXIT_FAILURE);
>+ }
>+}
>+
>+static int vsock_connect(unsigned int cid, unsigned int port)
>+{
>+ union {
>+ struct sockaddr sa;
>+ struct sockaddr_vm svm;
>+ } addr = {
>+ .svm = {
>+ .svm_family = AF_VSOCK,
>+ .svm_port = port,
>+ .svm_cid = cid,
>+ },
>+ };
>+ int fd;
>+
>+ fd = socket(AF_VSOCK, SOCK_STREAM, 0);
>+
>+ if (fd < 0)
>+ return -1;
>+
>+ vsock_increase_buf_size(fd);
>+
>+ if (connect(fd, &addr.sa, sizeof(addr.svm)) < 0) {
>+ close(fd);
>+ return -1;
>+ }
>+
>+ return fd;
>+}
>+
>+static float get_gbps(unsigned long bytes, time_t ns_delta)
>+{
>+ return ((float)bytes / BYTES_PER_GB) /
>+ ((float)ns_delta / NSEC_PER_SEC);
>+}
>+
>+static void run_sender(unsigned long to_send_bytes)
>+{
>+ time_t tx_begin_ns;
>+ size_t total_send;
>+ int client_fd;
>+ char *data;
>+ int fd;
>+ union {
>+ struct sockaddr sa;
>+ struct sockaddr_vm svm;
>+ } addr = {
>+ .svm = {
>+ .svm_family = AF_VSOCK,
>+ .svm_port = port,
>+ .svm_cid = VMADDR_CID_ANY,
>+ },
>+ };
>+ union {
>+ struct sockaddr sa;
>+ struct sockaddr_vm svm;
>+ } clientaddr;
>+
>+ socklen_t clientaddr_len = sizeof(clientaddr.svm);
>+
>+ printf("Run as sender\n");
>+ printf("Listen port %u\n", port);
>+ printf("Send %lu bytes\n", to_send_bytes);
>+ printf("TX buffer %lu bytes\n", buf_size_bytes);
>+ printf("Peer buffer %lu bytes\n", vsock_buf_bytes);
>+
>+ fd = socket(AF_VSOCK, SOCK_STREAM, 0);
>+
>+ if (fd < 0) {
>+ perror("socket");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ if (bind(fd, &addr.sa, sizeof(addr.svm)) < 0) {
>+ perror("bind");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ if (listen(fd, 1) < 0) {
>+ perror("listen");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ client_fd = accept(fd, &clientaddr.sa, &clientaddr_len);
>+
>+ if (client_fd < 0) {
>+ perror("accept");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ vsock_increase_buf_size(client_fd);
>+
>+ data = malloc(buf_size_bytes);
>+
>+ if (!data) {
>+ printf("malloc failed\n");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ memset(data, 0, buf_size_bytes);
>+ total_send = 0;
>+ tx_begin_ns = current_nsec();
>+
>+ while (total_send < to_send_bytes) {
>+ ssize_t sent;
>+
>+ sent = write(client_fd, data, buf_size_bytes);
>+
>+ if (sent <= 0) {
>+ perror("write");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ total_send += sent;
>+ }
>+
>+ printf("tx performance: %f Gb/s\n",
>+ get_gbps(total_send, current_nsec() - tx_begin_ns));
>+
>+ close(client_fd);
>+ close(fd);
>+
>+ free(data);
>+}
>+
>+static void run_receiver(int peer_cid, unsigned long rcvlowat_bytes)
>+{
>+ unsigned int read_cnt;
>+ time_t rx_begin_ns;
>+ time_t in_read_ns;
>+ size_t total_recv;
>+ void *data;
>+ int fd;
>+
>+ printf("Run as receiver\n");
>+ printf("Connect to %i:%u\n", peer_cid, port);
>+ printf("RX buffer %lu bytes\n", buf_size_bytes);
>+ printf("Peer buffer %lu bytes\n", vsock_buf_bytes);
>+ printf("SO_RCVLOWAT %lu bytes\n", rcvlowat_bytes);
>+
>+ fd = vsock_connect(peer_cid, port);
>+
>+ if (fd < 0) {
>+ perror("socket");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ if (setsockopt(fd, SOL_SOCKET, SO_RCVLOWAT,
>+ &rcvlowat_bytes,
>+ sizeof(rcvlowat_bytes))) {
>+ perror("setsockopt");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ data = mmap(NULL, buf_size_bytes, PROT_READ | PROT_WRITE,
>+ MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);
>+
>+ if (data == MAP_FAILED) {
>+ perror("mmap");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ read_cnt = 0;
>+ in_read_ns = 0;
>+ total_recv = 0;
>+ rx_begin_ns = current_nsec();
>+
>+ while (1) {
>+ struct pollfd fds = { 0 };
>+
>+ fds.fd = fd;
>+ fds.events = POLLIN | POLLERR | POLLHUP |
>+ POLLRDHUP | POLLNVAL;
>+
>+ if (poll(&fds, 1, -1) < 0) {
>+ perror("poll");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ if (fds.revents & POLLERR) {
>+ printf("'poll()' error\n");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ if (fds.revents & POLLIN) {
>+ ssize_t bytes_read;
>+ time_t t;
>+
>+ t = current_nsec();
>+ bytes_read = read(fd, data, buf_size_bytes);
>+ in_read_ns += (current_nsec() - t);
>+ read_cnt++;
>+
>+ if (!bytes_read)
>+ break;
>+
>+ if (bytes_read < 0) {
>+ perror("recv");
>+ exit(EXIT_FAILURE);
>+ }
>+
>+ total_recv += bytes_read;
>+ }
>+
>+ if (fds.revents & (POLLHUP | POLLRDHUP))
>+ break;
>+ }
>+
>+ printf("rx performance: %f Gb/s\n",
>+ get_gbps(total_recv, current_nsec() - rx_begin_ns));
>+ printf("total in 'read()': %f sec\n", (float)in_read_ns / NSEC_PER_SEC);

Maybe better "total time in ..."

>+ printf("POLLIN wakeups: %i\n", read_cnt);
>+ printf("average in 'read()': %f ns\n", (float)in_read_ns / read_cnt);

Ditto.

>+
>+ munmap(data, buf_size_bytes);
>+ close(fd);
>+}
>+
>+static void usage(void)
>+{
>+ printf("Help:\n"
>+ "\n"
>+ "This is benchmarking utility, to test vsock performance.\n"
>+ "It runs in two modes: sender or receiver. In sender mode, it waits\n"
>+ "connection from receiver, and when established, sender starts data\n"
>+ "transmission.\n"
>+ "\n"
>+ "Options:\n"
>+ " -h This help message\n"
>+ " -s Sender mode(receiver default)\n"
>+ " -p <port> Port (%d)\n"
>+ " -c <cid> CID of the peer (%d)\n"
>+ " -m <bytes to send> Bytes to send (%d)\n"
>+ " -b <buffer size> Rx/Tx buffer size (%d). In sender mode\n"
>+ " it is size of buffer passed to 'write()'.\n"
>+ " In receiver mode it is size of buffer passed\n"
>+ " to 'read()'.\n"
>+ " -v <peer buffer size> Peer buffer size (%d)\n"

Instead of "peer buffer", what about "socket buffer"?

>+ " -r <SO_RCVLOWAT> SO_RCVLOWAT (%d)\n"
>+ "\n", DEFAULT_PORT, DEFAULT_CID, DEFAULT_TO_SEND_BYTES,
>+ DEFAULT_BUF_SIZE_BYTES, DEFAULT_VSOCK_BUF_BYTES,
>+ DEFAULT_RCVLOWAT_BYTES);
>+ exit(EXIT_FAILURE);
>+}
>+
>+static long strtolx(const char *arg)
>+{
>+ long value;
>+ char *end;
>+
>+ value = strtol(arg, &end, 10);
>+
>+ if (end != arg + strlen(arg))
>+ usage();
>+
>+ return value;
>+}
>+
>+int main(int argc, char **argv)
>+{
>+ unsigned long to_send_bytes = DEFAULT_TO_SEND_BYTES;
>+ unsigned long rcvlowat_bytes = DEFAULT_RCVLOWAT_BYTES;
>+ bool receiver_mode = true;
>+ int peer_cid = DEFAULT_CID;
>+ int c;
>+
>+ while ((c = getopt(argc, argv, "v:r:c:p:m:b:sh")) != -1) {
>+ switch (c) {
>+ case 'v': /* Peer buffer size. */
>+ vsock_buf_bytes = memparse(optarg);
>+ break;
>+ case 'r': /* SO_RCVLOWAT value. */
>+ rcvlowat_bytes = memparse(optarg);
>+ break;
>+ case 'c': /* CID to connect to. */
>+ peer_cid = strtolx(optarg);
>+ break;
>+ case 'p': /* Port to connect to. */
>+ port = strtolx(optarg);
>+ break;
>+ case 'm': /* Bytes to send. */
>+ to_send_bytes = memparse(optarg);
>+ break;
>+ case 'b': /* Size of rx/tx buffer. */
>+ buf_size_bytes = memparse(optarg);
>+ break;
>+ case 's': /* Sender mode. */
>+ receiver_mode = false;
>+ break;
>+ case 'h': /* Help. */
>+ usage();
>+ break;
>+ default:
>+ usage();
>+ }
>+ }
>+
>+ if (receiver_mode)
>+ run_receiver(peer_cid, rcvlowat_bytes);
>+ else
>+ run_sender(to_send_bytes);
>+
>+ return 0;
>+}
>--
>2.25.1

2022-12-01 12:10:22

by Arseniy Krasnov

[permalink] [raw]
Subject: Re: [RFC PATCH v2 4/6] test/vsock: rework message bounds test

On 01.12.2022 12:41, Stefano Garzarella wrote:
> On Fri, Nov 25, 2022 at 05:10:35PM +0000, Arseniy Krasnov wrote:
>> This updates message bound test making it more complex. Instead of
>> sending 1 bytes messages with one MSG_EOR bit, it sends messages of
>> random length(one half of messages are smaller than page size, second
>> half are bigger) with random number of MSG_EOR bits set. Receiver
>> also don't know total number of messages.
>>
>> Signed-off-by: Arseniy Krasnov <[email protected]>
>> ---
>> tools/testing/vsock/control.c    |  28 +++++++
>> tools/testing/vsock/control.h    |   2 +
>> tools/testing/vsock/util.c       |  13 ++++
>> tools/testing/vsock/util.h       |   1 +
>> tools/testing/vsock/vsock_test.c | 124 +++++++++++++++++++++++++++----
>> 5 files changed, 155 insertions(+), 13 deletions(-)
>>
>> diff --git a/tools/testing/vsock/control.c b/tools/testing/vsock/control.c
>> index 4874872fc5a3..d2deb4b15b94 100644
>> --- a/tools/testing/vsock/control.c
>> +++ b/tools/testing/vsock/control.c
>> @@ -141,6 +141,34 @@ void control_writeln(const char *str)
>>     timeout_end();
>> }
>>
>> +void control_writeulong(unsigned long value)
>> +{
>> +    char str[32];
>> +
>> +    if (snprintf(str, sizeof(str), "%lu", value) >= sizeof(str)) {
>> +        perror("snprintf");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    control_writeln(str);
>> +}
>> +
>> +unsigned long control_readulong(void)
>> +{
>> +    unsigned long value;
>> +    char *str;
>> +
>> +    str = control_readln();
>> +
>> +    if (!str)
>> +        exit(EXIT_FAILURE);
>> +
>> +    value = strtoul(str, NULL, 10);
>> +    free(str);
>> +
>> +    return value;
>> +}
>> +
>> /* Return the next line from the control socket (without the trailing newline).
>>  *
>>  * The program terminates if a timeout occurs.
>> diff --git a/tools/testing/vsock/control.h b/tools/testing/vsock/control.h
>> index 51814b4f9ac1..c1f77fdb2c7a 100644
>> --- a/tools/testing/vsock/control.h
>> +++ b/tools/testing/vsock/control.h
>> @@ -9,7 +9,9 @@ void control_init(const char *control_host, const char *control_port,
>> void control_cleanup(void);
>> void control_writeln(const char *str);
>> char *control_readln(void);
>> +unsigned long control_readulong(void);
>> void control_expectln(const char *str);
>> bool control_cmpln(char *line, const char *str, bool fail);
>> +void control_writeulong(unsigned long value);
>>
>> #endif /* CONTROL_H */
>> diff --git a/tools/testing/vsock/util.c b/tools/testing/vsock/util.c
>> index 2acbb7703c6a..01b636d3039a 100644
>> --- a/tools/testing/vsock/util.c
>> +++ b/tools/testing/vsock/util.c
>> @@ -395,3 +395,16 @@ void skip_test(struct test_case *test_cases, size_t test_cases_len,
>>
>>     test_cases[test_id].skip = true;
>> }
>> +
>> +unsigned long hash_djb2(const void *data, size_t len)
>> +{
>> +    unsigned long hash = 5381;
>> +    int i = 0;
>> +
>> +    while (i < len) {
>> +        hash = ((hash << 5) + hash) + ((unsigned char *)data)[i];
>> +        i++;
>> +    }
>> +
>> +    return hash;
>> +}
>> diff --git a/tools/testing/vsock/util.h b/tools/testing/vsock/util.h
>> index a3375ad2fb7f..fb99208a95ea 100644
>> --- a/tools/testing/vsock/util.h
>> +++ b/tools/testing/vsock/util.h
>> @@ -49,4 +49,5 @@ void run_tests(const struct test_case *test_cases,
>> void list_tests(const struct test_case *test_cases);
>> void skip_test(struct test_case *test_cases, size_t test_cases_len,
>>            const char *test_id_str);
>> +unsigned long hash_djb2(const void *data, size_t len);
>> #endif /* UTIL_H */
>> diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>> index bb6d691cb30d..12ef0cca6f93 100644
>> --- a/tools/testing/vsock/vsock_test.c
>> +++ b/tools/testing/vsock/vsock_test.c
>> @@ -284,10 +284,14 @@ static void test_stream_msg_peek_server(const struct test_opts *opts)
>>     close(fd);
>> }
>>
>> -#define MESSAGES_CNT 7
>> -#define MSG_EOR_IDX (MESSAGES_CNT / 2)
>> +#define SOCK_BUF_SIZE (2 * 1024 * 1024)
>> +#define MAX_MSG_SIZE (32 * 1024)
>> +
>> static void test_seqpacket_msg_bounds_client(const struct test_opts *opts)
>> {
>> +    unsigned long curr_hash;
>> +    int page_size;
>> +    int msg_count;
>>     int fd;
>>
>>     fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
>> @@ -296,18 +300,78 @@ static void test_seqpacket_msg_bounds_client(const struct test_opts *opts)
>>         exit(EXIT_FAILURE);
>>     }
>>
>> -    /* Send several messages, one with MSG_EOR flag */
>> -    for (int i = 0; i < MESSAGES_CNT; i++)
>> -        send_byte(fd, 1, (i == MSG_EOR_IDX) ? MSG_EOR : 0);
>> +    /* Wait, until receiver sets buffer size. */
>> +    control_expectln("SRVREADY");
>> +
>> +    curr_hash = 0;
>> +    page_size = getpagesize();
>> +    msg_count = SOCK_BUF_SIZE / MAX_MSG_SIZE;
>> +
>> +    for (int i = 0; i < msg_count; i++) {
>> +        ssize_t send_size;
>> +        size_t buf_size;
>> +        int flags;
>> +        void *buf;
>> +
>> +        /* Use "small" buffers and "big" buffers. */
>> +        if (i & 1)
>> +            buf_size = page_size +
>> +                    (rand() % (MAX_MSG_SIZE - page_size));
>> +        else
>> +            buf_size = 1 + (rand() % page_size);
>> +
>> +        buf = malloc(buf_size);
>> +
>> +        if (!buf) {
>> +            perror("malloc");
>> +            exit(EXIT_FAILURE);
>> +        }
>> +
>> +        /* Set at least one MSG_EOR + some random. */
>> +        if (i == (msg_count / 2) || (rand() & 1)) {
>> +            flags = MSG_EOR;
>> +            curr_hash++;
>> +        } else {
>> +            flags = 0;
>> +        }
>> +
>> +        send_size = send(fd, buf, buf_size, flags);
>> +
>> +        if (send_size < 0) {
>> +            perror("send");
>> +            exit(EXIT_FAILURE);
>> +        }
>> +
>> +        if (send_size != buf_size) {
>> +            fprintf(stderr, "Invalid send size\n");
>> +            exit(EXIT_FAILURE);
>> +        }
>> +
>> +        /*
>> +         * Hash sum is computed at both client and server in
>> +         * the same way: H = hash(H + 'message length'). Idea
>> +         * is simple: such hash "accumulates" length and order
>> +         * of every sent/received message. After data exchange,
>> +         * both sums are compared using control socket, and if
>> +         * message bounds wasn't broken - two values must be
>> +         * equal.
>> +         */
>> +        curr_hash += send_size;
>
> Sorry, I thought about it now, but instead of hashing the size, couldn't we just fill the buffer with a pattern (maybe random?) and hash the content?
>
> That way we see both that the bounds were met and that the content was transferred correctly.
>
> What do you think?
>
> The rest LGTM!
>
> Stefano
Yes, I'll implement it in the next version as it gives exactly the same result

Thanks
>
>> +        curr_hash = hash_djb2(&curr_hash, sizeof(curr_hash));
>> +    }
>>
>>     control_writeln("SENDDONE");
>> +    control_writeulong(curr_hash);
>>     close(fd);
>> }
>>
>> static void test_seqpacket_msg_bounds_server(const struct test_opts *opts)
>> {
>> +    unsigned long sock_buf_size;
>> +    unsigned long remote_hash;
>> +    unsigned long curr_hash;
>>     int fd;
>> -    char buf[16];
>> +    char buf[MAX_MSG_SIZE];
>>     struct msghdr msg = {0};
>>     struct iovec iov = {0};
>>
>> @@ -317,25 +381,58 @@ static void test_seqpacket_msg_bounds_server(const struct test_opts *opts)
>>         exit(EXIT_FAILURE);
>>     }
>>
>> +    sock_buf_size = SOCK_BUF_SIZE;
>> +
>> +    if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MAX_SIZE,
>> +               &sock_buf_size, sizeof(sock_buf_size))) {
>> +        perror("getsockopt");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
>> +               &sock_buf_size, sizeof(sock_buf_size))) {
>> +        perror("getsockopt");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    /* Ready to receive data. */
>> +    control_writeln("SRVREADY");
>> +    /* Wait, until peer sends whole data. */
>>     control_expectln("SENDDONE");
>>     iov.iov_base = buf;
>>     iov.iov_len = sizeof(buf);
>>     msg.msg_iov = &iov;
>>     msg.msg_iovlen = 1;
>>
>> -    for (int i = 0; i < MESSAGES_CNT; i++) {
>> -        if (recvmsg(fd, &msg, 0) != 1) {
>> -            perror("message bound violated");
>> -            exit(EXIT_FAILURE);
>> -        }
>> +    curr_hash = 0;
>>
>> -        if ((i == MSG_EOR_IDX) ^ !!(msg.msg_flags & MSG_EOR)) {
>> -            perror("MSG_EOR");
>> +    while (1) {
>> +        ssize_t recv_size;
>> +
>> +        recv_size = recvmsg(fd, &msg, 0);
>> +
>> +        if (!recv_size)
>> +            break;
>> +
>> +        if (recv_size < 0) {
>> +            perror("recvmsg");
>>             exit(EXIT_FAILURE);
>>         }
>> +
>> +        if (msg.msg_flags & MSG_EOR)
>> +            curr_hash++;
>> +
>> +        curr_hash += recv_size;
>> +        curr_hash = hash_djb2(&curr_hash, sizeof(curr_hash));
>>     }
>>
>>     close(fd);
>> +    remote_hash = control_readulong();
>> +
>> +    if (curr_hash != remote_hash) {
>> +        fprintf(stderr, "Message bounds broken\n");
>> +        exit(EXIT_FAILURE);
>> +    }
>> }
>>
>> #define MESSAGE_TRUNC_SZ 32
>> @@ -837,6 +934,7 @@ int main(int argc, char **argv)
>>         .peer_cid = VMADDR_CID_ANY,
>>     };
>>
>> +    srand(time(NULL));
>>     init_signals();
>>
>>     for (;;) {
>> -- 
>> 2.25.1
>

2022-12-01 12:18:43

by Arseniy Krasnov

[permalink] [raw]
Subject: Re: [RFC PATCH v2 6/6] test/vsock: vsock_perf utility

On 01.12.2022 13:21, Stefano Garzarella wrote:
> On Fri, Nov 25, 2022 at 05:15:08PM +0000, Arseniy Krasnov wrote:
>> This adds utility to check vsock rx/tx performance.
>>
>> Usage as sender:
>> ./vsock_perf -p <port> -m <bytes to send)
>> Usage as receiver:
>> ./vsock_perf -c <cid> -p <port> -r <SO_RCVLOWAT>
>>
>> Signed-off-by: Arseniy Krasnov <[email protected]>
>> ---
>> tools/testing/vsock/Makefile     |   1 +
>> tools/testing/vsock/README       |  34 +++
>> tools/testing/vsock/vsock_perf.c | 400 +++++++++++++++++++++++++++++++
>> 3 files changed, 435 insertions(+)
>> create mode 100644 tools/testing/vsock/vsock_perf.c
>>
>> diff --git a/tools/testing/vsock/Makefile b/tools/testing/vsock/Makefile
>> index f8293c6910c9..d36fdd59fe2e 100644
>> --- a/tools/testing/vsock/Makefile
>> +++ b/tools/testing/vsock/Makefile
>> @@ -3,6 +3,7 @@ all: test
>> test: vsock_test vsock_diag_test
>> vsock_test: vsock_test.o timeout.o control.o util.o
>> vsock_diag_test: vsock_diag_test.o timeout.o control.o util.o
>> +vsock_perf: vsock_perf.o
>>
>> CFLAGS += -g -O2 -Werror -Wall -I. -I../../include -I../../../usr/include -Wno-pointer-sign -fno-strict-overflow -fno-strict-aliasing -fno-common -MMD -U_FORTIFY_SOURCE -D_GNU_SOURCE
>> .PHONY: all test clean
>> diff --git a/tools/testing/vsock/README b/tools/testing/vsock/README
>> index 4d5045e7d2c3..dbeba1d62016 100644
>> --- a/tools/testing/vsock/README
>> +++ b/tools/testing/vsock/README
>> @@ -35,3 +35,37 @@ Invoke test binaries in both directions as follows:
>>                        --control-port=$GUEST_IP \
>>                        --control-port=1234 \
>>                        --peer-cid=3
>
> I'm not a native speaker, so the suggestions below may be wrong.Ack, this and every comment below
Thanks for review!
Arseniy
>
>> +
>> +vsock_perf utility
>> +-------------------
>> +'vsock_perf' is simple tool to measure vsock performance. It works in
>
> is a simple tool
>
>> +sender/receiver modes: sender waits for connection at specified port,
>> +and after accept, starts data transmission to receiver. After data
>   ^
> and after accepting it
>                                              ^
>                                              to the receiver
>
>> +processing is done, it prints several metrics(see below).
>> +
>> +Usage:
>> +# run as sender
>> +# listen port 1234, tx buffer size is 1MB, send of data 1G
>> +./vsock_perf -s -p 1234 -b 1MB -m 1G
>
> Like for vsock_test and vsock_diag_test I would use only the long form for the parameters (e.g. --sender, --port, etc.)
>
>> +
>> +Output:
>> +tx performance: A Gb/s
>> +
>> +Output explanation:
>> +A is calculated as "number of bytes to send" / "time in tx loop"
>> +
>> +# run as receiver
>> +# connect to CID 2, port 1234, rx buffer size is 1MB, peer buf is 1G, SO_RCVLOWAT is 65536
>> +./vsock_perf -c 2 -p 1234 -b 1MB -v 1G -r 65536
>> +
>> +Output:
>> +rx performance: A Gb/s
>> +total in 'read()': B sec
>> +POLLIN wakeups: C
>> +average in 'read()': D ns
>> +
>> +Output explanation:
>> +A is calculated as "number of received bytes" / "time in rx loop".
>> +B is time, spent in 'read()' system call(excluding 'poll()')
>> +C is number of 'poll()' wake ups with POLLIN bit set.
>> +D is B / C, e.g. average amount of time, spent in single 'read()'.
>> diff --git a/tools/testing/vsock/vsock_perf.c b/tools/testing/vsock/vsock_perf.c
>> new file mode 100644
>> index 000000000000..bca2027de52d
>> --- /dev/null
>> +++ b/tools/testing/vsock/vsock_perf.c
>> @@ -0,0 +1,400 @@
>> +// SPDX-License-Identifier: GPL-2.0-only
>> +/*
>> + * vsock_perf - benchmark utility for vsock.
>> + *
>> + * Copyright (C) 2022 SberDevices.
>> + *
>> + * Author: Arseniy Krasnov <[email protected]>
>> + */
>> +#include <getopt.h>
>> +#include <stdio.h>
>> +#include <stdlib.h>
>> +#include <stdbool.h>
>> +#include <string.h>
>> +#include <errno.h>
>> +#include <unistd.h>
>> +#include <time.h>
>> +#include <sys/mman.h>
>> +#include <stdint.h>
>> +#include <poll.h>
>> +#include <sys/socket.h>
>> +#include <linux/vm_sockets.h>
>> +
>> +#define DEFAULT_BUF_SIZE_BYTES    (128 * 1024)
>> +#define DEFAULT_TO_SEND_BYTES    (64 * 1024)
>> +#define DEFAULT_VSOCK_BUF_BYTES (256 * 1024)
>> +#define DEFAULT_RCVLOWAT_BYTES    1
>> +#define DEFAULT_PORT        1234
>> +#define DEFAULT_CID        2
>> +
>> +#define BYTES_PER_GB        (1024 * 1024 * 1024ULL)
>> +#define NSEC_PER_SEC        (1000000000ULL)
>> +
>> +static unsigned int port = DEFAULT_PORT;
>> +static unsigned long buf_size_bytes = DEFAULT_BUF_SIZE_BYTES;
>> +static unsigned long vsock_buf_bytes = DEFAULT_VSOCK_BUF_BYTES;
>> +
>> +static inline time_t current_nsec(void)
>> +{
>> +    struct timespec ts;
>> +
>> +    if (clock_gettime(CLOCK_REALTIME, &ts)) {
>> +        perror("clock_gettime");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    return (ts.tv_sec * NSEC_PER_SEC) + ts.tv_nsec;
>> +}
>> +
>> +/* From lib/cmdline.c. */
>> +static unsigned long memparse(const char *ptr)
>> +{
>> +    char *endptr;
>> +
>> +    unsigned long long ret = strtoull(ptr, &endptr, 0);
>> +
>> +    switch (*endptr) {
>> +    case 'E':
>> +    case 'e':
>> +        ret <<= 10;
>> +    case 'P':
>> +    case 'p':
>> +        ret <<= 10;
>> +    case 'T':
>> +    case 't':
>> +        ret <<= 10;
>> +    case 'G':
>> +    case 'g':
>> +        ret <<= 10;
>> +    case 'M':
>> +    case 'm':
>> +        ret <<= 10;
>> +    case 'K':
>> +    case 'k':
>> +        ret <<= 10;
>> +        endptr++;
>> +    default:
>> +        break;
>> +    }
>> +
>> +    return ret;
>> +}
>> +
>> +static void vsock_increase_buf_size(int fd)
>> +{
>> +    if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_MAX_SIZE,
>> +               &vsock_buf_bytes, sizeof(vsock_buf_bytes))) {
>> +        perror("setsockopt");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    if (setsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
>> +               &vsock_buf_bytes, sizeof(vsock_buf_bytes))) {
>> +        perror("setsockopt");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +}
>> +
>> +static int vsock_connect(unsigned int cid, unsigned int port)
>> +{
>> +    union {
>> +        struct sockaddr sa;
>> +        struct sockaddr_vm svm;
>> +    } addr = {
>> +        .svm = {
>> +            .svm_family = AF_VSOCK,
>> +            .svm_port = port,
>> +            .svm_cid = cid,
>> +        },
>> +    };
>> +    int fd;
>> +
>> +    fd = socket(AF_VSOCK, SOCK_STREAM, 0);
>> +
>> +    if (fd < 0)
>> +        return -1;
>> +
>> +    vsock_increase_buf_size(fd);
>> +
>> +    if (connect(fd, &addr.sa, sizeof(addr.svm)) < 0) {
>> +        close(fd);
>> +        return -1;
>> +    }
>> +
>> +    return fd;
>> +}
>> +
>> +static float get_gbps(unsigned long bytes, time_t ns_delta)
>> +{
>> +    return ((float)bytes / BYTES_PER_GB) /
>> +           ((float)ns_delta / NSEC_PER_SEC);
>> +}
>> +
>> +static void run_sender(unsigned long to_send_bytes)
>> +{
>> +    time_t tx_begin_ns;
>> +    size_t total_send;
>> +    int client_fd;
>> +    char *data;
>> +    int fd;
>> +    union {
>> +        struct sockaddr sa;
>> +        struct sockaddr_vm svm;
>> +    } addr = {
>> +        .svm = {
>> +            .svm_family = AF_VSOCK,
>> +            .svm_port = port,
>> +            .svm_cid = VMADDR_CID_ANY,
>> +        },
>> +    };
>> +    union {
>> +        struct sockaddr sa;
>> +        struct sockaddr_vm svm;
>> +    } clientaddr;
>> +
>> +    socklen_t clientaddr_len = sizeof(clientaddr.svm);
>> +
>> +    printf("Run as sender\n");
>> +    printf("Listen port %u\n", port);
>> +    printf("Send %lu bytes\n", to_send_bytes);
>> +    printf("TX buffer %lu bytes\n", buf_size_bytes);
>> +    printf("Peer buffer %lu bytes\n", vsock_buf_bytes);
>> +
>> +    fd = socket(AF_VSOCK, SOCK_STREAM, 0);
>> +
>> +    if (fd < 0) {
>> +        perror("socket");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    if (bind(fd, &addr.sa, sizeof(addr.svm)) < 0) {
>> +        perror("bind");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    if (listen(fd, 1) < 0) {
>> +        perror("listen");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    client_fd = accept(fd, &clientaddr.sa, &clientaddr_len);
>> +
>> +    if (client_fd < 0) {
>> +        perror("accept");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    vsock_increase_buf_size(client_fd);
>> +
>> +    data = malloc(buf_size_bytes);
>> +
>> +    if (!data) {
>> +        printf("malloc failed\n");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    memset(data, 0, buf_size_bytes);
>> +    total_send = 0;
>> +    tx_begin_ns = current_nsec();
>> +
>> +    while (total_send < to_send_bytes) {
>> +        ssize_t sent;
>> +
>> +        sent = write(client_fd, data, buf_size_bytes);
>> +
>> +        if (sent <= 0) {
>> +            perror("write");
>> +            exit(EXIT_FAILURE);
>> +        }
>> +
>> +        total_send += sent;
>> +    }
>> +
>> +    printf("tx performance: %f Gb/s\n",
>> +           get_gbps(total_send, current_nsec() - tx_begin_ns));
>> +
>> +    close(client_fd);
>> +    close(fd);
>> +
>> +    free(data);
>> +}
>> +
>> +static void run_receiver(int peer_cid, unsigned long rcvlowat_bytes)
>> +{
>> +    unsigned int read_cnt;
>> +    time_t rx_begin_ns;
>> +    time_t in_read_ns;
>> +    size_t total_recv;
>> +    void *data;
>> +    int fd;
>> +
>> +    printf("Run as receiver\n");
>> +    printf("Connect to %i:%u\n", peer_cid, port);
>> +    printf("RX buffer %lu bytes\n", buf_size_bytes);
>> +    printf("Peer buffer %lu bytes\n", vsock_buf_bytes);
>> +    printf("SO_RCVLOWAT %lu bytes\n", rcvlowat_bytes);
>> +
>> +    fd = vsock_connect(peer_cid, port);
>> +
>> +    if (fd < 0) {
>> +        perror("socket");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    if (setsockopt(fd, SOL_SOCKET, SO_RCVLOWAT,
>> +               &rcvlowat_bytes,
>> +               sizeof(rcvlowat_bytes))) {
>> +        perror("setsockopt");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    data = mmap(NULL, buf_size_bytes, PROT_READ | PROT_WRITE,
>> +            MAP_ANONYMOUS | MAP_PRIVATE | MAP_POPULATE, -1, 0);
>> +
>> +    if (data == MAP_FAILED) {
>> +        perror("mmap");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    read_cnt = 0;
>> +    in_read_ns = 0;
>> +    total_recv = 0;
>> +    rx_begin_ns = current_nsec();
>> +
>> +    while (1) {
>> +        struct pollfd fds = { 0 };
>> +
>> +        fds.fd = fd;
>> +        fds.events = POLLIN | POLLERR | POLLHUP |
>> +                 POLLRDHUP | POLLNVAL;
>> +
>> +        if (poll(&fds, 1, -1) < 0) {
>> +            perror("poll");
>> +            exit(EXIT_FAILURE);
>> +        }
>> +
>> +        if (fds.revents & POLLERR) {
>> +            printf("'poll()' error\n");
>> +            exit(EXIT_FAILURE);
>> +        }
>> +
>> +        if (fds.revents & POLLIN) {
>> +            ssize_t bytes_read;
>> +            time_t t;
>> +
>> +            t = current_nsec();
>> +            bytes_read = read(fd, data, buf_size_bytes);
>> +            in_read_ns += (current_nsec() - t);
>> +            read_cnt++;
>> +
>> +            if (!bytes_read)
>> +                break;
>> +
>> +            if (bytes_read < 0) {
>> +                perror("recv");
>> +                exit(EXIT_FAILURE);
>> +            }
>> +
>> +            total_recv += bytes_read;
>> +        }
>> +
>> +        if (fds.revents & (POLLHUP | POLLRDHUP))
>> +            break;
>> +    }
>> +
>> +    printf("rx performance: %f Gb/s\n",
>> +           get_gbps(total_recv, current_nsec() - rx_begin_ns));
>> +    printf("total in 'read()': %f sec\n", (float)in_read_ns / NSEC_PER_SEC);
>
> Maybe better "total time in ..."
>
>> +    printf("POLLIN wakeups: %i\n", read_cnt);
>> +    printf("average in 'read()': %f ns\n", (float)in_read_ns / read_cnt);
>
> Ditto.
>
>> +
>> +    munmap(data, buf_size_bytes);
>> +    close(fd);
>> +}
>> +
>> +static void usage(void)
>> +{
>> +    printf("Help:\n"
>> +           "\n"
>> +           "This is benchmarking utility, to test vsock performance.\n"
>> +           "It runs in two modes: sender or receiver. In sender mode, it waits\n"
>> +           "connection from receiver, and when established, sender starts data\n"
>> +           "transmission.\n"
>> +           "\n"
>> +           "Options:\n"
>> +           "  -h                This help message\n"
>> +           "  -s                Sender mode(receiver default)\n"
>> +           "  -p <port>            Port (%d)\n"
>> +           "  -c <cid>            CID of the peer (%d)\n"
>> +           "  -m <bytes to send>        Bytes to send (%d)\n"
>> +           "  -b <buffer size>        Rx/Tx buffer size (%d). In sender mode\n"
>> +           "                                it is size of buffer passed to 'write()'.\n"
>> +           "                                In receiver mode it is size of buffer passed\n"
>> +           "                                to 'read()'.\n"
>> +           "  -v <peer buffer size>        Peer buffer size (%d)\n"
>
> Instead of "peer buffer", what about "socket buffer"?
>
>> +           "  -r <SO_RCVLOWAT>        SO_RCVLOWAT (%d)\n"
>> +           "\n", DEFAULT_PORT, DEFAULT_CID, DEFAULT_TO_SEND_BYTES,
>> +           DEFAULT_BUF_SIZE_BYTES, DEFAULT_VSOCK_BUF_BYTES,
>> +           DEFAULT_RCVLOWAT_BYTES);
>> +    exit(EXIT_FAILURE);
>> +}
>> +
>> +static long strtolx(const char *arg)
>> +{
>> +    long value;
>> +    char *end;
>> +
>> +    value = strtol(arg, &end, 10);
>> +
>> +    if (end != arg + strlen(arg))
>> +        usage();
>> +
>> +    return value;
>> +}
>> +
>> +int main(int argc, char **argv)
>> +{
>> +    unsigned long to_send_bytes = DEFAULT_TO_SEND_BYTES;
>> +    unsigned long rcvlowat_bytes = DEFAULT_RCVLOWAT_BYTES;
>> +    bool receiver_mode = true;
>> +    int peer_cid = DEFAULT_CID;
>> +    int c;
>> +
>> +    while ((c = getopt(argc, argv, "v:r:c:p:m:b:sh")) != -1) {
>> +        switch (c) {
>> +        case 'v': /* Peer buffer size. */
>> +            vsock_buf_bytes = memparse(optarg);
>> +            break;
>> +        case 'r': /* SO_RCVLOWAT value. */
>> +            rcvlowat_bytes = memparse(optarg);
>> +            break;
>> +        case 'c': /* CID to connect to. */
>> +            peer_cid = strtolx(optarg);
>> +            break;
>> +        case 'p': /* Port to connect to. */
>> +            port = strtolx(optarg);
>> +            break;
>> +        case 'm': /* Bytes to send. */
>> +            to_send_bytes = memparse(optarg);
>> +            break;
>> +        case 'b': /* Size of rx/tx buffer. */
>> +            buf_size_bytes = memparse(optarg);
>> +            break;
>> +        case 's': /* Sender mode. */
>> +            receiver_mode = false;
>> +            break;
>> +        case 'h': /* Help. */
>> +            usage();
>> +            break;
>> +        default:
>> +            usage();
>> +        }
>> +    }
>> +
>> +    if (receiver_mode)
>> +        run_receiver(peer_cid, rcvlowat_bytes);
>> +    else
>> +        run_sender(to_send_bytes);
>> +
>> +    return 0;
>> +}
>> -- 
>> 2.25.1
>

2022-12-01 12:19:15

by Arseniy Krasnov

[permalink] [raw]
Subject: Re: [RFC PATCH v2 3/6] vsock/vmci: always return ENOMEM in case of error

On 01.12.2022 12:30, Stefano Garzarella wrote:
> On Fri, Nov 25, 2022 at 05:08:06PM +0000, Arseniy Krasnov wrote:
>> From: Bobby Eshleman <[email protected]>
>>
>> This saves original behaviour from af_vsock.c - switch any error
>> code returned from transport layer to ENOMEM.
>>
>> Signed-off-by: Bobby Eshleman <[email protected]>
>> Signed-off-by: Arseniy Krasnov <[email protected]>
>> ---
>> net/vmw_vsock/vmci_transport.c | 9 ++++++++-
>> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> @Bryan @Vishnu what do you think about this patch?
>
> A bit of context:
>
> Before this series, the af_vsock core always returned ENOMEM to the user if the transport failed to queue the packet.
>
> Now we are changing it by returning the transport error. So I think here we want to preserve the previous behavior for vmci, but I don't know if that's the right thing.
>
>
>
> @Arseniy please in the next versions describe better in the commit messages the reasons for these changes, so it is easier review for others and also in the future by reading the commit message we can understand the reason for the change.
Hello,

Sure! Sorry for that! Also, I can send both vmci and hyperv patches in the next version(e.g. not waiting for
reviewers reply and reorder them with 1/6 as You asked), as result of review could be dropped patch only.

Thanks, Arseniy
>
> Thanks,
> Stefano
>
>>
>> diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
>> index 842c94286d31..289a36a203a2 100644
>> --- a/net/vmw_vsock/vmci_transport.c
>> +++ b/net/vmw_vsock/vmci_transport.c
>> @@ -1838,7 +1838,14 @@ static ssize_t vmci_transport_stream_enqueue(
>>     struct msghdr *msg,
>>     size_t len)
>> {
>> -    return vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg, len, 0);
>> +    int err;
>> +
>> +    err = vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg, len, 0);
>> +
>> +    if (err < 0)
>> +        err = -ENOMEM;
>> +
>> +    return err;
>> }
>>
>> static s64 vmci_transport_stream_has_data(struct vsock_sock *vsk)
>> -- 
>> 2.25.1
>

2022-12-01 12:48:15

by Arseniy Krasnov

[permalink] [raw]
Subject: Re: [RFC PATCH v2 5/6] test/vsock: add big message test

On 01.12.2022 12:45, Stefano Garzarella wrote:
> On Fri, Nov 25, 2022 at 05:13:06PM +0000, Arseniy Krasnov wrote:
>> This adds test for sending message, bigger than peer's buffer size.
>> For SOCK_SEQPACKET socket it must fail, as this type of socket has
>> message size limit.
>>
>> Signed-off-by: Arseniy Krasnov <[email protected]>
>> ---
>> tools/testing/vsock/vsock_test.c | 69 ++++++++++++++++++++++++++++++++
>> 1 file changed, 69 insertions(+)
>>
>> diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>> index 12ef0cca6f93..a8e43424fb32 100644
>> --- a/tools/testing/vsock/vsock_test.c
>> +++ b/tools/testing/vsock/vsock_test.c
>> @@ -569,6 +569,70 @@ static void test_seqpacket_timeout_server(const struct test_opts *opts)
>>     close(fd);
>> }
>>
>> +static void test_seqpacket_bigmsg_client(const struct test_opts *opts)
>> +{
>> +    unsigned long sock_buf_size;
>> +    ssize_t send_size;
>> +    socklen_t len;
>> +    void *data;
>> +    int fd;
>> +
>> +    len = sizeof(sock_buf_size);
>> +
>> +    fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
>> +    if (fd < 0) {
>> +        perror("connect");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    if (getsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
>> +               &sock_buf_size, &len)) {
>> +        perror("getsockopt");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    sock_buf_size++;
>> +
>> +    data = malloc(sock_buf_size);
>> +    if (!data) {
>> +        perror("malloc");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    send_size = send(fd, data, sock_buf_size, 0);
>> +    if (send_size != -1) {
>> +        fprintf(stderr, "expected 'send(2)' failure, got %zi\n",
>> +            send_size);
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    if (errno != EMSGSIZE) {
>> +        fprintf(stderr, "expected EMSGSIZE in 'errno', got %i\n",
>> +            errno);
>> +        exit(EXIT_FAILURE);
>> +    }
>
> We should make sure that this is true for all transports, but since now only virtio-vsock supports it, we should be okay.
Hm, in general: I've tested this test suite for vmci may be several months ago, and found, that some tests
didn't work. I'm thinking about reworking this test suite a little bit: each transport must have own set of
tests for features that it supports. I had feeling, that all these tests are run only with virtio transport :)
Because for example SEQPACKET mode is suported only for virtio.

Thanks
>
>> +
>> +    control_writeln("CLISENT");
>> +
>> +    free(data);
>> +    close(fd);
>> +}
>> +
>> +static void test_seqpacket_bigmsg_server(const struct test_opts *opts)
>> +{
>> +    int fd;
>> +
>> +    fd = vsock_seqpacket_accept(VMADDR_CID_ANY, 1234, NULL);
>> +    if (fd < 0) {
>> +        perror("accept");
>> +        exit(EXIT_FAILURE);
>> +    }
>> +
>> +    control_expectln("CLISENT");
>> +
>> +    close(fd);
>> +}
>> +
>> #define BUF_PATTERN_1 'a'
>> #define BUF_PATTERN_2 'b'
>>
>> @@ -851,6 +915,11 @@ static struct test_case test_cases[] = {
>>         .run_client = test_stream_poll_rcvlowat_client,
>>         .run_server = test_stream_poll_rcvlowat_server,
>>     },
>> +    {
>> +        .name = "SOCK_SEQPACKET big message",
>> +        .run_client = test_seqpacket_bigmsg_client,
>> +        .run_server = test_seqpacket_bigmsg_server,
>> +    },
>>     {},
>> };
>>
>> -- 
>> 2.25.1
>
> LGTM!
>
> Reviewed-by: Stefano Garzarella <[email protected]>
>

2022-12-01 16:14:34

by Arseniy Krasnov

[permalink] [raw]
Subject: Re: [RFC PATCH v2 3/6] vsock/vmci: always return ENOMEM in case of error

On 01.12.2022 18:14, Vishnu Dasa wrote:
>
>
>> On Dec 1, 2022, at 1:30 AM, Stefano Garzarella <[email protected]> wrote:
>>
>> !! External Email
>>
>> On Fri, Nov 25, 2022 at 05:08:06PM +0000, Arseniy Krasnov wrote:
>>> From: Bobby Eshleman <[email protected]>
>>>
>>> This saves original behaviour from af_vsock.c - switch any error
>>> code returned from transport layer to ENOMEM.
>>>
>>> Signed-off-by: Bobby Eshleman <[email protected]>
>>> Signed-off-by: Arseniy Krasnov <[email protected]>
>>> ---
>>> net/vmw_vsock/vmci_transport.c | 9 ++++++++-
>>> 1 file changed, 8 insertions(+), 1 deletion(-)
>>
>> @Bryan @Vishnu what do you think about this patch?
>>
>> A bit of context:
>>
>> Before this series, the af_vsock core always returned ENOMEM to the user
>> if the transport failed to queue the packet.
>>
>> Now we are changing it by returning the transport error. So I think here
>> we want to preserve the previous behavior for vmci, but I don't know if
>> that's the right thing.
>>
>
> Agree with Stefano. I don't think we need to preserve the previous
> behavior for vmci.
Good! I'll remove this patch from the next version

Thanks, Arseniy
>
>>
>> @Arseniy please in the next versions describe better in the commit
>> messages the reasons for these changes, so it is easier review for
>> others and also in the future by reading the commit message we can
>> understand the reason for the change.
>>
>> Thanks,
>> Stefano
>>
>>>
>>> diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
>>> index 842c94286d31..289a36a203a2 100644
>>> --- a/net/vmw_vsock/vmci_transport.c
>>> +++ b/net/vmw_vsock/vmci_transport.c
>>> @@ -1838,7 +1838,14 @@ static ssize_t vmci_transport_stream_enqueue(
>>> struct msghdr *msg,
>>> size_t len)
>>> {
>>> - return vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg, len, 0);
>>> + int err;
>>> +
>>> + err = vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg, len, 0);
>>> +
>>> + if (err < 0)
>>> + err = -ENOMEM;
>>> +
>>> + return err;
>>> }
>>>
>>> static s64 vmci_transport_stream_has_data(struct vsock_sock *vsk)
>>> --
>>> 2.25.1
>>
>>
>> !! External Email: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender.
>

2022-12-01 16:39:20

by Vishnu Dasa

[permalink] [raw]
Subject: Re: [RFC PATCH v2 3/6] vsock/vmci: always return ENOMEM in case of error



> On Dec 1, 2022, at 1:30 AM, Stefano Garzarella <[email protected]> wrote:
>
> !! External Email
>
> On Fri, Nov 25, 2022 at 05:08:06PM +0000, Arseniy Krasnov wrote:
>> From: Bobby Eshleman <[email protected]>
>>
>> This saves original behaviour from af_vsock.c - switch any error
>> code returned from transport layer to ENOMEM.
>>
>> Signed-off-by: Bobby Eshleman <[email protected]>
>> Signed-off-by: Arseniy Krasnov <[email protected]>
>> ---
>> net/vmw_vsock/vmci_transport.c | 9 ++++++++-
>> 1 file changed, 8 insertions(+), 1 deletion(-)
>
> @Bryan @Vishnu what do you think about this patch?
>
> A bit of context:
>
> Before this series, the af_vsock core always returned ENOMEM to the user
> if the transport failed to queue the packet.
>
> Now we are changing it by returning the transport error. So I think here
> we want to preserve the previous behavior for vmci, but I don't know if
> that's the right thing.
>

Agree with Stefano. I don't think we need to preserve the previous
behavior for vmci.

>
> @Arseniy please in the next versions describe better in the commit
> messages the reasons for these changes, so it is easier review for
> others and also in the future by reading the commit message we can
> understand the reason for the change.
>
> Thanks,
> Stefano
>
>>
>> diff --git a/net/vmw_vsock/vmci_transport.c b/net/vmw_vsock/vmci_transport.c
>> index 842c94286d31..289a36a203a2 100644
>> --- a/net/vmw_vsock/vmci_transport.c
>> +++ b/net/vmw_vsock/vmci_transport.c
>> @@ -1838,7 +1838,14 @@ static ssize_t vmci_transport_stream_enqueue(
>> struct msghdr *msg,
>> size_t len)
>> {
>> - return vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg, len, 0);
>> + int err;
>> +
>> + err = vmci_qpair_enquev(vmci_trans(vsk)->qpair, msg, len, 0);
>> +
>> + if (err < 0)
>> + err = -ENOMEM;
>> +
>> + return err;
>> }
>>
>> static s64 vmci_transport_stream_has_data(struct vsock_sock *vsk)
>> --
>> 2.25.1
>
>
> !! External Email: This email originated from outside of the organization. Do not click links or open attachments unless you recognize the sender.

2022-12-01 22:13:48

by Dexuan Cui

[permalink] [raw]
Subject: RE: [RFC PATCH v2 2/6] hv_sock: always return ENOMEM in case of error

> From: Stefano Garzarella <[email protected]>
> Sent: Thursday, December 1, 2022 1:24 AM
> > [...]
> >--- a/net/vmw_vsock/hyperv_transport.c
> >+++ b/net/vmw_vsock/hyperv_transport.c
> >@@ -687,7 +687,7 @@ static ssize_t hvs_stream_enqueue(struct vsock_sock
> *vsk, struct msghdr *msg,
> > if (bytes_written)
> > ret = bytes_written;
> > kfree(send_buf);
> >- return ret;
> >+ return ret < 0 ? -ENOMEM : ret;
>
> I'm not sure for hyperv we want to preserve -ENOMEM. This transport was
> added after virtio-vsock, so I think we can return the error directly.
>
> @Dexuan what do you think?
>
> Thanks,
> Stefano

I also think we can return the error directly.
BTW, I doubt any user really depends on the value of a non-zero error code.

2022-12-05 10:36:21

by Stefano Garzarella

[permalink] [raw]
Subject: Re: [RFC PATCH v2 5/6] test/vsock: add big message test

On Thu, Dec 01, 2022 at 11:44:39AM +0000, Arseniy Krasnov wrote:
>On 01.12.2022 12:45, Stefano Garzarella wrote:
>> On Fri, Nov 25, 2022 at 05:13:06PM +0000, Arseniy Krasnov wrote:
>>> This adds test for sending message, bigger than peer's buffer size.
>>> For SOCK_SEQPACKET socket it must fail, as this type of socket has
>>> message size limit.
>>>
>>> Signed-off-by: Arseniy Krasnov <[email protected]>
>>> ---
>>> tools/testing/vsock/vsock_test.c | 69 ++++++++++++++++++++++++++++++++
>>> 1 file changed, 69 insertions(+)
>>>
>>> diff --git a/tools/testing/vsock/vsock_test.c b/tools/testing/vsock/vsock_test.c
>>> index 12ef0cca6f93..a8e43424fb32 100644
>>> --- a/tools/testing/vsock/vsock_test.c
>>> +++ b/tools/testing/vsock/vsock_test.c
>>> @@ -569,6 +569,70 @@ static void test_seqpacket_timeout_server(const struct test_opts *opts)
>>> ????close(fd);
>>> }
>>>
>>> +static void test_seqpacket_bigmsg_client(const struct test_opts *opts)
>>> +{
>>> +??? unsigned long sock_buf_size;
>>> +??? ssize_t send_size;
>>> +??? socklen_t len;
>>> +??? void *data;
>>> +??? int fd;
>>> +
>>> +??? len = sizeof(sock_buf_size);
>>> +
>>> +??? fd = vsock_seqpacket_connect(opts->peer_cid, 1234);
>>> +??? if (fd < 0) {
>>> +??????? perror("connect");
>>> +??????? exit(EXIT_FAILURE);
>>> +??? }
>>> +
>>> +??? if (getsockopt(fd, AF_VSOCK, SO_VM_SOCKETS_BUFFER_SIZE,
>>> +?????????????? &sock_buf_size, &len)) {
>>> +??????? perror("getsockopt");
>>> +??????? exit(EXIT_FAILURE);
>>> +??? }
>>> +
>>> +??? sock_buf_size++;
>>> +
>>> +??? data = malloc(sock_buf_size);
>>> +??? if (!data) {
>>> +??????? perror("malloc");
>>> +??????? exit(EXIT_FAILURE);
>>> +??? }
>>> +
>>> +??? send_size = send(fd, data, sock_buf_size, 0);
>>> +??? if (send_size != -1) {
>>> +??????? fprintf(stderr, "expected 'send(2)' failure, got %zi\n",
>>> +??????????? send_size);
>>> +??????? exit(EXIT_FAILURE);
>>> +??? }
>>> +
>>> +??? if (errno != EMSGSIZE) {
>>> +??????? fprintf(stderr, "expected EMSGSIZE in 'errno', got %i\n",
>>> +??????????? errno);
>>> +??????? exit(EXIT_FAILURE);
>>> +??? }
>>
>> We should make sure that this is true for all transports, but since now only virtio-vsock supports it, we should be okay.
>Hm, in general: I've tested this test suite for vmci may be several months ago, and found, that some tests
>didn't work. I'm thinking about reworking this test suite a little bit: each transport must have own set of
>tests for features that it supports. I had feeling, that all these tests are run only with virtio transport :)
>Because for example SEQPACKET mode is suported only for virtio.

Yep, when we developed it, we added the "--skip" param for that.
Ideally there should be no difference, but I remember VMCI had a
different behavior and we couldn't change it for backward compatibility,
so we added "--skip".

Thanks,
Steano