2022-03-11 15:54:39

by Mingbao Sun

[permalink] [raw]
Subject: [PATCH v2 1/3] tcp: export symbol tcp_set_congestion_control

From: Mingbao Sun <[email protected]>

congestion-control could have a noticeable impaction on the
performance of TCP-based communications. This is of course true
to NVMe/TCP in the kernel.

Different congestion-controls (e.g., cubic, dctcp) are suitable for
different scenarios. Proper adoption of congestion control would
benefit the performance. On the contrary, the performance could be
destroyed.

So to gain excellent performance against different network
environments, NVMe/TCP tends to support specifying the
congestion-control.

This means NVMe/TCP (a kernel user) needs to set the congestion-control
of its TCP sockets.

Since the kernel API 'kernel_setsockopt' was removed, and since the
function ‘tcp_set_congestion_control’ is just the real underlying guy
handling this job, so it makes sense to get it exported.

Signed-off-by: Mingbao Sun <[email protected]>
---
net/ipv4/tcp_cong.c | 19 +++++++++++++++----
1 file changed, 15 insertions(+), 4 deletions(-)

diff --git a/net/ipv4/tcp_cong.c b/net/ipv4/tcp_cong.c
index db5831e6c136..1d6a23e42f7d 100644
--- a/net/ipv4/tcp_cong.c
+++ b/net/ipv4/tcp_cong.c
@@ -344,10 +344,20 @@ int tcp_set_allowed_congestion_control(char *val)
return ret;
}

-/* Change congestion control for socket. If load is false, then it is the
- * responsibility of the caller to call tcp_init_congestion_control or
- * tcp_reinit_congestion_control (if the current congestion control was
- * already initialized.
+/**
+ * tcp_set_congestion_control - set a sock's congestion control
+ * @sk: the sock.
+ * @name: the desired congestion control.
+ * @load: whether to load the required module in case not loaded.
+ * @cap_net_admin: indicating if the caller have the CAP_NET_ADMIN.
+ *
+ * Returns 0 or an error.
+ *
+ * Must be called on a locked sock.
+ *
+ * If load is false, then it is the responsibility of the caller to call
+ * tcp_init_congestion_control or tcp_reinit_congestion_control (if the
+ * current congestion control was already initialized).
*/
int tcp_set_congestion_control(struct sock *sk, const char *name, bool load,
bool cap_net_admin)
@@ -383,6 +393,7 @@ int tcp_set_congestion_control(struct sock *sk, const char *name, bool load,
rcu_read_unlock();
return err;
}
+EXPORT_SYMBOL_GPL(tcp_set_congestion_control);

/* Slow start is used when congestion window is no greater than the slow start
* threshold. We base on RFC2581 and also handle stretch ACKs properly.
--
2.26.2


2022-03-11 20:55:43

by Mingbao Sun

[permalink] [raw]
Subject: [PATCH v2 3/3] nvmet-tcp: support specifying the congestion-control

From: Mingbao Sun <[email protected]>

congestion-control could have a noticeable impaction on the
performance of TCP-based communications. This is of course true
to NVMe_over_TCP.

Different congestion-controls (e.g., cubic, dctcp) are suitable for
different scenarios. Proper adoption of congestion control would benefit
the performance. On the contrary, the performance could be destroyed.

Though we can specify the congestion-control of NVMe_over_TCP via
writing '/proc/sys/net/ipv4/tcp_congestion_control', but this also
changes the congestion-control of all the future TCP sockets that
have not been explicitly assigned the congestion-control, thus bringing
potential impaction on their performance.

So it makes sense to make NVMe_over_TCP support specifying the
congestion-control. And this commit addresses the target side.

Implementation approach:
the following new file entry was created for user to specify the
congestion-control of each nvmet port.
'/sys/kernel/config/nvmet/ports/X/tcp_congestion'
Then later in nvmet_tcp_add_port, the specified congestion-control
would be applied to the listening socket of the nvmet port.

Signed-off-by: Mingbao Sun <[email protected]>
---
drivers/nvme/target/configfs.c | 42 ++++++++++++++++++++++++++++++++++
drivers/nvme/target/nvmet.h | 1 +
drivers/nvme/target/tcp.c | 13 +++++++++++
3 files changed, 56 insertions(+)

diff --git a/drivers/nvme/target/configfs.c b/drivers/nvme/target/configfs.c
index 091a0ca16361..7b7d95f6f582 100644
--- a/drivers/nvme/target/configfs.c
+++ b/drivers/nvme/target/configfs.c
@@ -11,6 +11,7 @@
#include <linux/ctype.h>
#include <linux/pci.h>
#include <linux/pci-p2pdma.h>
+#include <net/tcp.h>

#include "nvmet.h"

@@ -222,6 +223,45 @@ static ssize_t nvmet_addr_trsvcid_store(struct config_item *item,

CONFIGFS_ATTR(nvmet_, addr_trsvcid);

+static ssize_t nvmet_tcp_congestion_show(struct config_item *item,
+ char *page)
+{
+ struct nvmet_port *port = to_nvmet_port(item);
+
+ return snprintf(page, PAGE_SIZE, "%s\n",
+ port->tcp_congestion ? port->tcp_congestion : "");
+}
+
+static ssize_t nvmet_tcp_congestion_store(struct config_item *item,
+ const char *page, size_t count)
+{
+ struct nvmet_port *port = to_nvmet_port(item);
+ int len;
+ char *buf;
+
+ len = strcspn(page, "\n");
+ if (!len)
+ return -EINVAL;
+
+ if (nvmet_is_port_enabled(port, __func__))
+ return -EACCES;
+
+ buf = kmemdup_nul(page, len, GFP_KERNEL);
+ if (!buf)
+ return -ENOMEM;
+ if (strlen(buf) >= TCP_CA_NAME_MAX) {
+ kfree(buf);
+ return -EINVAL;
+ }
+
+ kfree(port->tcp_congestion);
+ port->tcp_congestion = buf;
+
+ return count;
+}
+
+CONFIGFS_ATTR(nvmet_, tcp_congestion);
+
static ssize_t nvmet_param_inline_data_size_show(struct config_item *item,
char *page)
{
@@ -1597,6 +1637,7 @@ static void nvmet_port_release(struct config_item *item)
list_del(&port->global_entry);

kfree(port->ana_state);
+ kfree(port->tcp_congestion);
kfree(port);
}

@@ -1605,6 +1646,7 @@ static struct configfs_attribute *nvmet_port_attrs[] = {
&nvmet_attr_addr_treq,
&nvmet_attr_addr_traddr,
&nvmet_attr_addr_trsvcid,
+ &nvmet_attr_tcp_congestion,
&nvmet_attr_addr_trtype,
&nvmet_attr_param_inline_data_size,
#ifdef CONFIG_BLK_DEV_INTEGRITY
diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h
index 69637bf8f8e1..76a57c4c3456 100644
--- a/drivers/nvme/target/nvmet.h
+++ b/drivers/nvme/target/nvmet.h
@@ -145,6 +145,7 @@ struct nvmet_port {
struct config_group ana_groups_group;
struct nvmet_ana_group ana_default_group;
enum nvme_ana_state *ana_state;
+ const char *tcp_congestion;
void *priv;
bool enabled;
int inline_data_size;
diff --git a/drivers/nvme/target/tcp.c b/drivers/nvme/target/tcp.c
index 83ca577f72be..311383c6d7da 100644
--- a/drivers/nvme/target/tcp.c
+++ b/drivers/nvme/target/tcp.c
@@ -1741,6 +1741,19 @@ static int nvmet_tcp_add_port(struct nvmet_port *nport)
if (so_priority > 0)
sock_set_priority(port->sock->sk, so_priority);

+ if (nport->tcp_congestion) {
+ lock_sock(port->sock->sk);
+ ret = tcp_set_congestion_control(port->sock->sk,
+ nport->tcp_congestion,
+ true, true);
+ release_sock(port->sock->sk);
+ if (ret) {
+ pr_err("failed to set port socket's congestion to %s: %d\n",
+ nport->tcp_congestion, ret);
+ goto err_sock;
+ }
+ }
+
ret = kernel_bind(port->sock, (struct sockaddr *)&port->addr,
sizeof(port->addr));
if (ret) {
--
2.26.2

2022-03-11 22:41:18

by Mingbao Sun

[permalink] [raw]
Subject: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

From: Mingbao Sun <[email protected]>

congestion-control could have a noticeable impaction on the
performance of TCP-based communications. This is of course true
to NVMe_over_TCP.

Different congestion-controls (e.g., cubic, dctcp) are suitable for
different scenarios. Proper adoption of congestion control would benefit
the performance. On the contrary, the performance could be destroyed.

Though we can specify the congestion-control of NVMe_over_TCP via
writing '/proc/sys/net/ipv4/tcp_congestion_control', but this also
changes the congestion-control of all the future TCP sockets that
have not been explicitly assigned the congestion-control, thus bringing
potential impaction on their performance.

So it makes sense to make NVMe_over_TCP support specifying the
congestion-control. And this commit addresses the host side.

Implementation approach:
a new option called 'tcp_congestion' was created in fabrics opt_tokens
for 'nvme connect' command to passed in the congestion-control
specified by the user.
Then later in nvme_tcp_alloc_queue, the specified congestion-control
would be applied to the relevant sockets of the host side.

Signed-off-by: Mingbao Sun <[email protected]>
---
drivers/nvme/host/fabrics.c | 18 ++++++++++++++++++
drivers/nvme/host/fabrics.h | 2 ++
drivers/nvme/host/tcp.c | 17 ++++++++++++++++-
3 files changed, 36 insertions(+), 1 deletion(-)

diff --git a/drivers/nvme/host/fabrics.c b/drivers/nvme/host/fabrics.c
index ee79a6d639b4..ecd1ec4e473a 100644
--- a/drivers/nvme/host/fabrics.c
+++ b/drivers/nvme/host/fabrics.c
@@ -10,6 +10,7 @@
#include <linux/mutex.h>
#include <linux/parser.h>
#include <linux/seq_file.h>
+#include <net/tcp.h>
#include "nvme.h"
#include "fabrics.h"

@@ -548,6 +549,7 @@ static const match_table_t opt_tokens = {
{ NVMF_OPT_TOS, "tos=%d" },
{ NVMF_OPT_FAIL_FAST_TMO, "fast_io_fail_tmo=%d" },
{ NVMF_OPT_DISCOVERY, "discovery" },
+ { NVMF_OPT_TCP_CONGESTION, "tcp_congestion=%s" },
{ NVMF_OPT_ERR, NULL }
};

@@ -829,6 +831,21 @@ static int nvmf_parse_options(struct nvmf_ctrl_options *opts,
case NVMF_OPT_DISCOVERY:
opts->discovery_nqn = true;
break;
+ case NVMF_OPT_TCP_CONGESTION:
+ p = match_strdup(args);
+ if (!p) {
+ ret = -ENOMEM;
+ goto out;
+ }
+ if (strlen(p) >= TCP_CA_NAME_MAX) {
+ ret = -EINVAL;
+ kfree(p);
+ goto out;
+ }
+
+ kfree(opts->tcp_congestion);
+ opts->tcp_congestion = p;
+ break;
default:
pr_warn("unknown parameter or missing value '%s' in ctrl creation request\n",
p);
@@ -947,6 +964,7 @@ void nvmf_free_options(struct nvmf_ctrl_options *opts)
kfree(opts->subsysnqn);
kfree(opts->host_traddr);
kfree(opts->host_iface);
+ kfree(opts->tcp_congestion);
kfree(opts);
}
EXPORT_SYMBOL_GPL(nvmf_free_options);
diff --git a/drivers/nvme/host/fabrics.h b/drivers/nvme/host/fabrics.h
index c3203ff1c654..25fdc169949d 100644
--- a/drivers/nvme/host/fabrics.h
+++ b/drivers/nvme/host/fabrics.h
@@ -68,6 +68,7 @@ enum {
NVMF_OPT_FAIL_FAST_TMO = 1 << 20,
NVMF_OPT_HOST_IFACE = 1 << 21,
NVMF_OPT_DISCOVERY = 1 << 22,
+ NVMF_OPT_TCP_CONGESTION = 1 << 23,
};

/**
@@ -117,6 +118,7 @@ struct nvmf_ctrl_options {
unsigned int nr_io_queues;
unsigned int reconnect_delay;
bool discovery_nqn;
+ const char *tcp_congestion;
bool duplicate_connect;
unsigned int kato;
struct nvmf_host *host;
diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c
index 10fc45d95b86..8491f96a39e5 100644
--- a/drivers/nvme/host/tcp.c
+++ b/drivers/nvme/host/tcp.c
@@ -1487,6 +1487,20 @@ static int nvme_tcp_alloc_queue(struct nvme_ctrl *nctrl,
if (nctrl->opts->tos >= 0)
ip_sock_set_tos(queue->sock->sk, nctrl->opts->tos);

+ if (nctrl->opts->mask & NVMF_OPT_TCP_CONGESTION) {
+ lock_sock(queue->sock->sk);
+ ret = tcp_set_congestion_control(queue->sock->sk,
+ nctrl->opts->tcp_congestion,
+ true, true);
+ release_sock(queue->sock->sk);
+ if (ret) {
+ dev_err(nctrl->device,
+ "failed to set TCP congestion to %s: %d\n",
+ nctrl->opts->tcp_congestion, ret);
+ goto err_sock;
+ }
+ }
+
/* Set 10 seconds timeout for icresp recvmsg */
queue->sock->sk->sk_rcvtimeo = 10 * HZ;

@@ -2650,7 +2664,8 @@ static struct nvmf_transport_ops nvme_tcp_transport = {
NVMF_OPT_HOST_TRADDR | NVMF_OPT_CTRL_LOSS_TMO |
NVMF_OPT_HDR_DIGEST | NVMF_OPT_DATA_DIGEST |
NVMF_OPT_NR_WRITE_QUEUES | NVMF_OPT_NR_POLL_QUEUES |
- NVMF_OPT_TOS | NVMF_OPT_HOST_IFACE,
+ NVMF_OPT_TOS | NVMF_OPT_HOST_IFACE |
+ NVMF_OPT_TCP_CONGESTION,
.create_ctrl = nvme_tcp_create_ctrl,
};

--
2.26.2

2022-03-14 06:52:04

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control


> From: Mingbao Sun <[email protected]>

Hey Mingbao,

> congestion-control could have a noticeable impaction on the
> performance of TCP-based communications. This is of course true
> to NVMe_over_TCP.
>
> Different congestion-controls (e.g., cubic, dctcp) are suitable for
> different scenarios. Proper adoption of congestion control would benefit
> the performance. On the contrary, the performance could be destroyed.
>
> Though we can specify the congestion-control of NVMe_over_TCP via
> writing '/proc/sys/net/ipv4/tcp_congestion_control', but this also
> changes the congestion-control of all the future TCP sockets that
> have not been explicitly assigned the congestion-control, thus bringing
> potential impaction on their performance.
>
> So it makes sense to make NVMe_over_TCP support specifying the
> congestion-control. And this commit addresses the host side.

Thanks for this patchset.

Generally, I'm not opposed to allow users to customize what they want
to do, but in order to add something like this we need a few
justifications.

1. Can you please provide your measurements that support your claims?

2. Can you please provide a real, existing use-case where this provides
true, measureable value? And more specifically, please clarify how the
use-case needs a local tuning for nvme-tcp that would not hold for
other tcp streams that are running on the host (and vice-versa).

3. There are quite a few of TCP tuning knobs that will affect how
nvme-tcp performs, just like any TCP application that running on Linux.
However, Application level TCP tuning is not widespread at all, what
makes nvme-tcp special to allow this, and why the TCP congestion is more
important than other tuning knobs? I am not supportive of exporting
all or some TCP level knobs as a local shadow for sysctl.

Adding tunables, especially ones that are address niche use-cases in
nature, can easily become a slippery slope for a set of rarely touched
code and interface we are left stuck with for a long time...

But while this feels a bit random to me, I'm not objecting to add this
to the driver. I just want to make sure that this is something that is
a) really required and b) does not backfire on us nor the user.

> Implementation approach:
> a new option called 'tcp_congestion' was created in fabrics opt_tokens
> for 'nvme connect' command to passed in the congestion-control
> specified by the user.
> Then later in nvme_tcp_alloc_queue, the specified congestion-control
> would be applied to the relevant sockets of the host side.

Specifically to the interface, I'm wandering if this is the right
interface... The user is used to sysctl with the semantics that it
provides, wouldn't it be better to expose the exact same interface
just for nvme-tcp sockets?

Something like sysctl nvme.tcp_congestion_control ?

2022-03-14 10:10:43

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v2 3/3] nvmet-tcp: support specifying the congestion-control


> From: Mingbao Sun <[email protected]>

Hey Mingbao,

> congestion-control could have a noticeable impaction on the
> performance of TCP-based communications. This is of course true
> to NVMe_over_TCP.
>
> Different congestion-controls (e.g., cubic, dctcp) are suitable for
> different scenarios. Proper adoption of congestion control would benefit
> the performance. On the contrary, the performance could be destroyed.
>
> Though we can specify the congestion-control of NVMe_over_TCP via
> writing '/proc/sys/net/ipv4/tcp_congestion_control', but this also
> changes the congestion-control of all the future TCP sockets that
> have not been explicitly assigned the congestion-control, thus bringing
> potential impaction on their performance.
>
> So it makes sense to make NVMe_over_TCP support specifying the
> congestion-control. And this commit addresses the target side.
>
> Implementation approach:
> the following new file entry was created for user to specify the
> congestion-control of each nvmet port.
> '/sys/kernel/config/nvmet/ports/X/tcp_congestion'
> Then later in nvmet_tcp_add_port, the specified congestion-control
> would be applied to the listening socket of the nvmet port.

Please see my comments on the host side patch.

In addition, specifically on the chosen interface, why should this
be port specific? What is the use-case to configure this per-port?

2022-03-14 10:39:44

by Mingbao Sun

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

Before answering the questions, I’d like to address the motivation
behind this patchset.

You know, InfiniBand/RoCE provides NVMe-oF a lossless network
environment (that is zero packet loss), which is a great advantage
to performance.

In contrast, 'TCP/IP + ethernet' is often used as a lossy network
environment (packet dropping often occurs).
And once packet dropping occurs, timeout-retransmission would be
triggered. But once timeout-retransmission was triggered, bandwidth
would drop to 0 all of a sudden. This is great damage to performance.

So although NVMe/TCP may have a bandwidth competitive to that of
NVMe/RDMA, but the packet dropping of the former is a flaw to
its performance.

However, with the combination of the following conditions, NVMe/TCP
can become much more competitive to NVMe/RDMA in the data center.

- Ethernet NICs supporting QoS configuration (support mapping TOS/DSCP
in IP header into priority, supporting adjusting buffer size of each
priority, support PFC)

- Ethernet Switches supporting ECN marking, supporting adjusting
buffer size of each priority.

- NVMe/TCP supports specifying the tos for its TCP traffic
(already implemented)

- NVMe/TCP supports specifying dctcp as the congestion-control of its
TCP sockets (the work of this feature)

So this feature is the last item from the software aspect to form up the
above combination.

2022-03-14 16:31:11

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

The implementation of this and the target side looks sane to me.
I'll wait for the opinion of the NVMe/TCP on the usefulness of this,
and for the networking maintainer on an ACK for patch 1.

2022-03-25 19:18:21

by Mingbao Sun

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

[Combination 3] dctcp + switch ECN-marking off

This is a wrong usage of dctcp.
Since the ECN-marking required by dctcp is disabled on the switch.
The condition of the bandwidth is about as bad as [Combination 2].
Since packet dropping occured.

/*
* before loading traffic, disable ECN-marking and clear the
* counters on the 2 switches.
*/

hound-dirt# configure terminal
hound-dirt(config)# wred LOSSLESS_ECN_5
hound-dirt(config-wred)# no random-detect ecn
hound-dirt(config-wred)# end
hound-dirt#
hound-dirt# clear qos statistics type queuing interface ethernet 1/1/4
hound-dirt# show queuing statistics interface ethernet 1/1/4
Interface ethernet1/1/4
Queue Packets Bytes Dropped-Packets Dropped-Bytes
0 0 0 0 0
1 0 0 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
7 0 0 0 0
hound-dirt#

fox-dirt# configure terminal
fox-dirt(config)# wred LOSSLESS_ECN_5
fox-dirt(config-wred)# no random-detect ecn
fox-dirt(config-wred)# end
fox-dirt#
fox-dirt# clear qos statistics type queuing interface ethernet 1/1/4
fox-dirt# show queuing statistics interface ethernet 1/1/4
Interface ethernet1/1/4
Queue Packets Bytes Dropped-Packets Dropped-Bytes
0 2 128 0 0
1 2 196 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
7 0 0 0 0
fox-dirt#



/*
* logs of RX node.
*/

ogden-dirt:/home/admin/tyler # echo dctcp >/proc/sys/net/ipv4/tcp_congestion_control
ogden-dirt:/home/admin/tyler # cat /proc/sys/net/ipv4/tcp_congestion_control
dctcp

ogden-dirt:/home/admin/tyler # date; ./tcp_perf.exe --server-ip 169.254.85.4 --server-port 10001 --msg-len 65536 --conn-num 24 -s -r --data-random --svr-no-wait-all-conn-OK --show-bandwidth-only
Fri Mar 25 09:53:47 EDT 2022

DBG:/mnt/hgfs/src/linux-dev-framework-master/libs/app_utils/src/socket.c(104)-socket_init_2:
bind socket fd 3 to 169.254.85.4:10001 succeed
DBG:perf_frmwk.c(1472)-create_tcp_conns_start_load:
start listen on fd 3
conn [0] local 169.254.85.4:10001 peer 169.254.85.3:59692 created.
rx thread of conn 0 started
conn [1] local 169.254.85.4:10001 peer 169.254.85.3:59694 created.
rx thread of conn 1 started
conn [2] local 169.254.85.4:10001 peer 169.254.85.3:59696 created.
rx thread of conn 2 started
conn [3] local 169.254.85.4:10001 peer 169.254.85.3:59698 created.
rx thread of conn 3 started
conn [4] local 169.254.85.4:10001 peer 169.254.85.3:59700 created.
conn [5] local 169.254.85.4:10001 peer 169.254.85.3:59702 created.
rx thread of conn 4 started
conn [6] local 169.254.85.4:10001 peer 169.254.85.3:59706 created.
rx thread of conn 5 started
conn [7] local 169.254.85.4:10001 peer 169.254.85.3:59708 created.
rx thread of conn 6 started
rx thread of conn 7 started
conn [8] local 169.254.85.4:10001 peer 169.254.85.2:37992 created.
rx thread of conn 8 started
conn [9] local 169.254.85.4:10001 peer 169.254.85.2:37994 created.
rx thread of conn 9 started
conn [10] local 169.254.85.4:10001 peer 169.254.85.2:37996 created.
rx thread of conn 10 started
conn [11] local 169.254.85.4:10001 peer 169.254.85.2:37998 created.
rx thread of conn 11 started
conn [12] local 169.254.85.4:10001 peer 169.254.85.2:38000 created.
rx thread of conn 12 started
conn [13] local 169.254.85.4:10001 peer 169.254.85.2:38002 created.
rx thread of conn 13 started
conn [14] local 169.254.85.4:10001 peer 169.254.85.2:38004 created.
conn [15] local 169.254.85.4:10001 peer 169.254.85.2:38006 created.
rx thread of conn 14 started
rx thread of conn 15 started

[time lasts]: 1
rx_msg_succ_bytes 0x15b4f0000 (5,826,871,296)
conn [16] local 169.254.85.4:10001 peer 169.254.85.1:49500 created.
rx thread of conn 16 started
conn [17] local 169.254.85.4:10001 peer 169.254.85.1:49502 created.
rx thread of conn 17 started
conn [18] local 169.254.85.4:10001 peer 169.254.85.1:49504 created.
rx thread of conn 18 started
conn [19] local 169.254.85.4:10001 peer 169.254.85.1:49506 created.
rx thread of conn 19 started
conn [20] local 169.254.85.4:10001 peer 169.254.85.1:49508 created.
rx thread of conn 20 started
conn [21] local 169.254.85.4:10001 peer 169.254.85.1:49510 created.
rx thread of conn 21 started
conn [22] local 169.254.85.4:10001 peer 169.254.85.1:49512 created.
rx thread of conn 22 started
conn [23] local 169.254.85.4:10001 peer 169.254.85.1:49514 created.
24 connection(s) created in total
rx thread of conn 23 started

[time lasts]: 2
rx_msg_succ_bytes 0x159db0000 (5,802,491,904)

[time lasts]: 3
rx_msg_succ_bytes 0x15a770000 (5,812,715,520)

[time lasts]: 4
rx_msg_succ_bytes 0x159680000 (5,794,955,264)

[time lasts]: 5
rx_msg_succ_bytes 0x15b130000 (5,822,939,136)

[time lasts]: 6
rx_msg_succ_bytes 0x1574b0000 (5,759,500,288)

[time lasts]: 7
rx_msg_succ_bytes 0x159850000 (5,796,855,808)

[time lasts]: 8
rx_msg_succ_bytes 0x15ae60000 (5,819,990,016)

[time lasts]: 9
rx_msg_succ_bytes 0x15be90000 (5,836,963,840)

[time lasts]: 10
rx_msg_succ_bytes 0x158ef0000 (5,787,025,408)

[time lasts]: 11
rx_msg_succ_bytes 0x15ad30000 (5,818,744,832)

[time lasts]: 12
rx_msg_succ_bytes 0x159ee0000 (5,803,737,088)

[time lasts]: 13
rx_msg_succ_bytes 0x15bc00000 (5,834,276,864)

[time lasts]: 14
rx_msg_succ_bytes 0x1589d0000 (5,781,651,456)

[time lasts]: 15
rx_msg_succ_bytes 0x157f60000 (5,770,706,944)

[time lasts]: 16
rx_msg_succ_bytes 0x15a290000 (5,807,603,712)

[time lasts]: 17
rx_msg_succ_bytes 0x1582f0000 (5,774,442,496)

[time lasts]: 18
rx_msg_succ_bytes 0x15b3e0000 (5,825,757,184)

[time lasts]: 19
rx_msg_succ_bytes 0x15abc0000 (5,817,237,504)

[time lasts]: 20
rx_msg_succ_bytes 0x159010000 (5,788,205,056)

[time lasts]: 21
rx_msg_succ_bytes 0x15c080000 (5,838,995,456)

[time lasts]: 22
rx_msg_succ_bytes 0x159410000 (5,792,399,360)

[time lasts]: 23
rx_msg_succ_bytes 0x158fc0000 (5,787,877,376)

[time lasts]: 24
rx_msg_succ_bytes 0x1531c0000 (5,689,311,232)

[time lasts]: 25
rx_msg_succ_bytes 0x158520000 (5,776,736,256)

[time lasts]: 26
rx_msg_succ_bytes 0x15a720000 (5,812,387,840)

[time lasts]: 27
rx_msg_succ_bytes 0x157980000 (5,764,546,560)

[time lasts]: 28
rx_msg_succ_bytes 0x159660000 (5,794,824,192)

[time lasts]: 29
rx_msg_succ_bytes 0x157d30000 (5,768,413,184)

[time lasts]: 30
rx_msg_succ_bytes 0x15a890000 (5,813,895,168)

[time lasts]: 31
rx_msg_succ_bytes 0x159630000 (5,794,627,584)

[time lasts]: 32
rx_msg_succ_bytes 0x15cff0000 (5,855,182,848)

[time lasts]: 33
rx_msg_succ_bytes 0x15d700000 (5,862,588,416)

[time lasts]: 34
rx_msg_succ_bytes 0x158100000 (5,772,410,880)

[time lasts]: 35
rx_msg_succ_bytes 0x15ccf0000 (5,852,037,120)

[time lasts]: 36
rx_msg_succ_bytes 0x15b790000 (5,829,623,808)

[time lasts]: 37
rx_msg_succ_bytes 0x159570000 (5,793,841,152)

[time lasts]: 38
rx_msg_succ_bytes 0x15c070000 (5,838,929,920)

[time lasts]: 39
rx_msg_succ_bytes 0x158380000 (5,775,032,320)

[time lasts]: 40
rx_msg_succ_bytes 0x155d40000 (5,734,924,288)

[time lasts]: 41
rx_msg_succ_bytes 0x15af50000 (5,820,973,056)

[time lasts]: 42
rx_msg_succ_bytes 0x157c20000 (5,767,299,072)

[time lasts]: 43
rx_msg_succ_bytes 0x158530000 (5,776,801,792)

[time lasts]: 44
rx_msg_succ_bytes 0x15b2f0000 (5,824,774,144)

[time lasts]: 45
rx_msg_succ_bytes 0x15a660000 (5,811,601,408)

[time lasts]: 46
rx_msg_succ_bytes 0x158840000 (5,780,013,056)

[time lasts]: 47
rx_msg_succ_bytes 0x1585e0000 (5,777,522,688)

[time lasts]: 48
rx_msg_succ_bytes 0x158070000 (5,771,821,056)

[time lasts]: 49
rx_msg_succ_bytes 0x156ce0000 (5,751,308,288)

[time lasts]: 50
rx_msg_succ_bytes 0x158ed0000 (5,786,894,336)

[time lasts]: 51
rx_msg_succ_bytes 0x15a580000 (5,810,683,904)

[time lasts]: 52
rx_msg_succ_bytes 0x15adc0000 (5,819,334,656)

[time lasts]: 53
rx_msg_succ_bytes 0x1589d0000 (5,781,651,456)

[time lasts]: 54
rx_msg_succ_bytes 0x15b410000 (5,825,953,792)

[time lasts]: 55
rx_msg_succ_bytes 0x158890000 (5,780,340,736)

[time lasts]: 56
rx_msg_succ_bytes 0x153660000 (5,694,160,896)

[time lasts]: 57
rx_msg_succ_bytes 0x15a7a0000 (5,812,912,128)

[time lasts]: 58
rx_msg_succ_bytes 0x158ce0000 (5,784,862,720)

[time lasts]: 59
rx_msg_succ_bytes 0x157720000 (5,762,056,192)

[time lasts]: 60
rx_msg_succ_bytes 0x158d80000 (5,785,518,080)

[time lasts]: 61
rx_msg_succ_bytes 0x158440000 (5,775,818,752)

[time lasts]: 62
rx_msg_succ_bytes 0x157430000 (5,758,976,000)

[time lasts]: 63
rx_msg_succ_bytes 0x159c60000 (5,801,115,648)

[time lasts]: 64
rx_msg_succ_bytes 0x15a440000 (5,809,373,184)
^Ccaught signal 2

/*
* logs of TX node 1.
*/

provo-dirt:/home/admin/tyler # echo dctcp >/proc/sys/net/ipv4/tcp_congestion_control
provo-dirt:/home/admin/tyler # cat /proc/sys/net/ipv4/tcp_congestion_control
dctcp

provo-dirt:/home/admin/tyler # date; ./tcp_perf.exe --server-ip 169.254.85.4 --server-port 10001 --msg-len 65536 --conn-num 8 -c -t --data-random --show-bandwidth-only
Fri Mar 25 09:53:52 EDT 2022

conn [0] local 169.254.85.1:49500 peer 169.254.85.4:10001 created.
conn [1] local 169.254.85.1:49502 peer 169.254.85.4:10001 created.
conn [2] local 169.254.85.1:49504 peer 169.254.85.4:10001 created.
conn [3] local 169.254.85.1:49506 peer 169.254.85.4:10001 created.
conn [4] local 169.254.85.1:49508 peer 169.254.85.4:10001 created.
conn [5] local 169.254.85.1:49510 peer 169.254.85.4:10001 created.
conn [6] local 169.254.85.1:49512 peer 169.254.85.4:10001 created.
conn [7] local 169.254.85.1:49514 peer 169.254.85.4:10001 created.
8 connection(s) created in total
tx thread of conn 0 started
tx thread of conn 1 started
tx thread of conn 2 started
tx thread of conn 3 started
tx thread of conn 4 started
tx thread of conn 5 started
tx thread of conn 6 started
tx thread of conn 7 started

[time lasts]: 1
tx_succ_bytes 0x82fd0000 (2,197,618,688)

[time lasts]: 2
tx_succ_bytes 0x7dbe0000 (2,109,603,840)

[time lasts]: 3
tx_succ_bytes 0x82120000 (2,182,217,728)

[time lasts]: 4
tx_succ_bytes 0x75890000 (1,971,912,704)

[time lasts]: 5
tx_succ_bytes 0x67040000 (1,728,315,392)

[time lasts]: 6
tx_succ_bytes 0x754d0000 (1,967,980,544)

[time lasts]: 7
tx_succ_bytes 0x74be0000 (1,958,608,896)

[time lasts]: 8
tx_succ_bytes 0x79670000 (2,036,793,344)

[time lasts]: 9
tx_succ_bytes 0x7fb00000 (2,142,240,768)

[time lasts]: 10
tx_succ_bytes 0x730d0000 (1,930,231,808)

[time lasts]: 11
tx_succ_bytes 0x6f9b0000 (1,872,429,056)

[time lasts]: 12
tx_succ_bytes 0x58d60000 (1,490,419,712)

[time lasts]: 13
tx_succ_bytes 0x72b40000 (1,924,399,104)

[time lasts]: 14
tx_succ_bytes 0x5f190000 (1,595,473,920)

[time lasts]: 15
tx_succ_bytes 0x66590000 (1,717,108,736)

[time lasts]: 16
tx_succ_bytes 0x72bc0000 (1,924,923,392)

[time lasts]: 17
tx_succ_bytes 0x69460000 (1,766,195,200)

[time lasts]: 18
tx_succ_bytes 0x7a8c0000 (2,055,995,392)

[time lasts]: 19
tx_succ_bytes 0x630c0000 (1,661,730,816)

[time lasts]: 20
tx_succ_bytes 0x608c0000 (1,619,787,776)

[time lasts]: 21
tx_succ_bytes 0x64680000 (1,684,537,344)

[time lasts]: 22
tx_succ_bytes 0x5d250000 (1,562,705,920)

[time lasts]: 23
tx_succ_bytes 0x7ba60000 (2,074,476,544)

[time lasts]: 24
tx_succ_bytes 0x76120000 (1,980,891,136)

[time lasts]: 25
tx_succ_bytes 0x5a2a0000 (1,512,701,952)

[time lasts]: 26
tx_succ_bytes 0x6e550000 (1,851,064,320)

[time lasts]: 27
tx_succ_bytes 0x60650000 (1,617,231,872)

[time lasts]: 28
tx_succ_bytes 0x78cb0000 (2,026,569,728)

[time lasts]: 29
tx_succ_bytes 0x71c80000 (1,908,932,608)

[time lasts]: 30
tx_succ_bytes 0x66190000 (1,712,914,432)

[time lasts]: 31
tx_succ_bytes 0x6c760000 (1,819,672,576)

[time lasts]: 32
tx_succ_bytes 0x77b20000 (2,008,154,112)

[time lasts]: 33
tx_succ_bytes 0x6eab0000 (1,856,700,416)

[time lasts]: 34
tx_succ_bytes 0x85160000 (2,232,811,520)

[time lasts]: 35
tx_succ_bytes 0x71f70000 (1,912,012,800)

[time lasts]: 36
tx_succ_bytes 0x6ec30000 (1,858,273,280)

[time lasts]: 37
tx_succ_bytes 0x7b1d0000 (2,065,498,112)

[time lasts]: 38
tx_succ_bytes 0x74eb0000 (1,961,558,016)

[time lasts]: 39
tx_succ_bytes 0x72eb0000 (1,928,003,584)

[time lasts]: 40
tx_succ_bytes 0x76700000 (1,987,051,520)

[time lasts]: 41
tx_succ_bytes 0x721a0000 (1,914,306,560)

[time lasts]: 42
tx_succ_bytes 0x69500000 (1,766,850,560)

[time lasts]: 43
tx_succ_bytes 0x70fe0000 (1,895,694,336)

[time lasts]: 44
tx_succ_bytes 0x7e640000 (2,120,482,816)

[time lasts]: 45
tx_succ_bytes 0x6a090000 (1,778,974,720)

[time lasts]: 46
tx_succ_bytes 0x6e2b0000 (1,848,311,808)

[time lasts]: 47
tx_succ_bytes 0x5e0f0000 (1,578,041,344)

[time lasts]: 48
tx_succ_bytes 0x670f0000 (1,729,036,288)

[time lasts]: 49
tx_succ_bytes 0x77390000 (2,000,224,256)

[time lasts]: 50
tx_succ_bytes 0x77360000 (2,000,027,648)

[time lasts]: 51
tx_succ_bytes 0x7c040000 (2,080,636,928)

[time lasts]: 52
tx_succ_bytes 0x678c0000 (1,737,228,288)

[time lasts]: 53
tx_succ_bytes 0x69720000 (1,769,078,784)

[time lasts]: 54
tx_succ_bytes 0x683a0000 (1,748,631,552)

[time lasts]: 55
tx_succ_bytes 0x769d0000 (1,990,000,640)

[time lasts]: 56
tx_succ_bytes 0x77270000 (1,999,044,608)

[time lasts]: 57
tx_succ_bytes 0x671a0000 (1,729,757,184)

[time lasts]: 58
tx_succ_bytes 0x6c4a0000 (1,816,788,992)

[time lasts]: 59
tx_succ_bytes 0x7c370000 (2,083,979,264)

[time lasts]: 60
tx_succ_bytes 0x82a00000 (2,191,523,840)

[time lasts]: 61
tx_succ_bytes 0x6f980000 (1,872,232,448)

[time lasts]: 62
tx_succ_bytes 0x75d50000 (1,976,893,440)

[time lasts]: 63
tx_succ_bytes 0x75a20000 (1,973,551,104)

/*
* logs of TX node 2.
*/

sandy-dirt:/home/admin/tyler # echo dctcp >/proc/sys/net/ipv4/tcp_congestion_control
sandy-dirt:/home/admin/tyler # cat /proc/sys/net/ipv4/tcp_congestion_control
dctcp

sandy-dirt:/home/admin/tyler # date; ./tcp_perf.exe --server-ip 169.254.85.4 --server-port 10001 --msg-len 65536 --conn-num 8 -c -t --data-random --show-bandwidth-only
Fri Mar 25 09:53:51 EDT 2022

conn [0] local 169.254.85.2:37992 peer 169.254.85.4:10001 created.
conn [1] local 169.254.85.2:37994 peer 169.254.85.4:10001 created.
conn [2] local 169.254.85.2:37996 peer 169.254.85.4:10001 created.
conn [3] local 169.254.85.2:37998 peer 169.254.85.4:10001 created.
conn [4] local 169.254.85.2:38000 peer 169.254.85.4:10001 created.
conn [5] local 169.254.85.2:38002 peer 169.254.85.4:10001 created.
conn [6] local 169.254.85.2:38004 peer 169.254.85.4:10001 created.
conn [7] local 169.254.85.2:38006 peer 169.254.85.4:10001 created.
8 connection(s) created in total
tx thread of conn 0 started
tx thread of conn 1 started
tx thread of conn 2 started
tx thread of conn 3 started
tx thread of conn 4 started
tx thread of conn 5 started
tx thread of conn 6 started
tx thread of conn 7 started

[time lasts]: 1
tx_succ_bytes 0x9acb0000 (2,596,995,072)

[time lasts]: 2
tx_succ_bytes 0x6d5b0000 (1,834,680,320)

[time lasts]: 3
tx_succ_bytes 0x77070000 (1,996,947,456)

[time lasts]: 4
tx_succ_bytes 0x5f3b0000 (1,597,702,144)

[time lasts]: 5
tx_succ_bytes 0x6f690000 (1,869,152,256)

[time lasts]: 6
tx_succ_bytes 0x80690000 (2,154,364,928)

[time lasts]: 7
tx_succ_bytes 0x6c880000 (1,820,852,224)

[time lasts]: 8
tx_succ_bytes 0x6a640000 (1,784,938,496)

[time lasts]: 9
tx_succ_bytes 0x6dc50000 (1,841,627,136)

[time lasts]: 10
tx_succ_bytes 0x694d0000 (1,766,653,952)

[time lasts]: 11
tx_succ_bytes 0x69ee0000 (1,777,205,248)

[time lasts]: 12
tx_succ_bytes 0x6e630000 (1,851,981,824)

[time lasts]: 13
tx_succ_bytes 0x79a50000 (2,040,856,576)

[time lasts]: 14
tx_succ_bytes 0x73f90000 (1,945,698,304)

[time lasts]: 15
tx_succ_bytes 0x79280000 (2,032,664,576)

[time lasts]: 16
tx_succ_bytes 0x743b0000 (1,950,023,680)

[time lasts]: 17
tx_succ_bytes 0x6be00000 (1,809,842,176)

[time lasts]: 18
tx_succ_bytes 0x63d10000 (1,674,641,408)

[time lasts]: 19
tx_succ_bytes 0x69570000 (1,767,309,312)

[time lasts]: 20
tx_succ_bytes 0x87ed0000 (2,280,456,192)

[time lasts]: 21
tx_succ_bytes 0x76730000 (1,987,248,128)

[time lasts]: 22
tx_succ_bytes 0x75fa0000 (1,979,318,272)

[time lasts]: 23
tx_succ_bytes 0x84760000 (2,222,325,760)

[time lasts]: 24
tx_succ_bytes 0x6de60000 (1,843,789,824)

[time lasts]: 25
tx_succ_bytes 0x701b0000 (1,880,817,664)

[time lasts]: 26
tx_succ_bytes 0x7d400000 (2,101,346,304)

[time lasts]: 27
tx_succ_bytes 0x77650000 (2,003,107,840)

[time lasts]: 28
tx_succ_bytes 0x71570000 (1,901,527,040)

[time lasts]: 29
tx_succ_bytes 0x798e0000 (2,039,349,248)

[time lasts]: 30
tx_succ_bytes 0x70490000 (1,883,832,320)

[time lasts]: 31
tx_succ_bytes 0x84820000 (2,223,112,192)

[time lasts]: 32
tx_succ_bytes 0x88690000 (2,288,582,656)

[time lasts]: 33
tx_succ_bytes 0x72a30000 (1,923,284,992)

[time lasts]: 34
tx_succ_bytes 0x79ae0000 (2,041,446,400)

[time lasts]: 35
tx_succ_bytes 0x6fce0000 (1,875,771,392)

[time lasts]: 36
tx_succ_bytes 0x69680000 (1,768,423,424)

[time lasts]: 37
tx_succ_bytes 0x65430000 (1,698,889,728)

[time lasts]: 38
tx_succ_bytes 0x84ff0000 (2,231,304,192)

[time lasts]: 39
tx_succ_bytes 0x7dd10000 (2,110,849,024)

[time lasts]: 40
tx_succ_bytes 0x70480000 (1,883,766,784)

[time lasts]: 41
tx_succ_bytes 0x7b8c0000 (2,072,772,608)

[time lasts]: 42
tx_succ_bytes 0x75a90000 (1,974,009,856)

[time lasts]: 43
tx_succ_bytes 0x83110000 (2,198,929,408)

[time lasts]: 44
tx_succ_bytes 0x739e0000 (1,939,734,528)

[time lasts]: 45
tx_succ_bytes 0x665a0000 (1,717,174,272)

[time lasts]: 46
tx_succ_bytes 0x6f0c0000 (1,863,057,408)

[time lasts]: 47
tx_succ_bytes 0x7b530000 (2,069,037,056)

[time lasts]: 48
tx_succ_bytes 0x850d0000 (2,232,221,696)

[time lasts]: 49
tx_succ_bytes 0x7c6c0000 (2,087,452,672)

[time lasts]: 50
tx_succ_bytes 0x72d60000 (1,926,627,328)

[time lasts]: 51
tx_succ_bytes 0x6f790000 (1,870,200,832)

[time lasts]: 52
tx_succ_bytes 0x65be0000 (1,706,950,656)

[time lasts]: 53
tx_succ_bytes 0x89470000 (2,303,131,648)

[time lasts]: 54
tx_succ_bytes 0x84290000 (2,217,279,488)

[time lasts]: 55
tx_succ_bytes 0x739a0000 (1,939,472,384)

[time lasts]: 56
tx_succ_bytes 0x60fb0000 (1,627,062,272)

[time lasts]: 57
tx_succ_bytes 0x82570000 (2,186,739,712)

[time lasts]: 58
tx_succ_bytes 0x76fb0000 (1,996,161,024)

[time lasts]: 59
tx_succ_bytes 0x71dc0000 (1,910,243,328)

[time lasts]: 60
tx_succ_bytes 0x73fd0000 (1,945,960,448)

[time lasts]: 61
tx_succ_bytes 0x710c0000 (1,896,611,840)

[time lasts]: 62
tx_succ_bytes 0x77ea0000 (2,011,824,128)

[time lasts]: 63
tx_succ_bytes 0x6b9b0000 (1,805,320,192)

/*
* logs of TX node 3.
*/

orem-dirt:/home/admin/tyler # echo dctcp >/proc/sys/net/ipv4/tcp_congestion_control
orem-dirt:/home/admin/tyler # cat /proc/sys/net/ipv4/tcp_congestion_control
dctcp

orem-dirt:/home/admin/tyler # date; ./tcp_perf.exe --server-ip 169.254.85.4 --server-port 10001 --msg-len 65536 --conn-num 8 -c -t --data-random --show-bandwidth-only
Fri Mar 25 09:53:50 EDT 2022

conn [0] local 169.254.85.3:59692 peer 169.254.85.4:10001 created.
conn [1] local 169.254.85.3:59694 peer 169.254.85.4:10001 created.
conn [2] local 169.254.85.3:59696 peer 169.254.85.4:10001 created.
conn [3] local 169.254.85.3:59698 peer 169.254.85.4:10001 created.
conn [4] local 169.254.85.3:59700 peer 169.254.85.4:10001 created.
conn [5] local 169.254.85.3:59702 peer 169.254.85.4:10001 created.
conn [6] local 169.254.85.3:59706 peer 169.254.85.4:10001 created.
conn [7] local 169.254.85.3:59708 peer 169.254.85.4:10001 created.
8 connection(s) created in total
tx thread of conn 0 started
tx thread of conn 1 started
tx thread of conn 2 started
tx thread of conn 3 started
tx thread of conn 4 started
tx thread of conn 5 started
tx thread of conn 6 started
tx thread of conn 7 started

[time lasts]: 1
tx_succ_bytes 0x123670000 (4,888,920,064)

[time lasts]: 2
tx_succ_bytes 0x83950000 (2,207,580,160)

[time lasts]: 3
tx_succ_bytes 0x76440000 (1,984,167,936)

[time lasts]: 4
tx_succ_bytes 0x69280000 (1,764,229,120)

[time lasts]: 5
tx_succ_bytes 0x79240000 (2,032,402,432)

[time lasts]: 6
tx_succ_bytes 0x79ab0000 (2,041,249,792)

[time lasts]: 7
tx_succ_bytes 0x70370000 (1,882,652,672)

[time lasts]: 8
tx_succ_bytes 0x79660000 (2,036,727,808)

[time lasts]: 9
tx_succ_bytes 0x7e170000 (2,115,436,544)

[time lasts]: 10
tx_succ_bytes 0x719f0000 (1,906,245,632)

[time lasts]: 11
tx_succ_bytes 0x69e20000 (1,776,418,816)

[time lasts]: 12
tx_succ_bytes 0x7bbd0000 (2,075,983,872)

[time lasts]: 13
tx_succ_bytes 0x89910000 (2,307,981,312)

[time lasts]: 14
tx_succ_bytes 0x7a750000 (2,054,488,064)

[time lasts]: 15
tx_succ_bytes 0x7c7e0000 (2,088,632,320)

[time lasts]: 16
tx_succ_bytes 0x88b00000 (2,293,235,712)

[time lasts]: 17
tx_succ_bytes 0x71d10000 (1,909,522,432)

[time lasts]: 18
tx_succ_bytes 0x897a0000 (2,306,473,984)

[time lasts]: 19
tx_succ_bytes 0x78cf0000 (2,026,831,872)

[time lasts]: 20
tx_succ_bytes 0x7cc80000 (2,093,481,984)

[time lasts]: 21
tx_succ_bytes 0x803f0000 (2,151,612,416)

[time lasts]: 22
tx_succ_bytes 0x77170000 (1,997,996,032)

[time lasts]: 23
tx_succ_bytes 0x7bf80000 (2,079,850,496)

[time lasts]: 24
tx_succ_bytes 0x686c0000 (1,751,908,352)

[time lasts]: 25
tx_succ_bytes 0x764b0000 (1,984,626,688)

[time lasts]: 26
tx_succ_bytes 0x78b80000 (2,025,324,544)

[time lasts]: 27
tx_succ_bytes 0x77d40000 (2,010,382,336)

[time lasts]: 28
tx_succ_bytes 0x86830000 (2,256,732,160)

[time lasts]: 29
tx_succ_bytes 0x761b0000 (1,981,480,960)

[time lasts]: 30
tx_succ_bytes 0x6d880000 (1,837,629,440)

[time lasts]: 31
tx_succ_bytes 0x74060000 (1,946,550,272)

[time lasts]: 32
tx_succ_bytes 0x74aa0000 (1,957,298,176)

[time lasts]: 33
tx_succ_bytes 0x61a60000 (1,638,268,928)

[time lasts]: 34
tx_succ_bytes 0x735b0000 (1,935,343,616)

[time lasts]: 35
tx_succ_bytes 0x6f220000 (1,864,499,200)

[time lasts]: 36
tx_succ_bytes 0x73240000 (1,931,739,136)

[time lasts]: 37
tx_succ_bytes 0x755c0000 (1,968,963,584)

[time lasts]: 38
tx_succ_bytes 0x6e0a0000 (1,846,149,120)

[time lasts]: 39
tx_succ_bytes 0x682f0000 (1,747,910,656)

[time lasts]: 40
tx_succ_bytes 0x676b0000 (1,735,065,600)

[time lasts]: 41
tx_succ_bytes 0x71ca0000 (1,909,063,680)

[time lasts]: 42
tx_succ_bytes 0x6d740000 (1,836,318,720)

[time lasts]: 43
tx_succ_bytes 0x6f9d0000 (1,872,560,128)

[time lasts]: 44
tx_succ_bytes 0x6cec0000 (1,827,405,824)

[time lasts]: 45
tx_succ_bytes 0x82ae0000 (2,192,441,344)

[time lasts]: 46
tx_succ_bytes 0x783b0000 (2,017,132,544)

[time lasts]: 47
tx_succ_bytes 0x73300000 (1,932,525,568)

[time lasts]: 48
tx_succ_bytes 0x6e580000 (1,851,260,928)

[time lasts]: 49
tx_succ_bytes 0x77ab0000 (2,007,695,360)

[time lasts]: 50
tx_succ_bytes 0x67a80000 (1,739,063,296)

[time lasts]: 51
tx_succ_bytes 0x7a490000 (2,051,604,480)

[time lasts]: 52
tx_succ_bytes 0x73d00000 (1,943,011,328)

[time lasts]: 53
tx_succ_bytes 0x74110000 (1,947,271,168)

[time lasts]: 54
tx_succ_bytes 0x6d360000 (1,832,255,488)

[time lasts]: 55
tx_succ_bytes 0x730c0000 (1,930,166,272)

[time lasts]: 56
tx_succ_bytes 0x6ed20000 (1,859,256,320)

[time lasts]: 57
tx_succ_bytes 0x73de0000 (1,943,928,832)

[time lasts]: 58
tx_succ_bytes 0x70660000 (1,885,732,864)

[time lasts]: 59
tx_succ_bytes 0x723e0000 (1,916,665,856)

[time lasts]: 60
tx_succ_bytes 0x7a4c0000 (2,051,801,088)

[time lasts]: 61
tx_succ_bytes 0x6d8a0000 (1,837,760,512)

[time lasts]: 62
tx_succ_bytes 0x5f650000 (1,600,454,656)

[time lasts]: 63
tx_succ_bytes 0x74180000 (1,947,729,920)

/*
* counters on the switch.
* we can see, the rate of packet dropping is so high (~5%).
*/

hound-dirt# show queuing statistics interface ethernet 1/1/4
Interface ethernet1/1/4
Queue Packets Bytes Dropped-Packets Dropped-Bytes
0 18 1218 0 0
1 36 3528 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 21037361 189283361906 953632 8583259654
6 0 0 0 0
7 0 0 0 0
hound-dirt#

fox-dirt# show queuing statistics interface ethernet 1/1/4
Interface ethernet1/1/4
Queue Packets Bytes Dropped-Packets Dropped-Bytes
0 41 2666 0 0
1 28 2744 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 21026196 189216241992 962777 8667910211
6 0 0 0 0
7 0 0 0 0
fox-dirt#

2022-03-25 19:21:30

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control



On 3/25/22 15:11, Mingbao Sun wrote:
>> 1. Can you please provide your measurements that support your claims?
>
> Yes. I would provide a series of the testing result.
> In the bottom of this mail, I would provide the first one.
>
>>
>> 2. Can you please provide a real, existing use-case where this provides
>> true, measureable value? And more specifically, please clarify how the
>> use-case needs a local tuning for nvme-tcp that would not hold for
>> other tcp streams that are running on the host (and vice-versa).
>>
>
> As for the use-case.
> I think multiple NVMe/TCP hosts simultaneously write data to a single target
> is a much common use-case.
> And this patchset just addresses the performance issue of this use-case.

Thanks Mingbao,

Long email, haven't read it all yet.

But this doesn't answer my specific question. I was asking why should
the tcp congestion be controlled locally to nvme. You could just as
easily change these knobs via sysctl and achieve the expected result
that dctcp handles congestion better than cubic (which was not even
testing nvme btw).

As I said, TCP can be tuned in various ways, congestion being just one
of them. I'm sure you can find a workload where rmem/wmem will make
a difference.

In addition, based on my knowledge, application specific TCP level
tuning (like congestion) is not really a common thing to do. So why in
nvme-tcp?

So to me at least, it is not clear why we should add it to the driver.

2022-03-25 19:51:10

by Mingbao Sun

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

[Combination 2] cubic + switch ECN-marking on

This is a bad scenario.
Since the switch supports ECN-marking, but cubic can not take
advantage of that.
So we can see the bandwidth of each TX node could suddenly drop
several hundreds MB/S due to packet dropping and timeout-retransmission.
And the total bandwidth on the RX node can not reachto the full link
bandwidth (which is about 6 GB/S).

/*
* before loading traffic, clear the counters on the 2 switches.
*/

hound-dirt# clear qos statistics type queuing interface ethernet 1/1/4
hound-dirt# show queuing statistics interface ethernet 1/1/4
Interface ethernet1/1/4
Queue Packets Bytes Dropped-Packets Dropped-Bytes
0 0 0 0 0
1 0 0 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
7 0 0 0 0
hound-dirt#

fox-dirt# clear qos statistics type queuing interface ethernet 1/1/4
fox-dirt# show queuing statistics interface ethernet 1/1/4
Interface ethernet1/1/4
Queue Packets Bytes Dropped-Packets Dropped-Bytes
0 0 0 0 0
1 0 0 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
7 0 0 0 0
fox-dirt#



/*
* logs of RX node.
*/

ogden-dirt:/home/admin/tyler # echo cubic >/proc/sys/net/ipv4/tcp_congestion_control
ogden-dirt:/home/admin/tyler # cat /proc/sys/net/ipv4/tcp_congestion_control
cubic
ogden-dirt:/home/admin/tyler # date; ./tcp_perf.exe --server-ip 169.254.85.4 --server-port 10001 --msg-len 65536 --conn-num 24 -s -r --data-random --svr-no-wait-all-conn-OK --show-bandwidth-only
Fri Mar 25 08:23:13 EDT 2022
DBG:/mnt/hgfs/src/linux-dev-framework-master/libs/app_utils/src/socket.c(104)-socket_init_2:
bind socket fd 3 to 169.254.85.4:10001 succeed
DBG:perf_frmwk.c(1472)-create_tcp_conns_start_load:
start listen on fd 3
conn [0] local 169.254.85.4:10001 peer 169.254.85.3:59676 created.
rx thread of conn 0 started
conn [1] local 169.254.85.4:10001 peer 169.254.85.3:59678 created.
rx thread of conn 1 started
conn [2] local 169.254.85.4:10001 peer 169.254.85.3:59680 created.
rx thread of conn 2 started
conn [3] local 169.254.85.4:10001 peer 169.254.85.3:59682 created.
rx thread of conn 3 started
conn [4] local 169.254.85.4:10001 peer 169.254.85.3:59684 created.
conn [5] local 169.254.85.4:10001 peer 169.254.85.3:59686 created.
rx thread of conn 4 started
conn [6] local 169.254.85.4:10001 peer 169.254.85.3:59688 created.
rx thread of conn 5 started
rx thread of conn 6 started
conn [7] local 169.254.85.4:10001 peer 169.254.85.3:59690 created.
rx thread of conn 7 started
conn [8] local 169.254.85.4:10001 peer 169.254.85.2:37974 created.
rx thread of conn 8 started
conn [9] local 169.254.85.4:10001 peer 169.254.85.2:37978 created.
rx thread of conn 9 started
conn [10] local 169.254.85.4:10001 peer 169.254.85.2:37980 created.
rx thread of conn 10 started
conn [11] local 169.254.85.4:10001 peer 169.254.85.2:37982 created.
rx thread of conn 11 started
conn [12] local 169.254.85.4:10001 peer 169.254.85.2:37984 created.
rx thread of conn 12 started
conn [13] local 169.254.85.4:10001 peer 169.254.85.2:37986 created.
conn [14] local 169.254.85.4:10001 peer 169.254.85.2:37988 created.
rx thread of conn 13 started
conn [15] local 169.254.85.4:10001 peer 169.254.85.2:37990 created.
rx thread of conn 14 started
rx thread of conn 15 started

[time lasts]: 1
rx_msg_succ_bytes 0x1625d0000 (5,945,229,312)
conn [16] local 169.254.85.4:10001 peer 169.254.85.1:49484 created.
rx thread of conn 16 started
conn [17] local 169.254.85.4:10001 peer 169.254.85.1:49486 created.
rx thread of conn 17 started
conn [18] local 169.254.85.4:10001 peer 169.254.85.1:49490 created.
rx thread of conn 18 started
conn [19] local 169.254.85.4:10001 peer 169.254.85.1:49492 created.
rx thread of conn 19 started
conn [20] local 169.254.85.4:10001 peer 169.254.85.1:49494 created.
rx thread of conn 20 started
conn [21] local 169.254.85.4:10001 peer 169.254.85.1:49496 created.
rx thread of conn 21 started
conn [22] local 169.254.85.4:10001 peer 169.254.85.1:49498 created.
rx thread of conn 22 started
conn [23] local 169.254.85.4:10001 peer 169.254.85.1:49488 created.
24 connection(s) created in total
rx thread of conn 23 started

[time lasts]: 2
rx_msg_succ_bytes 0x153a90000 (5,698,551,808)

[time lasts]: 3
rx_msg_succ_bytes 0x1589a0000 (5,781,454,848)

[time lasts]: 4
rx_msg_succ_bytes 0x156770000 (5,745,606,656)

[time lasts]: 5
rx_msg_succ_bytes 0x155e30000 (5,735,907,328)

[time lasts]: 6
rx_msg_succ_bytes 0x1583f0000 (5,775,491,072)

[time lasts]: 7
rx_msg_succ_bytes 0x159de0000 (5,802,688,512)

[time lasts]: 8
rx_msg_succ_bytes 0x153b00000 (5,699,010,560)

[time lasts]: 9
rx_msg_succ_bytes 0x153a50000 (5,698,289,664)

[time lasts]: 10
rx_msg_succ_bytes 0x1568a0000 (5,746,851,840)

[time lasts]: 11
rx_msg_succ_bytes 0x157bb0000 (5,766,840,320)

[time lasts]: 12
rx_msg_succ_bytes 0x15a2a0000 (5,807,669,248)

[time lasts]: 13
rx_msg_succ_bytes 0x155020000 (5,721,161,728)

[time lasts]: 14
rx_msg_succ_bytes 0x154b40000 (5,716,049,920)

[time lasts]: 15
rx_msg_succ_bytes 0x157610000 (5,760,942,080)

[time lasts]: 16
rx_msg_succ_bytes 0x157cd0000 (5,768,019,968)

[time lasts]: 17
rx_msg_succ_bytes 0x153cf0000 (5,701,042,176)

[time lasts]: 18
rx_msg_succ_bytes 0x151820000 (5,662,441,472)

[time lasts]: 19
rx_msg_succ_bytes 0x153fa0000 (5,703,860,224)

[time lasts]: 20
rx_msg_succ_bytes 0x150af0000 (5,648,613,376)

[time lasts]: 21
rx_msg_succ_bytes 0x152a70000 (5,681,643,520)

[time lasts]: 22
rx_msg_succ_bytes 0x154e10000 (5,718,999,040)

[time lasts]: 23
rx_msg_succ_bytes 0x157d60000 (5,768,609,792)

[time lasts]: 24
rx_msg_succ_bytes 0x1581d0000 (5,773,262,848)

[time lasts]: 25
rx_msg_succ_bytes 0x157240000 (5,756,944,384)

[time lasts]: 26
rx_msg_succ_bytes 0x159e50000 (5,803,147,264)

[time lasts]: 27
rx_msg_succ_bytes 0x1535b0000 (5,693,440,000)

[time lasts]: 28
rx_msg_succ_bytes 0x157390000 (5,758,320,640)

[time lasts]: 29
rx_msg_succ_bytes 0x155620000 (5,727,453,184)

[time lasts]: 30
rx_msg_succ_bytes 0x153c80000 (5,700,583,424)

[time lasts]: 31
rx_msg_succ_bytes 0x154cb0000 (5,717,557,248)

[time lasts]: 32
rx_msg_succ_bytes 0x153ea0000 (5,702,811,648)

[time lasts]: 33
rx_msg_succ_bytes 0x1577f0000 (5,762,908,160)

[time lasts]: 34
rx_msg_succ_bytes 0x158090000 (5,771,952,128)

[time lasts]: 35
rx_msg_succ_bytes 0x154040000 (5,704,515,584)

[time lasts]: 36
rx_msg_succ_bytes 0x1577f0000 (5,762,908,160)

[time lasts]: 37
rx_msg_succ_bytes 0x154900000 (5,713,690,624)

[time lasts]: 38
rx_msg_succ_bytes 0x155750000 (5,728,698,368)

[time lasts]: 39
rx_msg_succ_bytes 0x154ca0000 (5,717,491,712)

[time lasts]: 40
rx_msg_succ_bytes 0x158110000 (5,772,476,416)

[time lasts]: 41
rx_msg_succ_bytes 0x155c90000 (5,734,203,392)

[time lasts]: 42
rx_msg_succ_bytes 0x1563a0000 (5,741,608,960)

[time lasts]: 43
rx_msg_succ_bytes 0x1566e0000 (5,745,016,832)

[time lasts]: 44
rx_msg_succ_bytes 0x158d20000 (5,785,124,864)

[time lasts]: 45
rx_msg_succ_bytes 0x156c70000 (5,750,849,536)

[time lasts]: 46
rx_msg_succ_bytes 0x1536c0000 (5,694,554,112)

[time lasts]: 47
rx_msg_succ_bytes 0x15b030000 (5,821,890,560)

[time lasts]: 48
rx_msg_succ_bytes 0x156ac0000 (5,749,080,064)

[time lasts]: 49
rx_msg_succ_bytes 0x151950000 (5,663,686,656)

[time lasts]: 50
rx_msg_succ_bytes 0x156f80000 (5,754,060,800)

[time lasts]: 51
rx_msg_succ_bytes 0x158250000 (5,773,787,136)

[time lasts]: 52
rx_msg_succ_bytes 0x156830000 (5,746,393,088)

[time lasts]: 53
rx_msg_succ_bytes 0x157470000 (5,759,238,144)

[time lasts]: 54
rx_msg_succ_bytes 0x1557a0000 (5,729,026,048)

[time lasts]: 55
rx_msg_succ_bytes 0x156270000 (5,740,363,776)

[time lasts]: 56
rx_msg_succ_bytes 0x158da0000 (5,785,649,152)

[time lasts]: 57
rx_msg_succ_bytes 0x158980000 (5,781,323,776)

[time lasts]: 58
rx_msg_succ_bytes 0x154fe0000 (5,720,899,584)

[time lasts]: 59
rx_msg_succ_bytes 0x155530000 (5,726,470,144)

[time lasts]: 60
rx_msg_succ_bytes 0x156da0000 (5,752,094,720)

[time lasts]: 61
rx_msg_succ_bytes 0x155f70000 (5,737,218,048)

[time lasts]: 62
rx_msg_succ_bytes 0x154500000 (5,709,496,320)

[time lasts]: 63
rx_msg_succ_bytes 0x157c50000 (5,767,495,680)

[time lasts]: 64
rx_msg_succ_bytes 0x1550b0000 (5,721,751,552)

[time lasts]: 65
rx_msg_succ_bytes 0x1586c0000 (5,778,440,192)
^Ccaught signal 2

/*
* logs of TX node 1.
*/

provo-dirt:/home/admin/tyler # echo cubic >/proc/sys/net/ipv4/tcp_congestion_control
provo-dirt:/home/admin/tyler # cat /proc/sys/net/ipv4/tcp_congestion_control
cubic
provo-dirt:/home/admin/tyler # date; ./tcp_perf.exe --server-ip 169.254.85.4 --server-port 10001 --msg-len 65536 --conn-num 8 -c -t --data-random --show-bandwidth-only
Fri Mar 25 08:23:18 EDT 2022
conn [0] local 169.254.85.1:49484 peer 169.254.85.4:10001 created.
conn [1] local 169.254.85.1:49486 peer 169.254.85.4:10001 created.
conn [2] local 169.254.85.1:49488 peer 169.254.85.4:10001 created.
conn [3] local 169.254.85.1:49490 peer 169.254.85.4:10001 created.
conn [4] local 169.254.85.1:49492 peer 169.254.85.4:10001 created.
conn [5] local 169.254.85.1:49494 peer 169.254.85.4:10001 created.
conn [6] local 169.254.85.1:49496 peer 169.254.85.4:10001 created.
conn [7] local 169.254.85.1:49498 peer 169.254.85.4:10001 created.
8 connection(s) created in total
tx thread of conn 0 started
tx thread of conn 1 started
tx thread of conn 2 started
tx thread of conn 3 started
tx thread of conn 4 started
tx thread of conn 5 started
tx thread of conn 6 started
tx thread of conn 7 started

[time lasts]: 1
tx_succ_bytes 0x76930000 (1,989,345,280)

[time lasts]: 2
tx_succ_bytes 0x639f0000 (1,671,364,608)

[time lasts]: 3
tx_succ_bytes 0x7e1d0000 (2,115,829,760)

[time lasts]: 4
tx_succ_bytes 0x770b0000 (1,997,209,600)

[time lasts]: 5
tx_succ_bytes 0x68900000 (1,754,267,648)

[time lasts]: 6
tx_succ_bytes 0x53dd0000 (1,406,992,384)

[time lasts]: 7
tx_succ_bytes 0x6eab0000 (1,856,700,416)

[time lasts]: 8
tx_succ_bytes 0x4c200000 (1,277,165,568)

[time lasts]: 9
tx_succ_bytes 0x6eab0000 (1,856,700,416)

[time lasts]: 10
tx_succ_bytes 0x793e0000 (2,034,106,368)

[time lasts]: 11
tx_succ_bytes 0x5b7a0000 (1,534,722,048)

[time lasts]: 12
tx_succ_bytes 0x6c220000 (1,814,167,552)

[time lasts]: 13
tx_succ_bytes 0x65660000 (1,701,183,488)

[time lasts]: 14
tx_succ_bytes 0x77860000 (2,005,270,528)

[time lasts]: 15
tx_succ_bytes 0x6eff0000 (1,862,205,440)

[time lasts]: 16
tx_succ_bytes 0x65330000 (1,697,841,152)

[time lasts]: 17
tx_succ_bytes 0x73600000 (1,935,671,296)

[time lasts]: 18
tx_succ_bytes 0x7cf30000 (2,096,300,032)

[time lasts]: 19
tx_succ_bytes 0x838d0000 (2,207,055,872)

[time lasts]: 20
tx_succ_bytes 0x8ea90000 (2,393,440,256)

[time lasts]: 21
tx_succ_bytes 0x5ef50000 (1,593,114,624)

[time lasts]: 22
tx_succ_bytes 0x693e0000 (1,765,670,912)

[time lasts]: 23
tx_succ_bytes 0x64390000 (1,681,457,152)

[time lasts]: 24
tx_succ_bytes 0x69400000 (1,765,801,984)

[time lasts]: 25
tx_succ_bytes 0x69fb0000 (1,778,057,216)

[time lasts]: 26
tx_succ_bytes 0x710a0000 (1,896,480,768)

[time lasts]: 27
tx_succ_bytes 0x70170000 (1,880,555,520)

[time lasts]: 28
tx_succ_bytes 0x7ed30000 (2,127,757,312)

[time lasts]: 29
tx_succ_bytes 0x70830000 (1,887,633,408)

[time lasts]: 30
tx_succ_bytes 0x6fcb0000 (1,875,574,784)

[time lasts]: 31
tx_succ_bytes 0x6dc30000 (1,841,496,064)

[time lasts]: 32
tx_succ_bytes 0x7f060000 (2,131,099,648)

[time lasts]: 33
tx_succ_bytes 0x79760000 (2,037,776,384)

[time lasts]: 34
tx_succ_bytes 0x78f30000 (2,029,191,168)

[time lasts]: 35
tx_succ_bytes 0x621a0000 (1,645,871,104)

[time lasts]: 36
tx_succ_bytes 0x7a5d0000 (2,052,915,200)

[time lasts]: 37
tx_succ_bytes 0x7fb80000 (2,142,765,056)

[time lasts]: 38
tx_succ_bytes 0x73ed0000 (1,944,911,872)

[time lasts]: 39
tx_succ_bytes 0x6fb20000 (1,873,936,384)

[time lasts]: 40
tx_succ_bytes 0x72110000 (1,913,716,736)

[time lasts]: 41
tx_succ_bytes 0x59240000 (1,495,531,520)

[time lasts]: 42
tx_succ_bytes 0x55e30000 (1,440,940,032)

[time lasts]: 43
tx_succ_bytes 0x69e20000 (1,776,418,816)

[time lasts]: 44
tx_succ_bytes 0x849a0000 (2,224,685,056)

[time lasts]: 45
tx_succ_bytes 0x7a930000 (2,056,454,144)

[time lasts]: 46
tx_succ_bytes 0x79230000 (2,032,336,896)

[time lasts]: 47
tx_succ_bytes 0x6d690000 (1,835,597,824)

[time lasts]: 48
tx_succ_bytes 0x77c60000 (2,009,464,832)

[time lasts]: 49
tx_succ_bytes 0x5c670000 (1,550,254,080)

[time lasts]: 50
tx_succ_bytes 0x6a210000 (1,780,547,584)

[time lasts]: 51
tx_succ_bytes 0x64ef0000 (1,693,384,704)

[time lasts]: 52
tx_succ_bytes 0x810a0000 (2,164,916,224)

[time lasts]: 53
tx_succ_bytes 0x7a4f0000 (2,051,997,696)

[time lasts]: 54
tx_succ_bytes 0x70ac0000 (1,890,320,384)

[time lasts]: 55
tx_succ_bytes 0x92260000 (2,451,963,904)

[time lasts]: 56
tx_succ_bytes 0x5d500000 (1,565,523,968)

[time lasts]: 57
tx_succ_bytes 0x59030000 (1,493,368,832)

[time lasts]: 58
tx_succ_bytes 0x7ad60000 (2,060,845,056)

[time lasts]: 59
tx_succ_bytes 0x765a0000 (1,985,609,728)

[time lasts]: 60
tx_succ_bytes 0x6c1e0000 (1,813,905,408)

[time lasts]: 61
tx_succ_bytes 0x6eeb0000 (1,860,894,720)

[time lasts]: 62
tx_succ_bytes 0x6ab50000 (1,790,246,912)

[time lasts]: 63
tx_succ_bytes 0x64cf0000 (1,691,287,552)

/*
* logs of TX node 2.
*/

sandy-dirt:/home/admin/tyler # echo cubic >/proc/sys/net/ipv4/tcp_congestion_control
sandy-dirt:/home/admin/tyler # cat /proc/sys/net/ipv4/tcp_congestion_control
cubic
sandy-dirt:/home/admin/tyler # date; ./tcp_perf.exe --server-ip 169.254.85.4 --server-port 10001 --msg-len 65536 --conn-num 8 -c -t --data-random --show-bandwidth-only
Fri Mar 25 08:23:16 EDT 2022
conn [0] local 169.254.85.2:37974 peer 169.254.85.4:10001 created.
conn [1] local 169.254.85.2:37978 peer 169.254.85.4:10001 created.
conn [2] local 169.254.85.2:37980 peer 169.254.85.4:10001 created.
conn [3] local 169.254.85.2:37982 peer 169.254.85.4:10001 created.
conn [4] local 169.254.85.2:37984 peer 169.254.85.4:10001 created.
conn [5] local 169.254.85.2:37986 peer 169.254.85.4:10001 created.
conn [6] local 169.254.85.2:37988 peer 169.254.85.4:10001 created.
conn [7] local 169.254.85.2:37990 peer 169.254.85.4:10001 created.
8 connection(s) created in total
tx thread of conn 0 started
tx thread of conn 1 started
tx thread of conn 2 started
tx thread of conn 3 started
tx thread of conn 4 started
tx thread of conn 5 started
tx thread of conn 6 started
tx thread of conn 7 started

[time lasts]: 1
tx_succ_bytes 0x87790000 (2,272,854,016)

[time lasts]: 2
tx_succ_bytes 0x62f50000 (1,660,223,488)

[time lasts]: 3
tx_succ_bytes 0x70030000 (1,879,244,800)

[time lasts]: 4
tx_succ_bytes 0x5b990000 (1,536,753,664)

[time lasts]: 5
tx_succ_bytes 0x64210000 (1,679,884,288)

[time lasts]: 6
tx_succ_bytes 0x6a0a0000 (1,779,040,256)

[time lasts]: 7
tx_succ_bytes 0x75720000 (1,970,405,376)

[time lasts]: 8
tx_succ_bytes 0x7d980000 (2,107,113,472)

[time lasts]: 9
tx_succ_bytes 0x8d070000 (2,366,046,208)

[time lasts]: 10
tx_succ_bytes 0x74030000 (1,946,353,664)

[time lasts]: 11
tx_succ_bytes 0x71f20000 (1,911,685,120)

[time lasts]: 12
tx_succ_bytes 0x7f900000 (2,140,143,616)

[time lasts]: 13
tx_succ_bytes 0x6f690000 (1,869,152,256)

[time lasts]: 14
tx_succ_bytes 0x8edd0000 (2,396,848,128)

[time lasts]: 15
tx_succ_bytes 0x80c30000 (2,160,263,168)

[time lasts]: 16
tx_succ_bytes 0x79a40000 (2,040,791,040)

[time lasts]: 17
tx_succ_bytes 0x82120000 (2,182,217,728)

[time lasts]: 18
tx_succ_bytes 0x69e30000 (1,776,484,352)

[time lasts]: 19
tx_succ_bytes 0x64c80000 (1,690,828,800)

[time lasts]: 20
tx_succ_bytes 0x6e430000 (1,849,884,672)

[time lasts]: 21
tx_succ_bytes 0x6b110000 (1,796,276,224)

[time lasts]: 22
tx_succ_bytes 0x753c0000 (1,966,866,432)

[time lasts]: 23
tx_succ_bytes 0x80c30000 (2,160,263,168)

[time lasts]: 24
tx_succ_bytes 0x7c700000 (2,087,714,816)

[time lasts]: 25
tx_succ_bytes 0x72ec0000 (1,928,069,120)

[time lasts]: 26
tx_succ_bytes 0x84d20000 (2,228,355,072)

[time lasts]: 27
tx_succ_bytes 0x70f60000 (1,895,170,048)

[time lasts]: 28
tx_succ_bytes 0x613c0000 (1,631,322,112)

[time lasts]: 29
tx_succ_bytes 0x7cf00000 (2,096,103,424)

[time lasts]: 30
tx_succ_bytes 0x61270000 (1,629,945,856)

[time lasts]: 31
tx_succ_bytes 0x608d0000 (1,619,853,312)

[time lasts]: 32
tx_succ_bytes 0x7cdd0000 (2,094,858,240)

[time lasts]: 33
tx_succ_bytes 0x6e430000 (1,849,884,672)

[time lasts]: 34
tx_succ_bytes 0x5d820000 (1,568,800,768)

[time lasts]: 35
tx_succ_bytes 0x78f40000 (2,029,256,704)

[time lasts]: 36
tx_succ_bytes 0x66c20000 (1,723,990,016)

[time lasts]: 37
tx_succ_bytes 0x7ba60000 (2,074,476,544)

[time lasts]: 38
tx_succ_bytes 0x58600000 (1,482,686,464)

[time lasts]: 39
tx_succ_bytes 0x64710000 (1,685,127,168)

[time lasts]: 40
tx_succ_bytes 0x6e410000 (1,849,753,600)

[time lasts]: 41
tx_succ_bytes 0x65d00000 (1,708,130,304)

[time lasts]: 42
tx_succ_bytes 0x83490000 (2,202,599,424)

[time lasts]: 43
tx_succ_bytes 0x95950000 (2,509,570,048)

[time lasts]: 44
tx_succ_bytes 0x81980000 (2,174,222,336)

[time lasts]: 45
tx_succ_bytes 0x6acb0000 (1,791,688,704)

[time lasts]: 46
tx_succ_bytes 0x73320000 (1,932,656,640)

[time lasts]: 47
tx_succ_bytes 0x70000000 (1,879,048,192)

[time lasts]: 48
tx_succ_bytes 0x73b40000 (1,941,176,320)

[time lasts]: 49
tx_succ_bytes 0x5bdd0000 (1,541,210,112)

[time lasts]: 50
tx_succ_bytes 0x7e980000 (2,123,890,688)

[time lasts]: 51
tx_succ_bytes 0x7cf00000 (2,096,103,424)

[time lasts]: 52
tx_succ_bytes 0x802c0000 (2,150,367,232)

[time lasts]: 53
tx_succ_bytes 0x61540000 (1,632,894,976)

[time lasts]: 54
tx_succ_bytes 0x5f690000 (1,600,716,800)

[time lasts]: 55
tx_succ_bytes 0x6cde0000 (1,826,488,320)

[time lasts]: 56
tx_succ_bytes 0x6af90000 (1,794,703,360)

[time lasts]: 57
tx_succ_bytes 0x9ba10000 (2,611,019,776)

[time lasts]: 58
tx_succ_bytes 0x82ea0000 (2,196,373,504)

[time lasts]: 59
tx_succ_bytes 0x80a30000 (2,158,166,016)

[time lasts]: 60
tx_succ_bytes 0x5d4c0000 (1,565,261,824)

[time lasts]: 61
tx_succ_bytes 0x789e0000 (2,023,620,608)

[time lasts]: 62
tx_succ_bytes 0x65cf0000 (1,708,064,768)

[time lasts]: 63
tx_succ_bytes 0x83050000 (2,198,142,976)

/*
* logs of TX node 3.
*/

orem-dirt:/home/admin/tyler # echo cubic >/proc/sys/net/ipv4/tcp_congestion_control
orem-dirt:/home/admin/tyler # cat /proc/sys/net/ipv4/tcp_congestion_control
cubic
orem-dirt:/home/admin/tyler # date; ./tcp_perf.exe --server-ip 169.254.85.4 --server-port 10001 --msg-len 65536 --conn-num 8 -c -t --data-random --show-bandwidth-only
Fri Mar 25 08:23:15 EDT 2022
conn [0] local 169.254.85.3:59676 peer 169.254.85.4:10001 created.
conn [1] local 169.254.85.3:59678 peer 169.254.85.4:10001 created.
conn [2] local 169.254.85.3:59680 peer 169.254.85.4:10001 created.
conn [3] local 169.254.85.3:59682 peer 169.254.85.4:10001 created.
conn [4] local 169.254.85.3:59684 peer 169.254.85.4:10001 created.
conn [5] local 169.254.85.3:59686 peer 169.254.85.4:10001 created.
conn [6] local 169.254.85.3:59688 peer 169.254.85.4:10001 created.
conn [7] local 169.254.85.3:59690 peer 169.254.85.4:10001 created.
8 connection(s) created in total
tx thread of conn 0 started
tx thread of conn 1 started
tx thread of conn 2 started
tx thread of conn 3 started
tx thread of conn 4 started
tx thread of conn 5 started
tx thread of conn 6 started
tx thread of conn 7 started

[time lasts]: 1
tx_succ_bytes 0x15b6d0000 (5,828,837,376)

[time lasts]: 2
tx_succ_bytes 0xb1590000 (2,975,399,936)

[time lasts]: 3
tx_succ_bytes 0x8e2e0000 (2,385,379,328)

[time lasts]: 4
tx_succ_bytes 0x79cd0000 (2,043,478,016)

[time lasts]: 5
tx_succ_bytes 0x7e280000 (2,116,550,656)

[time lasts]: 6
tx_succ_bytes 0x80dc0000 (2,161,901,568)

[time lasts]: 7
tx_succ_bytes 0x7fa90000 (2,141,782,016)

[time lasts]: 8
tx_succ_bytes 0x8d530000 (2,371,026,944)

[time lasts]: 9
tx_succ_bytes 0x6bf10000 (1,810,956,288)

[time lasts]: 10
tx_succ_bytes 0x7d2c0000 (2,100,035,584)

[time lasts]: 11
tx_succ_bytes 0x6cca0000 (1,825,177,600)

[time lasts]: 12
tx_succ_bytes 0x755f0000 (1,969,160,192)

[time lasts]: 13
tx_succ_bytes 0x78240000 (2,015,625,216)

[time lasts]: 14
tx_succ_bytes 0x7a880000 (2,055,733,248)

[time lasts]: 15
tx_succ_bytes 0x5e980000 (1,587,019,776)

[time lasts]: 16
tx_succ_bytes 0x59fe0000 (1,509,818,368)

[time lasts]: 17
tx_succ_bytes 0x6e370000 (1,849,098,240)

[time lasts]: 18
tx_succ_bytes 0x6db80000 (1,840,775,168)

[time lasts]: 19
tx_succ_bytes 0x72f60000 (1,928,724,480)

[time lasts]: 20
tx_succ_bytes 0x6b7c0000 (1,803,288,576)

[time lasts]: 21
tx_succ_bytes 0x64040000 (1,677,983,744)

[time lasts]: 22
tx_succ_bytes 0x63c80000 (1,674,051,584)

[time lasts]: 23
tx_succ_bytes 0x7abd0000 (2,059,206,656)

[time lasts]: 24
tx_succ_bytes 0x78b60000 (2,025,193,472)

[time lasts]: 25
tx_succ_bytes 0x73c00000 (1,941,962,752)

[time lasts]: 26
tx_succ_bytes 0x779e0000 (2,006,843,392)

[time lasts]: 27
tx_succ_bytes 0x6ba70000 (1,806,106,624)

[time lasts]: 28
tx_succ_bytes 0x73140000 (1,930,690,560)

[time lasts]: 29
tx_succ_bytes 0x79e20000 (2,044,854,272)

[time lasts]: 30
tx_succ_bytes 0x64210000 (1,679,884,288)

[time lasts]: 31
tx_succ_bytes 0x80b70000 (2,159,476,736)

[time lasts]: 32
tx_succ_bytes 0x79450000 (2,034,565,120)

[time lasts]: 33
tx_succ_bytes 0x72e30000 (1,927,479,296)

[time lasts]: 34
tx_succ_bytes 0x664f0000 (1,716,453,376)

[time lasts]: 35
tx_succ_bytes 0x81120000 (2,165,440,512)

[time lasts]: 36
tx_succ_bytes 0x6abd0000 (1,790,771,200)

[time lasts]: 37
tx_succ_bytes 0x88530000 (2,287,140,864)

[time lasts]: 38
tx_succ_bytes 0x5a210000 (1,512,112,128)

[time lasts]: 39
tx_succ_bytes 0x7e4a0000 (2,118,778,880)

[time lasts]: 40
tx_succ_bytes 0x86fe0000 (2,264,793,088)

[time lasts]: 41
tx_succ_bytes 0x73c10000 (1,942,028,288)

[time lasts]: 42
tx_succ_bytes 0x83620000 (2,204,237,824)

[time lasts]: 43
tx_succ_bytes 0x7f900000 (2,140,143,616)

[time lasts]: 44
tx_succ_bytes 0x699c0000 (1,771,831,296)

[time lasts]: 45
tx_succ_bytes 0x5e0d0000 (1,577,910,272)

[time lasts]: 46
tx_succ_bytes 0x66890000 (1,720,254,464)

[time lasts]: 47
tx_succ_bytes 0x75a10000 (1,973,485,568)

[time lasts]: 48
tx_succ_bytes 0x6b3f0000 (1,799,290,880)

[time lasts]: 49
tx_succ_bytes 0x764d0000 (1,984,757,760)

[time lasts]: 50
tx_succ_bytes 0x83320000 (2,201,092,096)

[time lasts]: 51
tx_succ_bytes 0x75ca0000 (1,976,172,544)

[time lasts]: 52
tx_succ_bytes 0x759e0000 (1,973,288,960)

[time lasts]: 53
tx_succ_bytes 0x6b160000 (1,796,603,904)

[time lasts]: 54
tx_succ_bytes 0x74430000 (1,950,547,968)

[time lasts]: 55
tx_succ_bytes 0x7ed90000 (2,128,150,528)

[time lasts]: 56
tx_succ_bytes 0x74ff0000 (1,962,868,736)

[time lasts]: 57
tx_succ_bytes 0x60dd0000 (1,625,096,192)

[time lasts]: 58
tx_succ_bytes 0x5eca0000 (1,590,296,576)

[time lasts]: 59
tx_succ_bytes 0x73890000 (1,938,358,272)

[time lasts]: 60
tx_succ_bytes 0x5e8c0000 (1,586,233,344)

[time lasts]: 61
tx_succ_bytes 0x7d010000 (2,097,217,536)

[time lasts]: 62
tx_succ_bytes 0x76cd0000 (1,993,146,368)

[time lasts]: 63
tx_succ_bytes 0x81620000 (2,170,683,392)

[time lasts]: 64
tx_succ_bytes 0x6b980000 (1,805,123,584)

/*
* counters on the switch.
* we can see, the rate of packet dropping is so high (~5%).
*/

hound-dirt# show queuing statistics interface ethernet 1/1/4
Interface ethernet1/1/4
Queue Packets Bytes Dropped-Packets Dropped-Bytes
0 0 0 0 0
1 9 882 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 21377065 191400217674 1263665 11321730380
6 0 0 0 0
7 0 0 0 0
hound-dirt#

fox-dirt# show queuing statistics interface ethernet 1/1/4
Interface ethernet1/1/4
Queue Packets Bytes Dropped-Packets Dropped-Bytes
0 10 654 0 0
1 6 588 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 21228491 189957815718 1192655 10677602488
6 0 0 0 0
7 0 0 0 0
fox-dirt#

2022-03-25 20:00:53

by Mingbao Sun

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

Hope the 3 combinations can support the claims in the commit message.

Since for the later 2 combinations, due to packet dropping and
timeout-retransmission, the bandwidth of each TX node could suddenly
drop a few hundred MB/S.
And on the RX node, the total bandwidth can not reach to the full link
bandwidth (which is about 6 GB/S).
.
In contrast, for the first combination, the bandwidth of each TX node
is stable at ~ 2GB/S.
And on the RX node, the total bandwidth reached to the full link bandwidth.
And no packet dropping occurs on the 2 switches.
This is even competitive to the performance of RDMA.

2022-03-25 20:03:58

by Mingbao Sun

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

> 1. Can you please provide your measurements that support your claims?

Yes. I would provide a series of the testing result.
In the bottom of this mail, I would provide the first one.

>
> 2. Can you please provide a real, existing use-case where this provides
> true, measureable value? And more specifically, please clarify how the
> use-case needs a local tuning for nvme-tcp that would not hold for
> other tcp streams that are running on the host (and vice-versa).
>

As for the use-case.
I think multiple NVMe/TCP hosts simultaneously write data to a single target
is a much common use-case.
And this patchset just addresses the performance issue of this use-case.


Now let's see the measurements.

I have got 4 servers with exactly the same HW/SW combination.
Each server has a 50Gb bonding port consisted of 2 25Gb Mellanox ConnectX-4
Lx ports.

Basic SW info:
provo-dirt:~ # cat /etc/os-release
NAME="SLES"
VERSION="15-SP3"
VERSION_ID="15.3"
PRETTY_NAME="SUSE Linux Enterprise Server 15 SP3"
......
provo-dirt:~ # uname -r
5.14.21-SP4-9c51d4e2+

Test method:
the first 3 servers TX, and the last one RX.
A small program was written to generate the traffic.
It prints the relative and absolute counters per second.
Each round last about 1 minute.


Here is the info and the role of these servers.

No. NAME IP ROLE
1 provo-dirt 169.254.85.1 TX
2 sandy-dirt 169.254.85.2 TX
3 orem-dirt 169.254.85.3 TX
4 ogden-dirt 169.254.85.4 RX


The output looks like this on the TX nodes:

[time lasts]: 1
[conn-num 8][tx on][rx off][msg-len 65536][data-type seq-u64]

[Relative Stat]
tx_total 0x7bc6 (31,686)
tx_total_bytes 0x7bc60000 (2,076,573,696)
tx_succ 0x7bbe (31,678)
tx_succ_bytes 0x7bbe0000 (2,076,049,408)

[Absolute Stat]
tx_total 0x7bc6 (31,686)
tx_total_bytes 0x7bc60000 (2,076,573,696)
tx_succ 0x7bbe (31,678)
tx_succ_bytes 0x7bbe0000 (2,076,049,408)

[time lasts]: 2
[conn-num 8][tx on][rx off][msg-len 65536][data-type seq-u64]

[Relative Stat]
tx_total 0x7bfd (31,741)
tx_total_bytes 0x7bfd0000 (2,080,178,176)
tx_succ 0x7bfd (31,741)
tx_succ_bytes 0x7bfd0000 (2,080,178,176)

[Absolute Stat]
tx_total 0xf7c3 (63,427)
tx_total_bytes 0xf7c30000 (4,156,751,872)
tx_succ 0xf7bb (63,419)
tx_succ_bytes 0xf7bb0000 (4,156,227,584)

[time lasts]: 3
[conn-num 8][tx on][rx off][msg-len 65536][data-type seq-u64]

[Relative Stat]
tx_total 0x7b8b (31,627)
tx_total_bytes 0x7b8b0000 (2,072,707,072)
tx_succ 0x7b8b (31,627)
tx_succ_bytes 0x7b8b0000 (2,072,707,072)

[Absolute Stat]
tx_total 0x1734e (95,054)
tx_total_bytes 0x1734e0000 (6,229,458,944)
tx_succ 0x17346 (95,046)
tx_succ_bytes 0x173460000 (6,228,934,656)


And the output looks like this on the RX node:

[time lasts]: 3
[conn-num 24][tx off][rx on][msg-len 65536][data-type seq-u64]

[Relative Stat]
rx_total 0x170af (94,383)
rx_total_bytes 0x170af0000 (6,185,484,288)
rx_msg 0x170af (94,383)
rx_msg_bytes 0x170af0000 (6,185,484,288)
rx_msg_succ 0x170af (94,383)
rx_msg_succ_bytes 0x170af0000 (6,185,484,288)

[Absolute Stat]
rx_total 0x44150 (278,864)
rx_total_bytes 0x441500000 (18,275,631,104)
rx_msg 0x44138 (278,840)
rx_msg_bytes 0x441380000 (18,274,058,240)
rx_msg_succ 0x44138 (278,840)
rx_msg_succ_bytes 0x441380000 (18,274,058,240)

[time lasts]: 4
[conn-num 24][tx off][rx on][msg-len 65536][data-type seq-u64]

[Relative Stat]
rx_total 0x170ab (94,379)
rx_total_bytes 0x170ab0000 (6,185,222,144)
rx_msg 0x170ab (94,379)
rx_msg_bytes 0x170ab0000 (6,185,222,144)
rx_msg_succ 0x170ab (94,379)
rx_msg_succ_bytes 0x170ab0000 (6,185,222,144)

[Absolute Stat]
rx_total 0x5b1fb (373,243)
rx_total_bytes 0x5b1fb0000 (24,460,853,248)
rx_msg 0x5b1e3 (373,219)
rx_msg_bytes 0x5b1e30000 (24,459,280,384)
rx_msg_succ 0x5b1e3 (373,219)
rx_msg_succ_bytes 0x5b1e30000 (24,459,280,384)

[time lasts]: 5
[conn-num 24][tx off][rx on][msg-len 65536][data-type seq-u64]

[Relative Stat]
rx_total 0x170ae (94,382)
rx_total_bytes 0x170ae0000 (6,185,418,752)
rx_msg 0x170ae (94,382)
rx_msg_bytes 0x170ae0000 (6,185,418,752)
rx_msg_succ 0x170ae (94,382)
rx_msg_succ_bytes 0x170ae0000 (6,185,418,752)

[Absolute Stat]
rx_total 0x722a9 (467,625)
rx_total_bytes 0x722a90000 (30,646,272,000)
rx_msg 0x72291 (467,601)
rx_msg_bytes 0x722910000 (30,644,699,136)
rx_msg_succ 0x72291 (467,601)
rx_msg_succ_bytes 0x722910000 (30,644,699,136)

But for your convenience to fucos on the fluctuation of the bandwidth,
and for reducing the length of the mail,
We get all other line muted, we only show tx_succ_bytes and rx_msg_succ_bytes
in each second.

[Combination 1] dctcp + switch ECN-marking on

This is the perfect scenario.
Bandwidth of each node is perfectly stable.
No packet dropping on switch at all.

/*
* before loading traffic, clear the counters on the 2 switches.
*/

hound-dirt# clear qos statistics type queuing interface ethernet 1/1/4
hound-dirt# show queuing statistics interface ethernet 1/1/4
Interface ethernet1/1/4
Queue Packets Bytes Dropped-Packets Dropped-Bytes
0 0 0 0 0
1 0 0 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
7 0 0 0 0
hound-dirt#

hound-dirt# clear qos statistics type queuing interface ethernet 1/1/4
hound-dirt# show queuing statistics interface ethernet 1/1/4
Interface ethernet1/1/4
Queue Packets Bytes Dropped-Packets Dropped-Bytes
0 0 0 0 0
1 0 0 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 0 0 0 0
6 0 0 0 0
7 0 0 0 0
hound-dirt#



/*
* logs of RX node.
*/

ogden-dirt:/home/admin/tyler # cat /proc/sys/net/ipv4/tcp_congestion_control
dctcp

ogden-dirt:/home/admin/tyler # date; ./tcp_perf.exe --server-ip 169.254.85.4 --server-port 10001 --msg-len 65536 --conn-num 24 -s -r --data-random --svr-no-wait-all-conn-OK --show-bandwidth-only
Fri Mar 25 07:58:48 EDT 2022

DBG:/mnt/hgfs/src/linux-dev-framework-master/libs/app_utils/src/socket.c(104)-socket_init_2:
bind socket fd 3 to 169.254.85.4:10001 succeed
DBG:perf_frmwk.c(1472)-create_tcp_conns_start_load:
start listen on fd 3
conn [0] local 169.254.85.4:10001 peer 169.254.85.3:59660 created.
rx thread of conn 0 started
conn [1] local 169.254.85.4:10001 peer 169.254.85.3:59662 created.
rx thread of conn 1 started
conn [2] local 169.254.85.4:10001 peer 169.254.85.3:59664 created.
rx thread of conn 2 started
conn [3] local 169.254.85.4:10001 peer 169.254.85.3:59666 created.
rx thread of conn 3 started
conn [4] local 169.254.85.4:10001 peer 169.254.85.3:59668 created.
conn [5] local 169.254.85.4:10001 peer 169.254.85.3:59670 created.
rx thread of conn 4 started
conn [6] local 169.254.85.4:10001 peer 169.254.85.3:59672 created.
rx thread of conn 5 started
conn [7] local 169.254.85.4:10001 peer 169.254.85.3:59674 created.
rx thread of conn 6 started
rx thread of conn 7 started
conn [8] local 169.254.85.4:10001 peer 169.254.85.2:37956 created.
rx thread of conn 8 started
conn [9] local 169.254.85.4:10001 peer 169.254.85.2:37958 created.
rx thread of conn 9 started
conn [10] local 169.254.85.4:10001 peer 169.254.85.2:37960 created.
rx thread of conn 10 started
conn [11] local 169.254.85.4:10001 peer 169.254.85.2:37962 created.
rx thread of conn 11 started
conn [12] local 169.254.85.4:10001 peer 169.254.85.2:37964 created.
rx thread of conn 12 started
conn [13] local 169.254.85.4:10001 peer 169.254.85.2:37966 created.
rx thread of conn 13 started
conn [14] local 169.254.85.4:10001 peer 169.254.85.2:37968 created.
rx thread of conn 14 started
conn [15] local 169.254.85.4:10001 peer 169.254.85.2:37970 created.
rx thread of conn 15 started

[time lasts]: 1
rx_msg_succ_bytes 0x161500000 (5,927,600,128)
conn [16] local 169.254.85.4:10001 peer 169.254.85.1:49468 created.
rx thread of conn 16 started
conn [17] local 169.254.85.4:10001 peer 169.254.85.1:49470 created.
rx thread of conn 17 started
conn [18] local 169.254.85.4:10001 peer 169.254.85.1:49472 created.
rx thread of conn 18 started
conn [19] local 169.254.85.4:10001 peer 169.254.85.1:49474 created.
rx thread of conn 19 started
conn [20] local 169.254.85.4:10001 peer 169.254.85.1:49476 created.
rx thread of conn 20 started
conn [21] local 169.254.85.4:10001 peer 169.254.85.1:49478 created.
rx thread of conn 21 started
conn [22] local 169.254.85.4:10001 peer 169.254.85.1:49480 created.
rx thread of conn 22 started
conn [23] local 169.254.85.4:10001 peer 169.254.85.1:49482 created.
24 connection(s) created in total
rx thread of conn 23 started

[time lasts]: 2
rx_msg_succ_bytes 0x1709a0000 (6,184,108,032)

[time lasts]: 3
rx_msg_succ_bytes 0x170ac0000 (6,185,287,680)

[time lasts]: 4
rx_msg_succ_bytes 0x170ad0000 (6,185,353,216)

[time lasts]: 5
rx_msg_succ_bytes 0x170ad0000 (6,185,353,216)

[time lasts]: 6
rx_msg_succ_bytes 0x170b00000 (6,185,549,824)

[time lasts]: 7
rx_msg_succ_bytes 0x170590000 (6,179,848,192)

[time lasts]: 8
rx_msg_succ_bytes 0x170a90000 (6,185,091,072)

[time lasts]: 9
rx_msg_succ_bytes 0x170af0000 (6,185,484,288)

[time lasts]: 10
rx_msg_succ_bytes 0x170af0000 (6,185,484,288)

[time lasts]: 11
rx_msg_succ_bytes 0x170ad0000 (6,185,353,216)

[time lasts]: 12
rx_msg_succ_bytes 0x170ad0000 (6,185,353,216)

[time lasts]: 13
rx_msg_succ_bytes 0x170ab0000 (6,185,222,144)

[time lasts]: 14
rx_msg_succ_bytes 0x170b20000 (6,185,680,896)

[time lasts]: 15
rx_msg_succ_bytes 0x170a70000 (6,184,960,000)

[time lasts]: 16
rx_msg_succ_bytes 0x170ab0000 (6,185,222,144)

[time lasts]: 17
rx_msg_succ_bytes 0x170b20000 (6,185,680,896)

[time lasts]: 18
rx_msg_succ_bytes 0x170ab0000 (6,185,222,144)

[time lasts]: 19
rx_msg_succ_bytes 0x170ac0000 (6,185,287,680)

[time lasts]: 20
rx_msg_succ_bytes 0x170ac0000 (6,185,287,680)

[time lasts]: 21
rx_msg_succ_bytes 0x170ab0000 (6,185,222,144)

[time lasts]: 22
rx_msg_succ_bytes 0x170ac0000 (6,185,287,680)

[time lasts]: 23
rx_msg_succ_bytes 0x170ae0000 (6,185,418,752)

[time lasts]: 24
rx_msg_succ_bytes 0x170ad0000 (6,185,353,216)

[time lasts]: 25
rx_msg_succ_bytes 0x170ae0000 (6,185,418,752)

[time lasts]: 26
rx_msg_succ_bytes 0x170aa0000 (6,185,156,608)

[time lasts]: 27
rx_msg_succ_bytes 0x170ae0000 (6,185,418,752)

[time lasts]: 28
rx_msg_succ_bytes 0x170ad0000 (6,185,353,216)

[time lasts]: 29
rx_msg_succ_bytes 0x170aa0000 (6,185,156,608)

[time lasts]: 30
rx_msg_succ_bytes 0x170ac0000 (6,185,287,680)

[time lasts]: 31
rx_msg_succ_bytes 0x170ac0000 (6,185,287,680)

[time lasts]: 32
rx_msg_succ_bytes 0x170ae0000 (6,185,418,752)

[time lasts]: 33
rx_msg_succ_bytes 0x170ae0000 (6,185,418,752)

[time lasts]: 34
rx_msg_succ_bytes 0x170aa0000 (6,185,156,608)

[time lasts]: 35
rx_msg_succ_bytes 0x170ab0000 (6,185,222,144)

[time lasts]: 36
rx_msg_succ_bytes 0x170b10000 (6,185,615,360)

[time lasts]: 37
rx_msg_succ_bytes 0x170ad0000 (6,185,353,216)

[time lasts]: 38
rx_msg_succ_bytes 0x170a90000 (6,185,091,072)

[time lasts]: 39
rx_msg_succ_bytes 0x170ab0000 (6,185,222,144)

[time lasts]: 40
rx_msg_succ_bytes 0x170ad0000 (6,185,353,216)

[time lasts]: 41
rx_msg_succ_bytes 0x170ae0000 (6,185,418,752)

[time lasts]: 42
rx_msg_succ_bytes 0x170aa0000 (6,185,156,608)

[time lasts]: 43
rx_msg_succ_bytes 0x170ac0000 (6,185,287,680)

[time lasts]: 44
rx_msg_succ_bytes 0x170af0000 (6,185,484,288)

[time lasts]: 45
rx_msg_succ_bytes 0x170ac0000 (6,185,287,680)

[time lasts]: 46
rx_msg_succ_bytes 0x170aa0000 (6,185,156,608)

[time lasts]: 47
rx_msg_succ_bytes 0x170ac0000 (6,185,287,680)

[time lasts]: 48
rx_msg_succ_bytes 0x170af0000 (6,185,484,288)

[time lasts]: 49
rx_msg_succ_bytes 0x170ac0000 (6,185,287,680)

[time lasts]: 50
rx_msg_succ_bytes 0x170ae0000 (6,185,418,752)

[time lasts]: 51
rx_msg_succ_bytes 0x170ad0000 (6,185,353,216)

[time lasts]: 52
rx_msg_succ_bytes 0x170ab0000 (6,185,222,144)

[time lasts]: 53
rx_msg_succ_bytes 0x170ae0000 (6,185,418,752)

[time lasts]: 54
rx_msg_succ_bytes 0x170ac0000 (6,185,287,680)

[time lasts]: 55
rx_msg_succ_bytes 0x170ad0000 (6,185,353,216)

[time lasts]: 56
rx_msg_succ_bytes 0x170ae0000 (6,185,418,752)

[time lasts]: 57
rx_msg_succ_bytes 0x170a90000 (6,185,091,072)

[time lasts]: 58
rx_msg_succ_bytes 0x170af0000 (6,185,484,288)

[time lasts]: 59
rx_msg_succ_bytes 0x170aa0000 (6,185,156,608)

[time lasts]: 60
rx_msg_succ_bytes 0x170a50000 (6,184,828,928)

[time lasts]: 61
rx_msg_succ_bytes 0x170aa0000 (6,185,156,608)

[time lasts]: 62
rx_msg_succ_bytes 0x170ad0000 (6,185,353,216)

[time lasts]: 63
rx_msg_succ_bytes 0x170ab0000 (6,185,222,144)
^Ccaught signal 2


/*
* logs of TX node 1.
*/

provo-dirt:/home/admin/tyler # cat /proc/sys/net/ipv4/tcp_congestion_control
dctcp

provo-dirt:/home/admin/tyler # date; ./tcp_perf.exe --server-ip 169.254.85.4 --server-port 10001 --msg-len 65536 --conn-num 8 -c -t --data-random --show-bandwidth-only
Fri Mar 25 07:58:53 EDT 2022

conn [0] local 169.254.85.1:49468 peer 169.254.85.4:10001 created.
conn [1] local 169.254.85.1:49470 peer 169.254.85.4:10001 created.
conn [2] local 169.254.85.1:49472 peer 169.254.85.4:10001 created.
conn [3] local 169.254.85.1:49474 peer 169.254.85.4:10001 created.
conn [4] local 169.254.85.1:49476 peer 169.254.85.4:10001 created.
conn [5] local 169.254.85.1:49478 peer 169.254.85.4:10001 created.
conn [6] local 169.254.85.1:49480 peer 169.254.85.4:10001 created.
conn [7] local 169.254.85.1:49482 peer 169.254.85.4:10001 created.
8 connection(s) created in total
tx thread of conn 0 started
tx thread of conn 1 started
tx thread of conn 2 started
tx thread of conn 3 started
tx thread of conn 4 started
tx thread of conn 5 started
tx thread of conn 6 started
tx thread of conn 7 started

[time lasts]: 1
tx_succ_bytes 0x765c0000 (1,985,740,800)

[time lasts]: 2
tx_succ_bytes 0x78f30000 (2,029,191,168)

[time lasts]: 3
tx_succ_bytes 0x783d0000 (2,017,263,616)

[time lasts]: 4
tx_succ_bytes 0x78c90000 (2,026,438,656)

[time lasts]: 5
tx_succ_bytes 0x76bf0000 (1,992,228,864)

[time lasts]: 6
tx_succ_bytes 0x784f0000 (2,018,443,264)

[time lasts]: 7
tx_succ_bytes 0x765c0000 (1,985,740,800)

[time lasts]: 8
tx_succ_bytes 0x78140000 (2,014,576,640)

[time lasts]: 9
tx_succ_bytes 0x77bf0000 (2,009,006,080)

[time lasts]: 10
tx_succ_bytes 0x79210000 (2,032,205,824)

[time lasts]: 11
tx_succ_bytes 0x786d0000 (2,020,409,344)

[time lasts]: 12
tx_succ_bytes 0x79480000 (2,034,761,728)

[time lasts]: 13
tx_succ_bytes 0x76610000 (1,986,068,480)

[time lasts]: 14
tx_succ_bytes 0x76940000 (1,989,410,816)

[time lasts]: 15
tx_succ_bytes 0x76ef0000 (1,995,374,592)

[time lasts]: 16
tx_succ_bytes 0x76f20000 (1,995,571,200)

[time lasts]: 17
tx_succ_bytes 0x78190000 (2,014,904,320)

[time lasts]: 18
tx_succ_bytes 0x775c0000 (2,002,518,016)

[time lasts]: 19
tx_succ_bytes 0x77040000 (1,996,750,848)

[time lasts]: 20
tx_succ_bytes 0x77d90000 (2,010,710,016)

[time lasts]: 21
tx_succ_bytes 0x79470000 (2,034,696,192)

[time lasts]: 22
tx_succ_bytes 0x79730000 (2,037,579,776)

[time lasts]: 23
tx_succ_bytes 0x760c0000 (1,980,497,920)

[time lasts]: 24
tx_succ_bytes 0x779e0000 (2,006,843,392)

[time lasts]: 25
tx_succ_bytes 0x78100000 (2,014,314,496)

[time lasts]: 26
tx_succ_bytes 0x78bb0000 (2,025,521,152)

[time lasts]: 27
tx_succ_bytes 0x77df0000 (2,011,103,232)

[time lasts]: 28
tx_succ_bytes 0x75840000 (1,971,585,024)

[time lasts]: 29
tx_succ_bytes 0x772d0000 (1,999,437,824)

[time lasts]: 30
tx_succ_bytes 0x785d0000 (2,019,360,768)

[time lasts]: 31
tx_succ_bytes 0x766f0000 (1,986,985,984)

[time lasts]: 32
tx_succ_bytes 0x77510000 (2,001,797,120)

[time lasts]: 33
tx_succ_bytes 0x78720000 (2,020,737,024)

[time lasts]: 34
tx_succ_bytes 0x790e0000 (2,030,960,640)

[time lasts]: 35
tx_succ_bytes 0x797b0000 (2,038,104,064)

[time lasts]: 36
tx_succ_bytes 0x78710000 (2,020,671,488)

[time lasts]: 37
tx_succ_bytes 0x79bb0000 (2,042,298,368)

[time lasts]: 38
tx_succ_bytes 0x76f80000 (1,995,964,416)

[time lasts]: 39
tx_succ_bytes 0x79780000 (2,037,907,456)

[time lasts]: 40
tx_succ_bytes 0x79270000 (2,032,599,040)

[time lasts]: 41
tx_succ_bytes 0x78660000 (2,019,950,592)

[time lasts]: 42
tx_succ_bytes 0x77970000 (2,006,384,640)

[time lasts]: 43
tx_succ_bytes 0x78270000 (2,015,821,824)

[time lasts]: 44
tx_succ_bytes 0x774d0000 (2,001,534,976)

[time lasts]: 45
tx_succ_bytes 0x77a80000 (2,007,498,752)

[time lasts]: 46
tx_succ_bytes 0x78790000 (2,021,195,776)

[time lasts]: 47
tx_succ_bytes 0x7a730000 (2,054,356,992)

[time lasts]: 48
tx_succ_bytes 0x78c80000 (2,026,373,120)

[time lasts]: 49
tx_succ_bytes 0x78eb0000 (2,028,666,880)

[time lasts]: 50
tx_succ_bytes 0x76fd0000 (1,996,292,096)

[time lasts]: 51
tx_succ_bytes 0x77e30000 (2,011,365,376)

[time lasts]: 52
tx_succ_bytes 0x78de0000 (2,027,814,912)

[time lasts]: 53
tx_succ_bytes 0x779f0000 (2,006,908,928)

[time lasts]: 54
tx_succ_bytes 0x78310000 (2,016,477,184)

[time lasts]: 55
tx_succ_bytes 0x77b20000 (2,008,154,112)

[time lasts]: 56
tx_succ_bytes 0x76760000 (1,987,444,736)

[time lasts]: 57
tx_succ_bytes 0x78060000 (2,013,659,136)

[time lasts]: 58
tx_succ_bytes 0x78a30000 (2,023,948,288)

[time lasts]: 59
tx_succ_bytes 0x79c20000 (2,042,757,120)

[time lasts]: 60
tx_succ_bytes 0x76ff0000 (1,996,423,168)

[time lasts]: 61
tx_succ_bytes 0x78d30000 (2,027,094,016)

[time lasts]: 62
tx_succ_bytes 0x77cd0000 (2,009,923,584)

/*
* logs of TX node 2.
*/

sandy-dirt:/home/admin/tyler # cat /proc/sys/net/ipv4/tcp_congestion_control
dctcp

sandy-dirt:/home/admin/tyler # date; ./tcp_perf.exe --server-ip 169.254.85.4 --server-port 10001 --msg-len 65536 --conn-num 8 -c -t --data-random --show-bandwidth-only
Fri Mar 25 07:58:52 EDT 2022

conn [0] local 169.254.85.2:37956 peer 169.254.85.4:10001 created.
conn [1] local 169.254.85.2:37958 peer 169.254.85.4:10001 created.
conn [2] local 169.254.85.2:37960 peer 169.254.85.4:10001 created.
conn [3] local 169.254.85.2:37962 peer 169.254.85.4:10001 created.
conn [4] local 169.254.85.2:37964 peer 169.254.85.4:10001 created.
conn [5] local 169.254.85.2:37966 peer 169.254.85.4:10001 created.
conn [6] local 169.254.85.2:37968 peer 169.254.85.4:10001 created.
conn [7] local 169.254.85.2:37970 peer 169.254.85.4:10001 created.
8 connection(s) created in total
tx thread of conn 0 started
tx thread of conn 1 started
tx thread of conn 2 started
tx thread of conn 3 started
tx thread of conn 4 started
tx thread of conn 5 started
tx thread of conn 6 started
tx thread of conn 7 started

[time lasts]: 1
tx_succ_bytes 0xb7980000 (3,080,192,000)

[time lasts]: 2
tx_succ_bytes 0x83350000 (2,201,288,704)

[time lasts]: 3
tx_succ_bytes 0x80fd0000 (2,164,064,256)

[time lasts]: 4
tx_succ_bytes 0x80790000 (2,155,413,504)

[time lasts]: 5
tx_succ_bytes 0x82a00000 (2,191,523,840)

[time lasts]: 6
tx_succ_bytes 0x83350000 (2,201,288,704)

[time lasts]: 7
tx_succ_bytes 0x82350000 (2,184,511,488)

[time lasts]: 8
tx_succ_bytes 0x84260000 (2,217,082,880)

[time lasts]: 9
tx_succ_bytes 0x835c0000 (2,203,844,608)

[time lasts]: 10
tx_succ_bytes 0x81880000 (2,173,173,760)

[time lasts]: 11
tx_succ_bytes 0x80a70000 (2,158,428,160)

[time lasts]: 12
tx_succ_bytes 0x80db0000 (2,161,836,032)

[time lasts]: 13
tx_succ_bytes 0x7faa0000 (2,141,847,552)

[time lasts]: 14
tx_succ_bytes 0x82b10000 (2,192,637,952)

[time lasts]: 15
tx_succ_bytes 0x837e0000 (2,206,072,832)

[time lasts]: 16
tx_succ_bytes 0x82a60000 (2,191,917,056)

[time lasts]: 17
tx_succ_bytes 0x83f60000 (2,213,937,152)

[time lasts]: 18
tx_succ_bytes 0x82ac0000 (2,192,310,272)

[time lasts]: 19
tx_succ_bytes 0x83e90000 (2,213,085,184)

[time lasts]: 20
tx_succ_bytes 0x82420000 (2,185,363,456)

[time lasts]: 21
tx_succ_bytes 0x83ce0000 (2,211,315,712)

[time lasts]: 22
tx_succ_bytes 0x82480000 (2,185,756,672)

[time lasts]: 23
tx_succ_bytes 0x82470000 (2,185,691,136)

[time lasts]: 24
tx_succ_bytes 0x83450000 (2,202,337,280)

[time lasts]: 25
tx_succ_bytes 0x80820000 (2,156,003,328)

[time lasts]: 26
tx_succ_bytes 0x82d80000 (2,195,193,856)

[time lasts]: 27
tx_succ_bytes 0x81930000 (2,173,894,656)

[time lasts]: 28
tx_succ_bytes 0x816a0000 (2,171,207,680)

[time lasts]: 29
tx_succ_bytes 0x83cf0000 (2,211,381,248)

[time lasts]: 30
tx_succ_bytes 0x82330000 (2,184,380,416)

[time lasts]: 31
tx_succ_bytes 0x828d0000 (2,190,278,656)

[time lasts]: 32
tx_succ_bytes 0x82a50000 (2,191,851,520)

[time lasts]: 33
tx_succ_bytes 0x81600000 (2,170,552,320)

[time lasts]: 34
tx_succ_bytes 0x82060000 (2,181,431,296)

[time lasts]: 35
tx_succ_bytes 0x814b0000 (2,169,176,064)

[time lasts]: 36
tx_succ_bytes 0x81370000 (2,167,865,344)

[time lasts]: 37
tx_succ_bytes 0x811c0000 (2,166,095,872)

[time lasts]: 38
tx_succ_bytes 0x83010000 (2,197,880,832)

[time lasts]: 39
tx_succ_bytes 0x83c50000 (2,210,725,888)

[time lasts]: 40
tx_succ_bytes 0x82f70000 (2,197,225,472)

[time lasts]: 41
tx_succ_bytes 0x83730000 (2,205,351,936)

[time lasts]: 42
tx_succ_bytes 0x816a0000 (2,171,207,680)

[time lasts]: 43
tx_succ_bytes 0x83360000 (2,201,354,240)

[time lasts]: 44
tx_succ_bytes 0x81170000 (2,165,768,192)

[time lasts]: 45
tx_succ_bytes 0x83930000 (2,207,449,088)

[time lasts]: 46
tx_succ_bytes 0x84430000 (2,218,983,424)

[time lasts]: 47
tx_succ_bytes 0x81600000 (2,170,552,320)

[time lasts]: 48
tx_succ_bytes 0x812a0000 (2,167,013,376)

[time lasts]: 49
tx_succ_bytes 0x80860000 (2,156,265,472)

[time lasts]: 50
tx_succ_bytes 0x81b70000 (2,176,253,952)

[time lasts]: 51
tx_succ_bytes 0x83610000 (2,204,172,288)

[time lasts]: 52
tx_succ_bytes 0x82dc0000 (2,195,456,000)

[time lasts]: 53
tx_succ_bytes 0x83760000 (2,205,548,544)

[time lasts]: 54
tx_succ_bytes 0x83360000 (2,201,354,240)

[time lasts]: 55
tx_succ_bytes 0x82570000 (2,186,739,712)

[time lasts]: 56
tx_succ_bytes 0x82370000 (2,184,642,560)

[time lasts]: 57
tx_succ_bytes 0x83430000 (2,202,206,208)

[time lasts]: 58
tx_succ_bytes 0x81e40000 (2,179,203,072)

[time lasts]: 59
tx_succ_bytes 0x81d30000 (2,178,088,960)

[time lasts]: 60
tx_succ_bytes 0x80b50000 (2,159,345,664)

[time lasts]: 61
tx_succ_bytes 0x838c0000 (2,206,990,336)

[time lasts]: 62
tx_succ_bytes 0x82580000 (2,186,805,248)

[time lasts]: 63
tx_succ_bytes 0x81eb0000 (2,179,661,824)

/*
* logs of TX node 3.
*/

orem-dirt:/home/admin/tyler # cat /proc/sys/net/ipv4/tcp_congestion_control
dctcp

orem-dirt:/home/admin/tyler # date; ./tcp_perf.exe --server-ip 169.254.85.4 --server-port 10001 --msg-len 65536 --conn-num 8 -c -t --data-random --show-bandwidth-only
Fri Mar 25 07:58:51 EDT 2022

conn [0] local 169.254.85.3:59660 peer 169.254.85.4:10001 created.
conn [1] local 169.254.85.3:59662 peer 169.254.85.4:10001 created.
conn [2] local 169.254.85.3:59664 peer 169.254.85.4:10001 created.
conn [3] local 169.254.85.3:59666 peer 169.254.85.4:10001 created.
conn [4] local 169.254.85.3:59668 peer 169.254.85.4:10001 created.
conn [5] local 169.254.85.3:59670 peer 169.254.85.4:10001 created.
conn [6] local 169.254.85.3:59672 peer 169.254.85.4:10001 created.
conn [7] local 169.254.85.3:59674 peer 169.254.85.4:10001 created.
8 connection(s) created in total
tx thread of conn 0 started
tx thread of conn 1 started
tx thread of conn 2 started
tx thread of conn 3 started
tx thread of conn 4 started
tx thread of conn 5 started
tx thread of conn 6 started
tx thread of conn 7 started

[time lasts]: 1
tx_succ_bytes 0x125530000 (4,921,163,776)

[time lasts]: 2
tx_succ_bytes 0xa22d0000 (2,720,858,112)

[time lasts]: 3
tx_succ_bytes 0x77530000 (2,001,928,192)

[time lasts]: 4
tx_succ_bytes 0x76b90000 (1,991,835,648)

[time lasts]: 5
tx_succ_bytes 0x78070000 (2,013,724,672)

[time lasts]: 6
tx_succ_bytes 0x74cd0000 (1,959,591,936)

[time lasts]: 7
tx_succ_bytes 0x76a80000 (1,990,721,536)

[time lasts]: 8
tx_succ_bytes 0x75dc0000 (1,977,352,192)

[time lasts]: 9
tx_succ_bytes 0x756b0000 (1,969,946,624)

[time lasts]: 10
tx_succ_bytes 0x76620000 (1,986,134,016)

[time lasts]: 11
tx_succ_bytes 0x772f0000 (1,999,568,896)

[time lasts]: 12
tx_succ_bytes 0x77090000 (1,997,078,528)

[time lasts]: 13
tx_succ_bytes 0x77220000 (1,998,716,928)

[time lasts]: 14
tx_succ_bytes 0x77ce0000 (2,009,989,120)

[time lasts]: 15
tx_succ_bytes 0x781d0000 (2,015,166,464)

[time lasts]: 16
tx_succ_bytes 0x762b0000 (1,982,529,536)

[time lasts]: 17
tx_succ_bytes 0x765c0000 (1,985,740,800)

[time lasts]: 18
tx_succ_bytes 0x77010000 (1,996,554,240)

[time lasts]: 19
tx_succ_bytes 0x74ad0000 (1,957,494,784)

[time lasts]: 20
tx_succ_bytes 0x76820000 (1,988,231,168)

[time lasts]: 21
tx_succ_bytes 0x75b70000 (1,974,927,360)

[time lasts]: 22
tx_succ_bytes 0x754f0000 (1,968,111,616)

[time lasts]: 23
tx_succ_bytes 0x75e30000 (1,977,810,944)

[time lasts]: 24
tx_succ_bytes 0x75730000 (1,970,470,912)

[time lasts]: 25
tx_succ_bytes 0x77380000 (2,000,158,720)

[time lasts]: 26
tx_succ_bytes 0x77900000 (2,005,925,888)

[time lasts]: 27
tx_succ_bytes 0x76c00000 (1,992,294,400)

[time lasts]: 28
tx_succ_bytes 0x75e10000 (1,977,679,872)

[time lasts]: 29
tx_succ_bytes 0x784c0000 (2,018,246,656)

[time lasts]: 30
tx_succ_bytes 0x768c0000 (1,988,886,528)

[time lasts]: 31
tx_succ_bytes 0x76c70000 (1,992,753,152)

[time lasts]: 32
tx_succ_bytes 0x76c80000 (1,992,818,688)

[time lasts]: 33
tx_succ_bytes 0x77c10000 (2,009,137,152)

[time lasts]: 34
tx_succ_bytes 0x78010000 (2,013,331,456)

[time lasts]: 35
tx_succ_bytes 0x74ef0000 (1,961,820,160)

[time lasts]: 36
tx_succ_bytes 0x76fe0000 (1,996,357,632)

[time lasts]: 37
tx_succ_bytes 0x76600000 (1,986,002,944)

[time lasts]: 38
tx_succ_bytes 0x76360000 (1,983,250,432)

[time lasts]: 39
tx_succ_bytes 0x73e40000 (1,944,322,048)

[time lasts]: 40
tx_succ_bytes 0x76120000 (1,980,891,136)

[time lasts]: 41
tx_succ_bytes 0x735b0000 (1,935,343,616)

[time lasts]: 42
tx_succ_bytes 0x74e30000 (1,961,033,728)

[time lasts]: 43
tx_succ_bytes 0x772f0000 (1,999,568,896)

[time lasts]: 44
tx_succ_bytes 0x75d30000 (1,976,762,368)

[time lasts]: 45
tx_succ_bytes 0x76a00000 (1,990,197,248)

[time lasts]: 46
tx_succ_bytes 0x76400000 (1,983,905,792)

[time lasts]: 47
tx_succ_bytes 0x75820000 (1,971,453,952)

[time lasts]: 48
tx_succ_bytes 0x768f0000 (1,989,083,136)

[time lasts]: 49
tx_succ_bytes 0x752c0000 (1,965,817,856)

[time lasts]: 50
tx_succ_bytes 0x76050000 (1,980,039,168)

[time lasts]: 51
tx_succ_bytes 0x77280000 (1,999,110,144)

[time lasts]: 52
tx_succ_bytes 0x757b0000 (1,970,995,200)

[time lasts]: 53
tx_succ_bytes 0x76220000 (1,981,939,712)

[time lasts]: 54
tx_succ_bytes 0x74de0000 (1,960,706,048)

[time lasts]: 55
tx_succ_bytes 0x75040000 (1,963,196,416)

[time lasts]: 56
tx_succ_bytes 0x76c10000 (1,992,359,936)

[time lasts]: 57
tx_succ_bytes 0x779d0000 (2,006,777,856)

[time lasts]: 58
tx_succ_bytes 0x766d0000 (1,986,854,912)

[time lasts]: 59
tx_succ_bytes 0x76a40000 (1,990,459,392)

[time lasts]: 60
tx_succ_bytes 0x760e0000 (1,980,628,992)

[time lasts]: 61
tx_succ_bytes 0x76910000 (1,989,214,208)

[time lasts]: 62
tx_succ_bytes 0x75270000 (1,965,490,176)

[time lasts]: 63
tx_succ_bytes 0x769d0000 (1,990,000,640)

[time lasts]: 64
tx_succ_bytes 0x77470000 (2,001,141,760)

/*
* counters on the switch.
* no packet dropping at all.
*/

hound-dirt# show queuing statistics interface ethernet 1/1/4
Interface ethernet1/1/4
Queue Packets Bytes Dropped-Packets Dropped-Bytes
0 0 0 0 0
1 14 1372 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 22209070 200252338712 0 0
6 0 0 0 0
7 0 0 0 0
hound-dirt#

hound-dirt# show queuing statistics interface ethernet 1/1/4
Interface ethernet1/1/4
Queue Packets Bytes Dropped-Packets Dropped-Bytes
0 0 0 0 0
1 14 1372 0 0
2 0 0 0 0
3 0 0 0 0
4 0 0 0 0
5 22209070 200252338712 0 0
6 0 0 0 0
7 0 0 0 0
hound-dirt#

2022-03-29 04:30:43

by Mingbao Sun

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

> As I said, TCP can be tuned in various ways, congestion being just one
> of them. I'm sure you can find a workload where rmem/wmem will make
> a difference.

agree.
but the difference for the knob of rmem/wmem is:
we could enlarge rmem/wmem for NVMe/TCP via sysctl,
and it would not bring downside to any other sockets whose
rmem/wmem are not explicitly specified.

> In addition, based on my knowledge, application specific TCP level
> tuning (like congestion) is not really a common thing to do. So why in
> nvme-tcp?
>
> So to me at least, it is not clear why we should add it to the driver.

As mentioned in the commit message, though we can specify the
congestion-control of NVMe_over_TCP via sysctl or writing
'/proc/sys/net/ipv4/tcp_congestion_control', but this also
changes the congestion-control of all the future TCP sockets on
the same host that have not been explicitly assigned the
congestion-control, thus bringing potential impaction on their
performance.

For example:

A server in a data-center with the following 2 NICs:

- NIC_fron-end, for interacting with clients through WAN
(high latency, ms-level)

- NIC_back-end, for interacting with NVMe/TCP target through LAN
(low latency, ECN-enabled, ideal for dctcp)

This server interacts with clients (handling requests) via the fron-end
network and accesses the NVMe/TCP storage via the back-end network.
This is a normal use case, right?

For the client devices, we can’t determine their congestion-control.
But normally it’s cubic by default (per the CONFIG_DEFAULT_TCP_CONG).
So if we change the default congestion control on the server to dctcp
on behalf of the NVMe/TCP traffic of the LAN side, it could at the
same time change the congestion-control of the front-end sockets
to dctcp while the congestion-control of the client-side is cubic.
So this is an unexpected scenario.

In addition, distributed storage products like the following also have
the above problem:

- The product consists of a cluster of servers.

- Each server serves clients via its front-end NIC
(WAN, high latency).

- All servers interact with each other via NVMe/TCP via back-end NIC
(LAN, low latency, ECN-enabled, ideal for dctcp).

2022-03-29 08:31:23

by Jakub Kicinski

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

On Tue, 29 Mar 2022 10:48:06 +0800 Mingbao Sun wrote:
> A server in a data-center with the following 2 NICs:
>
> - NIC_fron-end, for interacting with clients through WAN
> (high latency, ms-level)
>
> - NIC_back-end, for interacting with NVMe/TCP target through LAN
> (low latency, ECN-enabled, ideal for dctcp)
>
> This server interacts with clients (handling requests) via the fron-end
> network and accesses the NVMe/TCP storage via the back-end network.
> This is a normal use case, right?

Well, if you have clearly separated networks you can set the congestion
control algorithm per route, right? man ip-route, search congctl.

> For the client devices, we can’t determine their congestion-control.
> But normally it’s cubic by default (per the CONFIG_DEFAULT_TCP_CONG).
> So if we change the default congestion control on the server to dctcp
> on behalf of the NVMe/TCP traffic of the LAN side, it could at the
> same time change the congestion-control of the front-end sockets
> to dctcp while the congestion-control of the client-side is cubic.
> So this is an unexpected scenario.
>
> In addition, distributed storage products like the following also have
> the above problem:
>
> - The product consists of a cluster of servers.
>
> - Each server serves clients via its front-end NIC
> (WAN, high latency).
>
> - All servers interact with each other via NVMe/TCP via back-end NIC
> (LAN, low latency, ECN-enabled, ideal for dctcp).

2022-03-29 13:26:44

by Sagi Grimberg

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control


>> As I said, TCP can be tuned in various ways, congestion being just one
>> of them. I'm sure you can find a workload where rmem/wmem will make
>> a difference.
>
> agree.
> but the difference for the knob of rmem/wmem is:
> we could enlarge rmem/wmem for NVMe/TCP via sysctl,
> and it would not bring downside to any other sockets whose
> rmem/wmem are not explicitly specified.

It can most certainly affect them, positively or negatively, depends
on the use-case.

>> In addition, based on my knowledge, application specific TCP level
>> tuning (like congestion) is not really a common thing to do. So why in
>> nvme-tcp?
>>
>> So to me at least, it is not clear why we should add it to the driver.
>
> As mentioned in the commit message, though we can specify the
> congestion-control of NVMe_over_TCP via sysctl or writing
> '/proc/sys/net/ipv4/tcp_congestion_control', but this also
> changes the congestion-control of all the future TCP sockets on
> the same host that have not been explicitly assigned the
> congestion-control, thus bringing potential impaction on their
> performance.
>
> For example:
>
> A server in a data-center with the following 2 NICs:
>
> - NIC_fron-end, for interacting with clients through WAN
> (high latency, ms-level)
>
> - NIC_back-end, for interacting with NVMe/TCP target through LAN
> (low latency, ECN-enabled, ideal for dctcp)
>
> This server interacts with clients (handling requests) via the fron-end
> network and accesses the NVMe/TCP storage via the back-end network.
> This is a normal use case, right?
>
> For the client devices, we can’t determine their congestion-control.
> But normally it’s cubic by default (per the CONFIG_DEFAULT_TCP_CONG).
> So if we change the default congestion control on the server to dctcp
> on behalf of the NVMe/TCP traffic of the LAN side, it could at the
> same time change the congestion-control of the front-end sockets
> to dctcp while the congestion-control of the client-side is cubic.
> So this is an unexpected scenario.
>
> In addition, distributed storage products like the following also have
> the above problem:
>
> - The product consists of a cluster of servers.
>
> - Each server serves clients via its front-end NIC
> (WAN, high latency).
>
> - All servers interact with each other via NVMe/TCP via back-end NIC
> (LAN, low latency, ECN-enabled, ideal for dctcp).

Separate networks are still not application (nvme-tcp) specific and as
mentioned, we have a way to control that. IMO, this still does not
qualify as solid justification to add this to nvme-tcp.

What do others think?

2022-03-30 17:24:50

by Mingbao Sun

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

On Mon, 28 Mar 2022 21:33:53 -0700
Jakub Kicinski <[email protected]> wrote:

> On Tue, 29 Mar 2022 10:48:06 +0800 Mingbao Sun wrote:
> > A server in a data-center with the following 2 NICs:
> >
> > - NIC_fron-end, for interacting with clients through WAN
> > (high latency, ms-level)
> >
> > - NIC_back-end, for interacting with NVMe/TCP target through LAN
> > (low latency, ECN-enabled, ideal for dctcp)
> >
> > This server interacts with clients (handling requests) via the fron-end
> > network and accesses the NVMe/TCP storage via the back-end network.
> > This is a normal use case, right?
>
> Well, if you have clearly separated networks you can set the congestion
> control algorithm per route, right? man ip-route, search congctl.

Cool, many thanks for the education.

I verified this approach, and it did work well.
And I furtherly found the commit
‘net: tcp: add per route congestion control’ which just
addresses the requirement of this scenario (separated network).

So with this approach, the requirements of our use case are
roughly satisfied.

Thanks again ^_^

2022-03-31 05:13:48

by Mingbao Sun

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

On Tue, 29 Mar 2022 10:46:08 +0300
Sagi Grimberg <[email protected]> wrote:

> >> As I said, TCP can be tuned in various ways, congestion being just one
> >> of them. I'm sure you can find a workload where rmem/wmem will make
> >> a difference.
> >
> > agree.
> > but the difference for the knob of rmem/wmem is:
> > we could enlarge rmem/wmem for NVMe/TCP via sysctl,
> > and it would not bring downside to any other sockets whose
> > rmem/wmem are not explicitly specified.
>
> It can most certainly affect them, positively or negatively, depends
> on the use-case.

Agree.
Your saying is rigorous.

> >> In addition, based on my knowledge, application specific TCP level
> >> tuning (like congestion) is not really a common thing to do. So why in
> >> nvme-tcp?
> >>
> >> So to me at least, it is not clear why we should add it to the driver.
> >
> > As mentioned in the commit message, though we can specify the
> > congestion-control of NVMe_over_TCP via sysctl or writing
> > '/proc/sys/net/ipv4/tcp_congestion_control', but this also
> > changes the congestion-control of all the future TCP sockets on
> > the same host that have not been explicitly assigned the
> > congestion-control, thus bringing potential impaction on their
> > performance.
> >
> > For example:
> >
> > A server in a data-center with the following 2 NICs:
> >
> > - NIC_fron-end, for interacting with clients through WAN
> > (high latency, ms-level)
> >
> > - NIC_back-end, for interacting with NVMe/TCP target through LAN
> > (low latency, ECN-enabled, ideal for dctcp)
> >
> > This server interacts with clients (handling requests) via the fron-end
> > network and accesses the NVMe/TCP storage via the back-end network.
> > This is a normal use case, right?
> >
> > For the client devices, we can’t determine their congestion-control.
> > But normally it’s cubic by default (per the CONFIG_DEFAULT_TCP_CONG).
> > So if we change the default congestion control on the server to dctcp
> > on behalf of the NVMe/TCP traffic of the LAN side, it could at the
> > same time change the congestion-control of the front-end sockets
> > to dctcp while the congestion-control of the client-side is cubic.
> > So this is an unexpected scenario.
> >
> > In addition, distributed storage products like the following also have
> > the above problem:
> >
> > - The product consists of a cluster of servers.
> >
> > - Each server serves clients via its front-end NIC
> > (WAN, high latency).
> >
> > - All servers interact with each other via NVMe/TCP via back-end NIC
> > (LAN, low latency, ECN-enabled, ideal for dctcp).
>
> Separate networks are still not application (nvme-tcp) specific and as
> mentioned, we have a way to control that. IMO, this still does not
> qualify as solid justification to add this to nvme-tcp.
>
> What do others think?

Well, per the fact that the approach (‘ip route …’) proposed
by Jakub could largely fit the per link requirement on
congestion-control, so the usefulness of this patchset is really
not so significant.

So here I terminate all the threads of this patchset.

At last, many thanks to all of you for reviewing this patchset.

2022-03-31 05:14:04

by Mingbao Sun

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

On Tue, 29 Mar 2022 10:46:08 +0300
Sagi Grimberg <[email protected]> wrote:

> >> As I said, TCP can be tuned in various ways, congestion being just one
> >> of them. I'm sure you can find a workload where rmem/wmem will make
> >> a difference.
> >
> > agree.
> > but the difference for the knob of rmem/wmem is:
> > we could enlarge rmem/wmem for NVMe/TCP via sysctl,
> > and it would not bring downside to any other sockets whose
> > rmem/wmem are not explicitly specified.
>
> It can most certainly affect them, positively or negatively, depends
> on the use-case.

Agree.
Your saying is rigorous.

> >> In addition, based on my knowledge, application specific TCP level
> >> tuning (like congestion) is not really a common thing to do. So why in
> >> nvme-tcp?
> >>
> >> So to me at least, it is not clear why we should add it to the driver.
> >
> > As mentioned in the commit message, though we can specify the
> > congestion-control of NVMe_over_TCP via sysctl or writing
> > '/proc/sys/net/ipv4/tcp_congestion_control', but this also
> > changes the congestion-control of all the future TCP sockets on
> > the same host that have not been explicitly assigned the
> > congestion-control, thus bringing potential impaction on their
> > performance.
> >
> > For example:
> >
> > A server in a data-center with the following 2 NICs:
> >
> > - NIC_fron-end, for interacting with clients through WAN
> > (high latency, ms-level)
> >
> > - NIC_back-end, for interacting with NVMe/TCP target through LAN
> > (low latency, ECN-enabled, ideal for dctcp)
> >
> > This server interacts with clients (handling requests) via the fron-end
> > network and accesses the NVMe/TCP storage via the back-end network.
> > This is a normal use case, right?
> >
> > For the client devices, we can’t determine their congestion-control.
> > But normally it’s cubic by default (per the CONFIG_DEFAULT_TCP_CONG).
> > So if we change the default congestion control on the server to dctcp
> > on behalf of the NVMe/TCP traffic of the LAN side, it could at the
> > same time change the congestion-control of the front-end sockets
> > to dctcp while the congestion-control of the client-side is cubic.
> > So this is an unexpected scenario.
> >
> > In addition, distributed storage products like the following also have
> > the above problem:
> >
> > - The product consists of a cluster of servers.
> >
> > - Each server serves clients via its front-end NIC
> > (WAN, high latency).
> >
> > - All servers interact with each other via NVMe/TCP via back-end NIC
> > (LAN, low latency, ECN-enabled, ideal for dctcp).
>
> Separate networks are still not application (nvme-tcp) specific and as
> mentioned, we have a way to control that. IMO, this still does not
> qualify as solid justification to add this to nvme-tcp.
>
> What do others think?

Well, per the fact that the approach (‘ip route …’) proposed
by Jakub could largely fit the per link requirement on
congestion-control, so the usefulness of this patchset is really
not so significant.

So here I terminate all the threads of this patchset.

At last, many thanks to all of you for reviewing this patchset.

2022-03-31 05:15:21

by Mingbao Sun

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

On Tue, 29 Mar 2022 10:46:08 +0300
Sagi Grimberg <[email protected]> wrote:

> >> As I said, TCP can be tuned in various ways, congestion being just one
> >> of them. I'm sure you can find a workload where rmem/wmem will make
> >> a difference.
> >
> > agree.
> > but the difference for the knob of rmem/wmem is:
> > we could enlarge rmem/wmem for NVMe/TCP via sysctl,
> > and it would not bring downside to any other sockets whose
> > rmem/wmem are not explicitly specified.
>
> It can most certainly affect them, positively or negatively, depends
> on the use-case.

Agree.
Your saying is rigorous.

> >> In addition, based on my knowledge, application specific TCP level
> >> tuning (like congestion) is not really a common thing to do. So why in
> >> nvme-tcp?
> >>
> >> So to me at least, it is not clear why we should add it to the driver.
> >
> > As mentioned in the commit message, though we can specify the
> > congestion-control of NVMe_over_TCP via sysctl or writing
> > '/proc/sys/net/ipv4/tcp_congestion_control', but this also
> > changes the congestion-control of all the future TCP sockets on
> > the same host that have not been explicitly assigned the
> > congestion-control, thus bringing potential impaction on their
> > performance.
> >
> > For example:
> >
> > A server in a data-center with the following 2 NICs:
> >
> > - NIC_fron-end, for interacting with clients through WAN
> > (high latency, ms-level)
> >
> > - NIC_back-end, for interacting with NVMe/TCP target through LAN
> > (low latency, ECN-enabled, ideal for dctcp)
> >
> > This server interacts with clients (handling requests) via the fron-end
> > network and accesses the NVMe/TCP storage via the back-end network.
> > This is a normal use case, right?
> >
> > For the client devices, we can’t determine their congestion-control.
> > But normally it’s cubic by default (per the CONFIG_DEFAULT_TCP_CONG).
> > So if we change the default congestion control on the server to dctcp
> > on behalf of the NVMe/TCP traffic of the LAN side, it could at the
> > same time change the congestion-control of the front-end sockets
> > to dctcp while the congestion-control of the client-side is cubic.
> > So this is an unexpected scenario.
> >
> > In addition, distributed storage products like the following also have
> > the above problem:
> >
> > - The product consists of a cluster of servers.
> >
> > - Each server serves clients via its front-end NIC
> > (WAN, high latency).
> >
> > - All servers interact with each other via NVMe/TCP via back-end NIC
> > (LAN, low latency, ECN-enabled, ideal for dctcp).
>
> Separate networks are still not application (nvme-tcp) specific and as
> mentioned, we have a way to control that. IMO, this still does not
> qualify as solid justification to add this to nvme-tcp.
>
> What do others think?

Well, per the fact that the approach (‘ip route …’) proposed
by Jakub could largely fit the per link requirement on
congestion-control, so the usefulness of this patchset is really
not so significant.

So here I terminate all the threads of this patchset.

At last, many thanks to all of you for reviewing this patchset.

2022-03-31 05:51:26

by Mingbao Sun

[permalink] [raw]
Subject: Re: [PATCH v2 2/3] nvme-tcp: support specifying the congestion-control

OOOPS, my mail service has got some issues.
Sorry for the duplicated emails.